Structural basis for nuclear import selectivity of pioneer transcription factor SOX2

SOX (SRY-related HMG-box) transcription factors perform critical functions in development and cell differentiation. These roles depend on precise nuclear trafficking, with mutations in the nuclear targeting regions causing developmental diseases and a range of cancers. SOX protein nuclear localization is proposed to be mediated by two nuclear localization signals (NLSs) positioned within the extremities of the DNA-binding HMG-box domain and, although mutations within either cause disease, the mechanistic basis has remained unclear. Unexpectedly, we find here that these two distantly positioned NLSs of SOX2 contribute to a contiguous interface spanning 9 of the 10 ARM domains on the nuclear import adapter IMPα3. We identify key binding determinants and show this interface is critical for neural stem cell maintenance and for Drosophila development. Moreover, we identify a structural basis for the preference of SOX2 binding to IMPα3. In addition to defining the structural basis for SOX protein localization, these results provide a platform for understanding how mutations and post-translational modifications within these regions may modulate nuclear localization and result in clinical disease, and also how other proteins containing multiple NLSs may bind IMPα through an extended recognition interface. The SOX2 pioneer transcription factor performs critical roles in pluripotency and self-renewal of embryonic stem cells. Here the authors show that SOX2’s two nuclear localization signal sequences form a contiguous binding interface on the nuclear import receptor importin-α3, and provide a structural basis for the preference of SOX2 binding to IMPα3.

T he human SOX (sex-determining region Y (SRY)-related HMG-box) family of transcription factors comprises 20 members that play critical roles in organogenesis, stem cell maintenance, and cancer progression 1 . These proteins can act as both tumor suppressors or activators depending on the cellular environment 2 . SOX2 is essential for the neural development, and the self-renewal of undifferentiated embryonic and neural stem cells. It is spatially and temporally expressed during development; initial expression occurs at the preimplantation embryo stage, with restriction to the blastocyst inner cell mass and epiblast 3 . Expression is also found in the anterior ectoderm to facilitate formation of the neuroectoderm and anterior surface ectoderm 4 . During later stages of development, SOX2 expression is found in the primitive foregut endoderm 3,5 . SOX2 is one of the key factors required to convert somatic cells into induced pluripotent stem cells (PSCs), and in concert with Nanog and Oct4, plays a central role in embryonic stem cell maintenance 3,6 . These roles require precise and timely localization of SOX2 to the nucleus. To gain access to the nucleus, SOX proteins harbor two NLSs, located distally at the N-and C-terminus of the DNA-binding domain [7][8][9][10][11][12][13][14][15] . This arrangement is conserved across all SOX family members ( Fig. 1), and mutations within these regions can impair nuclear localization, cause severe developmental disease, and are associated with poor prognosis in cancer 9,16-19 (see also "Abstract" section).
The localization of proteins to the nucleus via the classical import pathway is an active process and requires a cargo bearing an NLS to be recognized by importin-α (IMPα) 20 . The cargo: IMPα complex is transported through the nuclear pore complex by importin-β (IMPβ) 21,22 to the nucleus, where it is disassembled by RanGTP. Humans harbor seven IMPα isoforms (IMPα1-7), each containing an N-terminal IMPβ-binding domain (residues 1-70) and a C-terminal NLS-binding domain constructed from ten armadillo (ARM) repeats (residues 70-500). Many nuclear import cargos exhibit specificity toward these isoforms 14,23 . For example, RCC1, the Ran nucleotide exchange factor that establishes the directionality of nuclear transport, and HIV-1 integrase, responsible for integrating the HIV-1 genome into the DNA of an infected cell, bind specifically to IMPα3 (refs. 24,25 ); STAT1, a signaling molecule in the innate immune system, binds specifically to the convex C-terminal surface of IMPα5-7 (ref. 23,26 ). Ebola VP24 binds specifically to IMPα5 to selectively compete with the nuclear import of phosphorylated STAT1 (ref. 27 ). SOX proteins show remarkable isoform specificity mechanisms, and strikingly undertake isoform-specific switching during differentiation 14 . For example, neural differentiation of embryonic stem cells is mediated by IMPα isoform switching such that Oct3/4 is driven to the nucleus by IMPα1 in undifferentiated stem cells; however, during neural development, upregulation of IMPα3/5 mediates SOX2/Brn2 nuclear import and neural differentiation 14 . The molecular basis for this specificity is unclear, and understanding IMPα specificity is complicated by the seven human IMPα isoforms all containing highly conserved NLS-binding regions.

SOX2
NLSs are bound by IMPα3 through a contiguous interface. To better understand the mechanisms of how critical signaling regions in SOX proteins interact with nuclear import receptors to drive nuclear transport, we crystallized the HMG domain of SOX2 (comprising residues   extensive and contiguous interface across ARM domains 1-9 of IMPα3 (Fig. 2). SOX2 bound IMPα3 through an extensive and contiguous interface across ARM domains 1-9 of IMPα3 (Fig. 2). The N-terminal NLS (NLS1) was previously reported to be bipartite 7,15,28 , and therefore expected to be bound at both the major and minor sites on IMPα3. However, we found instead that, SOX2 residues Arg40, Lys42, and Arg43 were bound at the minor site (IMPα3 ARM domains 6-8; Fig. 2) and that SOX2 Arg57 was bound at ARM 9, outside of the minor site. The Cterminal NLS (NLS2) was bound in the major site of IMPα3, with SOX2 residues Pro112-Met120 bound to ARM domains 1-4 ( Fig. 2). The HMG domain of SOX2, that is located between these NLS regions, formed additional interactions with IMPα3, including SOX2 Lys95 bound to ARM4; SOX2 Arg98 bound to IMPα3 ARM 5; SOX2 His101 and Asp107 bound to ARM 6; and SOX2 Glu104 bound to ARM 7 (see also Supplementary Table 2 for detailed interactions). Overall, this structure showed that, these two distally positioned NLSs within the HMG domain form a contiguous NLS interface on IMPα3 through ARM domains 1-9. That the two NLSs form a single, contiguous binding interface requires a fundamental reevaluation of how SOX proteins are recognized by the nuclear import transport machinery, how mutations in either region can cause disease, and how posttranslational modification can regulate this process 29 . More broadly, the structure provides a striking illustration of how NLS regions in different parts of a molecule can contribute to forming a contiguous NLS on the IMPα adapter.
To investigate the functional importance of the key binding determinants identified in the SOX2:IMPα3 structure, we engineered structure-guided mutations and examined their influence on nuclear localization, stem cell maintenance, and development. A total of 11 single, structure-guided mutations were assessed for their ability to disrupt interaction with IMPα3. These included R40A, K42A, R43A, R56A, R57A, R98A, R113A, R114A, K115A, T116E (to mimic phosphorylation), and K117A. We found that three of these single-point mutations reduced in vitro binding to IMPα3: K42A, R43A, and K115A (Fig. 3a). Based on a previous study demonstrating that mutating both SOX2 NLS regions impart the most dramatic reduction in nuclear localization 30 and combining our knowledge of the structural interface, we designed a SOX2 K42A, R43A, and K115A triple mutant (SOX2x3Mut) and tested the effect on the cell biology processes that SOX2 mediates. The SOX2x3Mut binding to IMPα3 was abrogated, as shown in both pull-down (Fig. 3a) and microscale thermophoresis (MST) assays (Fig. 3b).
To examine the impact of these mutations on neural stem cell biology, we compared the ability of SOX2 and the SOX2x3Mut to maintain cell proliferation in a neural stem cell assay (Fig. 4). Neonatal mouse neural stem/progenitor cells (NSC) can be exponentially expanded for extended periods of time (several months) in vitro; in contrast, Sox2-deleted (Sox2 −/− ) NSC replicate slowly, and progressively lose their ability to self-renew until the culture becomes completely exhausted 31,32 . We attempted to rescue the ability of Sox2 −/− NSC to long-term selfrenew by transducing them with lentiviruses expressing human wild-type SOX2 (ref. 32 ) or SOX2x3Mut (Fig. 4a). Sox2 −/− NSC transduced with wild-type Sox2 recovered the ability to efficiently self-renew (Fig. 4b, c), growing with kinetics comparable to wildtype NSC (doubling time: 41.81 ± 7.45 h for mutant cells transduced with wild-type Sox2, versus 44.22 ± 4.95 h for wild-type cells, n = 3 independent experiments; error is standard deviation). In contrast, NSC transduced with the SOX2x3Mut demonstrated inefficient expansion, progressively slowing until a plateau was reached, after which their numbers started to decline (Fig. 4b, c). In an independent experiment with NSC from a different Sox2   mutant mouse transduced with the same vectors, essentially identical results were obtained, with growth curves closely overlapping those of the first experiment ( Supplementary Fig. 1). NSC transduced with SOX2x3Mut had a clear tendency to attach to the plastic, elongate, and aggregate (a possible sign of initial differentiation), in contrast to Sox2 −/− NSC transduced with wild-type SOX2, that formed neurospheres, as expected for normal cells (- Fig. 4d). FACS analysis of the transduced cells compared to untransduced cells, measuring GFP expressed from the lentiviral vector, showed that in the same culture the transduced cells rapidly exceeded the untransduced cells in number, demonstrating a growth advantage over the rapidly declining SOX2 −/− NSC. Notably, NSC transduced with SOX2x3Mut also grew quicker and for longer than untransduced SOX2 mutant NSC (or empty-vector (EV)-transduced NSC, Supplementary Fig. 2), suggesting that SOX2x3Mut may retain some activity, though clearly too low to maintain efficient long-term growth (Fig. 4b, c). Confocal microscopy (Fig. 4e, f) demonstrated predominantly cytoplasmic localization in most of the SOX2x3Mut-transduced NSC, whether they had initiated differentiation (white arrowheads) or not (black arrowheads); in contrast, SOX2 was invariably nuclear in wild-type SOX2-transduced NSC. These results indicate a clear correlation between the predominantly cytoplasmic localization of the SOX2x3Mut, and the severe impairment of its ability to sustain long-term NSC selfrenewal.
Homologous SOX2:IMPα3 mutations affect Drosophila development. The role of SOX2 in mammalian embryogenesis is well established, with Sox2 −/− known to be embryonic lethal in mice.
The key roles of SOX2 are conserved throughout metazoans, with the SOX2 homolog in Drosophila, Dichaete (87% sequence identity to human SOX2 in the HMG domain and NLS-binding regions), playing a key role in central nervous system development. To examine if this interface is required for Dichaete function, we generated transgenic strains that expressed HAtagged Dichaete or the orthologous Dichaete3xMut (Dichaete K143A, R144A, and K216 A) from an upstream activation sequence (UAS) promoter. We drove expression using ptc-Gal4 (P{GawB}ptc 559.1 ) that expresses in multiple tissues during development, including third instar salivary glands. Ectopic expression of Dichaete results in developmental defects when expressed from a variety of promoters [33][34][35] , and we also observed that no ptc-Gal4, UAS-Dichaete adults emerged (0/63 siblings, Supplementary Fig. 3), indicating that it results in lethality when raised at 25°C. In contrast, expression of Dichaete3xMut had no effects upon development and ptc-Gal4, UAS-Dichaete3xMut animals emerged at approximately a Mendelian ratio (37/93 siblings, Supplementary Fig. 3). We were able to isolate third instar larval salivary glands from both allelic combinations and used an anti-HA antibody to observe the intracellular localization of the ectopically expressed Dichaete proteins. Wild-type HA-Dichaete was predominantly localized in nuclei of both the polytene salivary gland cells and the salivary duct cells, whereas HA-Dichaete3xMut localization was much more cytoplasmic. This mislocalization explains why expression of the Dichaete3xMut did not produce a phenotype, as it was not efficiently translocated into the nucleus where it could have an effect upon target gene expression (Fig. 5).
Structural basis for SOX2 specificity toward IMPα isoforms. SOX2 shows specificity toward IMPα isoforms during neural stem cell differentiation 14 . Moreover, many transcription factors and viral proteins exhibit specificity for IMPα isoforms, but the molecular mechanisms are unclear and analysis is complicated because of the high level of conservation of NLS-binding sites across IMPα isoforms. To assess whether SOX2 interacts differentially with IMPα isoforms, we performed an immunoprecipitation assay with representative members of each IMPα subfamily (SF; SF1:α1, SF2:α3, and SF3:α5/7; Fig. 6a). Here, we expressed Flag-tagged IMPα1, α3, and α7 together with HA-tagged SOX2, and performed HA-immunoprecipitation and western analysis. We found that SOX2 bound IMPα3, whereas interaction with IMPα1 and IMPα7 was not detected (Fig. 6a). As expected, the SOX2x3Mut demonstrated a loss of interaction with IMPα3. We performed a bead-binding assay using bacterially expressed Histagged SOX2 immobilized on Ni 2+ agarose that was able to bind IMPα3 directly, whereas IMPα1, α2, and α5 (same SF as IMPα7) bound more weakly (Fig. 6b). Next, we compared the affinity of these interactions using MST. Here we found that IMPα3 bound SOX2 with the strongest affinity, with a K D of 102 ± 15 nM (mean + SD, n = 3), whereas IMPα1, and α5 bound with significantly lower affinity with a K D of 447 ± 27 nM (P < 0.0001) and 247 ± 30 nM (P = 0.0017), respectively (Fig. 6c). Overall, our results indicate that SOX2 interacts most strongly with the IMPα3 isoform.
To investigate the possible basis for IMPα isoform specificity, we aligned the amino acid sequences of IMPα isoforms and compared the binding interface residues ( Supplementary Fig. 4). We found that of the 28 binding interface residues on IMPα3, 25 are identical across all IMPα isoforms and the remaining 3 are conserved. This suggests that differences in the IMPα NLSbinding sites are not responsible for isoform specificity. To further probe the basis of isoform specificity, we solved the structures of SOX2 bound to both IMPα2 and IMPα5 (2.7 and 2.8 Å resolution respectively). Consistent with the cellular and in vitro binding data, we observed differences at the SOX2:IMPα isoform interfaces, with that from IMPα3 being more extensive. The IMPα3:SOX2 interface was mediated through a buried surface area of 2034 Å 2 , 14 salt bridges, and 34 hydrogen bonds. In comparison, the IMPα2:SOX2 interaction interface buried 1082 Å 2 of surface area, and contained 4 salt bridges and 24 hydrogen bonds. The IMPα5:SOX2 interface buried 1106 Å 2 of surface area, and contained 4 salt bridges and 23 hydrogen bonds (see Supplementary Tables 2-4 for detailed interactions). Moreover, superimposing the IMPα isoform structures indicated that local differences in IMPα structure could contribute to the differences in affinity for SOX2 (Fig. 7). Although the overall structures of the IMPα isoforms were very similar (RMSD between IMPα3 with IMPα2 and IMPα5 of 1.5 and 1.9 Å, respectively), we found that the ARM 7 domain is positioned differently in IMPα3 compared to IMPα2 and IMPα5. This region is where the SOX2 HMG domain interacts with IMPα3, but this interaction was not observed in the IMPα2 and IMPα5 crystals, where instead the HMG domain appeared to be disordered. The position adopted by the ARM 7 domain in both IMPα2 and IMPα5 would generate a steric clash with SOX2 that could impair its binding (Fig. 7). Furthermore, Pro106 of SOX2, that induces a sharp bend adjacent to the SOX-HMG domain, would clash with the main chain of both IMPα2 and α5 ARM 7. In contrast, the ARM 7 domain of IMPα3 is set back by 4 Å, and so provides a favorable interaction with the HMG domain. In addition, SOX2 cis-Pro44 is also positioned close to the ARM 7 interface. These two prolines lie at either end of the SOX2 HMG domain and facilitate the extended interactions with IMPα3 across ARM domains 1-9 that are not possible with IMPα2 and α5. The reduced binding of the R43A-SOX2 to IMPα3 was consistent with the position of ARM 7 contributing to the interaction because Arg43 forms the majority of the interactions with this region (Fig. 3). The different positioning of ARM 7 within each isoform was independent of the type of cargo bound ( Supplementary  Fig. 5), suggesting that this region does not adjust to accommodate different cargo.

Discussion
SOX proteins localize to the nucleus where they alter nucleosome structure and function 36 , play critical roles in development, and are associated with many cancers. For example, mutations in the SRY protein, within either the N-or C-terminal NLS, reduce the ability of SRY to translocate to the nucleus in sex reversed patients 9,17,18 . In addition, mutations in the NLS region of SOX9 result in reduced nuclear accumulation in campomelic dysplasia patients with XY sex reversal 12 , and SOX10 mutants that fail to localize to the nucleus cause Waardenburg syndrome, resulting in sensorineural hearing defects and auditory-pigmentary disorder 37 . SOX2 plays a critical role in PSCs 3,6,14 , in neural and other stem cell type maintenance 31,32 , and in developmental and tumor biology [3][4][5][6]14,16,[38][39][40] . SOX proteins may either maintain or antagonize tumorigenesis [38][39][40][41][42] . Cytoplasmic SOX9 correlates with poor clinical cancer outcomes, including both shorter diseasespecific survival and relapse-free survival 19 . SOX9 is localized in the cytoplasm of 25-30% invasive ductal carcinomas and lymph node metastases, and its cytoplasmic accumulation significantly correlates with enhanced proliferation in breast tumors 43 . Cytoplasmic SOX18 correlates with poor patient outcome in adenocarcinoma and is associated with non-small cell lung cancer progression 44 . Moreover, many translational modifications occur within the SOX NLS regions (reviewed in ref. 45 ). The proposal that SOX proteins localize to the nucleus through two NLSs made it difficult to understand how mutation or modification in only one NLS would affect function because there appeared to be a high level of redundancy provided by the remaining NLS. However, our results demonstrate that these two NLS regions bind as a single continuous interface on IMPα, and so provide a more easily understood mechanistic basis for contextualizing SOX function and how it can be impaired by mutations in the NLS-HMG region of the molecule. The structural insights from our study may also assist with contextualizing how mutations in other SOX proteins may cause aberrations in nuclear transport and disease. While there are no naturally occurring mutants in the NLSs of SOX2 that have been documented, SRY mutants have been shown to impede nuclear localization and result in sex reversal. Mutations such as SRY R62G 15 , R75M 46 , and R76P 47 , located within the NLS1 (bipartite region) of SOX proteins, were shown to bind within the IMPA minor site (IMPα3 ARMs 6-8) and ARM 9 in this study. Similarly, the NLS2 region harbors mutations, such as SRY R133W 48 , shown to bind at the major site of IMPα3 (within ARM3). It is unlikely however that the interfaces identified in this study can be used to attribute all diseasecausing mutations across the SOX family since these sites are also subject to complex regulation, including calmodulin binding (also shown to regulate nuclear import). This may explain for example why some disease-causing mutations, such as SRY R76P 47 (equivalent to SOX2 Arg57), shown to be important for nuclear import regulation through calmodulin, did not disrupt the IMPA3:SOX2 interaction 8 .
Establishing the mechanism by which nuclear cargoes are recognized specifically by different IMPα isoforms is critical for understanding many key regulatory, developmental, and cancerrelated processes. For example, neural differentiation of embryonic stem cells is mediated by IMPα isoform switching such that Oct3/4 is driven to the nucleus by IMPα1 in undifferentiated stem cells; however, during neural development, upregulation of IMPα3/5 mediates SOX2/Brn2 nuclear import and neural differentiation 14 . Moreover, SOX proteins may also use alternate pathways for import, such as IMPβ 49 , calmodulinmediated pathway 12 , or exportin-4 (ref. 50 ), suggesting that import may be cell or tissue dependent. Structural insights into isoform specificity are limited and our present understanding of cargo binding to different receptor isoforms is limited to RCC1 (ref. 24 ), influenza A PB2 (ref. 51 ), and Henipavirus W proteins. It is noteworthy that viral cargo show some of the most remarkable specificity, and early indications suggest this may be an important strategy for viruses, since innate immune responses also require the nuclear transport of STAT1 and NF-kB through isoforms specificity of IMPα3 (ref. 52 ). Viral accessory proteins have been shown to specifically inhibit transport of these proteins to the nucleus and block innate immune responses. The ability of viruses to specifically target nuclear import to dampen immune responses, while not altering the import of cellular proteins important for cell maintenance (and viral replication), is likely an  important viral replication strategy. The greater flexibility of IMPα3 compared with other IMPα isoforms is important for its binding RCC1 selectively 24,51 , whereas we have shown here that the differential positioning of ARM 7 in IMPα3 (the position of which does not change relative to ARM 6 and ARM 8 in all published IMPα3 structures-see Supplementary Fig. 5) makes an important contribution to its selective binding of SOX2, similar to that seen for the W protein of Henipaviruses 52 (Fig. 8).
Finally, the NLS1 and NLS2 regions within SOX2 that mediate a single interface on IMPα3 are likely to have overlapping functions with SOX2 biology. A recent structure of SOX2 bound to nucleosomes 36 identified that these regions may adopt strikingly different conformations ( Fig. 9 and Supplementary Movie file 1). When bound to IMPα's for nuclear import, these regions are positioned in an open conformation to allow a single continuous interface. In contrast, when bound to nucleosomes, these NLS regions are in close proximity and in a closed conformation. That the IMPα and nucleosome binding sites are overlapping suggests a possible release (and recycling) mechanism for IMPα; however, this requires further experimental investigation.
Purification of 6xHis-tagged proteins were performed by injecting clarified cell lysate onto a GE HisTrap 5 mL column using PB, washing the column with 15 column volumes, and then eluting over 5 column volumes, using a gradient elution  Supplementary Fig. 9 for uncropped gels). b Pull-down assays using recombinant proteins expressed in E. coli. A representative from each subfamily was expressed. His-tagged SOX2 WT (wild type) was immobilized on Ni 2+ agarose beads in the presence of SF1:α1 and α2, SF2:α3, or SF3:α5 as indicated, (see input), washed, and eluted through TEV cleavage. The elution was further concentrated by precipitation to confirm binding differences between IMPα3 and the other isoforms. See also Supplementary Fig. 10 for uncropped gels. Results were reproduced independently in three separate experiments with similar results. c Binding affinities of SOX2 to IMPαs within different subfamilies. Sments OX2 bound to IMPα1 with 447 ± 27 nm affinity (n = 3); IMPα2 with 283 ± 10 nm affinity (n = 3); IMPα3 with 102 ± 15 nm affinity (n = 3); and IMPα5 with 247 ± 30 nm affinity (n = 3). In each case, n = 3 represents three independent experiments. The data points are presented as mean values ± standard deviation. The differences in binding observed between IMPα3 and other isoforms were significant: IMPα1 P = 0.000042; IMPα2 P = 0.000064; and IMPα5 P = 0.0017. P value determined by using a two-tailed unpaired Welch's T test in Graphpad prism 8 software; no adjustments were made for multiple comparisons. with high imidazole (500 mM imidazole, 300 mM NaCl, and 50 mM phosphate pH 8.0). Samples were pooled, and the affinity tag removed by TEV proteolysis. Sizeexclusion purification of pooled samples was performed on a Superdex 200 pg 26/600 column, using TBS pH 8.0. Eluted proteins were pooled and concentrated using 10 kDa MW centrifuge filters. Complex formation was performed by treating the SOX2 samples with RNAse, and mixing IMPα isoforms with SOX2 in a 1:2 molar ratio. The samples were repurified using size-exclusion chromatography, concentrated using 10 kDa MW centrifuge filters, and aliquoted.
Crystallization, data collection, and processing. All crystals were obtained using the hanging drop vapor diffusion method over a 300 μL reservoir solution. IMPα2: SOX2 was crystallized in 750 mM sodium citrate (pH 7.0) and 10 mM DTT, single rod-shaped crystals forming within 14 days. IMPα3:SOX2 crystallized in 1.2 M (NH 4 ) 2 SO 4 , 0.1 M HEPES pH 6.5, with plate-shaped crystals forming within 3 days. IMPα5:SOX2 crystallized in 0.2 M sodium/potassium phosphate, 20% (w/v) PEG3350 with needle-shaped crystals forming within 7 days. X-ray diffraction data were collected at the Australian Synchrotron on the MX1 and MX2 macromolecular beam lines, using an ASDC Quantum 210r, ASDC Quantum 315r detector, and Eiger 16 M detector, respectively 55,56 . Data reduction and integration was performed using iMosflm 57 . Merging, space group assignment, scaling and selection of 5% reflections for R free calculations were performed, using Aimless 58 and the CCP4 suite 59 . The anisotropy of the SOX2:IMPA3 and SOX2:IMPA5 data sets was addressed, using the UCLA MBI server (https://services.mbi.ucla.edu/ anisoscale/). Phasing was performed using molecular replacement in Phaser MR, with PDBID 3UL1 (ref. 60   Mice homozygous for a "floxed" Sox2 allele were crossed with mice compound heterozygotes for a βgeo gene knocked into the Sox2 gene (generating a null Sox2 mutation), and a nestin-cre transgene (which deletes the floxed Sox2 allele specifically in the nervous system), to obtain mutant mice with homozygous Sox2 deletion in the brain (Favaro et al. 32 ). Control Sox2-wild-type mice are generated in the same crosses when the nestin-cre gene is not inherired, and have a Sox2 floxed allele together with an intact Sox2 gene. Mice were sacrificed at P0 to obtain forebrains for NSC cultures (sex was indifferent). The lines (Favaro et al. 32 ) are maintained by matings between cousins, and outbred every two to three generations with B6D2F1 mice, to maintain the mutant alleles. Mice were housed at a temperature of 19-23°C, with 40-60% humidity, and a 13 h light/11 h dark cycle. The experiments were approved by the Italian Ministry of Health as conforming to the relevant regulatory standards.
Drosophila expression studies. The Drosophila melanogaster Dichaete coding sequence as cloned into pUASTattB with modification of the stop codon to introduce Gly-Ser residues followed by a 3xHA tag. The Dichaetex3Mut transgene was generated by mutating K42, R43, and K115 to alanine residues (Genscript). Both transgenes were introduced into the same genomic position in the Drosophila genome (BestGene, Inc.) and expression was induced at 25°C, using the ptc-Gal4 (P{GawB}ptc 559.1 ) driver (Bloomington Drosophila Stock Center). Immunostaining was conducted as described previously 67 , and involved salivary glands dissected from wandering third instar larvae in PBS and fixed in 4% formaldehyde/PBS for 15 min. Following fixation, tissues were washed three times in PBT for 5 min each (1% Triton-X100/PBS) and blocked in PBTH (PBT + 5% horse serum) for 1 h. This was followed by incubation in primary antibody, rat anti-HA 1:100 (Sigma-Aldrich, cat# 11867423001) overnight at 4°C. Subsequent to primary antibody incubation, tissues were washed three times in PBT and incubated in secondary antibody, AlexaFluor594 anti-rat 1:500 (ThermoFisher, cat# A-21209) for 2 h. Tissues were again washed three times in PBT prior to mounting in ProLong Gold Antifade Mountant with DAPI (ThermoFisher, cat# P36935). Unless otherwise stated, all steps were carried out at room temperature. Samples were then imaged on a Zeiss LSM800 Airyscan confocal microscope. Co-immunoprecipitation assay Cells. HEK293T cells (CRL-3216) were obtained from ATCC and were maintained in DMEM supplemented with 10% FBS and cultured at 37°C and 5% CO 2 .
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The data that support this work are available from the corresponding author upon reasonable request. Protein Data Bank files associated with the structures generated in this study have been deposited to the Protein Data Bank, and issued PDB accession codes 6WX7, 6WX8, and 6WX9.