Crystal structure of the DNA-binding domain of Myelin-gene Regulatory Factor

Myelin-gene Regulatory Factor (MyRF) is one of the master transcription factors controlling myelin formation and development in oligodendrocytes which is crucial for the powerful brain functions. The N-terminal of MyRF, which contains a proline-rich region and a DNA binding domain (DBD), is auto-cleaved from the ER membrane, and then enters the nucleus to participate in transcription regulation of the myelin genes. Here we report the crystal structure of MyRF DBD. It shows an Ig-fold like architecture which consists of two antiparallel β-sheets with 7 main strands, packing against each other, forming a β-sandwich. Compared to its homolog, Ndt80, MyRF has a smaller and less complex DBD lacking the helices and the big loops outside the core. Structural alignment reveals that MyRF DBD possess less interaction sites with DNA than Ndt80 and may bind only at the major groove of DNA. Moreover, the structure reveals a trimeric assembly, agreeing with the previous report that MyRF DBD functions as a trimer. The mutant that we designed based on the structure disturbed trimer formation, but didn’t affect the auto-cleavage reaction. It demonstrates that the activation of self-cleavage reaction of MyRF is independent of the presence of its N-terminal DBD homotrimer. The structure reported here will help to understand the molecular mechanism underlying the important roles of MyRF in myelin formation and development.

complex and showing more extensive contacts with DNA. Thus, the interaction of MyRF DBD with DNA may be different from Ndt80. Moreover, previous reports showed that MyRF DBD functions as a trimer 6 , while Ndt80 binds with DNA as a monomer 10 . Recently bioinformatic studies revealed the preferential DNA binding target of MyRF DBD 7 , which is very different from the Ndt80 target sequence. Thus their DNA binding properties ought to be different too. The details of the interaction between MyRF and its target DNA still need to be investigated. Further studies will also help to understand the molecular mechanism of MyRF co-operating with other transcription factors such as Sox10 and Olig2 during oligodendrocyte development and myelin formation 11, 12 . In this report, we solved the crystal structure of MyRF DBD and demonstrate that it forms a trimer in the crystal lattice. Unlike Ndt80, MyRF DBD may only interact with the major groove of target DNA. The mutations designed based on our structure disrupted trimer formation but had no effect on auto-cleavage. Our data will help to understand the detailed molecular mechanism of myelin gene transcriptional regulation.

Results and Discussion
Overall structure of MyRF DBD. The expressed MyRF fragment (351-717 plus N-terminal his tag and linker) showed a smaller molecular weight than calculated (33 vs. 45 kDa) due to auto-cleavage, which matches with the previous reports 6,7 . Using limited proteolysis, we generated a proteolytically resistant DBD core that is about 24kD. This DBD core was purified and crystallized. The structure was determined for the selenomethionyl substituted crystals by using single anomalous diffraction (SAD) phasing from 4 ordered Se atoms per molecule of 210 residues (see Materials and Methods). The resulting experimental electron density map calculated was of sufficient quality to build an initial chain trace and assign the amino acid residues. The structural model has been completed and refined to 2.46 Å resolution (R work = 18% and R free = 23%). The space group is P321 (104.0, 104.0, 46.7, 90, 90, 120), with one molecule per asymmetric unit. Of the 210 amino acids corresponding to the DBD, only residues 351-532 were clearly visible in the electron density map (Fig. 1A).
MyRF DBD is highly conserved in vertebrates (Fig. 1C) and its structure shows a typical Ig-fold architecture. The main part of MyRF DBD consists of two antiparallel β-sheets that have three strands (a, b and e) and five strands (c' , c, f, g and β1) respectively, labeled according to the standard nomenclature of Ig-fold 13 . MyRF DBD can be classified as an s-type Ig-fold, although the 4 th switched strand c' is not obvious. Instead, the N-terminal β1 strand is located next to the g strand of the c-f-g β-sheet, as observed in Ndt80. The sheets pack against each other, forming a β-sandwich with a "Greek key" topology ( Fig. 1B). At the end of the β-sandwich, the sheets pack tightly, forming a compact barrel-like structure (Fig. 1A). In addition, there are nine outer strands between the core strands, which are numbered according to their occurrence in the primary sequence. Each group of 3 of these outer strands forms a sheet at the end of core β-barrel. The ribbon representation of the DBD and its topology are illustrated in Fig. 1A and B, respectively.
Generally, Ig-fold proteins do not display high identity to one another at the primary sequence level, although they are similar in their 3D architecture. Consistent with this, MyRF DBD sequence does not show high homology with other members of Ig-fold family, but the structures with similar topology are found in the Protein Structure Database.
Comparison with other Ig-fold transcription factors. MyRF DBD is similar in topology to the Ndt80 transcription factors (Fig. 2), as demonstrated by being the best match in a DALI search 14 with an root mean square deviation (RMSD) of 2.1 Å for 162 C α atoms and 16% sequence identity. Although MyRF DBD and the core of the Ndt80 can be well aligned with each other, Ndt80 is significantly larger than MyRF DBD and other Ig-fold transcription factors. MyRF DBD mainly consists of β-strands. No helical segment and big loop between the core strands are observed, which makes it smaller and less complex than Ndt80. In Ig-fold proteins, the outer strands in the sheets, and the loops between the core strands in particular, are flexible and can be relocated in different proteins without perturbing the core structure of Ig-fold. Without the N-terminal β-hairpin-loophelix insert, the helix-loop-helix inserts in the c'-e loop, the helix in the C terminal and the helix in the c-c' loop (which are involved in the direct interaction with target DNA and form an essential sequence-specific recognition site for Ndt80 interacting with target DNA), the interaction of MyRF DBD with target DNA should be different than the one of Ndt80. The fact that MyRF shows less contact with target DNA than Ndt80 indicated that it may bind DNA as a multimer or cooperate with other transcription factors to fulfil its myelin gene activation function.
The second best match of the Ig-like transcription factors in the DALI search was with the members of p53 15,16 family, resulting in an RMSD of 2.9 Å for 112 C α atoms and 13% sequence identity. Although the core strands of p53 are shorter than MyRF DBD, 7 core strands of them can be aligned together while the outer strands between the core strands are different. Their DNA binding sites also are different. The other transcription factors resulting from the DALI search showing high degree of structural homology to MyRF DBD include members of STAT 17 , Runx 18 . Some other Ig-fold proteins, such as the Cholesterol-binding Protein 19,20 , ATP binding part of ABC Transporter 21 and sugar-binding domain of β-galactosidase 22 , have even higher score in the DALI search than p53, STAT and Runx, but they do not belong to the transcription factor family. Oligomization of MyRF DBD. MyRF DBD forms a trimer with a crystallographic three-fold symmetry axis in the crystal (Fig. 3A-D). The trimer interface buries a 1497.6 Å 2 surface area, accounting for 14.5% of total monomer surface area (http://pdbe.org/pisa/). The trimer interaction is mainly mediated by 2 contact regions. Residues 438-442 of the loop between strand c and β7 from one protomer form 3 hydrogen bond interactions with residues 469-471 of strand e from the neighboring protomer. Residues 354-360 of the loop between strand β1 and β2 from one protomer form 4 hydrogen bond interactions with residues 371-377 of the loop before strand a from the other protomer (Fig. 3E). In addition, Arg378 and Lys473 in one of the protomers are positioned in close proximity to E525 and D392 in the other protomer respectively, creating electrostatic attraction that strengthens the trimer formation. The PISA program predicts that MyRF DBD forms a metastable trimer. The molecular weight of MyRF DBD after auto-cleavage (351-586 plus N-terminal his tag and linker) is similar to the N-terminal fragment (StrepII-MyRF-319:577) reported 6 . All monomers run at a similar position on SDS-PAGE (Fig. 4A, lane 3). On gel filtration, MyRF DBD eluted as a single peak corresponding to a trimer (Fig. 4B, blue peak). Moreover, even the proteolytically resistant MyRF DBD core only (after treated by limited proteolysis of trypsin) still maintains as a trimer (Fig. 4B, cyan peak).
Previous studies have shown that full-length MyRF forms a trimer before cleavage. The ICA domain was thought to play the main role in MyRF trimer formation due to its C-terminal trimeric helix bundle. It is known that trimerization is required for the auto-cleavage reaction in ICA containing phage tailspike proteins 8 . Our structure indicates the N terminus of MyRF (which contains the proline-rich and DBD region only) after auto-cleavage can still keep the trimeric architecture, in agreement with previous results 6 . As observed in the trimer structure of MyRF DBD, the residues 469-471 and 438-442 lie on the trimer interfaces and form 3 hydrogen bonds between protomers. Based on this structural information, the site-directed mutant MyRF DBD E469R&439-441del was designed. The result of size exclusion chromatography showed that E469R&439-441del mutant disturbed the trimer formation (Fig. 4B, red peak). The crystal structure of bacteriophage K1F endosialidase tailspike protein shows its ICA domain and endosialidases upstream are connected by a triple β-helix. In this triple β-helix, three polypeptide chains wind around a common threefold symmetry axis. Proper assembly of this triple-β-helical fold depends on the trimeric helix bundle on the ICA C-terminal. The cleavage reaction of ICA acts as a control mechanism for generating a correctly folded protein. Only if endosialidases domain upstream of triple β-helix folds to a proper trimer, and the side chains of the key residues involved are in the right position, the self-cleavage reaction can occur 8 . Surprisingly, the mutation E469R&441-443del in MyRF did not affect its self-proteolytic processing, although the mutation disturbed the formation of MyRF DBD trimer (Fig. 4A, lane  2). It demonstrates that the activation of self-cleavage of MyRF is independent of its N terminal DBD trimer. The proper assembly of the triple-β-helical fold of MyRF, which depends on the ICA and its C-terminal trimeric helix bundle, triggers a serine-lysine catalytic dyad to activate the self-proteolysis. Consistent with this, the DBD of MrfA (the ortholog of MyRF in Dictyostelium) is separated from the triple β-helix of ICA domain by a large disordered fragment, resulting in the difficult to affect each other. The mutants in the trimeric helix bundle of ICA C-terminal, which disturb the trimer assembly, prevented the self-cleavage of MyRF 6 . The control mechanism of self-cleavage reaction of MyRF and its relationship with generating a correctly folded trimeric MyRF DBD still need to be investigated.
Predicted DNA binding. The electrostatic surface potential measurement of MyRF DBD structure reveals a highly electropositive surface owing to the presence of lysine and arginine residues at the bottom of the β-barrel core (Fig. 5A). Superposition of MyRF DBD and Ndt80 showed that these residues may be involved in the interaction of the protein with DNA (Fig. 5B). In particular, 3 conserved alkaline residues in MyRF (K399, R454 and R478) are superimposable with the residues of Ndt80 (R111, R177 and R254) interacting at the major groove of DNA, implying that they may be involved in interaction with DNA too (Fig. 5B). The individual point mutants of these residues dramatically decreased MyRF DNA binding capacity 7, 23 . The aligned structures show that MyRF DBD may only have 4 binding sites, while Ndt80 possess 6 binding sites with the target DNA. The 2 extra binding sites of Ndt80 are located at the minor groove of DNA and come from the extra N terminal β-hairpin-loop-helix and helix-loop-helix inserts in the c'-e loop. All of other 4 binding sites are located at the major groove of DNA. The comparison of their structures indicates that the MyRF DBD monomers have lower binding affinity to DNA than Ndt80. Their differences in binding residues and the position suggest they may recognize different target DNA sequences.
Trimer-DNA binding. MrfA, the ortholog of MyRF in Dictyostelium was reported to bind DNA via three distinct elements with 3-5 bp spacing. Mutagenesis studies showed that all three elements are needed for optimal expression in vivo 23 . Previous studies also suggested that MyRF is functional as a trimer 6,7 , though some of MyRF binding sites identified by ChIP-Seq only had a single binding motif 7 . Consistent with these results, our crystal structure shows MyRF DBD tends to assemble into a homo-trimer. The fact that the promoters of target genes contain different number of the MyRF binding motifs suggests that different genes may be regulated differently. How this trimer interacts with three elements of target DNA is still unclear. According to the model of a single MyRF DBD interacting with DNA, the DNA molecule should bind at the bottom of a trimer. However, it seems difficult to arrange three DNA elements on 3 protomers of a trimer simultaneously with 3-5 bp spacing only (Fig. 5C). Generally, transcription factors interact with DNA as monomers, dimers or tetramers while homo-trimeric transcription factors are rare. Well characterized homo-trimeric transcription factors are heat shock factor 1 24 and ArgR 25 . Their DBDs do not form trimers directly, but are connected to a trimeric domain by a flexible linker thereby being able to bind on 2 or 3 tandem and adjacent elements at the same time. MyRF is the first transcription factor discovered so far in which the DBD forms a homo-trimer by itself. The relative position of the protomer's DNA binding sites does not allow major structural changes, otherwise, the MyRF DBD trimer MyRF may cooperate with other transcription factors to control myelin gene expression in oligodendrocytes. It induces a cohort of genes that work together to wrap axon myelin sheathes in the central nervous system. Our structure shows that the N-terminus of MyRF DBD is located at the bottom of the β-barrel core and close to its DNA binding sites (Fig. 5A). Thus the N-terminal proline-rich region, located before the DBD of MyRF, may interact with other transcriptional factors directly. The MyRF homo-trimer can either be stabilized or destabilized by these interactions to regulate the myelin genes expression.
In conclusion, we presented a high resolution structure of MyRF DBD protein with an Ig-fold architecture which consists of two antiparallel β-sheets with 7 main strands, packing against each other, forming a β-sandwich. The structure of MyRF DBD is similar to the core of Ndt80, but with less interaction sites and a different DNA recognition sequence. The structure reveals that MyRF DBD forms a trimer, in agreement with the previous reports demonstrating that MyRF DBD is functional as a trimer. The mutant designed based on the structure demonstrates that the activation of self-cleavage of MyRF is independent of its N terminal DBD trimer. The electrostatic surface potential analysis and comparison with the structure of the complex of Ndt80 and DNA reveal the possible interaction sites between MyRF DBD and DNA, thereby providing molecular mechanistic insight into its transcriptional regulation.  16 °C. The clarified cell lysate after lysis (50 mM Tris-HCl, pH 8.0, and 150 mM NaCl, 1 mM PMSF, Sigma) obtained by sonication, and sequent centrifugation, was incubated with nickel-sepharose affinity resin (GE Healthcare) and washed with lysis buffer. The recombinant protein was analyzed by limited proteolytic mapping, and then subjected to digestion with trypsin (Worthington). At first, the enzyme was dissolved in a buffer of 50 mM Tris-HCl at pH 8.0, and 20 mM CaCl 2 to make 1 mg/ml trypsin stock solution, and aliquots were frozen at −20 °C. Subsequently the enzyme was added to the purified MyRF protein at a trypsin to protein (w/w) ratio of 1:500,000 in a buffer of 50 mM Tris-HCl at pH 8.0, and 150 mM NaCl. The reaction was incubated at 4 °C for 30 minutes and was terminated by the addition of 10-folds excess of soybean trypsin inhibitor (Sigma) to the reaction mixture.

Methods
The stable fragments were subsequently purified by size exclusion chromatography on a Superdex-75 column (GE Healthcare) pre-equilibrated in a buffer of 20 mM Tris-HCl at pH 7.4, 150 mM NaCl and 1 mM DTT. Peak fraction was concentrated to 20 mg/ml for crystallization. Selenomethionine-derivatized MyRF was obtained in Escherichia coli BL21 (DE3) strain using methionine pathway inhibition at 289 K. The procedure for protein expression was similar to the native protein except for the cell culture that was performed in M9 medium containing 0.4% glucose, 2 mM magnesium sulfate, 0.1 mM calcium chloride, and 50 µg/l kanamycin. Cells were cultured to an OD 600 of 0.8, and before adding IPTG, selenomethionine was supplemented along with leucine, isoleucine, and valine to the final concentration of 50 µg/L, and lysine, threonine, and phenylalanine to 100 µg/L.
The trimer formation was checked by size exclusion chromatography on a Superdex-200 column (GE Healthcare) pre-equilibrated in a buffer with 20 mM Tris-HCl at pH 7.4, 50 mM NaCl and 5% glycerol.
Crystallization and Data collection. The purified proteins were crystallized by sitting drop vapor diffusion method mixed 1:1 with reservoir solution. Crystals appeared in reservoir buffer containing 0.1 M Tris-HCl, pH 8.5, 50 mM magnesium acetate and 20% PEG4000 in 2 weeks at 18 °C. The crystals were frozen in a cryoprotectant consisting of the reservoir solution supplemented with 20% glycerol. Data were collected on the BL17U1 station of the Shanghai Synchrotron Radiation Facility (SSRF) 26 and then were processed using the HKL2000, XDS and Xia2 software 27, 28 . Structure determination, Refinement and Analysis. One dataset for the selenomethionyl substituted MyRF DBD crystal was collected at the Se peak wavelength. The Se sites (4 sites, 4 Se per molecule) were identified using single anomalous diffraction from Se-Met by program Phenix.autosol. The model was built using Phenix.autobuild. The crystallographic refinement was performed by using all data to 2.46 Å resolution in COOT and PHENIX refinement programs 29,30 . The orientations of the amino acid side chains and bound water molecules were modeled on the basis of 2F obs _F calc and F obs _F calc difference Fourier maps. Detailed data collection and refinement statistics are listed in Table 1. The model figures were generated with PyMol and CCP4mg. The interactions were analyzed with PyMol and LigPlus 31 .