Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

PionX sites mark the X chromosome for dosage compensation


The rules defining which small fraction of related DNA sequences can be selectively bound by a transcription factor are poorly understood. One of the most challenging tasks in DNA recognition is posed by dosage compensation systems that require the distinction between sex chromosomes and autosomes. In Drosophila melanogaster, the male-specific lethal dosage compensation complex (MSL-DCC) doubles the level of transcription from the single male X chromosome, but the nature of this selectivity is not known1. Previous efforts to identify X-chromosome-specific target sequences were unsuccessful as the identified MSL recognition elements lacked discriminative power2,3. Therefore, additional determinants such as co-factors, chromatin features, RNA and chromosome conformation have been proposed to refine targeting further4. Here, using an in vitro genome-wide DNA binding assay, we show that recognition of the X chromosome is an intrinsic feature of the MSL-DCC. MSL2, the male-specific organizer of the complex, uses two distinct DNA interaction surfaces—the CXC and proline/basic-residue-rich domains—to identify complex DNA elements on the X chromosome. Specificity is provided by the CXC domain, which binds a novel motif defined by DNA sequence and shape. This motif characterizes a subclass of MSL2-binding sites, which we name PionX (pioneering sites on the X) as they appeared early during the recent evolution of an X chromosome in D. miranda and are the first chromosomal sites to be bound during de novo MSL-DCC assembly. Our data provide the first, to our knowledge, documented molecular mechanism through which the dosage compensation machinery distinguishes the X chromosome from an autosome. They highlight fundamental principles in the recognition of complex DNA elements by protein that will have a strong impact on many aspects of chromosome biology.

Your institute does not have access to this article

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Genome-wide MSL2 in vitro binding partially recapitulates the in vivo pattern.
Figure 2: The CXC domain of MSL2 increases X-chromosomal specificity.
Figure 3: The CXC domain reads out nucleotide sequence and additional features.
Figure 4: The CXC-dependent sites are pioneer HAS.

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

The next-generation sequencing data have been deposited at the Gene Expression Omnibus (GEO) under accession number GSE75033.


  1. Lucchesi, J. C. & Kuroda, M. I. Dosage compensation in Drosophila. Cold Spring Harb. Perspect. Biol. 7, a019398 (2015)

    Article  Google Scholar 

  2. Alekseyenko, A. A. et al. A sequence motif within chromatin entry sites directs MSL establishment on the Drosophila X chromosome. Cell. 134, 599–609 (2008)

    CAS  Article  Google Scholar 

  3. Straub, T., Grimaud, C., Gilfillan, G. D., Mitterweger, A. & Becker, P. B. The chromosomal high-affinity binding sites for the Drosophila dosage compensation complex. PLoS Genet. 4, e1000302 (2008)

    Article  Google Scholar 

  4. McElroy, K. A., Kang, H. & Kuroda, M. I. Are we there yet? Initial targeting of the Male-Specific Lethal and Polycomb group chromatin complexes in Drosophila. Open Biol. 4, 140006 (2014)

    Article  Google Scholar 

  5. Fauth, T., Müller-Planitz, F., König, C., Straub, T. & Becker, P. B. The DNA binding CXC domain of MSL2 is required for faithful targeting the Dosage Compensation Complex to the X chromosome. Nucleic Acids Res. 38, 3209–3221 (2010)

    CAS  Article  Google Scholar 

  6. Zheng, S. et al. Structural basis of X chromosome DNA recognition by the MSL2 CXC domain during Drosophila dosage compensation. Genes Dev. 28, 2652–2662 (2014).114

    Article  Google Scholar 

  7. Gossett, A. J. & Lieb, J. D. DNA immunoprecipitation (DIP) for the determination of DNA-binding specificity. CSH Protoc. (2008)

  8. Guertin, M. J., Martins, A. L., Siepel, A. & Lis, J. T. Accurate prediction of inducible transcription factor binding intensities in vivo. PLoS Genet. 8, e1002610 (2012)

    CAS  Article  Google Scholar 

  9. Copps, K. et al. Complex formation by the Drosophila MSL proteins: role of the MSL2 RING finger in protein complex assembly. EMBO J. 17, 5409–5417 (1998)

    CAS  Article  Google Scholar 

  10. Villa, R. et al. MSL2 combines sensor and effector functions in homeostatic control of the Drosophila dosage compensation machinery. Mol. Cell 48, 647–654 (2012)

    CAS  Article  Google Scholar 

  11. Li, F., Schiemann, A. H. & Scott, M. J. Incorporation of the noncoding roX RNAs alters the chromatin-binding specificity of the Drosophila MSL1/MSL2 complex. Mol. Cell. Biol. 28, 1252–1264 (2008)

    CAS  Article  Google Scholar 

  12. Jolma, A. et al. DNA-dependent formation of transcription factor pairs alters their binding specificity. Nature 527, 384–388 (2015)

    ADS  CAS  Article  Google Scholar 

  13. Abe, N. et al. Deconvolving the recognition of DNA shape from sequence. Cell 161, 307–318 (2015)

    CAS  Article  Google Scholar 

  14. Joshi, R. et al. Functional specificity of a Hox protein mediated by the recognition of minor groove structure. Cell 131, 530–543 (2007)

    CAS  Article  Google Scholar 

  15. Zhou, T. et al. Quantitative modeling of transcription factor binding specificities using DNA shape. Proc. Natl Acad. Sci. USA 112, 4654–4659 (2015)

    ADS  CAS  Article  Google Scholar 

  16. Zhou, T. et al. DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale. Nucleic Acids Res. 41, W56–62 (2013)

    Article  Google Scholar 

  17. Dahlsveen, I. K., Gilfillan, G. D., Shelest, V. I., Lamm, R. & Becker, P. B. Targeting determinants of dosage compensation in Drosophila. PLoS Genet. 2, e5 (2006)

    Article  Google Scholar 

  18. Lucchesi, J. C. Gene dosage compensation and the evolution of sex chromosomes. Science 202, 711–716 (1978)

    ADS  CAS  Article  Google Scholar 

  19. Alekseyenko, A. A. et al. Conservation and de novo acquisition of dosage compensation on newly evolved sex chromosomes in Drosophila. Genes Dev. 27, 853–858 (2013)

    CAS  Article  Google Scholar 

  20. Zhou, Q. et al. The epigenome of evolving Drosophila neo-sex chromosomes: dosage compensation and heterochromatin formation. PLoS Biol. 11, e1001711 (2013)

    Article  Google Scholar 

  21. Ellison, C. E. & Bachtrog, D. Dosage compensation via transposable element mediated rewiring of a regulatory network. Science 342, 846–850 (2013)

    ADS  CAS  Article  Google Scholar 

  22. Hallacli, E. et al. Msl1-mediated dimerization of the dosage compensation complex is essential for male X-chromosome regulation in Drosophila. Mol. Cell 48, 587–600 (2012)

    CAS  Article  Google Scholar 

  23. Park, Y., Kelley, R. L., Oh, H., Kuroda, M. I. & Meller, V. H. Extent of chromatin spreading determined by roX RNA recruitment of MSL proteins. Science 298, 1620–1623 (2002)

    ADS  CAS  Article  Google Scholar 

  24. Soruco, M. M. et al. The CLAMP protein links the MSL complex to the X chromosome during Drosophila dosage compensation. Genes Dev. 27, 1551–1556 (2013)

    CAS  Article  Google Scholar 

  25. Ramírez, F. et al. High-affinity sites form an interaction network to facilitate spreading of the MSL complex across the X chromosome in Drosophila. Mol. Cell 60, 146–162 (2015)

    Article  Google Scholar 

  26. Schauer, T. et al. CAST-ChIP maps cell-type-specific chromatin states in the Drosophila central nervous system. Cell Reports 5, 271–282 (2013)

    CAS  Article  Google Scholar 

  27. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)

    Article  Google Scholar 

  28. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010)

    CAS  Article  Google Scholar 

  29. Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009)

    CAS  Article  Google Scholar 

  30. Summer, M., Frank, E. & Hall, M. Speeding Up Logistic Model Tree Induction 675–683 (Springer, 2005)

Download references


This work was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement number 293948 (PBB) and the German Research Council (CRC 1064, T.St.). We thank P. Korber for suggesting the DIP experiment, F. Gebauer for sharing her antibody against SXL, and S. Schunter, V. Flynn, S. Krause and A. Zabel for technical assistance. We thank N. Gompel for critical reading of the manuscript and S. Krebs and H. Blum for their sequencing service.

Author information

Authors and Affiliations



R.V. and T.St. conceived the project. R.V. conducted all the experiments except for the ones in Kc cells that were performed by T.Sc. All bioinformatics analyses were conducted by T.St. with the exception of machine learning procedures that were performed by P.S. P.B.B. supervised the experiments and provided intellectual support toward design and interpretation of the results. R.V., T.St. and P.B.B. wrote the manuscript.

Corresponding authors

Correspondence to Tobias Straub or Peter B. Becker.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Reviewer Information

Nature thanks J. Larsson, R. Rohs and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Extended data figures and tables

Extended Data Figure 1 Analysis of in vitro versus in vivo MSL2-binding sites.

a, Venn diagram showing the genome-wide overlap of robust MSL2 in vivo and in vitro DNA binding peaks. b, MSL2 enrichment (immunoprecipitate (IP) over input) of all 57 overlapping peaks from in vitro DIP–seq and in vivo ChIP–seq experiments. The average of two biological replicates is shown, and the Pearson correlation coefficient is indicated. c, X-chromosomal enrichment over autosomes of MSL2 DIP–seq peaks using genomic DNA from S2 cells, Kc cells or synthetic gDNA (whole-genome amplified). S2 peaks correspond to an overlapping set of two biological replicate experiments; Kc cell and whole-genome amplification experiments were performed once. d, Chromosomal distribution of MSL2 DIP–seq peaks of experiments shown in c. The relative size of chromosomes and the genome serve as a reference for uniform distribution. e, Representative profiles of in vivo MSL2 ChIP–seq and the corresponding chromatin input on chromosome 3R. Red bars indicate the positions of CXC-dependent in vitro binding sites. Gene models are depicted in grey at the bottom.

Extended Data Figure 2 Analysis of MSL2 mutants in DIP–seq assays.

a, Western blots showing input and anti-Flag immunoprecipitated MSL2 proteins from a representative DIP experiment (for gel source data see Supplementary Fig. 1). b, Chromosomal distribution of DIP–seq peaks obtained with MSL2, MSL2 mutants and HSF8 (see Fig. 2d). The chromosomal size distribution (genome) is provided for reference.

Extended Data Figure 3 Comparison between the CXC-dependent motif and the MRE.

a, Consensus motif in CXC-independent binding regions (present in 164 out of 201 regions; E = 2.0 × 10−1,191). b, ROC curves representing the PWM performances of MRE and the new motif in predicting whether an instance of the new motif (n = 2,651) will overlap with HAS (170). AUCs are provided in brackets. As our method slightly penalizes the MRE performance estimation (see Methods), this figure represents a symmetrical analysis of the new motif hits of Fig. 3b. c, Top, motif logos of MRE as reported previously2. Middle, MRE as reported in this study (see also Fig. 4c, top). Bottom, PionX motif as reported in this study (Fig. 3a). d, ROC curves representing the PWM performance comparison analogous to the result presented in Fig. 3b, including the MRE as reported previously2 (labelled MRE 2008), the MRE as reported in this study (labelled MRE) and the PionX motif (labelled new motif) in classifying MRE instances (35,659) within HAS (266) or not. AUCs are provided in brackets. e, Genome-wide search with the PWM of the new motif using FIMO. q-value cut-off relation with the total number of genomic hits (top), the number of CXC-dependent in vitro binding sites (middle) and the X-chromosomal enrichment of motif hits (bottom). f, To ensure that the enrichment is not solely due to performing de novo motif discovery on mainly X-chromosomal sequences, we performed the analysis as presented in e excluding the training regions. We conducted the same analysis for the new motif (left) as well as the MRE (right). Top panels depict the q-value distribution and the cut-offs used. The total numbers of genomic hits are displayed in the centre panels, with the corresponding X-chromosomal enrichments displayed at the bottom.

Extended Data Figure 4 Importance of k-mer frequencies and DNA shape for CXC-dependent MSL2 in vitro binding.

a, PCA on the set of all extended features in 2,667 genomic hit regions of the new motif (q ≤ 0.2). Scatter plots and corresponding scaled density plots of PC1 versus PC2. 2,613 sites not bound in vitro in a CXC-dependent manner and 54 bound in a CXC-dependent manner are coloured grey and red, respectively. b, ROC curves depicting the performance of simple logistic classifiers for CXC-dependent binding on 2,667 low-stringency motif hits (q ≤ 0.2; 54 sites CXC-bound, 2,613 sites non-CXC-bound) based on different combinations of motif PWM scores and extended features. AUCs are provided in brackets. c, DIP experiments testing the binding affinities of DNA oligonucleotides representing two unbound sites (unbound 2 and 3) and their respective mutated sites (unbound 2 mut and unbound 3 mut) to increase the roll at position +1. Results from qPCR amplification were normalized for their input and shown as enrichment over an unbound fragment. Data are mean ± s.e.m for 4 biological replicates. d, DNA shape features at each base position comparing CXC-bound motifs (n = 16) to non-CXC-bound ones (n = 18) in the highest-scoring hit regions of the new motif (q < 0.05). Differences of shape features at all positions were evaluated by applying Wilcoxon exact rank tests with two-sided alternatives. Only roll at position +1 had P < 0.001. As roll and helix twist specify inter-base structural features, the corresponding bar graph representations have been centred between the respective nucleotide positions.

Extended Data Figure 5 In vivo analysis of PionX sites.

a, Consensus motif found in the 25 regions where MSL2 binding is most sensitive to depletion of MLE. b, MSL2 signal changes on 37 HAS matching CXC-dependent in vitro binding sites or 272 non-matching ones during MLE knockdown in S2 cells. Displayed are the mean differences of three biological replicates. c, Western blot analysis of whole-cell extracts from S2 and Kc cells treated with either RNAi against Sxl (two different double-stranded RNAs) or control RNAi directed against irrelevant Gfp sequences at different time points (for gel source data see Supplementary Fig. 1). d, Clustered heat map of MSL2 peaks from ChIP–seq experiments in female Kc cells treated with RNAi against Sxl for 3, 6 and 9 days. Red bar indicates 30 sites characterized by strong MSL2 recruitment. e, Enrichment of PionX motif hits (score > 22) and MRE motif hits (score > 27) on D. miranda and D. pseudoobscura chromosomes relative to Müller-B, normalized for chromosome length. The analysis included 225 and 400 PionX hits in D. miranda and D. pseudoobscura, respectively. A total of 784 and 755 MRE hits were considered in D. miranda and D. pseudoobscura, respectively. f, Sequence from the neo-X chromosome chromatin entry sites compared to its counterpart on the neo-Y chromosome as in Supplementary Fig. 2 of ref. 19. Motifs are highlighted in green (neo-Y-chromosomal) and in red (neo-X-chromosomal) with their corresponding PionX motif score in blue.

Extended Data Table 1 CXC-dependent sites (PionX)
Extended Data Table 2 List of oligonucleotides used in the DIP experiments

Supplementary information

Supplementary Information

This file contains Supplementary Figure 1, uncropped scans with size marker indications. (PDF 318 kb)

Supplementary Data

This file contains Supplementary Tables 1-2 and a Supplementary Table Guide. (ZIP 450 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Villa, R., Schauer, T., Smialowski, P. et al. PionX sites mark the X chromosome for dosage compensation. Nature 537, 244–248 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing