Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The structure and evolution of centromeric transition regions within the human genome

Abstract

An understanding of how centromeric transition regions are organized is a critical aspect of chromosome structure and function; however, the sequence context of these regions has been difficult to resolve on the basis of the draft genome sequence. We present a detailed analysis of the structure and assembly of all human pericentromeric regions (5 megabases). Most chromosome arms (35 out of 43) show a gradient of dwindling transcriptional diversity accompanied by an increasing number of interchromosomal duplications in proximity to the centromere. At least 30% of the centromeric transition region structure originates from euchromatic gene-containing segments of DNA that were duplicatively transposed towards pericentromeric regions at a rate of six–seven events per million years during primate evolution. This process has led to the formation of a minimum of 28 new transcripts by exon exaptation and exon shuffling, many of which are primarily expressed in the testis. The distribution of these duplicated segments is nonrandom among pericentromeric regions, suggesting that some regions have served as preferential acceptors of euchromatic DNA.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Models of centromeric transition regions.
Figure 2: Pericentromeric architecture.
Figure 3: Sequence properties of centromeric transition regions.
Figure 4: Cohorts of pericentromeric duplication.
Figure 5: Ancestral duplicons within 2p11.

References

  1. 1

    Nagaki, K. et al. Sequencing of a rice centromere uncovers active genes. Nature Genet. 36, 138–145 (2004)

    CAS  Article  Google Scholar 

  2. 2

    Horvath, J. et al. Molecular structure and evolution of an alpha/non-alpha satellite junction at 16p11. Hum. Mol. Genet. 9, 113–123 (2000)

    CAS  Article  Google Scholar 

  3. 3

    Jackson, M. Duplicate, decouple, disperse: the evolutionary transience of human centromeric regions. Curr. Opin. Genet. Dev. 13, 629–635 (2003)

    CAS  Article  Google Scholar 

  4. 4

    International Human Genome Sequencing Consortium, Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)

    Article  Google Scholar 

  5. 5

    Eichler, E. E. Masquerading repeats: Paralogous pitfalls of the Human Genome. Genome Res. 8, 758–762 (1998)

    CAS  Article  Google Scholar 

  6. 6

    Bailey, J. A., Yavor, A. M., Massa, H. F., Trask, B. J. & Eichler, E. E. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 11, 1005–1017 (2001)

    CAS  Article  Google Scholar 

  7. 7

    Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl Acad. Sci. USA 101, 1916–1921 (2004)

    ADS  CAS  Article  Google Scholar 

  8. 8

    Bailey, J. A. et al. Human-specific duplication and mosaic transcripts: the recent paralogous structure of chromosome 22. Am. J. Hum. Genet. 70, 83–100 (2002)

    CAS  Article  Google Scholar 

  9. 9

    Bailey, J. A., Giu, L. & Eichler, E. E. An Alu transposition model for the origin and expansion of human segmental duplications. Am. J. Hum. Genet. 73, 823–834 (2003)

    CAS  Article  Google Scholar 

  10. 10

    Horvath, J. E., Bailey, J. A., Locke, D. P. & Eichler, E. E. Lessons from the human genome: transitions between euchromatin and heterochromatin. Hum. Mol. Genet. 10, 2215–2223 (2001)

    CAS  Article  Google Scholar 

  11. 11

    Schueler, M. G., Higgins, A. W., Rudd, M. K., Gustashaw, K. & Willard, H. F. Genomic and genetic definition of a functional human centromere. Science 294, 109–115 (2001)

    ADS  CAS  Article  Google Scholar 

  12. 12

    Horvath, J. E. et al. Using a pericentromeric interspersed repeat to recapitulate the phylogeny and expansion of human centromeric segmental duplications. Mol. Biol. Evol. 20, 1463–1479 (2003)

    CAS  Article  Google Scholar 

  13. 13

    Ventura, M. et al. Neocentromeres in 15q24–26 map to duplicons which flanked an ancestral centromere in 15q25. Genome Res. 13, 2059–2068 (2003)

    CAS  Article  Google Scholar 

  14. 14

    Yunis, J. J. & Prakash, O. The origin of man: a chromosomal pictorial legacy. Science 215, 1525–1530 (1982)

    ADS  CAS  Article  Google Scholar 

  15. 15

    Baldini, A. et al. An alphoid DNA sequence conserved in all human and great ape chromosomes: evidence for ancient centromeric sequences at human chromosomal regions 2q21 and 9q13. Hum. Genet. 90, 577–583 (1993)

    CAS  Article  Google Scholar 

  16. 16

    Liu, G. et al. Analysis of primate genomic variation reveals a repeat-driven expansion of the human genome. Genome Res. 13, 358–368 (2003)

    CAS  Article  Google Scholar 

  17. 17

    Ventura, M., Archidiacono, N. & Rocchi, M. Centromere emergence in evolution. Genome Res. 11, 595–599 (2001)

    CAS  Article  Google Scholar 

  18. 18

    Eder, V. et al. Chromosome 6 phylogeny in primates and centromere repositioning. Mol. Biol. Evol. 20, 1506–1512 (2003)

    CAS  Article  Google Scholar 

  19. 19

    Eichler, E. E. Recent duplication, domain accretion and the dynamic mutation of the human genome. Trends Genet. 17, 661–669 (2001)

    CAS  Article  Google Scholar 

  20. 20

    Locke, D. P. et al. Molecular evolution of the human chromosome 15 pericentromeric region. Cytogenet. Genome Res. (in the press)

  21. 21

    Golfier, G. et al. The 200-kb segmental duplication on human chromosome 21 originates from a pericentromeric dissemination involving human chromosomes 2, 18 and 13. Gene 312, 51–59 (2003)

    CAS  Article  Google Scholar 

  22. 22

    Courseaux, A. et al. Segmental duplications in euchromatic regions of human chromosome 5: a source of evolutionary instability and transcriptional innovation. Genome Res. 13, 369–381 (2003)

    CAS  Article  Google Scholar 

  23. 23

    Horvath, J., Schwartz, S. & Eichler, E. The mosaic structure of a 2p11 pericentromeric segment: A strategy for characterizing complex regions of the human genome. Genome Res. 10, 839–852 (2000)

    CAS  Article  Google Scholar 

  24. 24

    Bailey, J. A. et al. Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002)

    ADS  CAS  Article  Google Scholar 

  25. 25

    Guy, J. et al. Genomic sequence and transcriptional profile of the boundary between pericentromeric satellites and genes on human chromosome arm 10p. Genome Res. 13, 159–172 (2003)

    CAS  Article  Google Scholar 

  26. 26

    Hillier, L. W. et al. The DNA sequence of human chromosome 7. Nature 424, 157–164 (2003)

    ADS  CAS  Article  Google Scholar 

  27. 27

    Crosier, M. et al. Human paralogs of KIAA0187 were created through independent pericentromeric-directed and chromosome-specific duplication mechanisms. Genome Res. 12, 67–80 (2002)

    CAS  Article  Google Scholar 

  28. 28

    Jackson, M. S. et al. Sequences flanking the centromere of human chromosome 10 are a complex patchwork of arm-specific sequences, stable duplications, and unstable sequences with homologies to telomeric and other centromeric locations. Hum. Mol. Genet. 8, 205–215 (1999)

    CAS  Article  Google Scholar 

  29. 29

    Locke, D. P. et al. Large-scale variation among human and great ape genomes determined by array comparative genomic hybridization. Genome Res. 13, 347–357 (2003)

    CAS  Article  Google Scholar 

  30. 30

    Thomas, J. W. et al. Pericentromeric duplications in the laboratory mouse. Genome Res. 13, 55–63 (2003)

    CAS  Article  Google Scholar 

  31. 31

    Tuzun, E., Bailey, J. & Eichler, E. E. Recent segmental duplications in the working draft assembly of the brown Norway rat. Genome Res. 14, 493–506 (2004)

    CAS  Article  Google Scholar 

  32. 32

    Bailey, J. A., Baertsch, R., Kent, W. J., Haussler, D. & Eichler, E. E. Hotspots of mammalian chromosomal evolution. Genome Biol. 5, R23 (2004)

    Article  Google Scholar 

  33. 33

    Copenhaver, G. P. et al. Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286, 2468–2474 (1999)

    CAS  Article  Google Scholar 

  34. 34

    Schneider, H. et al. Molecular phylogeny of the New World monkeys (Platyrrhini, primates) based on two unlinked nuclear genes: IRBP intron 1 and epsilon-globin sequences. Am. J. Phys. Anthropol. 100, 153–179 (1996)

    CAS  Article  Google Scholar 

  35. 35

    Willard, H. F. & Waye, J. S. Chromosome-specific subsets of human alpha satellite DNA: analysis of sequence divergence within and between chromosomal subsets and evidence for an ancestral pentameric repeat. J. Mol. Evol. 25, 207–214 (1987)

    ADS  CAS  Article  Google Scholar 

  36. 36

    Alexandrov, I., Kazakov, A., Tumeneva, I., Shepelev, V. & Yurov, Y. Alpha-satellite DNA of primates: old and new families. Chromosoma 110, 253–266 (2001)

    CAS  Article  Google Scholar 

  37. 37

    Lee, C., Wevrick, R., Fisher, R. B., Ferguson-Smith, M. A. & Lin, C. C. Human centromeric DNAs. Hum. Genet. 100, 291–304 (1997)

    CAS  Article  Google Scholar 

  38. 38

    An International System for Human Cytogenetic Nomenclature, High resolution-banding. Cytogenet. Cell Genet. 31, 1–23 (1981)

    Article  Google Scholar 

  39. 39

    Leem, S. H. et al. Closing the gaps on human chromosome 19 revealed genes with a high density of repetitive tandemly arrayed elements. Genome Res. 14, 239–246 (2004)

    CAS  Article  Google Scholar 

  40. 40

    Rudd, M. K. & Willard, H. F. Analysis of the centromeric regions of the human genome assembly. Trends Genet. (in the press)

Download references

Acknowledgements

We are grateful to the large-scale sequencing centres (Baylor College of Medicine, Cold Spring Harbor Laboratory, Genome Therapeutics Corporation, Harvard Partners Genome Center, Joint Genome Institute, The NIH Intramural Sequencing Center, The UK-MRC Sequencing Consortium, The University of Oklahoma Advanced Center for Genome Technology, The University of Texas Southwest, The Whitehead Institute for Biomedical Research, The Washington University Genome Sequencing Center and the Wellcome Trust Sanger Institute) for access to all large-scale finished sequence, genome assembly and trace sequence data from the human genome before publication. This work was supported by grants from NIH and DOE to E.E.E. and grants from P.R.I.N.C.E., MURST and Telethon to M.R.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Evan E. Eichler.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Supplementary Methods

This file includes a detailed description of the Methods and additional references. (DOC 63 kb)

Supplementary Table 1a

Gaps, repeats and exons within 2Mb pericentromeric regions. (XLS 24 kb)

Supplementary Table 1b

Gaps, repeats and exons within 5Mb pericentromeric regions. (XLS 22 kb)

Supplementary Table 1c

Duplications and pairwise alignments within 2Mb pericentromeric regions. (XLS 31 kb)

Supplementary Table 1d

Duplications and pairwise alignments within 5Mb pericentromeric regions. (XLS 28 kb)

Supplementary Table 1e

Homology between each 2Mb pericentromeric region and all other chromosomes. (XLS 30 kb)

Supplementary Table 1f

Homology between each 5Mb pericentromeric region and all other chromosomes. (XLS 32 kb)

Supplementary Table 2

Analysis of alpha satellite DNA placement in build 34 (July 2003) for each chromosome. (XLS 29 kb)

Supplementary Table 3

Cytogenetic vs. in silico analysis of human alpha-satellite containing sequences. (XLS 23 kb)

Supplementary Table 4

Assessment of all gaps within the finished human genome. (XLS 19 kb)

Supplementary Table 5

Complete analysis of cytogenetic vs. in silico analysis of human segmental duplications. (XLS 116 kb)

Supplementary Table 6

Brief summary of cytogenetic vs. in silico analysis of human segmental duplications. (XLS 16 kb)

Supplementary Table 7

Paralogous STS content of 10 pericentromeric duplicons in the human genome. (XLS 18 kb)

Supplementary Table 8

Duplicon junction analysis. (XLS 19 kb)

Supplementary Table 9

Non-human primate mapping of pericentromeric duplications on 2p11. (XLS 17 kb)

Supplementary Table 10

Summary of ancestral duplicons and corresponding gene composition. (XLS 80 kb)

Supplementary Table 11a

Refseq genes located within 2Mb pericentromeric regions. (XLS 81 kb)

Supplementary Table 11b

Refseq genes located within 5Mb pericentromeric regions. (XLS 241 kb)

Supplementary Table 12

Known genes and mRNAs within pericentromeric duplications from ancestral donors. (XLS 29 kb)

Supplementary Table 13

RT-PCR analysis of a subset of pericentromeric genes, mRNAs and ESTs. (XLS 28 kb)

Supplementary Figure 1a

Repeat density plotted within 10Mb pericentromeric regions. (PDF 28 kb)

Supplementary Figure 1b

Exon density plotted within 10Mb pericentromeric regions. (PDF 358 kb)

Supplementary Figure 2

The 20 largest pericentromeric-pericentromeric alignments within 2Mb pericentromeric regions. (PDF 39 kb)

Supplementary Figure 3

Distribution of pericentromeric duplications for each chromosome by divergence. (PDF 651 kb)

Supplementary Figure 4

Methodology for identification of ancestral duplicons using mousenet. (PDF 26 kb)

Supplementary Figure 5

Ancestral duplicons within 15q11. (PDF 108 kb)

Supplementary Figure 6

Sequence similarity between acceptor duplicons and donor duplicons. (PDF 97 kb)

Supplementary Figure 7

Sequence structure of genes and mRNAs within pericentromeric regions. (PDF 287 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

She, X., Horvath, J., Jiang, Z. et al. The structure and evolution of centromeric transition regions within the human genome. Nature 430, 857–864 (2004). https://doi.org/10.1038/nature02806

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing