Abstract
An understanding of how centromeric transition regions are organized is a critical aspect of chromosome structure and function; however, the sequence context of these regions has been difficult to resolve on the basis of the draft genome sequence. We present a detailed analysis of the structure and assembly of all human pericentromeric regions (5 megabases). Most chromosome arms (35 out of 43) show a gradient of dwindling transcriptional diversity accompanied by an increasing number of interchromosomal duplications in proximity to the centromere. At least 30% of the centromeric transition region structure originates from euchromatic gene-containing segments of DNA that were duplicatively transposed towards pericentromeric regions at a rate of six–seven events per million years during primate evolution. This process has led to the formation of a minimum of 28 new transcripts by exon exaptation and exon shuffling, many of which are primarily expressed in the testis. The distribution of these duplicated segments is nonrandom among pericentromeric regions, suggesting that some regions have served as preferential acceptors of euchromatic DNA.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Modelling segmental duplications in the human genome
BMC Genomics Open Access 02 July 2021
-
Genome-wide unique insertion sequences among five Brucella species and demonstration of differential identification of Brucella by multiplex PCR assay
Scientific Reports Open Access 14 April 2020
-
Diversity and distribution of alpha satellite DNA in the genome of an Old World monkey: Cercopithecus solatus
BMC Genomics Open Access 14 November 2016
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout





References
Nagaki, K. et al. Sequencing of a rice centromere uncovers active genes. Nature Genet. 36, 138–145 (2004)
Horvath, J. et al. Molecular structure and evolution of an alpha/non-alpha satellite junction at 16p11. Hum. Mol. Genet. 9, 113–123 (2000)
Jackson, M. Duplicate, decouple, disperse: the evolutionary transience of human centromeric regions. Curr. Opin. Genet. Dev. 13, 629–635 (2003)
International Human Genome Sequencing Consortium, Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
Eichler, E. E. Masquerading repeats: Paralogous pitfalls of the Human Genome. Genome Res. 8, 758–762 (1998)
Bailey, J. A., Yavor, A. M., Massa, H. F., Trask, B. J. & Eichler, E. E. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 11, 1005–1017 (2001)
Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl Acad. Sci. USA 101, 1916–1921 (2004)
Bailey, J. A. et al. Human-specific duplication and mosaic transcripts: the recent paralogous structure of chromosome 22. Am. J. Hum. Genet. 70, 83–100 (2002)
Bailey, J. A., Giu, L. & Eichler, E. E. An Alu transposition model for the origin and expansion of human segmental duplications. Am. J. Hum. Genet. 73, 823–834 (2003)
Horvath, J. E., Bailey, J. A., Locke, D. P. & Eichler, E. E. Lessons from the human genome: transitions between euchromatin and heterochromatin. Hum. Mol. Genet. 10, 2215–2223 (2001)
Schueler, M. G., Higgins, A. W., Rudd, M. K., Gustashaw, K. & Willard, H. F. Genomic and genetic definition of a functional human centromere. Science 294, 109–115 (2001)
Horvath, J. E. et al. Using a pericentromeric interspersed repeat to recapitulate the phylogeny and expansion of human centromeric segmental duplications. Mol. Biol. Evol. 20, 1463–1479 (2003)
Ventura, M. et al. Neocentromeres in 15q24–26 map to duplicons which flanked an ancestral centromere in 15q25. Genome Res. 13, 2059–2068 (2003)
Yunis, J. J. & Prakash, O. The origin of man: a chromosomal pictorial legacy. Science 215, 1525–1530 (1982)
Baldini, A. et al. An alphoid DNA sequence conserved in all human and great ape chromosomes: evidence for ancient centromeric sequences at human chromosomal regions 2q21 and 9q13. Hum. Genet. 90, 577–583 (1993)
Liu, G. et al. Analysis of primate genomic variation reveals a repeat-driven expansion of the human genome. Genome Res. 13, 358–368 (2003)
Ventura, M., Archidiacono, N. & Rocchi, M. Centromere emergence in evolution. Genome Res. 11, 595–599 (2001)
Eder, V. et al. Chromosome 6 phylogeny in primates and centromere repositioning. Mol. Biol. Evol. 20, 1506–1512 (2003)
Eichler, E. E. Recent duplication, domain accretion and the dynamic mutation of the human genome. Trends Genet. 17, 661–669 (2001)
Locke, D. P. et al. Molecular evolution of the human chromosome 15 pericentromeric region. Cytogenet. Genome Res. (in the press)
Golfier, G. et al. The 200-kb segmental duplication on human chromosome 21 originates from a pericentromeric dissemination involving human chromosomes 2, 18 and 13. Gene 312, 51–59 (2003)
Courseaux, A. et al. Segmental duplications in euchromatic regions of human chromosome 5: a source of evolutionary instability and transcriptional innovation. Genome Res. 13, 369–381 (2003)
Horvath, J., Schwartz, S. & Eichler, E. The mosaic structure of a 2p11 pericentromeric segment: A strategy for characterizing complex regions of the human genome. Genome Res. 10, 839–852 (2000)
Bailey, J. A. et al. Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002)
Guy, J. et al. Genomic sequence and transcriptional profile of the boundary between pericentromeric satellites and genes on human chromosome arm 10p. Genome Res. 13, 159–172 (2003)
Hillier, L. W. et al. The DNA sequence of human chromosome 7. Nature 424, 157–164 (2003)
Crosier, M. et al. Human paralogs of KIAA0187 were created through independent pericentromeric-directed and chromosome-specific duplication mechanisms. Genome Res. 12, 67–80 (2002)
Jackson, M. S. et al. Sequences flanking the centromere of human chromosome 10 are a complex patchwork of arm-specific sequences, stable duplications, and unstable sequences with homologies to telomeric and other centromeric locations. Hum. Mol. Genet. 8, 205–215 (1999)
Locke, D. P. et al. Large-scale variation among human and great ape genomes determined by array comparative genomic hybridization. Genome Res. 13, 347–357 (2003)
Thomas, J. W. et al. Pericentromeric duplications in the laboratory mouse. Genome Res. 13, 55–63 (2003)
Tuzun, E., Bailey, J. & Eichler, E. E. Recent segmental duplications in the working draft assembly of the brown Norway rat. Genome Res. 14, 493–506 (2004)
Bailey, J. A., Baertsch, R., Kent, W. J., Haussler, D. & Eichler, E. E. Hotspots of mammalian chromosomal evolution. Genome Biol. 5, R23 (2004)
Copenhaver, G. P. et al. Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286, 2468–2474 (1999)
Schneider, H. et al. Molecular phylogeny of the New World monkeys (Platyrrhini, primates) based on two unlinked nuclear genes: IRBP intron 1 and epsilon-globin sequences. Am. J. Phys. Anthropol. 100, 153–179 (1996)
Willard, H. F. & Waye, J. S. Chromosome-specific subsets of human alpha satellite DNA: analysis of sequence divergence within and between chromosomal subsets and evidence for an ancestral pentameric repeat. J. Mol. Evol. 25, 207–214 (1987)
Alexandrov, I., Kazakov, A., Tumeneva, I., Shepelev, V. & Yurov, Y. Alpha-satellite DNA of primates: old and new families. Chromosoma 110, 253–266 (2001)
Lee, C., Wevrick, R., Fisher, R. B., Ferguson-Smith, M. A. & Lin, C. C. Human centromeric DNAs. Hum. Genet. 100, 291–304 (1997)
An International System for Human Cytogenetic Nomenclature, High resolution-banding. Cytogenet. Cell Genet. 31, 1–23 (1981)
Leem, S. H. et al. Closing the gaps on human chromosome 19 revealed genes with a high density of repetitive tandemly arrayed elements. Genome Res. 14, 239–246 (2004)
Rudd, M. K. & Willard, H. F. Analysis of the centromeric regions of the human genome assembly. Trends Genet. (in the press)
Acknowledgements
We are grateful to the large-scale sequencing centres (Baylor College of Medicine, Cold Spring Harbor Laboratory, Genome Therapeutics Corporation, Harvard Partners Genome Center, Joint Genome Institute, The NIH Intramural Sequencing Center, The UK-MRC Sequencing Consortium, The University of Oklahoma Advanced Center for Genome Technology, The University of Texas Southwest, The Whitehead Institute for Biomedical Research, The Washington University Genome Sequencing Center and the Wellcome Trust Sanger Institute) for access to all large-scale finished sequence, genome assembly and trace sequence data from the human genome before publication. This work was supported by grants from NIH and DOE to E.E.E. and grants from P.R.I.N.C.E., MURST and Telethon to M.R.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests.
Supplementary information
Supplementary Methods
This file includes a detailed description of the Methods and additional references. (DOC 63 kb)
Supplementary Table 1a
Gaps, repeats and exons within 2Mb pericentromeric regions. (XLS 24 kb)
Supplementary Table 1b
Gaps, repeats and exons within 5Mb pericentromeric regions. (XLS 22 kb)
Supplementary Table 1c
Duplications and pairwise alignments within 2Mb pericentromeric regions. (XLS 31 kb)
Supplementary Table 1d
Duplications and pairwise alignments within 5Mb pericentromeric regions. (XLS 28 kb)
Supplementary Table 1e
Homology between each 2Mb pericentromeric region and all other chromosomes. (XLS 30 kb)
Supplementary Table 1f
Homology between each 5Mb pericentromeric region and all other chromosomes. (XLS 32 kb)
Supplementary Table 2
Analysis of alpha satellite DNA placement in build 34 (July 2003) for each chromosome. (XLS 29 kb)
Supplementary Table 3
Cytogenetic vs. in silico analysis of human alpha-satellite containing sequences. (XLS 23 kb)
Supplementary Table 4
Assessment of all gaps within the finished human genome. (XLS 19 kb)
Supplementary Table 5
Complete analysis of cytogenetic vs. in silico analysis of human segmental duplications. (XLS 116 kb)
Supplementary Table 6
Brief summary of cytogenetic vs. in silico analysis of human segmental duplications. (XLS 16 kb)
Supplementary Table 7
Paralogous STS content of 10 pericentromeric duplicons in the human genome. (XLS 18 kb)
Supplementary Table 8
Duplicon junction analysis. (XLS 19 kb)
Supplementary Table 9
Non-human primate mapping of pericentromeric duplications on 2p11. (XLS 17 kb)
Supplementary Table 10
Summary of ancestral duplicons and corresponding gene composition. (XLS 80 kb)
Supplementary Table 11a
Refseq genes located within 2Mb pericentromeric regions. (XLS 81 kb)
Supplementary Table 11b
Refseq genes located within 5Mb pericentromeric regions. (XLS 241 kb)
Supplementary Table 12
Known genes and mRNAs within pericentromeric duplications from ancestral donors. (XLS 29 kb)
Supplementary Table 13
RT-PCR analysis of a subset of pericentromeric genes, mRNAs and ESTs. (XLS 28 kb)
Supplementary Figure 1a
Repeat density plotted within 10Mb pericentromeric regions. (PDF 28 kb)
Supplementary Figure 1b
Exon density plotted within 10Mb pericentromeric regions. (PDF 358 kb)
Supplementary Figure 2
The 20 largest pericentromeric-pericentromeric alignments within 2Mb pericentromeric regions. (PDF 39 kb)
Supplementary Figure 3
Distribution of pericentromeric duplications for each chromosome by divergence. (PDF 651 kb)
Supplementary Figure 4
Methodology for identification of ancestral duplicons using mousenet. (PDF 26 kb)
Supplementary Figure 5
Ancestral duplicons within 15q11. (PDF 108 kb)
Supplementary Figure 6
Sequence similarity between acceptor duplicons and donor duplicons. (PDF 97 kb)
Supplementary Figure 7
Sequence structure of genes and mRNAs within pericentromeric regions. (PDF 287 kb)
Rights and permissions
About this article
Cite this article
She, X., Horvath, J., Jiang, Z. et al. The structure and evolution of centromeric transition regions within the human genome. Nature 430, 857–864 (2004). https://doi.org/10.1038/nature02806
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature02806
This article is cited by
-
Regulatory roles of nucleolus organizer region-derived long non-coding RNAs
Mammalian Genome (2022)
-
Modelling segmental duplications in the human genome
BMC Genomics (2021)
-
Genome-wide unique insertion sequences among five Brucella species and demonstration of differential identification of Brucella by multiplex PCR assay
Scientific Reports (2020)
-
Diversity and distribution of alpha satellite DNA in the genome of an Old World monkey: Cercopithecus solatus
BMC Genomics (2016)
-
Detecting non-allelic homologous recombination from high-throughput sequencing data
Genome Biology (2015)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.