Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Comparative analyses of multi-species sequences from targeted genomic regions

Abstract

The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding1,2,3,4,5,6 and conserved non-coding4,6,7 regions, including regulatory elements8,9,10, and provide insight into the forces that have rendered modern-day genomes6. As a complement to whole-genome sequencing efforts3,5,6, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Patterns of sequence conservation.
Figure 2: Detection of MCSs by using different mammalian sequences.
Figure 3: Comparison of genome dynamics among species.

Similar content being viewed by others

References

  1. Batzoglou, S., Pachter, L., Mesirov, J. P., Berger, B. & Lander, E. S. Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958 (2000)

    Article  CAS  Google Scholar 

  2. Roest Crollius, H. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet. 25, 235–238 (2000)

    Article  CAS  Google Scholar 

  3. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)

    Article  Google Scholar 

  4. Chen, R., Bouck, J. B., Weinstock, G. M. & Gibbs, R. A. Comparing vertebrate whole-genome shotgun reads to the human genome. Genome Res. 11, 1807–1816 (2001)

    Article  CAS  Google Scholar 

  5. Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297, 1301–1310 (2002)

    Article  ADS  CAS  Google Scholar 

  6. Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)

    Article  Google Scholar 

  7. Dubchak, I. et al. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res. 10, 1304–1306 (2000)

    Article  CAS  Google Scholar 

  8. Gottgens, B. et al. Analysis of vertebrate SCL loci identifies conserved enhancers. Nature Biotechnol. 18, 181–186 (2000)

    Article  CAS  Google Scholar 

  9. Hardison, R. C. Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet. 16, 369–372 (2000)

    Article  CAS  Google Scholar 

  10. Pennacchio, L. A. & Rubin, E. M. Genomic strategies to identify mammalian regulatory sequences. Nature Rev. Genet. 2, 100–109 (2001)

    Article  CAS  Google Scholar 

  11. Rommens, J. M. et al. Identification of the cystic fibrosis gene: chromosome walking and jumping. Science 245, 1059–1065 (1989)

    Article  ADS  CAS  Google Scholar 

  12. Felsenfeld, A., Peterson, J., Schloss, J. & Guyer, M. Assessing the quality of the DNA sequence from The Human Genome Project. Genome Res. 9, 1–4 (1999)

    CAS  PubMed  Google Scholar 

  13. Schwartz, S. et al. Human–mouse alignments with BLASTZ. Genome Res 13, 103–107 (2003)

    Article  CAS  Google Scholar 

  14. Schwartz, S. et al. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res. 31, 3518–3524 (2003)

    Article  CAS  Google Scholar 

  15. Murphy, W. J. et al. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294, 2348–2351 (2001)

    Article  ADS  CAS  Google Scholar 

  16. Poux, C., Van Rheede, T., Madsen, O. & de Jong, W. W. Sequence gaps join mice and men: phylogenetic evidence from deletions in two proteins. Mol. Biol. Evol. 19, 2035–2037 (2002)

    Article  CAS  Google Scholar 

  17. Huelsenbeck, J. P., Larget, B. & Swofford, D. A compound Poisson process for relaxing the molecular clock. Genetics 154, 1879–1892 (2000)

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Cooper, G. M. et al. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13, 813–820 (2003)

    Article  CAS  Google Scholar 

  19. Siepel, A. & Haussler, D. Proc. 7th Annual Int. Conf. Research in Computational Molecular Biology (ACM, New York, 2003)

    Google Scholar 

  20. Hardison, R. C. et al. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13, 13–26 (2003)

    Article  CAS  Google Scholar 

  21. Green, P. et al. Transcription-associated mutational asymmetry in mammalian evolution. Nature Genet. 33, 514–517 (2003)

    Article  CAS  Google Scholar 

  22. Frazer, K. A. et al. Genomic DNA insertions and deletions occur frequently between humans and nonhuman primates. Genome Res. 13, 341–346 (2003)

    Article  CAS  Google Scholar 

  23. Britten, R. J. Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels. Proc. Natl Acad. Sci. USA 99, 13633–13635 (2002)

    Article  ADS  CAS  Google Scholar 

  24. Springer, M. S., Murphy, W. J., Eizirik, E. & O'Brien, S. J. Placental mammal diversification and the Cretaceous/Tertiary boundary. Proc. Natl Acad. Sci. USA 100, 1056–1061 (2003)

    Article  ADS  CAS  Google Scholar 

  25. Li, W. H., Ellsworth, D. L., Krushkal, J., Chang, B. H. & Hewett-Emmett, D. Rates of nucleotide substitution in primates and rodents and the generation-time effect hypothesis. Mol. Phylogenet. Evol. 5, 182–187 (1996)

    Article  CAS  Google Scholar 

  26. Kumar, S. & Subramanian, S. Mutation rates in mammalian genomes. Proc. Natl Acad. Sci. USA 99, 803–808 (2002)

    Article  ADS  CAS  Google Scholar 

  27. Shizuya, H. et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc. Natl Acad. Sci. USA 89, 8794–8797 (1992)

    Article  ADS  CAS  Google Scholar 

  28. Thomas, J. W. et al. Parallel construction of orthologous sequence-ready clone contig maps in multiple species. Genome Res. 12, 1277–1285 (2002)

    Article  CAS  Google Scholar 

  29. Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)

    Article  CAS  Google Scholar 

  30. Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank J. Weissenbach and H. Roest Crollius for Tetraodon BACs; M. Diekhans for computational expertise; N. Goldman and Z. Yang for advice on phylogenetic analyses; and F. Collins and J. Mullikin for critically reading the manuscript. We acknowledge the support of the National Human Genome Research Institute (National Institutes of Health) and the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. D. Green.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thomas, J., Touchman, J., Blakesley, R. et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003). https://doi.org/10.1038/nature01858

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature01858

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing