Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Closing gaps in the human genome with fosmid resources generated from multiple individuals

Abstract

The human genome sequence has been finished to very high standards; however, more than 340 gaps remained when the finished genome was published by the International Human Genome Sequencing Consortium in 2004. Using fosmid resources generated from multiple individuals, we targeted gaps in the euchromatic part of the human genome. Here we report 2,488,842 bp of previously unknown euchromatic sequence, 363,114 bp of which close 26 of 250 euchromatic gaps, or 10%, including two remaining euchromatic gaps on chromosome 19. Eight (30.7%) of the closed gaps were found to be polymorphic. These sequences allow complete annotation of several human genes as well as the assignment of mRNAs. The gap sequences are 2.3-fold enriched in segmentally duplicated sequences compared to the whole genome. Our analysis confirms that not all gaps within 'finished' genomes are recalcitrant to subcloning and suggests that the paired-end-sequenced fosmid libraries could prove to be a rich resource for completion of the human euchromatic genome.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Strategies for closing or extending into gaps in the human genome.
Figure 2: Closing euchromatic gaps using haplotype-specific sequence assemblies.
Figure 3: Overview of CGH experiments for the closed gap-contigs.

Accession codes

Accessions

Gene Expression Omnibus

References

  1. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004).

  2. Eichler, E.E., Clark, R.A. & She, X. An assessment of the sequence gaps: unfinished business in a finished human genome. Nat. Rev. Genet. 5, 345–354 (2004).

    CAS  Article  Google Scholar 

  3. Bailey, J.A. et al. Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002).

    CAS  Article  Google Scholar 

  4. Istrail, S. et al. Whole genome shotgun assembly and comparison of human genome assemblies. Proc. Natl. Acad. Sci. USA 101, 1916–1921 (2004).

    CAS  Article  Google Scholar 

  5. Sebat, J. et al. Large-scale copy number polymorphism in human genome. Science 305, 525–528 (2004).

    CAS  Article  Google Scholar 

  6. Iafrate, A.J. et al. Detection of large-scale variation in the human genome. Nat. Genet. 36, 949–951 (2004).

    CAS  Article  Google Scholar 

  7. Tuzun, E. et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).

    CAS  Article  Google Scholar 

  8. Sharp, A.J. et al. Segmental duplications and copy number variation in human genome. Am. J. Hum. Genet. 77, 78–88 (2005).

    CAS  Article  Google Scholar 

  9. Eichler, E.E. Widening the spectrum of human genetic variation. Nat. Genet. 38, 9–11 (2006).

    CAS  Article  Google Scholar 

  10. Conrad, D.F., Andrews, T.D., Carter, N.P., Hurles, M.E. & Pritchard, J.K. A high-resolution survey of deletion polymorphism in the human genome. Nat. Genet. 38, 75–81 (2006).

    CAS  Article  Google Scholar 

  11. Freeman, J.L. et al. Copy number variation: new insights in genome diversity. Genome Res. 16, 949–961 (2006).

    CAS  Article  Google Scholar 

  12. Hinds, D.A., Kloek, A.P., Jen, M., Chen, X. & Frazer, K.A. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat. Genet. 38, 82–85 (2006).

    CAS  Article  Google Scholar 

  13. Khaja, R. et al. Genome assembly comparison identifies structural variants in the human genome. Nat. Genet. 38, 1413–1417 (2006).

    CAS  Article  Google Scholar 

  14. McCarroll, S.A. et al. International HapMap Consortium. Common deletion polymorphisms in the human genome. Nat. Genet. 38, 86–92 (2006).

    CAS  Article  Google Scholar 

  15. Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).

    CAS  Article  Google Scholar 

  16. Feuk, L., Carson, A.R. & Scherer, S.W. Structural variation in the human genome. Nat. Rev. Genet. 7, 85–97 (2006).

    CAS  Article  Google Scholar 

  17. Aitman, T.J. et al. Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. Nature 439, 851–855 (2006).

    CAS  Article  Google Scholar 

  18. Gonzalez, E. et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307, 1434–1440 (2005).

    CAS  Article  Google Scholar 

  19. Stefansson, H. et al. A common inversion under selection in Europeans. Nat. Genet. 37, 129–137 (2005).

    CAS  Article  Google Scholar 

  20. Fellermann, K. et al. A chromosome 8 gene-cluster polymorphism with low human beta defensin 2 gene copy number variations predisposes to Crohn's disease of the colon. Am. J. Hum. Genet. 79, 439–448 (2006).

    CAS  Article  Google Scholar 

  21. Eichler, E.E. et al. Completing the map of human genetic variation. A plan to identify and integrate normal structural variation into the human genome sequence. Nature 447, 161–165 (2007).

    CAS  Article  Google Scholar 

  22. Wong, G.K., Yu, J., Thayer, E.C. & Olson, M.V. Multiple-complete-digest restriction fragment mapping: generating sequence-ready maps for large-scale DNA sequencing. Proc. Natl. Acad. Sci. USA 94, 5225–5230 (1997).

    CAS  Article  Google Scholar 

  23. Newman, T.L. et al. High throughput genotyping of intermediate-size structural variation. Hum. Mol. Genet. 15, 1159–1167 (2006).

    CAS  Article  Google Scholar 

  24. She, X. et al. A preliminary comparative analysis of primate segmental duplications shows elevated substitution rates and a great-apes expansion of intrachromosomal duplications. Genome Res. 16, 576–583 (2006).

    CAS  Article  Google Scholar 

  25. Kouprina, N. et al. Segments missing from the draft human genome sequence can be isolated by transformation-associated recombination closing in yeast. EMBO Rep. 4, 257–262 (2003).

    CAS  Article  Google Scholar 

  26. Leem, S.-H. et al. Closing the gaps on human chromosome 19 revealed genes with a high density of repetitive tandemly arrayed elements. Genome Res. 14, 239–246 (2004).

    CAS  Article  Google Scholar 

  27. Raymond, C.K. et al. Targeted haplotype resolved resequencing of long segments of the human genome. Genomics 86, 759–766 (2005).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the University of Washington Genome Center staff and A.Yamada at Agilent Technologies for technical assistance. This work was supported by US National Institutes of Health grants 3 U54 HG002043 to M.V.O. and HG004120 to E.E.E. and M.V.O. E.E.E. is an investigator of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Contributions

R.K. and M.V.O. designed the overall study. R.K. oversaw the overall data production and analysis. R.K. and E.E.E. carried out data analysis and wrote the manuscript with comments from M.V.O. D. Bovee was responsible for data curation, identification of clones from paired-end-sequenced fosmid libraries and incorporation of publicly available sequence data. E.H. and D.J. provided the informatics support and submission of the sequence data to Genbank. Y.Z. and Z.W. were responsible for finishing the fosmid clones. J.C. was responsible for custom fosmid library constructions, and R.L., D. Buckley and S.S. generated shotgun-sequencing data. H.S.H., W.G. and K.P. generated and analyzed the MCD fingerprint data. G.M.C. and N.S. designed and analyzed the aCGH data; E.E.E., E.T., V.A.M. and J.S. initially identified and cherry-picked the fosmid clones from G248 and ABC libraries. D.R.S. was responsible for generating ABC fosmid libraries and their paired-end-sequence data.

Corresponding author

Correspondence to Rajinder Kaul.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 1–7 (PDF 560 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Bovee, D., Zhou, Y., Haugen, E. et al. Closing gaps in the human genome with fosmid resources generated from multiple individuals. Nat Genet 40, 96–101 (2008). https://doi.org/10.1038/ng.2007.34

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.2007.34

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing