Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Commentary
  • Published:

Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology

Abstract

The increasing volume of whole-genome sequence (WGS) and multi-omics data requires new approaches for analysis. As one solution, we have created the cloud-based Analysis Commons, which brings together genotype and phenotype data from multiple studies in a setting that is accessible by multiple investigators. This framework addresses many of the challenges of multicenter WGS analyses, including data-sharing mechanisms, phenotype harmonization, integrated multi-omics analyses, annotation and computational flexibility. In this setting, the computational pipeline facilitates a sequence-to-discovery analysis workflow illustrated here by an analysis of plasma fibrinogen levels in 3,996 individuals from the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) WGS program. The Analysis Commons represents a novel model for translating WGS resources from a massive quantity of phenotypic and genomic data into knowledge of the determinants of health and disease risk in diverse human populations.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Analysis Commons design.
Figure 2: Plasma fibrinogen association results.

References

  1. Psaty, B.M. et al. Circ Cardiovasc Genet 2, 73–80 (2009).

    Article  Google Scholar 

  2. Morrison, A.C. et al. Nat. Genet. 45, 899–901 (2013).

    Article  CAS  Google Scholar 

  3. Fuchsberger, C. et al. Nature 536, 41–47 (2016).

    Article  CAS  Google Scholar 

  4. Sankar, P.L. & Parker, L.S. Genet. Med. 19, 743–750 (2017).

    Article  Google Scholar 

  5. Zheng, X. et al. Bioinformatics 33, 2251–2257 (2017).

    Article  Google Scholar 

  6. Liu, X. et al. J. Med. Genet. 53, 111–112 (2016).

    Article  CAS  Google Scholar 

  7. Reid, J.G. et al. BMC Bioinformatics 15, 30 (2014).

    Article  Google Scholar 

  8. ENCODE Project Consortium Nature 489, 57–74 (2012).

  9. Lumley, T., Brody, J.A., Peloso, G.M. & Rice, K. Preprint at https://www.biorxiv.org/content/early/2016/11/04/085639/ (2016).

  10. Wu, M.C. et al. Am. J. Hum. Genet. 89, 82–93 (2011).

    Article  CAS  Google Scholar 

  11. Huffman, J.E. et al. Blood 126, e19–e29 (2015).

    Article  CAS  Google Scholar 

  12. Kircher, M. et al. Nat. Genet. 46, 310–315 (2014).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

TOPMed. WGS for the TOPMed program was supported by the NHLBI. WGS for 'NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Framingham Heart Study' (phs000974.v1.p1) and 'NHLBI TOPMed: Genetics of Cardiometabolic Health in the Amish' (phs000956.v1.p1) was performed at the Broad Institute of MIT and Harvard (HHSN268201500014C and 3R01HL121007-01S1 (NHLBI, B.D.M.)). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1). Phenotype harmonization, data management, sample-identity quality control and general study coordination were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1 (NHLBI, B.M.P., K.M.R. and S.S.R.)). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The infrastructure for the Analysis Commons is additionally supported by R01HL105756 (NHLBI, B.M.P.), U01HL130114 (NHLBI, B.M.P.) and 5RC2HL102419 (NHLBI, E.B.).

Old Order Amish Study. This investigation was supported by National Institutes of Health grants R01 HL121007 (NHLBI, B.D.M.), U01 GM074518, U01 HL084756 (NHLBI, J.R.O.), U01 HL137181 (NHLBI, J.R.O.) and K23 GM102678 (NIGMS, J.P.L.), as well as Mid-Atlantic Nutrition and Obesity Research Center grant P30 DK072488 (NIDDK, B.D.M.). We also gratefully acknowledge our Amish liaisons and field workers and the extraordinary cooperation and support of the Amish community.

Framingham Heart Study. The Framingham Heart Study was supported by the NHLBI Framingham Heart Study (contract no. N01-HC-25195 and HHSN268201500001I (NHLBI, R.S.V. and L.A.C.)), Fibrinogen measurement was supported by NIH R01-HL-48157. J.E.H. and A.D.J. were supported by NHLBI Intramural Research Program funds. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the NHLBI, the National Institutes of Health or the US Department of Health and Human Services.

Author information

Authors and Affiliations

Authors

Consortia

Corresponding authors

Correspondence to Jennifer A Brody or L Adrienne Cupples.

Ethics declarations

Competing interests

B.M.P. reports serving on the data and safety monitoring board for a clinical trial funded by the manufacturer Zoll LifeCor and on the Steering Committee for the Yale Open Data Access Project funded by Johnson & Johnson. J.R.O. has a consulting agreement with Regeneron Pharmaceuticals that focuses on development of statistical analysis and software tools. A.C. and D.C.A. are employed by DNAnexus.

Additional information

A full list of members and affiliations appears in the Supplementary Note.

A full list of members and affiliations appears in the Supplementary Note.

A full list of members and affiliations appears in the Supplementary Note.

A full list of members and affiliations appears in the Supplementary Note.

Supplementary information

Supplementary Text and Figures

Supplementary Note (PDF 158 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brody, J., Morrison, A., Bis, J. et al. Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology. Nat Genet 49, 1560–1563 (2017). https://doi.org/10.1038/ng.3968

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3968

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing