Abstract
The increasing volume of whole-genome sequence (WGS) and multi-omics data requires new approaches for analysis. As one solution, we have created the cloud-based Analysis Commons, which brings together genotype and phenotype data from multiple studies in a setting that is accessible by multiple investigators. This framework addresses many of the challenges of multicenter WGS analyses, including data-sharing mechanisms, phenotype harmonization, integrated multi-omics analyses, annotation and computational flexibility. In this setting, the computational pipeline facilitates a sequence-to-discovery analysis workflow illustrated here by an analysis of plasma fibrinogen levels in 3,996 individuals from the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) WGS program. The Analysis Commons represents a novel model for translating WGS resources from a massive quantity of phenotypic and genomic data into knowledge of the determinants of health and disease risk in diverse human populations.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Research collaboration data platform ensuring general data protection
Scientific Reports Open Access 24 May 2024
-
ARFID Genes and Environment (ARFID-GEN): study protocol
BMC Psychiatry Open Access 21 November 2023
-
Whole genome sequence association analysis of fasting glucose and fasting insulin levels in diverse cohorts from the NHLBI TOPMed program
Communications Biology Open Access 28 July 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Psaty, B.M. et al. Circ Cardiovasc Genet 2, 73–80 (2009).
Morrison, A.C. et al. Nat. Genet. 45, 899–901 (2013).
Fuchsberger, C. et al. Nature 536, 41–47 (2016).
Sankar, P.L. & Parker, L.S. Genet. Med. 19, 743–750 (2017).
Zheng, X. et al. Bioinformatics 33, 2251–2257 (2017).
Liu, X. et al. J. Med. Genet. 53, 111–112 (2016).
Reid, J.G. et al. BMC Bioinformatics 15, 30 (2014).
ENCODE Project Consortium Nature 489, 57–74 (2012).
Lumley, T., Brody, J.A., Peloso, G.M. & Rice, K. Preprint at https://www.biorxiv.org/content/early/2016/11/04/085639/ (2016).
Wu, M.C. et al. Am. J. Hum. Genet. 89, 82–93 (2011).
Huffman, J.E. et al. Blood 126, e19–e29 (2015).
Kircher, M. et al. Nat. Genet. 46, 310–315 (2014).
Acknowledgements
TOPMed. WGS for the TOPMed program was supported by the NHLBI. WGS for 'NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Framingham Heart Study' (phs000974.v1.p1) and 'NHLBI TOPMed: Genetics of Cardiometabolic Health in the Amish' (phs000956.v1.p1) was performed at the Broad Institute of MIT and Harvard (HHSN268201500014C and 3R01HL121007-01S1 (NHLBI, B.D.M.)). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1). Phenotype harmonization, data management, sample-identity quality control and general study coordination were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1 (NHLBI, B.M.P., K.M.R. and S.S.R.)). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The infrastructure for the Analysis Commons is additionally supported by R01HL105756 (NHLBI, B.M.P.), U01HL130114 (NHLBI, B.M.P.) and 5RC2HL102419 (NHLBI, E.B.).
Old Order Amish Study. This investigation was supported by National Institutes of Health grants R01 HL121007 (NHLBI, B.D.M.), U01 GM074518, U01 HL084756 (NHLBI, J.R.O.), U01 HL137181 (NHLBI, J.R.O.) and K23 GM102678 (NIGMS, J.P.L.), as well as Mid-Atlantic Nutrition and Obesity Research Center grant P30 DK072488 (NIDDK, B.D.M.). We also gratefully acknowledge our Amish liaisons and field workers and the extraordinary cooperation and support of the Amish community.
Framingham Heart Study. The Framingham Heart Study was supported by the NHLBI Framingham Heart Study (contract no. N01-HC-25195 and HHSN268201500001I (NHLBI, R.S.V. and L.A.C.)), Fibrinogen measurement was supported by NIH R01-HL-48157. J.E.H. and A.D.J. were supported by NHLBI Intramural Research Program funds. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the NHLBI, the National Institutes of Health or the US Department of Health and Human Services.
Author information
Authors and Affiliations
Consortia
Corresponding authors
Ethics declarations
Competing interests
B.M.P. reports serving on the data and safety monitoring board for a clinical trial funded by the manufacturer Zoll LifeCor and on the Steering Committee for the Yale Open Data Access Project funded by Johnson & Johnson. J.R.O. has a consulting agreement with Regeneron Pharmaceuticals that focuses on development of statistical analysis and software tools. A.C. and D.C.A. are employed by DNAnexus.
Additional information
A full list of members and affiliations appears in the Supplementary Note.
A full list of members and affiliations appears in the Supplementary Note.
A full list of members and affiliations appears in the Supplementary Note.
A full list of members and affiliations appears in the Supplementary Note.
Supplementary information
Supplementary Text and Figures
Supplementary Note (PDF 158 kb)
Rights and permissions
About this article
Cite this article
Brody, J., Morrison, A., Bis, J. et al. Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology. Nat Genet 49, 1560–1563 (2017). https://doi.org/10.1038/ng.3968
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3968
This article is cited by
-
Research collaboration data platform ensuring general data protection
Scientific Reports (2024)
-
ARFID Genes and Environment (ARFID-GEN): study protocol
BMC Psychiatry (2023)
-
Genomics and Functional Genomics of Alzheimer's Disease
Neurotherapeutics (2022)
-
FAIRSCAPE: a Framework for FAIR and Reproducible Biomedical Analytics
Neuroinformatics (2022)
-
Rare coding variants in RCN3 are associated with blood pressure
BMC Genomics (2022)