Article

A wellness study of 108 individuals using personal, dense, dynamic data clouds

  • Nature Biotechnology volume 35, pages 747756 (2017)
  • doi:10.1038/nbt.3870
  • Download Citation
Received:
Accepted:
Published:

Abstract

Personal data for 108 individuals were collected during a 9-month period, including whole genome sequences; clinical tests, metabolomes, proteomes, and microbiomes at three time points; and daily activity tracking. Using all of these data, we generated a correlation network that revealed communities of related analytes associated with physiology and disease. Connectivity within analyte communities enabled the identification of known and candidate biomarkers (e.g., gamma-glutamyltyrosine was densely interconnected with clinical analytes for cardiometabolic disease). We calculated polygenic scores from genome-wide association studies (GWAS) for 127 traits and diseases, and used these to discover molecular correlates of polygenic risk (e.g., genetic risk for inflammatory bowel disease was negatively correlated with plasma cystine). Finally, behavioral coaching informed by personal data helped participants to improve clinical biomarkers. Our results show that measurement of personal data clouds over time can improve our understanding of health and disease, including early transitions to disease states.

  • Subscribe to Nature Biotechnology for full access:

    $250

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    & A personal view on systems medicine and the emergence of proactive P4 medicine: predictive, preventive, personalized and participatory. N. Biotechnol. 29, 613–624 (2012).

  2. 2.

    & Predictive, personalized, preventive, participatory (P4) cancer medicine. Nat. Rev. Clin. Oncol. 8, 184–187 (2011).

  3. 3.

    & A new initiative on precision medicine. N. Engl. J. Med. 372, 793–795 (2015).

  4. 4.

    , & The Healthcare Imperative: Lowering Costs and Improving Outcomes: Workshop Series Summary (National Academies Press, 2010).

  5. 5.

    et al. Host lifestyle affects human microbiota on daily timescales. Genome Biol. 15, R89 (2014).

  6. 6.

    et al. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 148, 1293–1307 (2012).

  7. 7.

    Quantifying your body: a how-to guide from a systems biology perspective. Biotechnol. J. 7, 980–991 (2012).

  8. 8.

    & Promoting wellness and demystifying disease: The 100K project. Clinical OMICs 1, 20–23 (2014).

  9. 9.

    et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).

  10. 10.

    & Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).

  11. 11.

    & Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 7821–7826 (2002).

  12. 12.

    Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA 103, 8577–8582 (2006).

  13. 13.

    et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

  14. 14.

    , & Leptin and cardiovascular disease: response to therapeutic interventions. Circulation 117, 3238–3249 (2008).

  15. 15.

    Clinical application of C-reactive protein for cardiovascular disease detection and prevention. Circulation 107, 363–369 (2003).

  16. 16.

    , , & Fibroblast growth factor 21 as an emerging metabolic regulator: clinical perspectives. Clin. Endocrinol. 78, 489–496 (2013).

  17. 17.

    , & Immunodetection of the amyloid P component in Alzheimer's disease. Acta Neuropathol. 78, 429–437 (1989).

  18. 18.

    , , , & Serum amyloid P and cardiovascular disease in older men and women: results from the Cardiovascular Health Study. Arterioscler. Thromb. Vasc. Biol. 27, 352–358 (2007).

  19. 19.

    , , , & LDL/HDL-changes in subclinical hypothyroidism: possible risk factors for coronary heart disease. Clin. Endocrinol. 28, 157–163 (1988).

  20. 20.

    et al. STRING 8—a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37, D412–D416 (2009).

  21. 21.

    , , & Effects of serotonin on platelet activation in whole blood. Blood Coagul. Fibrinolysis 8, 517–523 (1997).

  22. 22.

    , & Selective serotonin reuptake inhibitors and myocardial infarction. Circulation 104, 1894–1898 (2001).

  23. 23.

    et al. Symbiotic gut microbes modulate human metabolic phenotypes. Proc. Natl. Acad. Sci. USA 105, 2117–2122 (2008).

  24. 24.

    et al. Microbiota-derived phenylacetylglutamine associates with overall mortality and cardiovascular disease in patients with CKD. J. Am. Soc. Nephrol. 27, 3479–3487 (2016).

  25. 25.

    et al. Reduced diversity of faecal microbiota in Crohn's disease revealed by a metagenomic approach. Gut 55, 205–211 (2006).

  26. 26.

    et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).

  27. 27.

    , , , & Associations between total serum GGT activity and metabolic risk: MESA. Biomark. Med. 7, 709–721 (2013).

  28. 28.

    , , , & A strong interaction between serum gamma-glutamyltransferase and obesity on the risk of prevalent type 2 diabetes: results from the Third National Health and Nutrition Examination Survey. Clin. Chem. 53, 1092–1098 (2007).

  29. 29.

    , & The galanin peptide family: receptor pharmacology, pleiotropic biological actions, and implications in health and disease. Pharmacol. Ther. 115, 177–207 (2007).

  30. 30.

    et al. The furan fatty acid metabolite CMPF is elevated in diabetes and induces β cell dysfunction. Cell Metab. 19, 653–666 (2014).

  31. 31.

    et al. Serum metabolomics profiles in response to n-3 fatty acids in Chinese patients with type 2 diabetes: a double-blind randomised controlled trial. Sci. Rep. 6, 29522 (2016).

  32. 32.

    et al. Genome-wide association study of plasma N6 polyunsaturated fatty acids within the cohorts for heart and aging research in genomic epidemiology consortium. Circ Cardiovasc Genet 7, 321–331 (2014).

  33. 33.

    et al. Genome-wide association of serum bilirubin levels in Korean population. Hum. Mol. Genet. 19, 3672–3678 (2010).

  34. 34.

    et al. A multi-stage genome-wide association study of bladder cancer identifies multiple susceptibility loci. Nat. Genet. 42, 978–984 (2010).

  35. 35.

    , , & Arylamine N-acetyltransferase 1 (NAT1) and 2 (NAT2) polymorphisms in susceptibility to bladder cancer: the influence of smoking. Cancer Epidemiol. Biomarkers Prev. 6, 225–231 (1997).

  36. 36.

    Diabetes Prevention Program Research Group. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. The Lancet 374, 1677–1686 (2009).

  37. 37.

    et al. Hemochromatosis and iron-overload screening in a racially diverse population. N. Engl. J. Med. 352, 1769–1778 (2005).

  38. 38.

    et al. The effects of LY2405319, an FGF21 analog, in obese human subjects with type 2 diabetes. Cell Metab. 18, 333–340 (2013).

  39. 39.

    et al. TSH-controlled L-thyroxine therapy reduces cholesterol levels and clinical symptoms in subclinical hypothyroidism: a double blind, placebo-controlled trial (Basel Thyroid Study). J. Clin. Endocrinol. Metab. 86, 4860–4866 (2001).

  40. 40.

    et al. Gamma-glutamyltransferase as a risk factor for cardiovascular disease mortality: an epidemiological investigation in a cohort of 163,944 Austrian adults. Circulation 112, 2130–2137 (2005).

  41. 41.

    & Interrelationships between the binding sites for amino acids, dipeptides, and gamma-glutamyl donors in gamma-glutamyl transpeptidase. J. Biol. Chem. 252, 6792–6798 (1977).

  42. 42.

    et al. Serum metabolomics reveals γ-glutamyl dipeptides as biomarkers for discrimination among different forms of liver disease. J. Hepatol. 55, 896–905 (2011).

  43. 43.

    et al. Metabolomic derangements are associated with mortality in critically ill adult patients. PLoS One 9, e87538 (2014).

  44. 44.

    et al. Impairment of intestinal glutathione synthesis in patients with inflammatory bowel disease. Gut 42, 485–492 (1998).

  45. 45.

    et al. Plasma metabolomic profiles enhance precision medicine for volunteers of normal health. Proc. Natl. Acad. Sci. USA 112, E4901–E4910 (2015).

  46. 46.

    & Demystifying disease, democratizing health care. Sci. Transl. Med. 6, 225ed5 (2014).

  47. 47.

    , , & Vitamin D insufficiency among free-living healthy young adults. Am. J. Med. 112, 659–662 (2002).

  48. 48.

    , & Evolution of Translational Omics: Lessons Learned and the Path Forward (National Academies Press, 2012).

  49. 49.

    et al. Identification of copy number variants in whole-genome data using Reference Coverage Profiles. Front. Genet. 6, 45 (2015).

  50. 50.

    , , , & Kaviar: an accessible system for testing SNV novelty. Bioinformatics 27, 3216–3217 (2011).

  51. 51.

    et al. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J. 6, 1621–1624 (2012).

  52. 52.

    et al. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences. PeerJ 2, e545 (2014).

  53. 53.

    Evolution and measurement of species diversity. Taxon 21, 213–251 (1972).

  54. 54.

    et al. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26, 266–267 (2010).

  55. 55.

    et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

  56. 56.

    et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6, 610–618 (2012).

  57. 57.

    in Nucleic Acid Techniques in Bacterial Systematics (eds. Stackebrandt, E. & Goodfellow, M.) 115–175 (John Wiley and Sons, Chichester, UK, 1991).

  58. 58.

    , , , & UniFrac: an effective distance metric for microbial community comparison. ISME J. 5, 169–172 (2011).

  59. 59.

    , & Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J. 4, 17–27 (2010).

  60. 60.

    & UniFrac: a new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol. 71, 8228–8235 (2005).

  61. 61.

    & IPython: a system for interactive scientific computing. Comput. Sci. Eng. 9, 21–29 (2007).

  62. 62.

    Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).

  63. 63.

    et al. Human SRMAtlas: A resource of targeted assays to quantify the complete human proteome. Cell 166, 766–778 (2016).

  64. 64.

    et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966–968 (2010).

  65. 65.

    & Statsmodels: econometric and statistical modeling with python. Proceedings of the 9th Python in Science Conference (eds. van der Walt, S. & Millman, J.) 57–61 (SciPy, 2010).

  66. 66.

    & Exploring network structure, dynamics, and function using NetworkX. Proceedings of the 7th Python in Science Conference (eds. Varoquaux, G., Vaught, T. & Millman, J.) 11–15 (SciPy, 2008).

  67. 67.

    et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet. Med. 15, 565–574 (2013).

Download references

Acknowledgements

We would like to acknowledge significant contributions to this study from our 108 Pioneers, S. Kaplan, S. Mecca, S. Bell, G. Sorensen, C. Lewis, T. Kilgallon, M. Brunkow, S. Huang, C.-Y. Huang, D. Mauldin, S. Speck, M. Raff, J. Pizzorno, J. Guiltinan, R. Green, L. Smarr, E. Lazowska, C. Witwer, M. Flores, and many others who helped us on this wellness journey. This work was supported in part by the Robert Wood Johnson Foundation (L.H., N.D.P.), the M.J. Murdock Charitable Trust (L.H., N.D.P.), NIH grants 2P50GM076547 (L.H., N.D.P.), P30ES017885 (G.S.O.), U24CA2210967 (G.S.O.), RC2HG005805 (R.L.M.), and Arivale.

Author information

Author notes

    • Daniel T McDonald

    Present address: University of California, San Diego, San Diego, California, USA.

    • Nathan D Price
    • , Andrew T Magis
    •  & John C Earls

    These authors contributed equally to this work.

    • Nathan D Price
    •  & Leroy Hood

    These authors jointly supervised this work.

Affiliations

  1. Institute for Systems Biology, Seattle, Washington, USA.

    • Nathan D Price
    • , Gustavo Glusman
    • , Roie Levy
    • , Christopher Lausted
    • , Daniel T McDonald
    • , Ulrike Kusebauch
    • , Christopher L Moss
    • , Yong Zhou
    • , Shizhen Qin
    • , Robert L Moritz
    • , Gilbert S Omenn
    • , Jennifer C Lovejoy
    •  & Leroy Hood
  2. Arivale, Seattle, Washington, USA.

    • Nathan D Price
    • , Andrew T Magis
    • , John C Earls
    • , Kristin Brogaard
    •  & Jennifer C Lovejoy
  3. Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.

    • Gilbert S Omenn
  4. Providence St. Joseph Health, Seattle, Washington, USA.

    • Leroy Hood

Authors

  1. Search for Nathan D Price in:

  2. Search for Andrew T Magis in:

  3. Search for John C Earls in:

  4. Search for Gustavo Glusman in:

  5. Search for Roie Levy in:

  6. Search for Christopher Lausted in:

  7. Search for Daniel T McDonald in:

  8. Search for Ulrike Kusebauch in:

  9. Search for Christopher L Moss in:

  10. Search for Yong Zhou in:

  11. Search for Shizhen Qin in:

  12. Search for Robert L Moritz in:

  13. Search for Kristin Brogaard in:

  14. Search for Gilbert S Omenn in:

  15. Search for Jennifer C Lovejoy in:

  16. Search for Leroy Hood in:

Contributions

L.H. and N.D.P. conceived of and led the study. J.C.L. designed and managed the clinical and coaching aspects of the study. A.T.M. and J.C.E. performed most of the computational analyses. G.G. contributed many important ideas from the beginning of the study. G.G., R.L., and D.T.M. performed additional computational analysis. N.D.P., A.T.M., J.C.E., G.G., R.L., D.T.M., G.S.O., J.C.L., and L.H. analyzed data. C.L. generated the Olink proteomics data. U.K., C.L.M., Y.Z., S.Q., and R.L.M. generated the mass spectrometry proteomics data. K.B. managed most of the logistics of implementing the study. A.T.M., N.D.P., and L.H. were the primary writers of the paper, with contributions from all authors.

Competing interests

L.H. and N.D.P. are co-founders of Arivale and hold stock in the company. N.D.P. is on the Arivale board of directors; L.H. is chair and G.S.O. a member of Arivale's scientific advisory board. A.T.M., J.C.E., K.B., and J.C.L. are employees of Arivale and have stock options in the company, as do G.G. and G.S.O.

Corresponding authors

Correspondence to Nathan D Price or Leroy Hood.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–9 and Supplementary Tables 6–7, 9–13

Zip files

  1. 1.

    Supplementary Dataset 1

  2. 2.

    Supplementary Dataset 2

  3. 3.

    Supplementary Code

    Supplementary Code zip

Excel files

  1. 1.

    Supplementary Table 1

    All analytes measured in the P100

  2. 2.

    Supplementary Table 2

    Complete inter-omic correlation network for cross-sectional correlations

  3. 3.

    Supplementary Table 3

    Complete intra- and inter-omic correlation network for cross-sectional correlations

  4. 4.

    Supplementary Table 4

    Complete inter-omic correlation network for delta correlations

  5. 5.

    Supplementary Table 5

    Complete intra- and inter-omic correlation network for delta correlations

  6. 6.

    Supplementary Table 8

    Polygenic score quantitative traits tested in the P100

  7. 7.

    Supplementary Table 14

    Age and sex adjustments for the correlation networks