Plasma protein patterns as comprehensive indicators of health

Williams, Stephen A.; Kivimaki, Mika; Langenberg, Claudia; Hingorani, Aroon D.; Casas, J. P.; Bouchard, Claude; Jonasson, Christian; Sarzynski, Mark A.; Shipley, Martin J.; Alexander, Leigh; Ash, Jessica; Bauer, Tim; Chadwick, Jessica; Datta, Gargi; DeLisle, Robert Kirk; Hagar, Yolanda; Hinterberg, Michael; Ostroff, Rachel; Weiss, Sophie; Ganz, Peter; Wareham, Nicholas J.

doi:10.1038/s41591-019-0665-2

Letter
Published: 02 December 2019

Plasma protein patterns as comprehensive indicators of health

Stephen A. Williams ORCID: orcid.org/0000-0002-8661-4315¹^na1,
Mika Kivimaki ORCID: orcid.org/0000-0002-4699-5627²,
Claudia Langenberg ORCID: orcid.org/0000-0002-5017-7344³,
Aroon D. Hingorani^4,5,6,
J. P. Casas⁷,
Claude Bouchard ORCID: orcid.org/0000-0002-0048-491X⁸,
Christian Jonasson⁹,
Mark A. Sarzynski¹⁰,
Martin J. Shipley²,
Leigh Alexander¹,
Jessica Ash¹,
Tim Bauer¹,
Jessica Chadwick¹,
Gargi Datta ORCID: orcid.org/0000-0002-1314-7824¹,
Robert Kirk DeLisle¹,
Yolanda Hagar¹,
Michael Hinterberg¹,
Rachel Ostroff¹,
Sophie Weiss¹,
Peter Ganz¹¹^na1 &
…
Nicholas J. Wareham³^na1

Nature Medicine volume 25, pages 1851–1857 (2019)Cite this article

20k Accesses
215 Citations
217 Altmetric
Metrics details

Subjects

Abstract

Proteins are effector molecules that mediate the functions of genes^1,2 and modulate comorbidities^{3,4,5,6,7,8,9,10}, behaviors and drug treatments¹¹. They represent an enormous potential resource for personalized, systemic and data-driven diagnosis, prevention, monitoring and treatment. However, the concept of using plasma proteins for individualized health assessment across many health conditions simultaneously has not been tested. Here, we show that plasma protein expression patterns strongly encode for multiple different health states, future disease risks and lifestyle behaviors. We developed and validated protein-phenotype models for 11 different health indicators: liver fat, kidney filtration, percentage body fat, visceral fat mass, lean body mass, cardiopulmonary fitness, physical activity, alcohol consumption, cigarette smoking, diabetes risk and primary cardiovascular event risk. The analyses were prospectively planned, documented and executed at scale on archived samples and clinical data, with a total of ~85 million protein measurements in 16,894 participants. Our proof-of-concept study demonstrates that protein expression patterns reliably encode for many different health issues, and that large-scale protein scanning^{12,13,14,15,16} coupled with machine learning is viable for the development and future simultaneous delivery of multiple measures of health. We anticipate that, with further validation and the addition of more protein-phenotype models, this approach could enable a single-source, individualized so-called liquid health check.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Model outputs compared to the truth standards against which they were derived.**

Large-scale integration of the plasma proteome with genetics and disease

Article 02 December 2021

Coding and regulatory variants are associated with serum protein levels and disease

Article Open access 25 January 2022

Integration of molecular profiles in a longitudinal wellness profiling cohort

Article Open access 08 September 2020

Data availability

Pre-existing data access policies for each of the five parent cohort studies specify that research data requests can be submitted to each steering committee; these will be promptly reviewed for confidentiality or intellectual property restrictions and will not unreasonably be refused. Individual-level patient or protein data may further be restricted by consent, confidentiality or privacy laws/considerations. These policies apply to both clinical and proteomic data.

References

Sun, B. B. et al. Genomic atlas of the human plasma proteome. Nature 558, 73–79 (2018).
Article CAS Google Scholar
Emilsson, V. et al. Co-regulatory networks of human serum proteins link genetics to disease. Science 361, 769–773. (2018).
Article CAS Google Scholar
Tasaki, S. et al. Multi-omics monitoring of drug response in rheumatoid arthritis in pursuit of molecular remission. Nat. Commun. 9, 2755 (2018).
Article Google Scholar
O’Dwyer, D. N. et al. The peripheral blood proteome signature of idiopathic pulmonary fibrosis is distinct from normal and is associated with novel immunological processes. Sci. Rep. 7, 46560 (2017).
Article Google Scholar
Christensson, A. et al. The impact of the glomerular filtration rate on the human plasma proteome. Proteom. Clin. Appl. 12, e1700067 (2018).
Article Google Scholar
Ganz, P. et al. Development and validation of a protein-based risk score for cardiovascular outcomes among patients with stable coronary heart disease. J. Am. Med. Assoc. 315, 2532–2541 (2016).
Article CAS Google Scholar
Wood, G. C., Chu, X. & Argyropoulos, G. et al. A multi-component classifier for nonalcoholic fatty liver disease (NAFLD) based on genomic, proteomic, and phenomic data domains. Sci. Rep. 7, 43238 (2017).
Article Google Scholar
Han, Z. et al. Validation of a novel modified aptamer-based array proteomic platform in patients with end-stage renal disease. Diagnostics (Basel) 8, 71 (2018).
Google Scholar
Menni, C. et al. Circulating proteomic signatures of chronolological age. J. Gerontol. A 70, 809–816 (2014).
Article Google Scholar
Thrush, A. et al. Diet-resistant obesity is characterized by a distinct plasma proteomic signature and impaired muscle fiber metabolism. Int. J. Obes. 42, 353–362 (2018).
Article CAS Google Scholar
Williams, S. A. et al. Improving assessment of drug safety through proteomics: early detection and mechanistic characterization of the unforeseen harmful effects of torcetrapib. Circulation 137, 999–1010 (2018).
Article CAS Google Scholar
Rohloff, J. C. et al. Nucleic acid ligands with protein-like side chains: modified aptamers and their use as diagnostic and therapeutic agents. Mol. Ther. Nucleic Acids 3, e201 (2014).
Article CAS Google Scholar
Gold, L. et al. Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE 5, e15004 (2010).
Article CAS Google Scholar
Brody, E. et al. Life’s simple measures: unlocking the proteome. J. Mol. Biol. 422, 595–606 (2012).
Article CAS Google Scholar
Kim, C. H. et al. Stability and reproducibility of proteomic profiles measured with an aptamer-based platform. Sci. Rep. 8, 8382 (2018).
Article Google Scholar
Candia, J. et al. Assessment of variability in the SOMAscan assay. Sci. Rep. 7, 14248 (2017).
Article Google Scholar
Collaborators GBDRF, Forouzanfar, M. H. et al. Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks in 188 countries, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet 386, 2287–2323 (2015).
Article Google Scholar
Maruthappu, M. Delivering triple prevention: a health system responsibility. Lancet Diabetes Endocrinol. 4, 299–301 (2016).
Article Google Scholar
Robson, J. et al. The NHS Health Check in England: an evaluation of the first 4 years. BMJ Open 6, e008840 (2016).
Article Google Scholar
Valabhji, J. et al. Efficacy and effectiveness of screen and treat policies in prevention of type 2 diabetes: systematic review and meta-analysis of screening tests and interventions. BMJ 356, i6538 (2017).
Middleton, K. R., Anton, S. D. & Perri, M. G. Long-term adherence to health behavior change. Am. J. Lifestyle Med. 7, 395–404 (2013).
Article Google Scholar
Dimitrov, D. V. Medical internet of things and big data in healthcare. Health Inf. Res. 22, 156–163 (2016).
Article Google Scholar
Flores, M., Glusman, G., Brogaard, K., Price, N. D. & Hood, L. P4 medicine: how systems medicine will transform the healthcare sector and society. Per. Med. 10, 565–576 (2013).
Article CAS Google Scholar
Musich, S., Wang, S., Hawkins, K. & Klemes, A. The impact of personalized preventive care on health care quality, utilization, and expenditures. Popul. Health Manag. 19, 389–397. (2016).
Article Google Scholar
Ezkurdia, I. et al. Multiple evidence strands suggest that there may be as few as 19,000 human protein-coding genes. Hum. Mol. Genet. 23, 5866–5878 (2014).
Article CAS Google Scholar
Lin, H. et al. Discovery of a cytokine and its receptor by functional screening of the extracellular proteome. Science 320, 807–811 (2008).
Article CAS Google Scholar
Harrell, F. E. Jr. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer, 2015).
Pencina, Michael J. et al. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat. Med. 27, 157–172 (2008).
Article Google Scholar
Fielding, C. M. & Angulo, P. Hepatic steatosis and steatohepatitis: are they really two distinct entities? Curr. Hepatol. Rep. 13, 151–158 (2014).
Article Google Scholar
Yki-Jarvinen, H. Non-alcoholic fatty liver disease as a cause and a consequence of metabolic syndrome. Lancet Diabetes Endocrinol. 2, 901–910. (2014).
Article CAS Google Scholar
Shuster, A., Patlas, M., Pinthus, J. H. & Mourtzakis, M. The clinical importance of visceral adiposity: a critical review of methods for visceral adipose tissue analysis. Br. J. Radiol. 85, 1–10 (2012).
Article CAS Google Scholar
Ross, R. et al. Importance of assessing cardiorespiratory fitness in clinical practice: a case for fitness as a clinical vital sign: a scientific statement from the American Heart Association. Circulation 134, e653–e699 (2016).
Article Google Scholar
de Souza de Silva, C. G. et al. Association between cardiorespiratory fitness, obesity, and health care costs: The Veterans Exercise Testing Study. Int. J. Obes. (Lond.) https://doi.org/10.1038/s41366-018-0257-0 (2018).
Article Google Scholar
Hobbs, F. D., Jukema, J. W., Da Silva, P. M., McCormack, T. & Catapano, A. L. Barriers to cardiovascular disease risk scoring and primary prevention in Europe. QJM 103, 727–739 (2010).
Article CAS Google Scholar
Ostroff, R. M. et al. Unlocking biomarker discovery: large scale application of aptamer proteomic technology for early detection of lung cancer. PLoS ONE 5, e15003 (2010).
Article CAS Google Scholar
Ostroff, R. M. et al. Early detection of malignant pleural mesothelioma in asbestos-exposed individuals with a noninvasive proteomics-based surveillance tool. PLoS ONE 7, e46091 (2012).
Article CAS Google Scholar
Usher-Smith, J. A., Sharp, S. J. & Griffin, S. J. The spectrum effect in tests for risk prediction, screening, and diagnosis. BMJ 353, i3139 (2016).
Article Google Scholar
Ganna, A. et al. Risk prediction measures for case-cohort and nested case-control designs: an application to cardiovascular disease. Am. J. Epidemiol. 175, 715–724 (2012).
Article Google Scholar
Levey, A. S. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150, 604–612 (2009).
Article Google Scholar
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005).
Article Google Scholar
Tibshirani, R. Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. B 58, 267–288 (1996).
Google Scholar

Download references

Acknowledgements

The Whitehall II study is supported by the UK Medical Research Council UK (no. MR/R024227/1, to M.K.), the US National Institutes on Aging (NIH, nos. US R01AG056477, R01AG062553) to M.K. and the British Heart Foundation (no. RG/16/11/32334) to M.J.S. A.D.H. is a NIHR Senior Investigator and was also supported, in part, by the National Institute for Health Research University College London Hospitals Biomedical Research Centre and the UCL BHF Research Accelerator (AA/18/6/34223). FENLAND (the Fenland study, no. 10.22025/2017.10.101.00001) is funded by UK Medical Research Council (no. MC_UU_12015/1), and N.W. is a NIHR senior investigator. We also thank the Fenland Study Investigators, Fenland Study Co-ordination team and the Epidemiology Field, Data and Laboratory teams. HUNT3 is funded by the Norwegian Ministry of Health, Norwegian University of Science and Technology and Norwegian Research Council, Central Norway Regional Health Authority, the Nord-Trondelag County Council and the Norwegian Institute of Public Health. The HERITAGE Family study was funded by the US National Heart, Lung and Blood Institute grants (NIH/NHLBI, no. R01HL146462 to M.A.S.) and no. HL45670 (HERITAGE, to C.B.). All authors are grateful to all volunteers/participants in all of the cohorts, and to the general practitioners, other physicans and practice staff for assistance with recruitment. SomaScan assays and the Covance study were funded by SomaLogic, Inc. The authors also thank A. Lowell (leader of the SomaLogic assay team), D. Perry for the bioinformatics of quality control, J. Williams for the agreements with the study institutions and J. Zach for clinical data organization and management.

Author information

These authors contributed equally: Stephen A. Williams, Peter Ganz, Nicholas J. Wareham.

Authors and Affiliations

SomaLogic, Inc., Boulder, CO, USA
Stephen A. Williams, Leigh Alexander, Jessica Ash, Tim Bauer, Jessica Chadwick, Gargi Datta, Robert Kirk DeLisle, Yolanda Hagar, Michael Hinterberg, Rachel Ostroff & Sophie Weiss
Department of Epidemiology and Public Health, University College London, London, UK
Mika Kivimaki & Martin J. Shipley
MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
Claudia Langenberg & Nicholas J. Wareham
Institute of Cardiovascular Science, University College London, London, UK
Aroon D. Hingorani
University College London, British Heart Foundation Research Accelerator, London, UK
Aroon D. Hingorani
Health Data Research UK, London, UK
Aroon D. Hingorani
Massachusetts Veterans Epidemiology and Research Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, USA
J. P. Casas
Pennington Biomedical Research Center, Louisiana State University, Baton Rouge, LA, USA
Claude Bouchard
HUNT Research Center and K. G. Jebsen Center for Genetic Epidemiology, Faculty of Medicine and Health Sciences, NTNU–Norwegian University of Science and Technology, Trondheim, Norway
Christian Jonasson
Department of Exercise Science, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA
Mark A. Sarzynski
Division of Cardiology, Center of Excellence in Vascular Research, Zuckerberg San Francisco General Hospital, University of California San Francisco, San Francisco, CA, USA
Peter Ganz

Authors

Stephen A. Williams
View author publications
You can also search for this author in PubMed Google Scholar
Mika Kivimaki
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Langenberg
View author publications
You can also search for this author in PubMed Google Scholar
Aroon D. Hingorani
View author publications
You can also search for this author in PubMed Google Scholar
J. P. Casas
View author publications
You can also search for this author in PubMed Google Scholar
Claude Bouchard
View author publications
You can also search for this author in PubMed Google Scholar
Christian Jonasson
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Sarzynski
View author publications
You can also search for this author in PubMed Google Scholar
Martin J. Shipley
View author publications
You can also search for this author in PubMed Google Scholar
Leigh Alexander
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Ash
View author publications
You can also search for this author in PubMed Google Scholar
Tim Bauer
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Chadwick
View author publications
You can also search for this author in PubMed Google Scholar
Gargi Datta
View author publications
You can also search for this author in PubMed Google Scholar
Robert Kirk DeLisle
View author publications
You can also search for this author in PubMed Google Scholar
Yolanda Hagar
View author publications
You can also search for this author in PubMed Google Scholar
Michael Hinterberg
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Ostroff
View author publications
You can also search for this author in PubMed Google Scholar
Sophie Weiss
View author publications
You can also search for this author in PubMed Google Scholar
Peter Ganz
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas J. Wareham
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

In an academic–industry partnership, SomaLogic, Inc. and the academic collaborators worked together on study design, interpretation of the data and preparation of the manuscript. S.A.W., P.G. and N.W. were responsible for designing, writing and final editing of the manuscript and responses to reviewer comments. In addition to all authors being generally involved in the program, specific contributions were as follows: M.K. and M.J.S. were accountable for the data from the Whitehall II study and advised on the study design for the CV and diabetes models. C.L. and N.W. were accountable for the data from the Fenland study and advising on diabetes risk and behavioral models. C.B. and M.A.S. were accountable for the data from the Heritage Family study. C.J. was accountable for the data from the HUNT3 study. R.O. was accountable for the data from the Covance study. L.A., G.D., R.K.D., Y.H., M.H. and S.W. designed and executed the machine learning tactics and developed the models. R.O., J.A., T.B., J.C. and S.A.W. were responsible for the design and integration of the program across studies. A.D.H. and J.P.C. were particularly involved in the design, execution and interpretation of the CV risk evaluations.

Corresponding author

Correspondence to Stephen A. Williams.

Ethics declarations

Competing interests

The SomaLogic co-authors (S.W., L.A., J.A., T.B., J.C., G.D., R.K.D., Y.H., M.H., R.O. and S.W.) were/are all employees of SomaLogic, Inc., which has a commercial interest in the results. N.W. and C.L. declared that SomaLogic, Inc. has given a grant to the University of Cambridge. P.G. is a member of the SomaLogic Medical Advisory board, for which he receives no remuneration of any kind. The remaining authors (M.K., A.H., J.P.C., C.B., C.J., M.S. and M.S.) have no competing interests.

Additional information

Peer review information Jennifer Sargent was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Descriptors of parent studies and fractions used for model derivation and validation.

Solid black arrows designate how fractions of samples and clinical data were utilized independently; blue dashed arrows designate the validation of finalized models either in new fractions of the same dataset or in independent datasets. eGFR = estimated glomerular filtration rate; VO₂max. = maximum rate of oxygen consumption; kg. = kilograms. *For Fenland, the precise numbers available for 70%/15%/15% fractions depended on the numbers of participants with data for each endpoint as follows: n=9654 for self-reported alcohol units, n = 11,471 with DEXA scans for body composition, n=10,077 with ultrasound for liver fat, n=11,695 with individually calibrated heart rate and movement sensing for caloric expenditure due to physical activity. **For HERITAGE the model was trained on the pre-training time point from half the 523 participants and the post training time point from the other half of the participants. The model was tested on samples with the opposite time points in the same participants and finally replicated in the 10% fraction not used for training.

Supplementary information

Reporting Summary

Supplementary Tables

Supplementary Tables 1–6.

Source data

Source Data Fig. 1

Statistical Source Data for 12 individual panels in Fig. 1

Rights and permissions

Reprints and permissions

About this article

Cite this article

Williams, S.A., Kivimaki, M., Langenberg, C. et al. Plasma protein patterns as comprehensive indicators of health. Nat Med 25, 1851–1857 (2019). https://doi.org/10.1038/s41591-019-0665-2

Download citation

Received: 20 June 2019
Accepted: 23 October 2019
Published: 02 December 2019
Issue Date: December 2019
DOI: https://doi.org/10.1038/s41591-019-0665-2

This article is cited by

Adopting artificial intelligence in cardiovascular medicine: a scoping review
- Hisaki Makimoto
- Takahide Kohro
Hypertension Research (2024)
Biological aging as a predictor of cardiometabolic multimorbidity
- Mika Kivimäki
- Linda Partridge
Nature Cardiovascular Research (2024)
The transition from genomics to phenomics in personalized population health
- James T. Yurkovich
- Simon J. Evans
- Leroy E. Hood
Nature Reviews Genetics (2024)
Multi-omic prediction of incident type 2 diabetes
- Julia Carrasco-Zanini
- Maik Pietzner
- Nicholas J. Wareham
Diabetologia (2024)
Comprehensive proteomics of CSF, plasma, and urine identify DDC and other biomarkers of early Parkinson’s disease
- Jarod Rutledge
- Benoit Lehallier
- Kathleen L. Poston
Acta Neuropathologica (2024)