Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies

Abstract

Improved understanding of genetic regulation of the proteome can facilitate identification of the causal mechanisms for complex traits. We analyzed data on 4,657 plasma proteins from 7,213 European American (EA) and 1,871 African American (AA) individuals from the Atherosclerosis Risk in Communities study, and further replicated findings on 467 AA individuals from the African American Study of Kidney Disease and Hypertension study. Here, we identified 2,004 proteins in EA and 1,618 in AA, with most overlapping, which showed associations with common variants in cis-regions. Availability of AA samples led to smaller credible sets and notable number of population-specific cis-protein quantitative trait loci. Elastic Net produced powerful models for protein prediction in both populations. An application of proteome-wide association studies to serum urate and gout implicated several proteins, including IL1RN, revealing the promise of the drug anakinra to treat acute gout flares. Our study demonstrates the value of large and diverse ancestry study to investigate the genetic mechanisms of molecular phenotypes and their relationship with complex traits.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: cis-pQTL analysis.
Fig. 2: Fine-mapping analysis.
Fig. 3: cis-h2 and evaluation of models for genetic prediction of proteins.
Fig. 4: Miami plots for PWAS and TWAS analyses for serum urate level and gout.

Similar content being viewed by others

Data availability

Genome-wide summary-level statistics for all single SNP cis-pQTL analysis, irrespective of significance level, and data required to perform PWAS, are available from http://nilanjanchatterjeelab.org/pwas. For individual-level plasma protein data, pre-existing data access policies for each of the parent cohort studies (ARIC and AASK) specify that research data requests can be submitted to each steering committee; these will be reviewed promptly for confidentiality or intellectual property restrictions and will not unreasonably be refused. Please refer to the data-sharing policies of these studies. Individual-level patient or protein data may further be restricted by consent, confidentiality or privacy laws/considerations. These policies apply to both clinical and proteomic data. The CKDGen Consortium makes all data reported in its original publications publicly available (https://ckdgen.imbi.uni-freiburg.de/). For European-specific gout GWAS data, further data requests can be submitted to the CKDGen steering committee; these will be reviewed promptly for confidentiality or intellectual property restrictions and will not unreasonably be refused. GRCh38 reference genome data from Phase-3 1000 Genome Project is available from https://www.internationalgenome.org/data. Access to UK Biobank individual-level data can be requested from https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. Gene expression imputation models previously built based on data from the GTEx V7 and data required to perform TWAS are available from http://gusevlab.org/projects/fusion/#reference-functional-data, accession date 28 July 2021; models based on GTEx V8 are available on request from . Mancuso and A. Gusev. cis-eQTL summary statistics are available from https://gtexportal.org/home/. VEP was obtained from https://useast.ensembl.org/index.html. Therapeutic target database was downloaded from http://db.idrblab.net/ttd/full-data-download. Source data are provided with this paper.

Code availability

All custom codes used to perform data analysis relevant to this paper, including protein data cleaning, cis-pQTL mapping, building PWAS models, etc., are available at Zenodo with https://zenodo.org/record/6332981 and https://doi.org/10.5281/zenodo.6332981. Example codes to perform PWAS using external GWAS data are available from http://nilanjanchatterjeelab.org/pwas. Most of our statistical analysis was performed using R v.3.6.1 and R v.4.0.2, and R packages biomaRt v.2.42.1, peer v.1.0, plink2R v.1.1, glmnet v.4.0, ggplot2 v.3.3.3, gaston v.1.5.6, GGally v.2.0.0, ggpubr v.0.4.0, readr v.1.3.1, bigreadr v.0.2.0, readxl v.1.3.1, xlsx v.0.6.3, dplyr v.1.0.4, stringr v.1.4.0, latex2exp v.0.4.0. cis-pQTL mapping was performed using QTLtools v.1.2 (Binary CentOS 7.8). The publicly available summary-level statistics and analysis relevant to analyzing genotype data were performed by PLINK v.2.0 and PLINK v.1.9. cis-h2 analysis was performed using GCTA v.1.93.0 beta. Plasma protein imputation models were trained using FUSION available from https://github.com/gusevlab/fusion_twas. Downstream analysis including enrichment and colocalization was performed using VEP (v.85), TORUS (https://github.com/xqwen/torus), and coloc v.3.2.1. Fine-mapping was performed using SuSIE v.0.11.42 for ancestry-specific analysis, and MANTRA [1.0; February 2012] (available on request from A.P. Morris) for transancestry analysis.

References

  1. Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    Article  CAS  PubMed  Google Scholar 

  2. Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Zhang, F. & Lupski, J. R. Non-coding genetic variants in human disease. Hum. Mol. Genet. 24, R102–R110 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Tak, Y. G. & Farnham, P. J. Making sense of GWAS: using epigenomics and genome engineering to understand the functional relevance of SNPs in non-coding regions of the human genome. Epigenetics chromatin 8, 57 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  5. Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kumar, V. et al. Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. PLoS Genet. 9, e1003201 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Lonsdale, J. et al. The genotype-tissue expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).

    Article  CAS  Google Scholar 

  8. GTEx Consortium. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  PubMed Central  Google Scholar 

  9. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article  CAS  Google Scholar 

  10. Hormozdiari, F. et al. Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat. Genet. 50, 1041–1047 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Gold, L. et al. Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS One 5, e15004 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Williams, S. A. et al. Plasma protein patterns as comprehensive indicators of health. Nat. Med. 25, 1851–1857 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Anderson, N. L. & Anderson, N. G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell. Proteom. 1, 845–867 (2002).

    Article  CAS  Google Scholar 

  14. Folkersen, L. et al. Genomic and drug target evaluation of 90 cardiovascular proteins in 30,931 individuals. Nat. Metab. 2, 1135–1148 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Sun, B. B. et al. Genomic atlas of the human plasma proteome. Nature 558, 73–79 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Emilsson, V. et al. Coding and regulatory variants affect serum protein levels and common disease. Preprint at BioRxiv https://doi.org/10.1101/2020.05.06.080440 (2021).

  17. Yao, C. et al. Genome‐wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease. Nat. Commun. 9, 1–11 (2018).

    CAS  Google Scholar 

  18. Pietzner, M. et al. Genetic architecture of host proteins involved in SARS-CoV-2 infection. Nat. Commun. 11, 1–14 (2020).

    Article  CAS  Google Scholar 

  19. Zhou, S. et al. A Neanderthal OAS1 isoform protects individuals of European ancestry against COVID-19 susceptibility and severity. Nat. Med. 27, 659–667 (2021).

    Article  CAS  PubMed  Google Scholar 

  20. Yang, C. et al. Genomic and multi-tissue proteomic integration for understanding the biology of disease and other complex traits. Preprint at medRxiv https://doi.org/10.1101/2020.06.25.20140277 (2020).

  21. He, B., Shi, J., Wang, X., Jiang, H. & Zhu, H. Genome-wide pQTL analysis of protein expression regulatory networks in the human liver. BMC Biol. 18, 97 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Wingo, A. P. Integrating human brain proteomes with genome-wide association data implicates new proteins in Alzheimer’s disease pathogenesis. Nat. Genet. 53, 143–146 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Bretherick, A. D. et al. Linking protein to phenotype with Mendelian randomization detects 38 proteins with causal roles in human diseases and traits. PLoS Genet. 16, e1008785 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Suhre, K. et al. Connecting genetic risk to disease end points through the human blood plasma proteome. Nat. Commun. 8, 14357 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Zheng, J. et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 52, 1122–1131 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators. Am. J. Epidemiol. 129, 687–702 (1989).

    Google Scholar 

  29. Pietzner, M. et al. Cross-platform proteomics to advance genetic prioritisation strategies. Preprint at bioRxiv https://doi.org/10.1101/2021.03.18.435919 (2021).

  30. Hemani, G., Bowden, J. & Davey Smith, G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet. 27, R195–R208 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Tin, A. et al. Target genes, variants, tissues and transcriptional pathways influencing human serum urate levels.Nat. Genet. 51, 1459–1474 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Stegle, O., Parts, L., Durbin, R. & Winn, J. A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLoS Comput. Biol. 6, e1000770 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  CAS  Google Scholar 

  35. Gassman, J. J. et al. Design and statistical aspects of the African American Study of Kidney Disease and Hypertension (AASK). J. Am. Soc. Nephrol. 14, S154–S165 (2003).

    Article  PubMed  Google Scholar 

  36. Park, J. et al. Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants. Proc. Natl Acad. Sci. USA 108, 18026–18031 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Delaneau, O. et al. A complete tool set for molecular QTL discovery and analysis. Nat. Commun. 8, 15452 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

    Article  CAS  PubMed  Google Scholar 

  39. Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B Stat. Methodol. 82, 1273–1300 (2020).

    Article  Google Scholar 

  40. Morris, A. P. Transethnic meta-analysis of genomewide association studies. Genet. Epidemiol. 35, 809–822 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  41. He, Z., Song, D., van Zalen, S. & Russell, J. E. Structural determinants of human ζ-globin mRNA stability. J. Hematol. Oncol. 7, 35 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  42. He, Z. & Russell, J. E. Effect of ζ-globin substitution on the O2-transport properties of Hb S in vitro and in vivo. Biochem. Biophys. Res. Commun. 325, 1376–1382 (2004).

    Article  CAS  PubMed  Google Scholar 

  43. Lafferty, J. D. et al. A multicenter trial of the effectiveness of ζ-globin enzyme-linked immunosorbent assay and hemoglobin H inclusion body screening for the detection of α0-thalassemia trait. Am. J. Clin. Pathol. 129, 309–315 (2008).

    Article  CAS  PubMed  Google Scholar 

  44. Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).

    Article  CAS  PubMed  Google Scholar 

  45. Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429. e19 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Anderson, L. & Seilhamer, J. A comparison of selected mRNA and protein abundances in human liver. Electrophoresis 18, 533–537 (1997).

    Article  CAS  PubMed  Google Scholar 

  49. Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Köttgen, A. et al. Genome-wide association analyses identify 18 new loci associated with serum urate concentrations. Nat. Genet. 45, 145–154 (2013).

    Article  PubMed  CAS  Google Scholar 

  51. Janssen, C. A. et al. Anakinra for the treatment of acute gout flares: a randomized, double-blind, placebo-controlled, active-comparator, non-inferiority trial. Rheumatology 58, 1344–1352 (2019).

    Article  CAS  Google Scholar 

  52. Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Pietzner, M. et al. Mapping the proteo-genomic convergence of human diseases. Science 374, eabj1541 (2021).

    Article  PubMed  CAS  Google Scholar 

  54. Ferkingstad, E. et al. Large-scale integration of the plasma proteome with genetics and disease. Nat. Genet. 53, 1712–1721 (2021).

    Article  CAS  PubMed  Google Scholar 

  55. Durinck, S. et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21, 3439–3440 (2005).

    Article  CAS  PubMed  Google Scholar 

  56. Kowalski, M. H. et al. Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations. PLoS Genet. 15, e1008500 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  58. Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    Article  CAS  PubMed  Google Scholar 

  59. McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  60. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Zerbino, D. R., Wilder, S. P., Johnson, N., Juettemann, T. & Flicek, P. R. The ensembl regulatory build. Genome Biol. 16, 56 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  63. Wen, X. Molecular QTL discovery incorporating genomic annotations using Bayesian false discovery rate control. Ann. Appl. Stat. 10, 1619–1638 (2016).

    Article  Google Scholar 

  64. Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1–20 (2018).

    Article  CAS  Google Scholar 

  67. Wang, Y. et al. Therapeutic target database 2020: enriched resource for facilitating research and early development of targeted therapeutics. Nucleic Acids Res. 48, D1031–D1041 (2020).

    CAS  PubMed  Google Scholar 

  68. Finan, C. The druggable genome and support for target identification and validation in drug development. Sci. Transl. Med. 9, eaag1166 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  69. Navarro Gonzalez, J. et al. The UCSC Genome Browser database: 2021 update. Nucleic Acids Res. 49, D1046–D1057 (2021).

    Article  PubMed  CAS  Google Scholar 

  70. Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Zhang, J. & Dutta, D. Jingning-Zhang/PlasmaProtein: Custom code for: Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies. Zenodo https://doi.org/10.5281/zenodo.6332981 (2022).

Download references

Acknowledgements

The ARIC study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute (NHLBI), National Institutes of Health (NIH), Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I). The authors thank the staff and participants of the ARIC study for their important contributions. SomaLogic Inc. conducted the SomaScan assays in exchange for use of ARIC data. This work was supported in part by NIH/NHLBI grant R01 HL134320. The UK Biobank data was obtained under the UK Biobank resource application 17712. Research of J.Z., D.D. and N.C. was supported by grant R01 from the National Human Genome Research Institute [1 R01 HG010480-01]. B.H. was supported by the Bloomberg Distinguished Professorship Endowment fund available to N.C. The work of A.K. was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Project-ID 431984000—SFB 1453. The work of P.S. was funded by the EQUIP Program for Medical Scientists, Faculty of Medicine, University of Freiburg. The work of A.T. was funded by R01 AR073178. The work of J.C. and E.B. was funded by the ARIC contract. The work of M.G. and J.C. was funded by the multiomics grant R01 DK124399. The work of B.Y. was funded by HL148218. We acknowledge N. Mancuso and A. Gusev for providing preliminary TWAS models built with GTEx V8 data.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

J.Z., J.C. and N.C. conceived the project. J.Z. and D.D. carried out all data analyses with supervision from N.C. B.H. developed online resources for data visualization and sharing, J.Z., D.D., A.K. and N.C. drafted the manuscript, and A.T., P.S., M.E.G., B.Y. and E.B. provided comments. All authors reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Nilanjan Chatterjee.

Ethics declarations

Competing interests

Proteomic assays in ARIC were conducted free of charge as part of a data exchange agreement with SomaLogic. The authors declare no other competing interests.

Peer review

Peer review information

Nature Genetics thanks Anders Mälarstig and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Cis-pQTLs’ effect sizes across two populations.

Effect sizes for common (MAF> 1%) sentinel cis-pQTLs across EA and AA populations. Each dot represents a common sentinel SNP detected through either the EA (left panel) or the AA population (right panel). x-axis shows the effect size in the population through which the cis-pQTL is identified, and y-axis shows effect size in the other population. Minor allele frequencies (MAF) are checked for some outliers corresponding to large difference in allele frequency across populations (marked with orange). Red line is diagonal.

Source data

Extended Data Fig. 2 Overlap and colocalization of cis-pQTLs and cis-eQTLs.

(a) Proportion of sentinel cis-pQTLs in EA (including their LD proxies; SNPs with LD > 0.8) that are identified as cis-eQTLs across 49 different tissues in GTEx V8. Results are ordered by the size of overlap. (b) proportion of SOMAmers showing high colocalization probability (PP.H4 > 0.8) of underlying cis-pQTLS and cis-eQTLs in the same gene across tissues in GTEx (V8). Results are ordered by the size of overlap reported in (a) for ease of comparison.

Source data

Extended Data Fig. 3 Cis-pQTLs tended to be significant cis-eQTLs across multiple tissues.

Distribution of number of tissues with significant cis-eQTL effects in GTEx V8 for the sentinel cis-pQTLs (and SNPs in high LD) (blue) compared to that of cis-eQTLs in GTEx V8 irrespective of their cis-pQTL status (red). Sentinel cis-pQTLs are restricted to those which show cis-eQTL effect in at least one tissue. cis-eQTL effects are evaluated for the same underlying genes for which significant cis-pQTLs are detected.

Source data

Extended Data Fig. 4 Functional enrichment.

Functional enrichment of all sentinel cis-pQTLs and SNPs in high LD with them (r2 > 0.8) for EA (a) and AA (b). Functional enrichment of sentinel cis-pQTLs which have effects independent of protein altering variants are shown for EA (c) and AA (d). The red dots denote the estimated log2-enrichment statistic, and the black lines represent the corresponding 95% confidence intervals using TORUS (See Methods for details). Sample sizes for EA and AA population are n=7,213 and 1,871, respectively.

Source data

Extended Data Fig. 5 Cis-heritability comparison between gene expression and plasma protein levels.

Comparison of cis- heritability (cis-h2) estimates of plasma protein (P) and gene expression (T) for a common set of overlapping genes. For each population, the overlap is defined by the set of genes that have significant cis-h2 for both plasma protein and gene expression in the given tissue (liver and whole blood) in GTEx (a) V7 and (b) V8. Sample sizes for EA and AA populations are n=7,213 and 1,871, respectively. In boxplots, the boxes are drawn from first and third quartiles, with the median at the center, and the whiskers extending to 1.5 times the interquartile range from the box boundaries. Figures are truncated in the y-axis at cis-h2=0 and 0.5 for better display.

Source data

Extended Data Fig. 6 Correlation between imputed gene expression and measured plasma protein levels in ARIC EA samples.

Measured plasma protein levels are pre-processed by inverse-rank normalization and adjusted for covariates and 90 PEER factors. Gene expression imputation models for TWAS analyses across all tissues are built based on GTEx V7 datasets (see Supplementary Table 13 for available sample sizes). The imputation models for plasma proteins are built based n=7,213 EA individuals in the ARIC study. In boxplots, the boxes are drawn from first and third quartiles, with the median at the center, and the whiskers extending to 1.5 times the interquartile range from the box boundaries. Figure is truncated in the y-axis at correlation= −0.15 and 0.45 for better display.

Source data

Extended Data Fig. 7 Control of type-1 error of PWAS.

Quantile-quantile plot (red diagonal line) of p-values are shown for a continuous phenotype that is simulated under the null hypothesis of no genetic association for unrelated European ancestry individuals in the UK Biobank study (n=337,484). Results are based on two-sided z-tests of association between the cis-genetic regulated plasma protein level and the simulated null trait. The diagonal line represents expected p-values under the null hypothesis of no genetic association and the 95% confidence band, which is calculated based on standard errors of order statistics under normal approximation, represents regions of uncertainty in the q-q plot under the null hypothesis of no association.

Source data

Extended Data Fig. 8 PWAS of serum urate level and gout.

Quantile-quantile plots of PWAS p-values obtained from two-sided z-tests of association between the cis-genetic regulated plasma protein levels and the trait of interest, serum urate level (n=288,649) and gout (n=754,056). The diagonal lines represent expected p-values under the null hypothesis of no genetic association and the 95% confidence bands, which is calculated based on standard errors of order statistics under normal approximation, represent regions of uncertainty in the q-q plot under the null hypothesis of no association.

Source data

Extended Data Fig. 9 PWAS identify repurposing opportunity for anakinra to treat gout.

Blue particle is interleukin-1 (IL-1) which produces proinflammatory effect of interleukin-1 signaling. Green particle is interleukin-1 receptor antagonist protein (IL1RN) which competes for binding but does not lead to a signal. Red particle is anakinra which has same shape as IL1RN and can also bind to the IL1R1 without eliciting a signal. Anakinra is a synthetic drug that mimics the function of the natural protein IL1RN. It is approved for treating rheumatoid arthritis. Our study shows that genetically higher IL1RN levels show protection from gout. This suggests that anakinra may also be effective to treat gout (repurposing). Plot was created with BioRender.com.

Extended Data Fig. 10 Top five genetic principal components (PC) of ARIC data.

Genetic PCs represent the major population structure in the aggregated sample of EA (blue) and AA (green) populations, colored by self-reported ancestry.

Source data

Supplementary information

Source data

Source Data Fig. 1

Statistical Source Data.

Source Data Fig. 2

Statistical Source Data.

Source Data Fig. 3

Statistical Source Data.

Source Data Fig. 4

Statistical Source Data.

Source Data Extended Data Fig. 1

Statistical Source Data.

Source Data Extended Data Fig. 2

Statistical Source Data.

Source Data Extended Data Fig. 3

Statistical Source Data.

Source Data Extended Data Fig. 4

Statistical Source Data.

Source Data Extended Data Fig. 5

Statistical Source Data.

Source Data Extended Data Fig. 6

Statistical Source Data.

Source Data Extended Data Fig. 7

Statistical Source Data.

Source Data Extended Data Fig. 8

Statistical Source Data.

Source Data Extended Data Fig. 10

Statistical Source Data.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Dutta, D., Köttgen, A. et al. Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies. Nat Genet 54, 593–602 (2022). https://doi.org/10.1038/s41588-022-01051-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-022-01051-w

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing