Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A unified framework identifies new links between plasma lipids and diseases from electronic medical records across large-scale cohorts

Abstract

Plasma lipids are known heritable risk factors for cardiovascular disease, but increasing evidence also supports shared genetics with diseases of other organ systems. We devised a comprehensive three-phase framework to identify new lipid-associated genes and study the relationships among lipids, genotypes, gene expression and hundreds of complex human diseases from the Electronic Medical Records and Genomics (347 traits) and the UK Biobank (549 traits). Aside from 67 new lipid-associated genes with strong replication, we found evidence for pleiotropic SNPs/genes between lipids and diseases across the phenome. These include discordant pleiotropy in the HLA region between lipids and multiple sclerosis and putative causal paths between triglycerides and gout, among several others. Our findings give insights into the genetic basis of the relationship between plasma lipids and diseases on a phenome-wide scale and can provide context for future prevention and treatment strategies.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Workflow to identify lipid-associated genes, suggestive pleiotropy between lipids and diseases and putative diseases for which lipids are modifiable exposures.
Fig. 2: Replication of lipid-associated genes across four cohorts from lipid TWAS.
Fig. 3: Comparison of results between Xpress-PheWAS and lipid-guided PheWAS in eMERGE and UKB.
Fig. 4: Lipid–disease pleiotropy from Xpress-PheWAS and colocalization in either eMERGE or the UKB.
Fig. 5: Concordant/discordant pleiotropy for SNPs that replicate in both eMERGE and the UKB for the same lipids/diseases.
Fig. 6: Protective/risk effect genes from Xpress-PheWAS and colocalization analyses that replicate in both eMERGE and the UKB for the same lipids/diseases.
Fig. 7: Two-sample univariable Mendelian randomization.
Fig. 8: Pictorial depiction of suggestive genetic mechanisms underlying the analyses conducted in this study.

Data availability

This project corresponds to UKB application ID 32133 and eMERGE Network phase III (dbGaP study accession no. phs001584.v1.p1). Lipid GWAS summary statistics for GLGC 2013 (ref. 3) are publicly available for download (http://csg.sph.umich.edu/willer/public/lipids2013/). Lipid GWAS summary statistics for GERA5 are available via dbGaP (accession no. phs000674.v2.p2). Expression prediction models with LD reference data using MASHR are available on Zenodo (https://zenodo.org/record/3518299/files/mashr_eqtl.tar?download=1). GTEx Analysis Release v8 (dbGaP accession no. phs000424.v8.p2) is available for download via the GTEx Portal (https://gtexportal.org/home/datasets/). Summary statistics for lipid GWAS, lipid TWAS, lipid-guided PheWAS and Xpress-PheWAS generated in this study are available on Figshare (https://figshare.com/s/d62961bbc6c45c8dc2b0).

Code availability

Code for identifying LD-contaminated genes and detecting secondary independent associations at a locus is shared on GitHub (https://github.com/RitchieLab/Gene-level-statistical-colocalization/).

References

  1. 1.

    Castelli, W. P. Cholesterol and lipids in the risk of coronary artery disease—the Framingham Heart Study. Can. J. Cardiol. 4, 5A–10A (1988).

    PubMed  Google Scholar 

  2. 2.

    Kannel, W. B., Dawber, T. R., Kagan, A., Revotskie, N. & Stokes, J. Factors of risk in the development of coronary heart disease—six year follow-up experience. The Framingham Study. Ann. Intern. Med. 55, 33–50 (1961).

    CAS  PubMed  Google Scholar 

  3. 3.

    Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Hoffmann, T. J. et al. A large electronic-health-record-based genome-wide study of serum lipids. Nat. Genet. 50, 401–413 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Klarin, D. et al. Genetics of blood lipids among ~300,000 multiethnic participants of the Million Veteran Program. Nat. Genet. 50, 1514–1523 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue-specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).

    PubMed  PubMed Central  Google Scholar 

  8. 8.

    Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).

    Google Scholar 

  10. 10.

    Gottesman, O. et al. The Electronic Medical Records and Genomics (eMERGE) network: past, present and future. Genet. Med. 15, 761–771 (2013).

    PubMed  PubMed Central  Google Scholar 

  11. 11.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    González-Gay, M. A. & González-Juanatey, C. Inflammation and lipid profile in rheumatoid arthritis: bridging an apparent paradox. Ann. Rheum. Dis. 73, 1281–1283 (2014).

    PubMed  Google Scholar 

  13. 13.

    Pietrzak, A., Michalak-Stoma, A., Chodorowska, G. & Szepietowski, J. C. Lipid disturbances in psoriasis: an update. Mediators Inflamm. 2010, 535612 (2010).

    PubMed  PubMed Central  Google Scholar 

  14. 14.

    Ference, B. A., Graham, I., Tokgozoglu, L. & Catapano, A. L. Impact of lipids on cardiovascular health. J. Am. Coll. Cardiol. 72, 1141–1156 (2018).

    CAS  PubMed  Google Scholar 

  15. 15.

    Reale, M. & Sanchez-Ramon, S. Lipids at the cross-road of autoimmunity in multiple sclerosis. Curr. Med. Chem. 24, 176–192 (2017).

    CAS  PubMed  Google Scholar 

  16. 16.

    Di Paolo, G. & Kim, T.-W. Linking lipids to Alzheimer’s disease: cholesterol and beyond. Nat. Rev. Neurosci. 12, 284–296 (2011).

    PubMed  PubMed Central  Google Scholar 

  17. 17.

    Chesmore, K., Bartlett, J. & Williams, S. M. The ubiquity of pleiotropy in human disease. Hum. Genet. 137, 39–44 (2018).

    CAS  PubMed  Google Scholar 

  18. 18.

    Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).

    CAS  PubMed  Google Scholar 

  19. 19.

    Sivakumaran, S. et al. Abundant pleiotropy in human complex diseases and traits. Am. J. Hum. Genet. 89, 607–618 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Webb, T. R. et al. Systematic evaluation of pleiotropy identifies 6 further loci associated with coronary artery disease. J. Am. Coll. Cardiol. 69, 823–836 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Andreassen, O. A. et al. Abundant genetic overlap between blood lipids and immune-mediated diseases indicates shared molecular genetic mechanisms. PLoS ONE 10, e0123057 (2015).

    PubMed  PubMed Central  Google Scholar 

  23. 23.

    Kim, Y. K. et al. Evaluation of pleiotropic effects among common genetic loci identified for cardio-metabolic traits in a Korean population. Cardiovasc. Diabetol. 15, 1–11 (2016).

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Ligthart, C. et al. Bivariate genome-wide association study identifies novel pleiotropic loci for lipids and inflammation. BMC Genomics 17, 443 (2016).

    PubMed  PubMed Central  Google Scholar 

  25. 25.

    Nikpay, M., Turner, A. W. & McPherson, R. Partitioning the pleiotropy between coronary artery disease and body mass index reveals the importance of low frequency variants and central nervous system-specific functional elements. Circ. Genom. Precis. Med. 11, e002050 (2018).

    CAS  PubMed  Google Scholar 

  26. 26.

    Zhang, X. et al. Detecting potential pleiotropy across cardiovascular and neurological diseases using univariate, bivariate and multivariate methods on 43,870 individuals from the eMERGE network. Pac. Symp. Biocomput. 24, 272–283 (2019).

    PubMed  PubMed Central  Google Scholar 

  27. 27.

    Davey Smith, G. & Ebrahim, S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1–22 (2003).

    Google Scholar 

  28. 28.

    Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).

    CAS  PubMed  Google Scholar 

  29. 29.

    Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2013).

    Google Scholar 

  30. 30.

    Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Google Scholar 

  32. 32.

    Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Butler, R. The ICD-10 General Equivalence Mappings. Bridging the translation gap from ICD-9. J. AHIMA 78, 84–85 (2007).

    PubMed  Google Scholar 

  34. 34.

    Xu, L. et al. An association study between genetic polymorphisms related to lipoprotein-associated phospholipase A2 and coronary heart disease. Exp. Ther. Med. 5, 742–750 (2013).

    PubMed  PubMed Central  Google Scholar 

  35. 35.

    Wolpin, B. M. et al. Prospective study of ABO blood type and the risk of pulmonary embolism in two large cohort studies. Thromb. Haemost. 104, 962–971 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Hajizadeh, R., Kavandi, H., Nadiri, M. & Ghaffari, S. The association of ABO blood group with incidence and outcome of acute pulmonary embolism. Turk Kardiyol. Dern. Ars. 44, 397–403 (2016).

    PubMed  Google Scholar 

  37. 37.

    Zhang, J., Zhao, Z., Guo, X., Guo, B. & Wu, B. Powerful statistical method to detect disease-associated genes using publicly available genome-wide association studies summary data. Genet. Epidemiol. 43, 941–951 (2019).

    PubMed  Google Scholar 

  38. 38.

    Lumish, H. S., O’Reilly, M. P. & Reilly, M. P. Sex differences in genomic drivers of adipose distribution and related cardiometabolic disorders: opportunities for precision medicine. Arterioscler. Thromb. Vasc. Biol. 40, 45–60 (2020).

    CAS  PubMed  Google Scholar 

  39. 39.

    Reshef, Y. A. et al. Detecting genome-wide directional effects of transcription factor binding on polygenic disease risk. Nat. Genet. 50, 1483–1493 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Cantuti-Castelvetri, L. et al. Defective cholesterol clearance limits remyelination in the aged central nervous system. Science 359, 684–688 (2018).

    CAS  PubMed  Google Scholar 

  41. 41.

    Fard, M. K. et al. BCAS1 expression defines a population of early myelinating oligodendrocytes in multiple sclerosis lesions. Sci. Transl. Med. 9, eaam7816 (2017).

    PubMed  PubMed Central  Google Scholar 

  42. 42.

    Kung, J. T. Y., Colognori, D. & Lee, J. T. Long noncoding RNAs: past, present and future. Genetics 193, 651–669 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Ginn, L., Shi, L., La Montagna, M. & Garofalo, M. LncRNAs in non-small-cell lung cancer. Noncoding RNA 6, 25 (2020).

    CAS  PubMed Central  Google Scholar 

  44. 44.

    Zhong, R. et al. LINC01149 variant modulates MICA expression that facilitates hepatitis B virus spontaneous recovery but increases hepatocellular carcinoma risk. Oncogene 39, 1944–1956 (2020).

    CAS  PubMed  Google Scholar 

  45. 45.

    Feng, X. & Yang, S. Long noncoding RNA LINC00243 promotes proliferation and glycolysis in non-small-cell lung cancer cells by positively regulating PDK4 through sponging miR-507. Mol. Cell. Biochem. 463, 127–136 (2020).

    CAS  PubMed  Google Scholar 

  46. 46.

    Yu, X., Chen, H., Huang, S. & Zeng, P. Evaluation of the causal effects of blood lipid levels on gout with summary level GWAS data: two-sample Mendelian randomization and mediation analysis. J. Hum. Genet. 66, 465–473 (2021).

    CAS  PubMed  Google Scholar 

  47. 47.

    Marien, E. et al. Non-small-cell lung cancer is characterized by dramatic changes in phospholipid profiles. Int. J. Cancer 137, 1539–1548 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Eggers, L. F. et al. Lipidomes of lung cancer and tumour-free lung tissues reveal distinct molecular signatures for cancer differentiation, age, inflammation and pulmonary emphysema. Sci. Rep. 7, 11087 (2017).

    PubMed  PubMed Central  Google Scholar 

  49. 49.

    Tiwary, S. et al. Metastatic brain tumors disrupt the blood–brain barrier and alter lipid metabolism by inhibiting expression of the endothelial cell fatty acid transporter Mfsd2a. Sci. Rep. 8, 8267 (2018).

    PubMed  PubMed Central  Google Scholar 

  50. 50.

    Sun, H., Zhang, X., Shi, W. & Fang, B. Association of soft tissue infection in the extremity with glucose and lipid metabolism and inflammatory factors. Exp. Ther. Med. 17, 2535–2540 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Gao, S., Cui, X., Wang, X., Burg, M. B. & Dmitrieva, N. I. Cross-sectional positive association of serum lipids and blood pressure with serum sodium within the normal reference range of 135–145 mmol/l. Arterioscler. Thromb. Vasc. Biol. 37, 598–606 (2017).

    CAS  PubMed  Google Scholar 

  52. 52.

    Goldstein, I. et al. p53, a novel regulator of lipid metabolism pathways. J. Hepatol. 56, 656–662 (2012).

    CAS  PubMed  Google Scholar 

  53. 53.

    Mäkinen, N. et al. Exome sequencing of uterine leiomyosarcomas identifies frequent mutations in TP53, ATRX and MED12. PLoS Genet. 12, e1005850 (2016).

    PubMed  PubMed Central  Google Scholar 

  54. 54.

    Parrales, A. & Iwakuma, T. p53 as a regulator of lipid metabolism in cancer. Int. J. Mol. Sci. 17, 2074 (2016).

    PubMed Central  Google Scholar 

  55. 55.

    Veturi, Y. & Ritchie, M. D. How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures? Pac. Symp. Biocomput. 23, 228–239 (2018).

    PubMed  PubMed Central  Google Scholar 

  56. 56.

    Olafsdottir, T. et al. Genome-wide association identifies seven loci for pelvic organ prolapse in Iceland and the UK Biobank. Commun. Biol. 3, 129 (2020).

    PubMed  PubMed Central  Google Scholar 

  57. 57.

    Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Verma, S. S. et al. Imputation and quality-control steps for combining multiple genome-wide datasets. Front. Genet. 5, 370 (2014).

    PubMed  PubMed Central  Google Scholar 

  59. 59.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. 61.

    Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ~700,000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2015).

    PubMed Central  Google Scholar 

  63. 63.

    Macarthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).

    CAS  PubMed  Google Scholar 

  64. 64.

    Eicher, J. D. et al. GRASP v2.0: an update on the Genome-Wide Repository of Associations between SNPs and Phenotypes. Nucleic Acids Res. 43, 799–804 (2014).

    Google Scholar 

  65. 65.

    Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. 66.

    Yavorska, O. O. & Burgess, S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int. J. Epidemiol. 46, 1734–1739 (2017).

    PubMed  PubMed Central  Google Scholar 

  67. 67.

    Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).

    PubMed  PubMed Central  Google Scholar 

  68. 68.

    anastasia-lucas/hudson. A Hudson Plot Package version 0.1.0. GitHub. https://rdrr.io/github/anastasia-lucas/hudson/. Accessed 5 March 2020.

  69. 69.

    Conway, J. R., Lex, A. & Gehlenborg, N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33, 2938–2940 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. 70.

    Zuguang Gu. Circlize R package. https://cran.r-project.org/web/packages/circlize/index.html (2019).

  71. 71.

    Pruim, R. J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Phase III of the eMERGE Network was initiated and funded by the NHGRI through the following grants: U01HG8657 (Group Health Cooperative/University of Washington); U01HG8685 (Brigham and Women’s Hospital); U01HG8672 (Vanderbilt University Medical Center); U01HG8666 (Cincinnati Children’s Hospital Medical Center); U01HG6379 (Mayo Clinic); U01HG8679 (Geisinger Clinic); U01HG8680 (Columbia University Health Sciences); U01HG8684 (Children’s Hospital of Philadelphia); U01HG8673 (Northwestern University); U01HG8701 (Vanderbilt University Medical Center serving as the Coordinating Center); U01HG8676 (Partners Healthcare/Broad Institute); and U01HG8664 (Baylor College of Medicine). For the UKB, all data for this cohort pertained to project 32133: ‘Integration of multi-organ imaging phenotypes, clinical phenotypes and genomic data’. Y.V., R.M.K., T.J.H., N.R., M.W.M., E.T. and M.D.R. acknowledge funding from the National Institutes of Health (NIH GM115318: Pharmacogenomics of Statin Therapy (POST)). Y.V. and M.D.R. also acknowledge NIH AI077505 (Pharmacogenomics of HIV Therapy). J.E.M. acknowledges NHGRI T32HG009495–01; C.M.S. acknowledges R35GM131770 (Pharmacogenetics to improve Drug Therapy). B.F.V. acknowledges NIH DK101478, NIH HG010067 and a Linda Pechenik Montague Investigator Award for their time on this project.

Author information

Affiliations

Authors

Contributions

Y.V. and M.D.R. conceptualized and designed the study. Y.V. conducted all statistical analyses. Y.V. and D.H. conducted phase III analyses. Y.V., A.L. and S.D. performed data visualization. Y.V., Y.B. and A.L. conducted phenotype curation. Y.V., M.D.R. and A.V. performed data acquisition for the UKB. H.H., P.S., I.K., D.S., C.M.S., D.R.V.E., Q.F. and W.-Q.W. performed data acquisition for eMERGE. T.J.H., N.R., R.M.K., M.W.M. and E.T. performed data acquisition for GERA. Y.V. and B.F.V. conceptualized phase III of this study. Y.V. and J.E.M. performed overrepresentation analysis. D.J.R. provided guidance for phases I and II. Y.V. and M.D.R. wrote the manuscript. All authors provided interpretation of the results and critical feedback on the manuscript.

Corresponding author

Correspondence to Marylyn D. Ritchie.

Ethics declarations

Competing interests

M.D.R. is on the scientific advisory board for Goldfinch Bio and Cipherome. D.J.R. serves on Scientific Advisory Boards for Alnylam, Novartis, Pfizer and Verve and is a founder of Staten Biotechnology. The other co-authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Case-control distribution for ICD codes.

Distribution of cases (blue) and controls (yellow) for the collapsed 3-digit ICD codes in eMERGE (top) and UKB (bottom). eMERGE has predominantly ICD-9 codes whereas UKB has predominantly ICD-10 codes.

Extended Data Fig. 2 Lipid GWAS in eMERGE.

Manhattan plots from GWAS (two-sided linear regression) conducted on the four plasma lipid traits (HDL-C, LDL-C, TC, TG) for the eMERGE cohort. In each plot we have chromosomes 1 to 22 on the x-axis and -log(P) value on the y-axis.

Extended Data Fig. 3 Lipid GWAS in UKB.

Manhattan plots from GWAS (two-sided linear regression) conducted on the four plasma lipid traits (HDL-C, LDL-C, TC, TG) for the UKB cohort. In each plot we have chromosomes 1 to 22 on the x-axis and -log(P) value on the y-axis.

Extended Data Fig. 4 Lipid TWAS P-values for novel lipid genes.

Synthesis-view plot indicating -log10 P-values for Bonferroni-significant ‘novel’ genes (two-sided gene-based tests: P < 5.57 × 10−7) from lipid TWAS. These genes passed coloc P[H3] < 0.5 filter in at least one cohort. The direction of triangle corresponds to the direction of gene-effect from TWAS (left facing-negative and right facing-positive). Colors indicate the five selected tissues from GTEx v8 (adipose subcutaneous, adipose visceral omentum, liver, small intestine terminal ileum, whole blood).

Extended Data Fig. 5 Colocalization probabilities of shared causal variant between lipids and gene expression for novel lipid genes.

Synthesis-view plot indicating coloc P[H4] for Bonferroni-significant ‘novel’ genes (two-sided gene-based tests: P < 5.57 × 10−7) obtained from lipid TWAS. These genes passed coloc P[H3] < 0.5 filter in at least one cohort. The direction of triangle corresponds to the direction of gene-effect from TWAS (left facing-negative and right facing-positive). Colors indicate the five selected tissues from GTEx v8 (adipose subcutaneous, adipose visceral omentum, liver, small intestine terminal ileum, whole blood). We present coloc results for all regions corresponding to a gene.

Extended Data Fig. 6 Overlap of detected ICD codes between cohorts.

UpSet plot indicating overlap of diseases (ICD codes) with Bonferroni-significant genes between PheWAS and Xpress-PheWAS conducted on eMERGE and UKB, respectively.

Extended Data Fig. 7 Overlap of significant SNPs between lipid GWAS and lipid-guided PheWAS across cohorts.

UpSet plot indicating overlap of GWAS-significant SNPs (Bonferroni threshold) between each of the four plasma lipids (HDL-C, LDL-C, TC, TG) aggregated across the four considered cohorts (eMERGE, GERA, GLGC, UKB) and lipid-guided PheWAS conducted in eMERGE and UKB, respectively.

Extended Data Fig. 8 Lipid-disease pleiotropy from lipid-guided PheWAS in either eMERGE or UKB.

Circos plot indicates Bonferroni-significant SNPs in either cohort (eMERGE or UKB) from lipid-guided PheWAS (two-sided logistic regression). Outer track, the number of SNPs detected in either cohort; inner track, significant ICD codes per disease category. Links, SNPs connecting lipids (in salmon) to diseases (in blue); link thickness, # SNPs; link color, chromosome. Due to large number of SNP associations involved, this plot does not show associations (links) in the HLA region (chromosome 6).

Extended Data Fig. 9 Overlap of significant genes between lipid TWAS and Xpress-PheWAS across cohorts.

UpSet plot indicating overlap of detected Bonferroni-significant genes between lipid TWAS and Xpress-PheWAS conducted on eMERGE and UKB, respectively. Lipid TWAS genes have been split into two categories: (1) novel; (2) previously reported.

Extended Data Fig. 10 Effect sizes and confidence intervals from two-sample univariable Mendelian randomization analyses.

Mendelian randomization funnel plots depicting MR effect size (using two-sided IVW and Egger approaches) across ICD codes detected as FDR significant (excluding proof-of-concept diseases such as E78 Disorders of lipoprotein metabolism and other lipidemias and I10 Essential primary hypertension; see Fig. 7 for a full list of FDR-significant diseases). Top 5 plots: exposure dataset (lipid), GERA; outcome dataset, UKB. Remaining plots: exposure dataset (lipid), UKB; outcome dataset, eMERGE.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Veturi, Y., Lucas, A., Bradford, Y. et al. A unified framework identifies new links between plasma lipids and diseases from electronic medical records across large-scale cohorts. Nat Genet 53, 972–981 (2021). https://doi.org/10.1038/s41588-021-00879-y

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing