Abstract
A genome-wide association study (GWAS) of 94,674 ancestrally diverse Kaiser Permanente members using 478,866 longitudinal electronic health record (EHR)-derived measurements for untreated serum lipid levels empowered multiple new findings: 121 new SNP associations (46 primary, 15 conditional, and 60 in meta-analysis with Global Lipids Genetic Consortium data); an increase of 33–42% in variance explained with multiple measurements; sex differences in genetic impact (greater impact in females for LDL, HDL, and total cholesterol and the opposite for triglycerides); differences in variance explained among non-Hispanic whites, Latinos, African Americans, and East Asians; genetic dominance and epistatic interaction, with strong evidence for both at the ABO and FUT2 genes for LDL; and tissue-specific enrichment of GWAS-associated SNPs among liver, adipose, and pancreas eQTLs. Using EHR pharmacy data, both LDL and triglyceride genetic risk scores (477 SNPs) were strongly predictive of age at initiation of lipid-lowering treatment. These findings highlight the value of longitudinal EHRs for identifying new genetic features of cholesterol and lipoprotein metabolism with implications for lipid treatment and risk of coronary heart disease.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Castelli, W. P. Cholesterol and lipids in the risk of coronary artery disease—the Framingham Heart Study. Can. J. Cardiol. 4 (Suppl. A), 5A–10A (1988).
Kannel, W. B., Dawber, T. R., Kagan, A., Revotskie, N. & Stokes, J. III Factors of risk in the development of coronary heart disease—six year follow-up experience. The Framingham Study. Ann. Intern. Med. 55, 33–50 (1961).
Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
Below, J. E. et al. Meta-analysis of lipid-traits in Hispanics identifies novel loci, population-specific effects, and tissue-specific enrichment of eQTLs. Sci. Rep. 6, 19429 (2016).
Buyske, S. et al. Evaluation of the metabochip genotyping array in African Americans and implications for fine mapping of GWAS-identified loci: the PAGE study. PLoS One 7, e35651 (2012).
Coram, M. A. et al. Genome-wide characterization of shared and distinct genetic components that influence blood lipid levels in ethnically diverse human populations. Am. J. Hum. Genet. 92, 904–916 (2013).
Elbers, C. C. et al. Gene-centric meta-analysis of lipid traits in African, East Asian and Hispanic populations. PLoS One 7, e50198 (2012).
Keller, M. et al. THOC5: a novel gene involved in HDL-cholesterol metabolism. J. Lipid Res. 54, 3170–3176 (2013).
Ko, A. et al. Amerindian-specific regions under positive selection harbour new lipid variants in Latinos. Nat. Commun. 5, 3983 (2014).
Kurano, M. et al. Genome-wide association study of serum lipids confirms previously reported associations as well as new associations of common SNPs within PCSK7 gene with triglyceride. J. Hum. Genet. 61, 427–433 (2016).
Lanktree, M. B. et al. Genetic meta-analysis of 15,901 African Americans identifies variation in EXOC3L1 is associated with HDL concentration. J. Lipid Res. 56, 1781–1786 (2015).
van Leeuwen, E. M. et al. Meta-analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels. J. Med. Genet. 53, 441–449 (2016).
Lu, X. et al. Genetic susceptibility to lipid levels and lipid change over time and risk of incident hyperlipidemia in Chinese populations. Circ. Cardiovasc. Genet. 9, 37–44 (2016).
Musunuru, K. et al. Multi-ethnic analysis of lipid-associated loci: the NHLBI CARe project. PLoS One 7, e36473 (2012).
Surakka, I. et al. The impact of low-frequency and rare variants on lipid levels. Nat. Genet. 47, 589–597 (2015).
UK10K Consortium. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).
Wu, Y. et al. Trans-ethnic fine-mapping of lipid loci identifies population-specific signals and allelic heterogeneity that increases the trait variance explained. PLoS Genet. 9, e1003379 (2013).
Sidore, C. et al. Genome sequencing elucidates Sardinian genetic architecture and augments association analyses for lipid and blood inflammatory markers. Nat. Genet. 47, 1272–1281 (2015).
Kanoni, S. et al. Analysis with the exome array identifies multiple new independent variants in lipid loci. Hum. Mol. Genet. 25, 4094–4106 (2016).
Tada, H. et al. Multiple associated variants increase the heritability explained for plasma lipids and coronary artery disease. Circ. Cardiovasc. Genet. 7, 583–587 (2014).
van Dongen, J., Willemsen, G., Chen, W.-M., de Geus, E. J. C. & Boomsma, D. I. Heritability of metabolic syndrome traits in a large population-based sample. J. Lipid Res. 54, 2914–2923 (2013).
Ganesh, S. K. et al. Effects of long-term averaging of quantitative blood pressure traits on the detection of genetic associations. Am. J. Hum. Genet. 95, 49–65 (2014).
Hoffmann, T. J. et al. Genome-wide association analyses using electronic health records identify new loci influencing blood pressure variation. Nat. Genet. 49, 54–64 (2017).
Banda, Y. et al. Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetics 200, 1285–1295 (2015).
Kvale, M. N. et al. Genotyping informatics and quality control for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetics 200, 1051–1060 (2015).
Hoffmann, T. J. et al. Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm. Genomics 98, 422–430 (2011).
Hoffmann, T. J. et al. Next generation genome-wide association tool: design and coverage of a high-throughput European-optimized SNP array. Genomics 98, 79–89 (2011).
Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).
Han, B. & Eskin, E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88, 586–598 (2011).
Yang, J. et al. Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet. 19, 807–812 (2011).
Pasaniuc, B. et al. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics 30, 2906–2914 (2014).
Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Huedo-Medina, T. B., Sánchez-Meca, J., Marín-Martínez, F. & Botella, J. Assessing heterogeneity in meta-analysis: Q statistic or I 2 index? Psychol. Methods 11, 193–206 (2006).
Sijbrands, E. J. G. et al. Severe hyperlipidemia in apolipoprotein E2 homozygotes due to a combined effect of hyperinsulinemia and an SstI polymorphism. Arterioscler. Thromb. Vasc. Biol. 19, 2722–2729 (1999).
Aung, L. H. H. et al. Sex-specific association of the zinc finger protein 259 rs2075290 polymorphism and serum lipid levels. Int. J. Med. Sci. 11, 471–478 (2014).
Conomos, M. P., Reiner, A. P., Weir, B. S. & Thornton, T. A. Model-free estimation of recent genetic relatedness. Am. J. Hum. Genet. 98, 127–148 (2016).
Chen, G.-B. Estimating heritability of complex traits from genome-wide association studies using IBS-based Haseman–Elston regression. Stat. Genet. 5, 107 (2014).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).
Hosmer, D. & Lemeshow, S. Applied Survival Analysis: Regression Modeling of Time to Event Data (Wiley, Hoboken, NJ, 2008).
Link, E. et al. SLCO1B1 variants and statin-induced myopathy—a genomewide study. N. Engl. J. Med. 359, 789–799 (2008).
Coviello, A. D. et al. A genome-wide association meta-analysis of circulating sex hormone–binding globulin reveals multiple loci implicated in sex steroid hormone regulation. PLoS Genet. 8, e1002805 (2012).
Johnson, A. D. et al. Genome-wide association meta-analysis for total serum bilirubin levels. Hum. Mol. Genet. 18, 2700–2710 (2009).
Oshiro, C., Mangravite, L., Klein, T. & Altman, R. PharmGKB very important pharmacogene: SLCO1B1. Pharmacogenet. Genomics 20, 211–216 (2010).
Abe, T. et al. Identification of a novel gene family encoding human liver-specific organic anion transporter LST-1. J. Biol. Chem. 274, 17159–17163 (1999).
Hsiang, B. et al. A novel human hepatic organic anion transporting polypeptide (OATP2). Identification of a liver-specific human organic anion transporting polypeptide and identification of rat and human hydroxymethylglutaryl-CoA reductase inhibitor transporters. J. Biol. Chem. 274, 37161–37168 (1999).
Yu, E. A. & Weaver, D. R. Disrupting the circadian clock: gene-specific effects on aging, cancer, and other phenotypes. Aging 3, 479–493 (2011).
Shimba, S. et al. Deficient of a clock gene, brain and muscle Arnt-like protein-1 (BMAL1), induces dyslipidemia and ectopic fat formation. PLoS One 6, e25231 (2011).
Castro, C., Briggs, W., Paschos, G. K., FitzGerald, G. A. & Griffin, J. L. A metabolomic study of adipose tissue in mice with a disruption of the circadian system. Mol. Biosyst. 11, 1897–1906 (2015).
Parks, D. J. et al. Bile acids: natural ligands for an orphan nuclear receptor. Science 284, 1365–1368 (1999).
Green, M. D., Oturu, E. M. & Tephly, T. R. Stable expression of a human liver UDP-glucuronosyltransferase (UGT2B15) with activity toward steroid and xenobiotic substrates. Drug Metab. Dispos. 22, 799–805 (1994).
Beaulieu, M., Lévesque, E., Hum, D. W. & Bélanger, A. Isolation and characterization of a novel cDNA encoding a human UDP-glucuronosyltransferase active on C19 steroids. J. Biol. Chem. 271, 22855–22862 (1996).
Turgeon, D., Carrier, J.-S., Chouinard, S. & Bélanger, A. Glucuronidation activity of the UGT2B17 enzyme toward xenobiotics. Drug Metab. Dispos. 31, 670–676 (2003).
Liao, Y.-J. et al. Glycine N-methyltransferase deficiency affects Niemann–Pick type C2 protein stability and regulates hepatic cholesterol homeostasis. Mol. Med. 18, 412–422 (2012).
Liu, S.-P. et al. Glycine N-methyltransferase–/– mice develop chronic hepatitis and glycogen storage disease in the liver. Hepatology 46, 1413–1425 (2007).
Chu, B.-B. et al. Cholesterol transport through lysosome–peroxisome membrane contacts. Cell 161, 291–306 (2015).
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Guan, H.-P. et al. Glucagon receptor antagonism induces increased cholesterol absorption. J. Lipid Res. 56, 2183–2195 (2015).
Ebbert, J. O. & Jensen, M. D. Fat depots, free fatty acids, and dyslipidemia. Nutrients 5, 498–508 (2013).
Hoenig, M. R., Cowin, G., Buckley, R., McHenery, C. & Coulthard, A. Low density lipoprotein cholesterol is inversely correlated with abdominal visceral fat area: a magnetic resonance imaging study. Lipids Health Dis. 10, 12 (2011).
Maher, B. Personal genomes: the case of the missing heritability. Nature 456, 18–21 (2008).
Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, aaf6814 (2016).
Lu, X. et al. Genome-wide association study in Chinese identifies novel loci for blood pressure and hypertension. Hum. Mol. Genet. 24, 865–874 (2015).
Liu, D. J. et al. Exome-wide association study of plasma lipids in ≥300,000 individuals. Nat. Genet. 49, 1758–1766 (2017).
Lu, X. et al. Exome chip meta-analysis identifies novel loci and East Asian–specific coding variants that contribute to lipid levels and coronary artery disease. Nat. Genet. 49, 1722–1730 (2017).
Friedewald, W. T., Levy, R. I. & Fredrickson, D. S. Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clin. Chem. 18, 499–502 (1972).
Delaneau, O., Marchini, J. & Zagury, J.-F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2011).
Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 1, 457–470 (2011).
Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).
Huang, L. et al. Genotype-imputation accuracy across worldwide human populations. Am. J. Hum. Genet. 84, 235–250 (2009).
Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
Ward, L. D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Schadt, E. E. et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 6, e107 (2008).
Westra, H.-J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
Greenawalt, D. M. et al. A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort. Genome Res. 21, 1008–1016 (2011).
Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).
Acknowledgements
We are grateful to the Kaiser Permanente Northern California members who have generously agreed to participate in the Kaiser Permanente Research Program on Genes, Environment, and Health. We would like to thank R. Dobrin for his contribution of the liver and adipose eQTL results table from the RYGB cohort. This work was supported by NIH P50 GM115318 to R.M.K., which partially supported T.H., E.T., M.W.M., C.I., C.S., E.J., R.M.K., and N.R. This work was supported by grants R21 AG046616 and K01 DC013300 to T.J.H. from the US National Institutes of Health for imputation. Support for participant enrollment, survey completion, and biospecimen collection for the RPGEH was provided by the Robert Wood Johnson Foundation, the Wayne and Gladys Valley Foundation, the Ellison Medical Foundation, and Kaiser Permanente national and regional community benefit programs. Genotyping of the GERA cohort was funded by a grant from the National Institute on Aging, the National Institute of Mental Health, and the National Institutes of Health Common Fund (RC2 AG036607 to C.S. and N.R.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
T.J.H., E.T., T.H., E.J., M.W.M., C.S., R.M.K., C.I., and N.R. conceived and designed the study. P.-Y.K. supervised the creation of genotype data. D.K.R., in collaboration with C.I., C.S., N.R., and T.J.H., extracted phenotype data from the EHRs. T.J.H., E.T., T.H., D.K.R., and N.R. performed the statistical analysis. T.J.H., E.T., T.H., D.K.R., E.J., M.W.M., C.S., R.M.K., C.I., and N.R. interpreted the results of analysis. T.J.H., E.T., T.H., D.K.R., E.J., M.W.M., M.N.K., P.-Y.K., C.S., R.M.K., C.I., and N.R. contributed to the drafting and critical review of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1 and 3–14, and Supplementary Tables 1, 2, 10, 11, and 13–17
Supplementary Figure 2
Plots of each genetic locus
Supplementary Table 3
Novel GERA lipid results
Supplementary Table 4
GERA results for previously identified loci
Supplementary Table 6
Novel GERA + GLGC lipid results
Supplementary Table 8
Conditional SNP results
Supplementary Table 12
Discovery was done in the fixed-effects meta-analysis (n = 94,674) of GERA non-Hispanic whites (n = 76,627), Latinos (n = 7,795), East Asians (n = 6,855), African Americans (n = 2,958), and South Asians (n = 439), each using linear regression
Rights and permissions
About this article
Cite this article
Hoffmann, T.J., Theusch, E., Haldar, T. et al. A large electronic-health-record-based genome-wide study of serum lipids. Nat Genet 50, 401–413 (2018). https://doi.org/10.1038/s41588-018-0064-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-018-0064-5
This article is cited by
-
Protective Association of APOC1/rs4420638 with Risk of Obesity: A case-control Study in Portuguese Children
Biochemical Genetics (2024)
-
Integrated epigenome, whole genome sequence and metabolome analyses identify novel multi-omics pathways in type 2 diabetes: a Middle Eastern study
BMC Medicine (2023)
-
Causal relationships between blood lipids and major psychiatric disorders: Univariable and multivariable mendelian randomization analysis
BMC Medical Genomics (2023)
-
Harnessing Electronic Medical Records in Cardiovascular Clinical Practice and Research
Journal of Cardiovascular Translational Research (2023)
-
Gene expression in African Americans, Puerto Ricans and Mexican Americans reveals ancestry-specific patterns of genetic architecture
Nature Genetics (2023)