A genome-wide association study (GWAS) of 94,674 ancestrally diverse Kaiser Permanente members using 478,866 longitudinal electronic health record (EHR)-derived measurements for untreated serum lipid levels empowered multiple new findings: 121 new SNP associations (46 primary, 15 conditional, and 60 in meta-analysis with Global Lipids Genetic Consortium data); an increase of 33–42% in variance explained with multiple measurements; sex differences in genetic impact (greater impact in females for LDL, HDL, and total cholesterol and the opposite for triglycerides); differences in variance explained among non-Hispanic whites, Latinos, African Americans, and East Asians; genetic dominance and epistatic interaction, with strong evidence for both at the ABO and FUT2 genes for LDL; and tissue-specific enrichment of GWAS-associated SNPs among liver, adipose, and pancreas eQTLs. Using EHR pharmacy data, both LDL and triglyceride genetic risk scores (477 SNPs) were strongly predictive of age at initiation of lipid-lowering treatment. These findings highlight the value of longitudinal EHRs for identifying new genetic features of cholesterol and lipoprotein metabolism with implications for lipid treatment and risk of coronary heart disease.

  • Subscribe to Nature Genetics for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Castelli, W. P. Cholesterol and lipids in the risk of coronary artery disease—the Framingham Heart Study. Can. J. Cardiol. 4 (Suppl. A), 5A–10A (1988).

  2. 2.

    Kannel, W. B., Dawber, T. R., Kagan, A., Revotskie, N. & Stokes, J. III Factors of risk in the development of coronary heart disease—six year follow-up experience. The Framingham Study. Ann. Intern. Med. 55, 33–50 (1961).

  3. 3.

    Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).

  4. 4.

    Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

  5. 5.

    Below, J. E. et al. Meta-analysis of lipid-traits in Hispanics identifies novel loci, population-specific effects, and tissue-specific enrichment of eQTLs. Sci. Rep. 6, 19429 (2016).

  6. 6.

    Buyske, S. et al. Evaluation of the metabochip genotyping array in African Americans and implications for fine mapping of GWAS-identified loci: the PAGE study. PLoS One 7, e35651 (2012).

  7. 7.

    Coram, M. A. et al. Genome-wide characterization of shared and distinct genetic components that influence blood lipid levels in ethnically diverse human populations. Am. J. Hum. Genet. 92, 904–916 (2013).

  8. 8.

    Elbers, C. C. et al. Gene-centric meta-analysis of lipid traits in African, East Asian and Hispanic populations. PLoS One 7, e50198 (2012).

  9. 9.

    Keller, M. et al. THOC5: a novel gene involved in HDL-cholesterol metabolism. J. Lipid Res. 54, 3170–3176 (2013).

  10. 10.

    Ko, A. et al. Amerindian-specific regions under positive selection harbour new lipid variants in Latinos. Nat. Commun. 5, 3983 (2014).

  11. 11.

    Kurano, M. et al. Genome-wide association study of serum lipids confirms previously reported associations as well as new associations of common SNPs within PCSK7 gene with triglyceride. J. Hum. Genet. 61, 427–433 (2016).

  12. 12.

    Lanktree, M. B. et al. Genetic meta-analysis of 15,901 African Americans identifies variation in EXOC3L1 is associated with HDL concentration. J. Lipid Res. 56, 1781–1786 (2015).

  13. 13.

    van Leeuwen, E. M. et al. Meta-analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels. J. Med. Genet. 53, 441–449 (2016).

  14. 14.

    Lu, X. et al. Genetic susceptibility to lipid levels and lipid change over time and risk of incident hyperlipidemia in Chinese populations. Circ. Cardiovasc. Genet. 9, 37–44 (2016).

  15. 15.

    Musunuru, K. et al. Multi-ethnic analysis of lipid-associated loci: the NHLBI CARe project. PLoS One 7, e36473 (2012).

  16. 16.

    Surakka, I. et al. The impact of low-frequency and rare variants on lipid levels. Nat. Genet. 47, 589–597 (2015).

  17. 17.

    UK10K Consortium. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).

  18. 18.

    Wu, Y. et al. Trans-ethnic fine-mapping of lipid loci identifies population-specific signals and allelic heterogeneity that increases the trait variance explained. PLoS Genet. 9, e1003379 (2013).

  19. 19.

    Sidore, C. et al. Genome sequencing elucidates Sardinian genetic architecture and augments association analyses for lipid and blood inflammatory markers. Nat. Genet. 47, 1272–1281 (2015).

  20. 20.

    Kanoni, S. et al. Analysis with the exome array identifies multiple new independent variants in lipid loci. Hum. Mol. Genet. 25, 4094–4106 (2016).

  21. 21.

    Tada, H. et al. Multiple associated variants increase the heritability explained for plasma lipids and coronary artery disease. Circ. Cardiovasc. Genet. 7, 583–587 (2014).

  22. 22.

    van Dongen, J., Willemsen, G., Chen, W.-M., de Geus, E. J. C. & Boomsma, D. I. Heritability of metabolic syndrome traits in a large population-based sample. J. Lipid Res. 54, 2914–2923 (2013).

  23. 23.

    Ganesh, S. K. et al. Effects of long-term averaging of quantitative blood pressure traits on the detection of genetic associations. Am. J. Hum. Genet. 95, 49–65 (2014).

  24. 24.

    Hoffmann, T. J. et al. Genome-wide association analyses using electronic health records identify new loci influencing blood pressure variation. Nat. Genet. 49, 54–64 (2017).

  25. 25.

    Banda, Y. et al. Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetics 200, 1285–1295 (2015).

  26. 26.

    Kvale, M. N. et al. Genotyping informatics and quality control for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetics 200, 1051–1060 (2015).

  27. 27.

    Hoffmann, T. J. et al. Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm. Genomics 98, 422–430 (2011).

  28. 28.

    Hoffmann, T. J. et al. Next generation genome-wide association tool: design and coverage of a high-throughput European-optimized SNP array. Genomics 98, 79–89 (2011).

  29. 29.

    Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

  30. 30.

    Han, B. & Eskin, E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88, 586–598 (2011).

  31. 31.

    Yang, J. et al. Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet. 19, 807–812 (2011).

  32. 32.

    Pasaniuc, B. et al. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics 30, 2906–2914 (2014).

  33. 33.

    Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

  34. 34.

    Huedo-Medina, T. B., Sánchez-Meca, J., Marín-Martínez, F. & Botella, J. Assessing heterogeneity in meta-analysis: Q statistic or I 2 index? Psychol. Methods 11, 193–206 (2006).

  35. 35.

    Sijbrands, E. J. G. et al. Severe hyperlipidemia in apolipoprotein E2 homozygotes due to a combined effect of hyperinsulinemia and an SstI polymorphism. Arterioscler. Thromb. Vasc. Biol. 19, 2722–2729 (1999).

  36. 36.

    Aung, L. H. H. et al. Sex-specific association of the zinc finger protein 259 rs2075290 polymorphism and serum lipid levels. Int. J. Med. Sci. 11, 471–478 (2014).

  37. 37.

    Conomos, M. P., Reiner, A. P., Weir, B. S. & Thornton, T. A. Model-free estimation of recent genetic relatedness. Am. J. Hum. Genet. 98, 127–148 (2016).

  38. 38.

    Chen, G.-B. Estimating heritability of complex traits from genome-wide association studies using IBS-based Haseman–Elston regression. Stat. Genet. 5, 107 (2014).

  39. 39.

    Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

  40. 40.

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  41. 41.

    Hosmer, D. & Lemeshow, S. Applied Survival Analysis: Regression Modeling of Time to Event Data (Wiley, Hoboken, NJ, 2008).

  42. 42.

    Link, E. et al. SLCO1B1 variants and statin-induced myopathy—a genomewide study. N. Engl. J. Med. 359, 789–799 (2008).

  43. 43.

    Coviello, A. D. et al. A genome-wide association meta-analysis of circulating sex hormone–binding globulin reveals multiple loci implicated in sex steroid hormone regulation. PLoS Genet. 8, e1002805 (2012).

  44. 44.

    Johnson, A. D. et al. Genome-wide association meta-analysis for total serum bilirubin levels. Hum. Mol. Genet. 18, 2700–2710 (2009).

  45. 45.

    Oshiro, C., Mangravite, L., Klein, T. & Altman, R. PharmGKB very important pharmacogene: SLCO1B1. Pharmacogenet. Genomics 20, 211–216 (2010).

  46. 46.

    Abe, T. et al. Identification of a novel gene family encoding human liver-specific organic anion transporter LST-1. J. Biol. Chem. 274, 17159–17163 (1999).

  47. 47.

    Hsiang, B. et al. A novel human hepatic organic anion transporting polypeptide (OATP2). Identification of a liver-specific human organic anion transporting polypeptide and identification of rat and human hydroxymethylglutaryl-CoA reductase inhibitor transporters. J. Biol. Chem. 274, 37161–37168 (1999).

  48. 48.

    Yu, E. A. & Weaver, D. R. Disrupting the circadian clock: gene-specific effects on aging, cancer, and other phenotypes. Aging 3, 479–493 (2011).

  49. 49.

    Shimba, S. et al. Deficient of a clock gene, brain and muscle Arnt-like protein-1 (BMAL1), induces dyslipidemia and ectopic fat formation. PLoS One 6, e25231 (2011).

  50. 50.

    Castro, C., Briggs, W., Paschos, G. K., FitzGerald, G. A. & Griffin, J. L. A metabolomic study of adipose tissue in mice with a disruption of the circadian system. Mol. Biosyst. 11, 1897–1906 (2015).

  51. 51.

    Parks, D. J. et al. Bile acids: natural ligands for an orphan nuclear receptor. Science 284, 1365–1368 (1999).

  52. 52.

    Green, M. D., Oturu, E. M. & Tephly, T. R. Stable expression of a human liver UDP-glucuronosyltransferase (UGT2B15) with activity toward steroid and xenobiotic substrates. Drug Metab. Dispos. 22, 799–805 (1994).

  53. 53.

    Beaulieu, M., Lévesque, E., Hum, D. W. & Bélanger, A. Isolation and characterization of a novel cDNA encoding a human UDP-glucuronosyltransferase active on C19 steroids. J. Biol. Chem. 271, 22855–22862 (1996).

  54. 54.

    Turgeon, D., Carrier, J.-S., Chouinard, S. & Bélanger, A. Glucuronidation activity of the UGT2B17 enzyme toward xenobiotics. Drug Metab. Dispos. 31, 670–676 (2003).

  55. 55.

    Liao, Y.-J. et al. Glycine N-methyltransferase deficiency affects Niemann–Pick type C2 protein stability and regulates hepatic cholesterol homeostasis. Mol. Med. 18, 412–422 (2012).

  56. 56.

    Liu, S.-P. et al. Glycine N-methyltransferase–/– mice develop chronic hepatitis and glycogen storage disease in the liver. Hepatology 46, 1413–1425 (2007).

  57. 57.

    Chu, B.-B. et al. Cholesterol transport through lysosome–peroxisome membrane contacts. Cell 161, 291–306 (2015).

  58. 58.

    Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

  59. 59.

    Guan, H.-P. et al. Glucagon receptor antagonism induces increased cholesterol absorption. J. Lipid Res. 56, 2183–2195 (2015).

  60. 60.

    Ebbert, J. O. & Jensen, M. D. Fat depots, free fatty acids, and dyslipidemia. Nutrients 5, 498–508 (2013).

  61. 61.

    Hoenig, M. R., Cowin, G., Buckley, R., McHenery, C. & Coulthard, A. Low density lipoprotein cholesterol is inversely correlated with abdominal visceral fat area: a magnetic resonance imaging study. Lipids Health Dis. 10, 12 (2011).

  62. 62.

    Maher, B. Personal genomes: the case of the missing heritability. Nature 456, 18–21 (2008).

  63. 63.

    Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, aaf6814 (2016).

  64. 64.

    Lu, X. et al. Genome-wide association study in Chinese identifies novel loci for blood pressure and hypertension. Hum. Mol. Genet. 24, 865–874 (2015).

  65. 65.

    Liu, D. J. et al. Exome-wide association study of plasma lipids in ≥300,000 individuals. Nat. Genet. 49, 1758–1766 (2017).

  66. 66.

    Lu, X. et al. Exome chip meta-analysis identifies novel loci and East Asian–specific coding variants that contribute to lipid levels and coronary artery disease. Nat. Genet. 49, 1722–1730 (2017).

  67. 67.

    Friedewald, W. T., Levy, R. I. & Fredrickson, D. S. Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clin. Chem. 18, 499–502 (1972).

  68. 68.

    Delaneau, O., Marchini, J. & Zagury, J.-F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2011).

  69. 69.

    Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 1, 457–470 (2011).

  70. 70.

    Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).

  71. 71.

    Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

  72. 72.

    Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).

  73. 73.

    Huang, L. et al. Genotype-imputation accuracy across worldwide human populations. Am. J. Hum. Genet. 84, 235–250 (2009).

  74. 74.

    Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).

  75. 75.

    Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).

  76. 76.

    Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

  77. 77.

    Ward, L. D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).

  78. 78.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

  79. 79.

    Schadt, E. E. et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 6, e107 (2008).

  80. 80.

    Westra, H.-J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).

  81. 81.

    Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

  82. 82.

    Greenawalt, D. M. et al. A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort. Genome Res. 21, 1008–1016 (2011).

  83. 83.

    Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).

  84. 84.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

  85. 85.

    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

  86. 86.

    Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).

Download references


We are grateful to the Kaiser Permanente Northern California members who have generously agreed to participate in the Kaiser Permanente Research Program on Genes, Environment, and Health. We would like to thank R. Dobrin for his contribution of the liver and adipose eQTL results table from the RYGB cohort. This work was supported by NIH P50 GM115318 to R.M.K., which partially supported T.H., E.T., M.W.M., C.I., C.S., E.J., R.M.K., and N.R. This work was supported by grants R21 AG046616 and K01 DC013300 to T.J.H. from the US National Institutes of Health for imputation. Support for participant enrollment, survey completion, and biospecimen collection for the RPGEH was provided by the Robert Wood Johnson Foundation, the Wayne and Gladys Valley Foundation, the Ellison Medical Foundation, and Kaiser Permanente national and regional community benefit programs. Genotyping of the GERA cohort was funded by a grant from the National Institute on Aging, the National Institute of Mental Health, and the National Institutes of Health Common Fund (RC2 AG036607 to C.S. and N.R.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information


  1. Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA

    • Thomas J. Hoffmann
    • , Tanushree Haldar
    • , Mark N. Kvale
    • , Pui-Yan Kwok
    •  & Neil Risch
  2. Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA

    • Thomas J. Hoffmann
    •  & Neil Risch
  3. Children’s Hospital Oakland Research Institute, Oakland, CA, USA

    • Elizabeth Theusch
    • , Marisa W. Medina
    •  & Ronald M. Krauss
  4. Division of Research, Kaiser Permanente, Northern California, Oakland, CA, USA

    • Dilrini K. Ranatunga
    • , Eric Jorgenson
    • , Catherine Schaefer
    • , Carlos Iribarren
    •  & Neil Risch


  1. Search for Thomas J. Hoffmann in:

  2. Search for Elizabeth Theusch in:

  3. Search for Tanushree Haldar in:

  4. Search for Dilrini K. Ranatunga in:

  5. Search for Eric Jorgenson in:

  6. Search for Marisa W. Medina in:

  7. Search for Mark N. Kvale in:

  8. Search for Pui-Yan Kwok in:

  9. Search for Catherine Schaefer in:

  10. Search for Ronald M. Krauss in:

  11. Search for Carlos Iribarren in:

  12. Search for Neil Risch in:


T.J.H., E.T., T.H., E.J., M.W.M., C.S., R.M.K., C.I., and N.R. conceived and designed the study. P.-Y.K. supervised the creation of genotype data. D.K.R., in collaboration with C.I., C.S., N.R., and T.J.H., extracted phenotype data from the EHRs. T.J.H., E.T., T.H., D.K.R., and N.R. performed the statistical analysis. T.J.H., E.T., T.H., D.K.R., E.J., M.W.M., C.S., R.M.K., C.I., and N.R. interpreted the results of analysis. T.J.H., E.T., T.H., D.K.R., E.J., M.W.M., M.N.K., P.-Y.K., C.S., R.M.K., C.I., and N.R. contributed to the drafting and critical review of the manuscript.

Competing interests

The authors declare no competing interests.

Corresponding authors

Correspondence to Thomas J. Hoffmann or Neil Risch.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1 and 3–14, and Supplementary Tables 1, 2, 10, 11, and 13–17

  2. Life Sciences Reporting Summary

  3. Supplementary Figure 2

    Plots of each genetic locus

  4. Supplementary Table 3

    Novel GERA lipid results

  5. Supplementary Table 4

    GERA results for previously identified loci

  6. Supplementary Table 6

    Novel GERA + GLGC lipid results

  7. Supplementary Table 8

    Conditional SNP results

  8. Supplementary Table 12

    Discovery was done in the fixed-effects meta-analysis (n = 94,674) of GERA non-Hispanic whites (n = 76,627), Latinos (n = 7,795), East Asians (n = 6,855), African Americans (n = 2,958), and South Asians (n = 439), each using linear regression

  9. Supplementary Tables 5, 7 and 9