Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Systematic evaluation of coding variation identifies a candidate causal variant in TM6SF2 influencing total cholesterol and myocardial infarction risk


Blood lipid levels are heritable, treatable risk factors for cardiovascular disease. We systematically assessed genome-wide coding variation to identify new genes influencing lipid traits, fine map known lipid loci and evaluate whether low-frequency variants with large effects exist for these traits. Using an exome array, we genotyped 80,137 coding variants in 5,643 Norwegians. We followed up 18 variants in 4,666 Norwegians and identified ten loci with coding variants associated with a lipid trait (P < 5 × 10−8). One variant in TM6SF2 (encoding p.Glu167Lys), residing in a known genome-wide association study locus for lipid traits, influences total cholesterol levels and is associated with myocardial infarction. Transient TM6SF2 overexpression or knockdown of Tm6sf2 in mice alters serum lipid profiles, consistent with the association observed in humans, identifying TM6SF2 as a functional gene within a locus previously known as NCAN-CILP2-PBX4 or 19p13. This study demonstrates that systematic assessment of coding variation can quickly point to a candidate causal gene.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Power estimates for the current study compared to the estimated effect sizes for coding variants and GWAS index SNPs.
Figure 2: Functional follow up in C57BL/6J mice implicates TMF6SF2 in lipid metabolism.

Accession codes


NCBI Reference Sequence


  1. Go, A.S. et al. Heart disease and stroke statistics—2013 update: a report from the American Heart Association. Circulation 127, e6–e245 (2013).

    PubMed  Google Scholar 

  2. The Lipid Research Clinics Coronary Primary Prevention Trial results. II. The relationship of reduction in incidence of coronary heart disease to cholesterol lowering. J. Am. Med. Assoc. 251, 365–374 (1984).

  3. Shen, L., Peng, H., Xu, D. & Zhao, S. The next generation of novel low-density lipoprotein cholesterol-lowering agents: proprotein convertase subtilisin/kexin 9 inhibitors. Pharmacol. Res. 73, 27–34 (2013).

    CAS  Article  PubMed  Google Scholar 

  4. Willer, C.J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. Teslovich, T.M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. Pilia, G. et al. Heritability of cardiovascular and personality traits in 6,148 Sardinians. PLoS Genet. 2, e132 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Manolio, T.A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. Cirulli, E.T. & Goldstein, D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 11, 415–425 (2010).

    CAS  Article  PubMed  Google Scholar 

  9. Eichler, E.E. et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11, 446–450 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. 1000 Genomes Project Consortium et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  11. 1000 Genomes Project Consortium et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  12. Huyghe, J.R. et al. Exome array analysis identifies new loci and low-frequency variants influencing insulin processing and secretion. Nat. Genet. 45, 197–201 (2013).

    CAS  Article  PubMed  Google Scholar 

  13. Jostins, L., Morley, K.I. & Barrett, J.C. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. Eur. J. Hum. Genet. 19, 662–666 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Musunuru, K. & Kathiresan, S. HapMap and mapping genes for cardiovascular disease. Circ. Cardiovasc. Genet. 1, 66–71 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  15. Sanna, S. et al. Fine mapping of five loci associated with low-density lipoprotein cholesterol detects variants that double the explained heritability. PLoS Genet. 7, e1002198 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  16. Nejentsev, S., Walker, N., Riches, D., Egholm, M. & Todd, J.A. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324, 387–389 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  17. Willer, C.J. et al. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat. Genet. 40, 161–169 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. Kathiresan, S. et al. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat. Genet. 40, 189–197 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  19. Krokstad, S. et al. Cohort profile: the HUNT Study, Norway. Int. J. Epidemiol. 42, 968–977 (2013).

    CAS  Article  PubMed  Google Scholar 

  20. Jacobsen, B.K., Eggen, A.E., Mathiesen, E.B., Wilsgaard, T. & Njolstad, I. Cohort profile: the Tromso Study. Int. J. Epidemiol. 41, 961–967 (2012).

    Article  PubMed  Google Scholar 

  21. Golledge, J. et al. Apolipoprotein E genotype is associated with serum C-reactive protein but not abdominal aortic aneurysm. Atherosclerosis 209, 487–491 (2010).

    CAS  Article  PubMed  Google Scholar 

  22. Hegele, R.A. et al. A hepatic lipase gene mutation associated with heritable lipolytic deficiency. J. Clin. Endocrinol. Metab. 72, 730–732 (1991).

    CAS  Article  PubMed  Google Scholar 

  23. Nelis, M. et al. Genetic structure of Europeans: a view from the North-East. PLoS ONE 4, e5472 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Hofman, A., Grobbee, D.E., de Jong, P.T. & van den Ouweland, F.A. Determinants of disease and disability in the elderly: the Rotterdam Elderly Study. Eur. J. Epidemiol. 7, 403–422 (1991).

    CAS  Article  PubMed  Google Scholar 

  25. Gottesman, O. et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet. Med. 15, 761–771 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  27. Inatani, M. et al. Upregulated expression of neurocan, a nervous tissue specific proteoglycan, in transient retinal ischemia. Invest. Ophthalmol. Vis. Sci. 41, 2748–2754 (2000).

    CAS  PubMed  Google Scholar 

  28. Schunkert, H. et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat. Genet. 43, 333–338 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  29. Speliotes, E.K. et al. Genome-wide association analysis identifies variants associated with nonalcoholic fatty liver disease that have distinct effects on metabolic traits. PLoS Genet. 7, e1001324 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  30. Li, B. & Leal, S.M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. Lee, S., Wu, M.C. & Lin, X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13, 762–775 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Cholesterol Treatment Trialists' (CTT) Collaborators. The effects of lowering LDL cholesterol with statin therapy in people at low risk of vascular disease: meta-analysis of individual data from 27 randomised trials. Lancet 380, 581–590 (2012).

  33. Blattmann, P., Schuberth, C., Pepperkok, R. & Runz, H. RNAi-based functional profiling of loci from blood lipid genome-wide association studies identifies genes with cholesterol-regulatory function. PLoS Genet. 9, e1003338 (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  34. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Holmen, J. et al. The Nord-Trøndelag Health Study 1995–97 (HUNT 2): objectives, contents, methods and participation. Norsk Epidemiologi 13, 19–32 (2003).

    Google Scholar 

  36. Friedewald, W.T., Levy, R.I. & Fredrickson, D.S. Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clin. Chem. 18, 499–502 (1972).

    CAS  PubMed  Google Scholar 

  37. Goldstein, J.I. et al. zCall: a rare variant caller for array-based genotyping: genetics and population analysis. Bioinformatics 28, 2543–2545 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  38. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  40. Fan, Y. et al. Kruppel-like factor-11, a transcription factor involved in diabetes mellitus, suppresses endothelial cell activation via the nuclear factor-κB signaling pathway. Arterioscler. Thromb. Vasc. Biol. 32, 2981–2988 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  41. Fan, Y. et al. Suppression of pro-inflammatory adhesion molecules by PPAR-δ in human vascular endothelial cells. Arterioscler. Thromb. Vasc. Biol. 28, 315–321 (2008).

    CAS  Article  PubMed  Google Scholar 

Download references


The Nord-Trøndelag Health Study (The HUNT Study) is a collaboration between the HUNT Research Centre (Faculty of Medicine, Norwegian University of Science and Technology NTNU), the Nord-Trøndelag County Council, the Central Norway Health Authority and the Norwegian Institute of Public Health. C.J.W. is supported by HL094535 and HL109946 from the National Heart, Lung and Blood Institute. M.B. is supported by DK062370 from the National Institute of Diabetes and Digestive and Kidney Diseases. Y.E.C. is supported by HL068878 and HL117491 from the National Heart, Lung and Blood Institute. G.R.A. is supported by HL117626 from the National Heart, Lung and Blood Institute and HG007022 from the National Human Genome Research Institute. For frequency look up for the RNF111 variant, we thank the following: R. Loos and K. Lu of the BioMe Clinical Care Cohort operated by The Charles Bronfman Institute for Personalized Medicine (IPM) at the Mount Sinai Medical Center (the Mount Sinai IPM Biobank Program is supported by The Andrea and Charles Bronfman Philanthropies); A. Metspalu of the Estonian Genome Center (the Estonian Biobank data were provided by E. Mihailov from the Estonian Genome Center of University of Tartu, Estonia); and A. Uitterlinden, F. Rivadeneira and K. Estrada of Erasmus University Rotterdam.

Author information

Authors and Affiliations



A.L., I.N., K.H., H.D., C.P., E.B.M., T.W., L.V., F.S., M.-L.L. and O.L.H. obtained, contributed and analyzed the phenotype data. O.L.H. and T.W. were responsible for sample selection. O.L.H. and H.Z. were responsible for genetic data analysis and interpretation. H.Z. and J.C. performed variant calling from sequence data. D.H.H., E.M.S. and W.Z. generated figures and performed secondary analyses. L.V., M.-L.L., S.K.G., A.L., E.B.M., I.N. and K.H. provided epidemiological expertise. H.D. and C.P. provided clinical expertise. G.R.A., F.S. and M.B. provided genotyping and genetic epidemiology expertise. Y.F., Y.G., Ji Zhang, S.P. and Jifeng Zhang conducted mouse experiments under the supervision of Y.E.C. with assistance from C.J.W. C.J.W., G.R.A. and O.L.H. drafted the manuscript with assistance from D.H.H., H.Z., M.B. and K.H. Y.F., E.M.S., W.Z., Y.G., Ji Zhang, Jifeng Zhang, A.L., M.-L.L., S.K.G., L.V., F.S., H.D., J.C., C.P., E.B.M., T.W., I.N. and Y.E.C. critically reviewed the manuscript and provided comments and feedback. C.J.W., O.L.H. and K.H. conceived the study. C.J.W., O.L.H., K.H., M.B. and G.R.A. designed the study. K.H. and C.J.W. provided overall leadership for the project.

Corresponding authors

Correspondence to Kristian Hveem or Cristen J Willer.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Effect sizes for known lipid index SNPs from non-fasting HUNT samples are correlated with estimates from large GWAS

This figure shows similarity in effect sizes estimated from non-fasting HUNT samples (N=5,643) and from large population-based cohorts used in GWAS (N=196k). The effect sizes estimated from HUNT samples (N=5,643) are correlated with the previously published effect sizes (LDL cholesterol r2 = 0.46, HDL cholesterol r2 = 0.75, Total cholesterol r2 = 0.47, Triglycerides r2 = 0.84).

Supplementary Figure 2 Quantile-quantile plots for single-variant analysis results of lipid traits

Quantile-quantile plots (QQ) for single variant association analysis results for lipid traits in Stage 1 samples (N=5,643). We display separate Q-Q curves for previously known GWAS variants as published by the Global Lipid Genetic Consortium1 (red), variants within 500kb from a GWAS variant (blue), and association for other coding variants >500kb away from lead GWAS SNPs (black). 95% confidence intervals for the null hypothesis of no association are shown in grey. HDL, high-density lipoprotein; LDL, low-density lipoprotein. Genomic control lambda values were: HDL λGC = 1.06; LDL λGC = 1.04; TC λGC = 1.08; TG λGC = 1.04.

Supplementary Figure 3 Regional association plot of the LIPC locus

The regional association plots show the LIPC region on chromosome 15. The –log10 P value for association with HDL cholesterol is shown for several data sets: (a) Global Lipids Genetics Consortium GWAS results 4 in 95,000, (b) association in Stage 1 samples (N=5,643), and (c) association in Stage 1 samples conditioning on association with rs113298164 (LIPC p.Thr405Met). Supplementary Figure 3a shows that the LIPC p.Thr405Met and RNF111 p.Pro836Ser were not observed in GWAS. Comparison of Supplementary Figure 3b and the conditional analysis in Supplementary Figure 3c demonstrates that the association signal at RNF111 p.Pro836Ser is due to linkage disequilibrium with LIPC p.Thr405Met. Furthermore, the association signal at the variants discovered by GWAS as two independent signals (rs1532085 and rs261334) remain strongly associated after conditioning on p.Thr405Met, suggesting the association at these two SNPs is independent from p.Thr405Met.

Supplementary Figure 4 Regional association plot of the TM6SF2 locus

This figure shows a LocusZoom plot of the –log10 P value for association with total cholesterol levels for all markers represented on the exome array in the TM6SF2 region by position on the chromosome. Color-coded linkage disequilibrium (LD) metrics r2 using HapMap CEU values are provided for the most significant association to reflect their LD with the lead SNP. SNPs are functionally annotated coding (square) or other (circle), and lead SNPs are colored purple. (a) GWAS P-values as published by Teslovich et al. (N >100,000)4 showing an absence of data for the TM6SF2 coding variant. (b) Association as observed in Stage 1 samples (N = 5,643) demonstrating the most significant association for the TM6SF2 variant (in purple) and the GWAS variant (in red), with less significant association for the NCAN coding variant (yellow).

Supplementary Figure 5 Expression pattern of endogenous TM6SF2 in wild-type C57BL/6J mouse

This figure shows the tissue distribution pattern of TM6SF2 in C57BL/6J mouse. Endogenous Tm6sf2 was highly expressed in the liver at both mRNA and protein levels. (a) The mRNA level of Tm6sf2 was determined by Northern blotting. The ethidium bromide (EB) staining of 28S and 18S Ribosomal RNAs (rRNA) was used as a positive control. (b) The protein level of TM6SF2 in tissues was detected by Western blotting, and the fast green staining of blot served as an internal control. Tissues were collected from wild-type C57BL/6J mouse. SKM refers to skeletal muscle.

Supplementary Figure 6 No evidence for accumulation of triglyceride in mouse liver following adenovirus overexpression of TM6SF2 or shTm6sf2

This figure show representative photomicrographs of liver sections stained with Oil red O after mice were fasted overnight. We found no evidence of triglyceride accumulation -in mouse liver in any experimental or control mouse, suggesting that neither TM6SF2 expression changes nor adenovirus injection caused any liver damage. In total, 20 animals were analyzed, 5 in each study group (Ad-LacZ, Ad-TM6SF2, Ad-shLacZ, Ad-shTM6SF2). A total of 10 tissue sections were analyzed for each animal. Scale bars in black, 2 mm.

Supplementary Figure 7 Adenovirus-mediated TM6SF2 overexpression and knockdown does not elevate alanine aminotransferase

This figure shows no significant differences in ALT levels in experimental vs. control mice after either overexpression or knockdown of Tm6sf2. ALT activity in mice was determined in serum collected from overnight-fasted wild-type mice and experimental mice including Ad-LacZ, Ad-TM6SF2, Ad-shLacZ and Ad-shTM6SF2 mice. No significant differences were observed between any experimental group and the control group (no adenovirus injection). 25 animals were analyzed, 5 in each study group.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 1–7 and Supplementary Figures 1–7 (PDF 2556 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Holmen, O., Zhang, H., Fan, Y. et al. Systematic evaluation of coding variation identifies a candidate causal variant in TM6SF2 influencing total cholesterol and myocardial infarction risk. Nat Genet 46, 345–351 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing