TM6SF2 rs58542926 influences hepatic fibrosis progression in patients with non-alcoholic fatty liver disease

Non-alcoholic fatty liver disease (NAFLD) is an increasingly common condition, strongly associated with the metabolic syndrome, that can lead to progressive hepatic fibrosis, cirrhosis and hepatic failure. Subtle inter-patient genetic variation and environmental factors combine to determine variation in disease progression. A common non-synonymous polymorphism in TM6SF2 (rs58542926 c.449 C>T, p.Glu167Lys) was recently associated with increased hepatic triglyceride content, but whether this variant promotes clinically relevant hepatic fibrosis is unknown. Here we confirm that TM6SF2 minor allele carriage is associated with NAFLD and is causally related to a previously reported chromosome 19 GWAS signal that was ascribed to the gene NCAN. Furthermore, using two histologically characterized cohorts encompassing steatosis, steatohepatitis, fibrosis and cirrhosis (combined n=1,074), we demonstrate a new association, independent of potential confounding factors (age, BMI, type 2 diabetes mellitus and PNPLA3 rs738409 genotype), with advanced hepatic fibrosis/cirrhosis. These findings establish new and important clinical relevance to TM6SF2 in NAFLD.

N on-alcoholic fatty liver disease (NAFLD) represents a spectrum of progressive liver disease characterized by increased hepatic triglyceride content (HTGC) in the absence of excess alcohol consumption 1 . NAFLD includes simple steatosis, non-alcoholic steatohepatitis, fibrosis and ultimately cirrhosis, and is strongly associated with features of the metabolic syndrome (obesity, insulin resistance/type 2 diabetes mellitus (T2DM) and dyslipidaemia) 1 . Reflecting the increasing prevalence of these conditions, NAFLD is estimated to affect approximately one-third of the population in many developed countries. Simple steatosis is generally considered to have a benign course and therefore to be of limited prognostic relevance 2,3 . However, some NAFLD patients exhibit progressive steatohepatitis leading to cirrhosis and/or hepatocellular carcinoma (HCC), conditions that confer increased morbidity and mortality 1,4 . Despite its high prevalence, only a minority of NAFLD patients progress to significant fibrosis and experience the associated morbidity 1 . Thus, similar to other common diseases (for example, obesity, T2DM and cardiovascular disease), NAFLD is best considered as a complex trait in which disease phenotype results from environmental exposures acting on a susceptible polygenic background that comprises multiple independent modifiers [5][6][7] .
Recently, Kozlitina et al. 19 showed that a non-synonymous SNP in TM6SF2 (rs58542926 c.449 C4T, p.Glu167Lys), transmembrane 6 superfamily member 2, a gene of unknown function on chromosome 19, was associated with proton magnetic resonance spectroscopy ( 1 H-MRS) quantified HTGC based on genotyping with a genome-wide exome chip 19 . This variant has also been associated with dyslipidaemia and cardiovascular risk 20 . The TM6SF2 rs58542926 SNP lies within 50 kb of an NCAN gene variant (rs2228603 c.274 C4T, p.Pro92Ser) that has previously been associated with HTGC in another GWAS 9,21 . Both SNPs are in strong linkage disequilibrium (D 0 ¼ 0.926, r 2 ¼ 0.798). Conditioning on the TM6SF2 variant abrogated the effect of the NCAN variant while the reverse did not occur, suggesting that TM6SF2 rs58542926 is more strongly associated with the HTGC phenotype. Homozygote TM6SF2 rs58542926 minor (T) allele carriage was shown to be associated with a modest but statistically significant increase in 1 H-MRS measured HTGC from 5.86 ± 0.25% in CC homozygotes to 15.04±2.23% in TT homozygotes 19 . In vitro and in vivo functional studies also supported this conclusion but were unable to determine whether the effect of TM6SF2 was limited to steatosis or had broader clinical relevance, as has already been shown for PNPLA3 (ref. 19).
The aim of the current study was, first, to determine whether the association with NAFLD reported by Kozlitina et al. 19 could be independently validated; and, second, to establish whether the TM6SF2 rs58542926 variant was associated with clinically important disease end points that have prognostic relevance (in particular stage of hepatic fibrosis or development of NAFLDrelated HCC). To address this, we perform a quantitative analysis within a well-characterized European Caucasian 'discovery' cohort with histologically characterized NAFLD, controlling for relevant co-morbidities and factors that have previously been linked with disease progression (age, gender, body mass index (BMI), presence of T2DM and PNPLA3 rs738409 genotype), and replicate our findings in a separate histologically characterized European Caucasian 'validation' cohort. To discover whether the TM6SF2 rs58542926 variant also confers an increased risk of NAFLD-related HCC, we perform a secondary case-control analysis comparing the overall 'combined' cohort of NAFLD patients to a cohort of NAFLD-HCC patients.

Results
Increased TM6SF2 rs58542926 C4T minor allele carriage in NAFLD. In the NAFLD discovery cohort, the TM6SF2 rs58542926 genotypes were confirmed to be in Hardy-Weinberg equilibrium with a minor allele frequency of 0.12, significantly higher than that observed in a reference Northern European population sample (MAF 0.07, http://browser.1000genomes.org) or a cohort of 265 Caucasian self-reported 'healthy workers' recruited from offices and factories locally in the North East of England (MAF 0.07) and so supportive of an association between these variants and NAFLD. Indeed, a gene-dosage effect was observed for both variants in the discovery cohort with the incidence of NAFLD increasing with the number of minor alleles possessed (X 2 for trend, P ¼ 0.0008; Supplementary Table 1). A similar association was also confirmed for PNPLA3 rs738409, Table 2). Specific histological components of the NAFLD disease phenotype were next assessed individually.
TM6SF2 and degree of histological steatosis. As a positive control, and consistent with our previously reported analysis 12 , carriage of the PNPLA3 rs738409 minor allele was significantly associated with degree of steatosis in multivariate analysis adopting an additive model adjusted for gender, age at biopsy, BMI and presence of T2DM (b ¼ 0.192±0.056, 95% confidence interval (CI) 0.082-0.301; P ¼ 6.74 Â 10 À 4 ). However, in contrast to the report by Kozlitina et al. 19 , neither TM6SF2 rs58542926 (b ¼ 0.087 ± 0.083, 95%CI À 0.076 to 0.250; P ¼ 0.296) nor NCAN rs2228603 (b ¼ 0.050±0.085, 95%CI À 0.116 to 0.216; P ¼ 0.554) were found to be significantly associated with degree of histologically determined steatosis in the 349-patient discovery cohort. This was also the case in the 725-patient validation cohort (P ¼ 0.17). However, a trend towards significance was observed when the two cohorts were combined (b ¼ 0.111 ± 0.059, 95%CI À 0.0041 to 0.2268; P ¼ 0.053), suggesting that an underlying effect on degree of steatosis may be present but of relatively small size. An effect became apparent when the multivariate analysis in the combined cohort was repeated after subdividing the cohort into those with mild steatosis (S0-1) and pronounced steatosis (S2-3). Here, carriage of each copy of the TM6SF2 rs58542926 C4T minor allele was associated with increased risk of greater steatosis (odds ratio (OR) 1.379, 95%CI 1.019-1.865; P ¼ 0.037), although with a marginal level of significance.
TM6SF2 and severity of histological steatohepatitis. Next, the association with steatohepatitis activity was tested using a composite score incorporating severity of necroinflammation and ballooning hepatocyte degeneration. TM6SF2 rs58542926, but not NCAN rs2228603, was associated with severity of steatohepatitis in the discovery cohort by multivariate analysis adopting an additive model adjusted for gender, age at biopsy, BMI, T2DM and PNPLA3 rs738409 genotype (b ¼ 0.288 ± 0.139, 95%CI 0.015-0.561; P ¼ 0.039). However, this effect was not replicated in the validation or combined cohorts.
This strong association between TM6SF2 rs58542926 and fibrosis stage was replicated independently in the validation cohort (b ¼ 0.238 ± 0.097, 95%CI 0.047-0.428; P ¼ 0.014) and also clearly demonstrated in the combined cohort (b ¼ 0.357 ± 0.079, 95%CI 0.203-0.511; P ¼ 6.36 Â 10 À 6 ) by using an additive model adjusted for gender, age at biopsy, BMI, T2DM and PNPLA3 rs738409 genotype in both cases. To illustrate the potential clinical relevance of this finding, when the multivariate analysis was repeated subdividing the NAFLD cohort into those with mild fibrosis (F0-1) and advanced fibrosis (F2-4), carriage of each copy of the TM6SF2 rs58542926 C4T minor allele was associated consistently with a significant increased risk of advanced fibrosis, independent of gender, age at biopsy, BMI, T2DM and PNPLA3 rs738409 genotype across each cohort studied (Table 1).
TM6SF2 and risk of HCC. There is increasing evidence that NAFLD predisposes to an increased risk of HCC 22 , an effect influenced by PNPLA3 rs738409 genotype independent of the presence of cirrhosis 18 . We therefore sought to determine whether TM6SF2 rs58542926 had a similar effect. A cohort of 99 consecutive Northern European Caucasian patients with primary NAFLD-related HCC was identified according to the joint European Association for the Study of the Liver and European Association for the Research and Treatment of Cancer (EASL-EORTC) guidelines 23 . TM6SF2 rs58542926 allele and genotype frequencies in this cohort were compared with the combined NAFLD cohort described above (n ¼ 1,074). In univariate analysis, homozygote carriage of the TM6SF2 rs58542926 minor allele was associated with an increased risk of NAFLD-HCC with respect to CC (OR 1.922, 95%CI 1.31-2.81; P ¼ 6.81 Â 10 À 4 ); however, significance was lost in multivariate analysis incorporating known risk factors including age, gender, BMI, T2DM and presence of cirrhosis (P ¼ 0.42).

Discussion
The region on chromosome 19 (19p13) flanking TM6SF2 has been reported to be associated with NAFLD 9,19,21 as well as variations in plasma cholesterol, triglyceride and low-density lipoprotein levels 20,24,25 in several previous studies. In particular, a variant within the NCAN gene (rs2228603 C4T) that is in strong linkage disequilibrium (D 0 ¼ 0.926, r 2 ¼ 0.798) with TM6SF2 rs58542926 was reported to be associated with radiologically and histologically characterized NAFLD in both GWAS and candidate-gene studies 9,21 . Before the recent publication by Kozlitina et al. 19 , examination of linkage disequilibrium patterns across the region had already brought that association into question 7 . It was, however, the use of a genome-wide exome-chip genotyping approach, combined with detailed association analysis conditioning on previously published variants across the 19p13 region, that determined that the causative variant affecting HTGC was TM6SF2 rs58542926 (ref. 19). When considered alongside a separate study by Holmen et al. 20 , which demonstrated an association with cardiovascular disease and circulating triglyceride/total cholesterol levels, it appears that TM6SF2 rs58542926 C-allele carriage increases circulating triglyceride/total cholesterol while T-allele carriage promotes hepatic triglyceride/cholesterol retention. In clinical practice, simple steatosis is generally considered to have a benign course and so degree of HTGC is of limited prognostic relevance 1,3,26 . In contrast, progressive hepatic fibrosis leading to cirrhosis is the principal common pathway to hepatic failure and a liver-related death 2,3 . Using two large, well-characterized European Caucasian cohorts with biopsy-proven NAFLD, we demonstrate that carriage of the TM6SF2 rs58542926 variant is strongly associated with the presence of NAFLD and, in particular, with a significantly greater risk of developing advanced hepatic fibrosis/cirrhosis. Evidence to support a modifier effect of the TM6SF2 rs58542926 variant on histologically determined HTGC (steatosis), seen only when the 1,074-patient strong combined cohort was studied, is arguably more modest than might be expected. Our findings do support the previously reported association 9,19,21 , although differences in sensitivity to subtle changes in HTGC between radiological and histological modalities may have reduced the power to detect this effect 27 . Kozlitina et al. 19 reported that the maximal effect of the TM6SF2 variant in European Caucasians was only a mean 9.2% increase in 1 H-MRS quantified HTGC in TT homozygotes above the B5.9% observed in CC homozygotes 19 . Histological assessment of hepatic steatosis uses broad microscopic categories reflecting the proportion of hepatocytes that are visibly steatotic (S0 o5%, S1 5-33%, S2 33-66% and S3 466%) 28 . Based on data from previous comparative modality analysis 27 , the modest gene effect size reported by Kozlitina et al. 19 (less than a threefold increase in HTGC above normal) would likely be encompassed within the histological S1 bracket, and therefore may not be apparent histologically. Combined with the relatively low minor allele frequency in the background population, smaller cohorts may therefore have insufficient statistical power for an association to become evident. The modifier effect of the TM6SF2 variant on grade of steatohepatitis (disease activity) was apparent in the initial discovery cohort analysis; however, statistical significance was not reached in the subsequent validation analysis. The validation cohort comprised a mixture of patients recruited from both hepatology and bariatric services and, although the cohorts appear well matched histologically, the validation cohort exhibited higher mean BMI levels (38.5±9.1 versus 34.5 ± 5.7 kg m À 2 , Po0.0001), a younger mean age (47.6 ± 12.4 versus 51.5±12.0 years, Po0.0001), a greater female preponderance (56.1 versus 32.1%, Po0.0001) and a lower prevalence of T2DM (32.4 versus 46.1%, Po0.0001) than the discovery cohort ( Table 2). These factors may have impacted on our ability to replicate the initial association with steatohepatitis in a multivariate analysis. Further study of the variant in other patient cohorts and exploration of the functional effects of TM6SF2 on inflammatory response will be needed to address this point.
The key finding of the current study is that carriage of the TM6SF2 rs58542926 C4T minor allele is unequivocally associated with an increased risk of advanced NAFLD-associated hepatic fibrosis. This highly significant effect was consistently demonstrated across all the cohorts studied and was independent of potentially confounding factors including gender, age at time of biopsy, BMI, T2DM and PNPLA3 rs738409 genotype.
Conditional analysis undertaken as part of the present study adds further weight to the assertion that the 19p13 signal is causally related to TM6SF2 and not NCAN, not only for HTGC as was previously reported 19 but now also for stage of hepatic fibrosis. These findings therefore establish a new and important clinical relevance to the recently described association between TM6SF2 and NAFLD, and suggest that TM6SF2 should be considered alongside PNPLA3 (refs 8,12) and GCKR 9,29,30 , as one of a handful of genes so far identified that are associated not only with variations in hepatic triglyceride accumulation but also with fibrogenesis 7 . It is noteworthy that across all the cohorts studied, the OR for advanced fibrosis conferred by each copy of the TM6SF2 variant carried was consistently of similar or up to twofold greater magnitude than that which was observed, or has previously been reported 12,31 , for the widely studied PNPLA3 rs738409 variant ( Table 1).
Carriage of the TM6SF2 variant was associated with increased risk of progression to NAFLD-HCC in univariate analysis. In contrast to PNPLA3 (ref. 18), this effect was not sustained when confounding factors including age, T2DM and presence of underlying cirrhosis were included in the model. Carriage of the TM6SF2 variant therefore does not appear to further increase HCC risk independent of its effect on fibrosis stage. It should, however, be noted that the NAFLD-HCC cohort contained only 99 patients and so, combined with a relatively modest TM6SF2 rs58542926 minor allele frequency, the current study had B70% power to detect an effect if an additive genetic model and risk similar to that seen for fibrosis is assumed (a ¼ 0.05) 32 . An association cannot therefore be completely excluded, but would seem unlikely. Studies using larger cohorts of NAFLD-HCC patients than are presently available will be required to provide sufficient power to study this further.
The TM6SF2 rs58542926 c.449 C4T variant is a nonsynonymous change producing a glutamate to lysine amino-acid substitution at residue 167 (Glu167Lys), which is highly conserved across mammals 19 . First identified as part of a large-scale sequencing project, little is currently known about the biological function of the TM6SF2 protein product 33 . Adenovirus-mediated short hairpin RNA knockdown of Tm6sf2 in mice has been shown to increase HTGC and reduce very lowdensity lipoprotein (VLDL) secretion, suggesting that TM6SF2 activity is necessary for normal VLDL secretion, and that impaired TM6SF2 function causally contributes to NAFLD 19 . However, these in vivo studies were of too short a duration to adequately address the effects on steatohepatitis or fibrogenesis. Furthermore, previous experimental evidence has shown that hepatic triglyceride accumulation may not itself be directly hepatotoxic. This was elegantly demonstrated in mice by silencing hepatic gene expression of diacylglycerol O-acyltransferase 2 (Dgat2), a key enzyme mediating the conversion of free fatty acids to triglyceride 34 . Rather than ameliorating steatohepatitis, the consequent reduction in hepatocyte triglyceride synthesis was associated with increased fatty acid oxidation, particularly through Cyp2e1, leading to greater oxidative stress, cellular damage and higher serum transaminase levels 34 . It is therefore tempting to speculate that the function of TM6SF2 and the mechanism through which TM6SF2 drives NAFLD-associated hepatic fibrosis may be other than through increased triglyceride accumulation.
In conclusion, the current study confirms that TM6SF2 is associated with histologically defined NAFLD, and is the first demonstration that this gene serves as a powerful modifier of hepatic fibrogenesis. That this gene is also associated with disturbed cholesterol metabolism and so may modify risk of cardiovascular events including myocardial infarction 20 suggests that TM6SF2 is an important determinant of clinical outcome across several facets of metabolic syndrome-related end-organ damage. In light of evidence that cholesterol accumulation in hepatic stellate cells promotes NAFLD fibrosis 35 , it is tempting to speculate that TM6SF2 may act as a 'switch' with TM6SF2 rs58542926 T-allelemediated hepatic retention of triglyceride and cholesterol predisposing to NAFLD fibrosis while C-allele carriage promotes VLDL excretion, protecting the liver at the expense of increased risk of cardiovascular disease. These data mandate further mechanistic study to determine the physiological and pathophysiological role of this gene in various tissues and cell types as a modifier of fibrogenesis and a putative therapeutic target. . All participants gave informed consent. In all cases, alternative diagnoses were excluded, including excess alcohol intake (alcohol intake o20 g per day for women; and o30 g per day for men), chronic viral hepatitis (hepatitis B and hepatitis C), autoimmune liver diseases, hereditary hemochromatosis, a1-antitrypsin deficiency, Wilson's disease and drug-induced liver injury. Clinical and laboratory data were collected at the time of diagnosis including basic anthropometrics so that BMI could be calculated, and relevant co-morbidity including the presence of T2DM (fasting glucose Z7.1 mmol l À 1 (Z128 mg dl À 1 ) or treatment with anti-diabetic drugs) and evidence of underlying cirrhosis was recorded. The degree of steatosis (S0-3), activity of steatohepatitis (A0-4) and stage of fibrosis (F0-4) were scored according to the validated semi-quantitative SAF score 28 . The main study cohorts were:

Methods
An initial discovery cohort of 349 consecutive European Caucasian patients from the United Kingdom with histologically characterized NAFLD of different stages of disease. These were unrelated patients with histologically characterized NAFLD, derived from a patient population originally identified as having ultrasonographically detected bright liver and abnormal biochemical tests (alanine transaminase and/or gamma-glutamyl transferase). A validation cohort of 725 consecutive European Caucasian patients from centres in UK and mainland Europe with histologically characterized NAFLD of different stages of disease. Patients in this cohort were unrelated patients with histologically characterized NAFLD, derived from a patient population originally identified as having ultrasonographically detected bright liver and abnormal biochemical tests (alanine transaminase and/or gammaglutamyl transferase) or identified as having evidence of NAFLD at the time of bariatric surgery.
Together, these comprised the combined cohort of 1,074 patients with histologically characterized NAFLD. Demographic and histological details are shown in Table 2. A description of the 'healthy workers' cohort recruited in the North East of the United Kingdom has previously been published 36 .
A separate cohort of 99 consecutive Northern European Caucasian patients with primary HCC arising on a background of NAFLD was identified (NAFLD-HCC cohort). The diagnosis of HCC was established histologically or through non-invasive assessment according to the EASL-EORTC clinical practice guidelines 23 . The presence of NAFLD was determined through histological assessment of non-tumour liver tissue or, when biopsy was not clinically appropriate, through radiological evidence of hepatic steatosis.
Liver biopsy. Liver biopsy was performed under radiological guidance. Specimens (at least 1.6 cm length and 1.5 mm thick) were fixed in 10% neutral formalin for evaluation and embedded in paraffin for histological examination. Tissue sections were stained with haematoxylin and eosin, impregnated with silver for visualizing reticulin framework and stained with Sirius Red Fast Green for visualizing collagen. Liver biopsies were reviewed by a single expert liver pathologist at each participating centre, unaware of clinical or genetic data. The degree of steatosis (S0-3), activity of steatohepatitis (A0-4) and stage of fibrosis (F0-4) were scored according to the validated semi-quantitative SAF score (Supplementary  Table 3) 28,37 . In 25 HCC patients, the diagnosis of HCC was confirmed histologically and graded according to Edmondson and Steiner 38 , adapted for needle biopsy specimens.
DNA preparation. Venous blood was collected from each patient and DNA was prepared from peripheral blood lymphocytes using a perchlorate-chloroform isolation method 39 . In brief, 35 ml lysis buffer (10 mM Tris-HCl (pH 8.0), 320 mM sucrose, 5 mM magnesium chloride and 1% Triton X-100) was added to 5 ml venous blood in a 50-ml polypropylene centrifuge tube. After mixing, the tube was centrifuged at 3,000 g for 10 min. The supernatant was discarded and the cell pellet was re-suspended in 2 ml of solution B (400 mM Tris-HCl (pH 8.0) 6 0 mM EDTA, 150 mM NaCl and 1% SDS). A quantity of 500 ml of sodium perchlorate (5 M) was added and the sample was mixed at room temperature for 15 min before incubating in a preheated hot block at 65°C for 30 min. Next, 2 ml of chloroform was added and the sample was mixed for 10 min at room temperature. The tube was then centrifuged at 1,400g for 10 min, and the upper, clear DNA-containing phase was transferred to a new 15 ml polypropylene tube. Two volumes of cold ethanol were added to the aqueous phase, and the tube was gently inverted until the DNA precipitated. The DNA was spooled using a soft plastic sterile loop and allowed to air dry for 20 min. DNA was then re-suspended by incubation in 200 ml water at 60°C. Samples are quantitated and quality assessed by absorbance measurements at 260 and 280 nm. Genotyping was performed by personnel unaware of clinical status or histology of patients.
Statistical analysis. Statistical analyses were performed using SPSS v19.0 (IBM, USA) to collate and analyse cohort phenotype data. Continuous variables were tested using Student's t-test/one-way analysis of variance and categorical variables by w 2 -squared test unless otherwise stated. PLINK v1.07 (ref. 40) (via the gPLINK v2.050 GUI) was used to conduct the genetic analysis. An initial univariate w 2 -squared analysis was performed. Subsequently, multivariate logistic regression analysis was conducted incorporating biologically relevant covariates that were associated with risk of NAFLD progression (age, gender, BMI, presence of T2DM and PNPLA3 rs738409 genotype) to test the genetic association. An additive genetic model best fitted the data and was reported. Results are expressed as beta b ± s.e.m. or OR with 95%CI as appropriate. Significance was taken as Po0.05 throughout.