MBOAT7 rs641738 variant and hepatocellular carcinoma in non-cirrhotic individuals

Nonalcoholic fatty liver disease (NAFLD) represents an emerging cause of hepatocellular carcinoma (HCC), especially in non-cirrhotic individuals. The rs641738 C > T MBOAT7/TMC4 variant predisposes to progressive NAFLD, but the impact on hepatic carcinogenesis is unknown. In Italian NAFLD patients, the rs641738 T allele was associated with NAFLD-HCC (OR 1.65, 1.08–2.55; n = 765), particularly in those without advanced fibrosis (p < 0.001). The risk T allele was linked to 3’-UTR variation in MBOAT7 and to reduced MBOAT7 expression in patients without severe fibrosis. The number of PNPLA3, TM6SF2, and MBOAT7 risk variants was associated with NAFLD-HCC independently of clinical factors (p < 0.001), but did not significantly improve their predictive accuracy. When combining data from an independent UK NAFLD cohort, in the overall cohort of non-cirrhotic patients (n = 913, 41 with HCC) the T allele remained associated with HCC (OR 2.10, 1.33–3.31). Finally, in a combined cohort of non-cirrhotic patients with chronic hepatitis C or alcoholic liver disease (n = 1121), the T allele was independently associated with HCC risk (OR 1.93, 1.07–3.58). In conclusion, the MBOAT7 rs641738 T allele is associated with reduced MBOAT7 expression and may predispose to HCC in patients without cirrhosis, suggesting it should be evaluated in future prospective studies aimed at stratifying NAFLD-HCC risk.

patients even in the absence of severe liver fibrosis 6 . The high prevalence of NAFLD and development of HCC in non-cirrhotic patients unaware of being at risk renders therefore classic HCC screening strategies aimed at early diagnosis and curative treatment 7 unfeasible. Thus, there is an urgent need of non-invasive biomarkers able to stratify NAFLD-HCC risk, especially in patients without severe fibrosis 8, 9 . Inherited factors contribute to HCC susceptibility, and strong familial aggregation is observed 10 . NAFLD too has a strong heritable component 11 , and the PNPLA3 I148M variant is the main common genetic determinant of hepatic fat content and of progressive NAFLD [12][13][14][15] . The mechanism is related to accumulation of the mutated protein 16 , which interferes with lipid droplets remodeling in hepatocytes 15,17,18 , and with retinol release by hepatic stellate cells 19,20 . The PNPLA3 variant predicts HCC development in European patients with NAFLD 21 , suggesting that genetic risk factors may prove helpful to select high-risk individuals for screening [21][22][23] , but has a low specificity to be used as single prognostic biomarker 24 . Furthermore, the PNPLA3 variant also predisposes to HCC in other liver diseases associated with steatosis, namely alcoholic liver disease (ALD) and chronic hepatitis C (CHC) 23 . The TM6SF2 E167K variant also predisposes to progressive NAFLD by altering the secretion of very low-density lipoproteins [25][26][27] , but its direct role in HCC predisposition is disputed 26,28 .
The MBOAT7/TMC4 locus rs641738 C > T sequence variant predisposes to liver fibrosis development in individuals with excessive alcohol intake 29 and chronic hepatitis C 30 , and to the development and the progression of NAFLD in individuals of European descent 31,32 . However, whether the rs641738 variant is also associated with HCC risk is still unknown. Aim of this study was therefore to evaluate whether the rs641738 variant predisposes to HCC in NAFLD patients stratified by the presence of severe fibrosis, and in other liver diseases characterized by hepatic steatosis.

Results
The Italian NAFLD cohort. The clinical features of Italian NAFLD patients stratified by HCC diagnosis are presented in Table 1. Patients who developed HCC were older (p < 0.001), had higher prevalence of type 2 diabetes (T2DM; p < 0.001), and of severe fibrosis (stage F3-F4) than those who did not (Table 1), whereas sex distribution and prevalence of obesity were not different (p = NS). Concerning genetic risk factors, at univariate analysis HCC development was associated with PNPLA3 and MBOAT7 variants (p < 0.001 and p = 0.003, respectively; Table 1).
The clinical features of Italian NAFLD-HCC patients according to presence of severe fibrosis are reported in Table S1. Patients who developed HCC in the absence of severe fibrosis (n = 21, 17%) were more frequently males (p = 0.040), and carriers of the MBOAT7 rs641738 risk T allele (p = 0.006).
MBOAT7 variation is associated with NAFLD-HCC in Italian patients. The frequency distribution of the MBOAT7 rs641738 C > T polymorphism in Italian NAFLD patients stratified by the presence of HCC is shown in Fig. 1A. There was a significant over-representation of the rs641738 T allele in HCC vs. non-HCC NAFLD patients (p = 0.003, Fig. 1a and Table 1). In the NAFLD population, which was selected due to referral for suspected steatohepatitis/HCC, there was a borderline deviation from Hardy-Weinberg equilibrium for the frequency distribution of the rs641738 MBOAT7 variant (p = 0.03 in both groups). However, the frequency  distribution of the rs641738 variant did not violate Hardy-Weinberg equilibrium in 243 unselected healthy control subjects (p = NS; Table S3).
The clinical features of Italian NAFLD patients stratified by the rs641738 genotype are shown in Table S2. The rs641738 T allele was borderline associated with T2DM (p = 0.05) in the overall cohort, but this was not confirmed in patients stratified by HCC diagnosis, and was possibly explained by the confounding effect of the association between both rs641738 T allele and T2DM with HCC. In HCC patients, the T allele was associated with obesity (p = 0.035), and lack of severe fibrosis F3-F4 (p = 0.006).
As expected, the rs641738 T allele was nearly associated with severe fibrosis (stage F3-F4; OR 1.24, 95% c.i. 1.00-1.54; p = 0.052). The frequency distribution of PNPLA3, TM6SF2, and MBOAT7 variants according to hepatocellular carcinoma (HCC) diagnosis in Italian NAFLD patients stratified by the severity of fibrosis (stage F0-F2 vs. F3-F4) is presented in Table 2. While the PNPLA3 variant was associated with HCC development in patients with (p = 0.011), but not in those without severe fibrosis (p = NS), the MBOAT7 T allele was associated with HCC in patients without (p < 0.001; Fig. 1b and Table 2), but not in those with (p = 0.55; Fig. 1c and Table 2) severe fibrosis.
In Combined effect of genetic risk factors for NAFLD-HCC. The relationship between the total number of risk alleles including PNPLA3 I148M, TM6SF2 E167K, and MBOAT7 rs641738 T and HCC risk is presented in Fig. 2. There was a significant association between the number of risk alleles and HCC (OR per allele 1.56, 95% c.i. ). The model had a 0.96 ± 0.4 area under the receiving operating characteristic curve (AUROC) for detecting HCC cases. The optimal cutoff (identifying the best combination of sensitivity and specificity) had 96% sensitivity and 89% specificity for HCC (Fig. S1). The corresponding AUROC of a model taking into consideration only clinical factors was slightly lower (0.93 ± 0.5). In the subgroup of patients without severe fibrosis, the AUROC for clinical factors alone was 0.91 ± 0.5, whereas the full model incorporating genetic risk factors maintained an AUROC of 0.96 ± 0.4 (p = NS vs. clinical factors alone).

Relationship between MBOAT7 locus variants and gene expression.
To investigate the biological basis for the stronger association of MBOAT7 locus variation with HCC in patients without severe fibrosis, we next examined the association of the rs641738 variant with possibly variants that may influence MBOAT7 mRNA stability, and with hepatic MBOAT7 mRNA expression levels in patients stratified by the severity of liver fibrosis.
In 98 severely obese patients, the rs641738 variant was in high linkage with the MBOAT7 3′-UTR variant rs8736 C > T polymorphism (R 2 = 0.98; only 1/98 discordant case). Despite this, the rs8736 polymorphism was non-significantly more closely associated with NAFLD (p = 0.048 vs. p = 0.057) and MBOAT7 expression   (p = 0.042 vs. p = 0.046) than rs641738. These data are in line with the hypothesis that rs641738 is not the causal variant, but may be in linkage with variants influencing MBOAT7 expression. Gene expression of MBOAT7 in 47 patients from the Hepatology service characterized by more severe liver damage (Table S4) is shown in Fig. 3. The rs641738 T allele was associated with reduced hepatic MBOAT7 expression in patients absent or mild fibrosis (stage F0-F1; p = 0.02), but not in those with moderate-severe fibrosis (stage F2-F4; p = 0.1).

MBOAT7 variation and NAFLD-HCC risk in UK non-cirrhotic NAFLD patients.
To increase the study power, we next evaluated the association of the rs641738 T allele with HCC risk in non-cirrhotic NAFLD patients from UK, whose clinical features are presented in Table S5. The frequency distribution of the rs641738 C > T genotype in patients stratified by HCC diagnosis is shown in Table 4. Although in the UK non-cirrhotic NAFLD cohort (N = 358, of whom 20 with HCC) MBOAT7 variation was not significantly associated with HCC (p = 0.32), in the overall combined UK/Italian cohort of NAFLD patients without advanced fibrosis/cirrhosis the T allele remained associated with an increased risk of HCC (allelic OR 2.10, 95% c.i. 1.33-3.31).

Impact of MBOAT7 variation on HCC risk in non-cirrhotic patients with other liver diseases.
We finally evaluated the impact of the rs641738 T allele on HCC risk in 1121 non-cirrhotic patients with CHC and ALD (25, 2% with HCC, Table S6). Results are presented in Table 5. The rs641738 T allele was associated with increased risk of HCC, independently of age, sex, and the etiology of liver disease (OR 1.93 for each T allele, 95% c.i. 1.07-3.58; p = 0.035), with an effect size comparable to that observed in NAFLD. We observed a similar  trend for association of the T allele with non-cirrhotic HCC in patients with CHC and ALD analyzed separately (Table S7). The PNPLA3 I148M variant was also associated with HCC development outside cirrhosis, with a similar effect size of that of MBOAT7 variation (Table 5; p = 0.021).

Discussion
In this study, we evaluated whether the rs641738 C > T MBOAT7 locus sequence variant, associated with the development and progression of NAFLD 31,32 , influences susceptibility to NAFLD-HCC. The main result is that in the Italian NAFLD cohort each MBOAT7 rs641738 T allele conferred an approximately 80% increased risk of HCC. The association was mostly driven by a strong enrichment in the risk T allele in patients without advanced liver fibrosis, suggesting that MBOAT7 variation predisposes to HCC development particularly in non-cirrhotic patients. To confirm this hypothesis, we also examined the frequency of the MBOAT7 T allele in an independent UK cohort of non-cirrhotic NAFLD patients. Although the T allele was not significantly associated with NAFLD-HCC in this replication cohort, it remained significant in the combined cohort of 913 non-cirrhotic European NAFLD patients (41 with HCC) where it was associated with a greater than 2-fold increased risk of NAFLD-HCC; a relatively large effect size for a common genetic variant. This genetic polymorphism might thus represent a first useful biomarker to stratify HCC risk among individuals affected by NAFLD without advanced liver fibrosis.
The MBOAT7 protein catalyzes the transfer of polyunsaturated fatty acids such as arachidonoyl-CoA to lyso-phosphatidylinositol, thereby allowing to achieve an adequate level of desaturation in cell membranes 31 . The rs641738 T allele is associated with reduced MBOAT7 expression and altered phosphatidyl-inositol plasma and hepatic composition 31,32 , favoring hepatocellular fat accumulation and the production of inflammatory mediators 31 . However, rs641738 is not likely the causal variant underpinning susceptibility to NAFLD and HCC, as we observed that it is in strong linkage with other polymorphisms in 3′-UTR of MBOAT7, which may be more closely related to the phenotype and are potentially involved in the regulation of MBOAT7 mRNA stability.   It could be speculated that the effect size of rs641738 on HCC risk was larger in patients without severe fibrosis because the presence of the risk variant may somewhat compensate for the lack of the cirrhotic pro-carcinogenic environment. However, we observed that the rs641738 T allele is associated with reduced hepatic expression of MBOAT7 31 only in NAFLD patients without severe fibrosis. Therefore, the MBOAT7 variant may exert its deleterious effect specifically at early stages of liver disease. Alteration of hepatic parenchymal structure and relative cell-types representation may then hamper the impact of the MBOAT7 variant during severe fibrosis, because MBOAT7 is highly expressed in hepatic stellate cells and inflammatory cells 31,33 . In keeping with this interpretation, the rs641738 T allele was also associated with development of early stages, but not severe fibrosis in patients at risk of NASH 31 , and in a large cohort of CHC patients 30 . Therefore, the MBOAT7 variation might have a dual impact on liver disease during initial stages: either predisposes to HCC development before severe fibrosis ensues, or it facilitates the evolution to early-intermediate fibrosis. In line with this hypothesis, we also showed that the rs641738 T allele was associated with HCC development in non-cirrhotic patients with ALD or CHC.
In the Italian NAFLD cohort, the overall impact of the MBOAT7 rs641738 on HCC risk was similar to that of the I148M PNPLA3 variant. However, the effect of the I148M variant on HCC risk was not independent of severe fibrosis, suggesting that the mechanism is partly mediated by promotion of hepatic fibrogenesis and alteration of hepatic stellate cells biology 15,19,34 . Notably, the size effect of the PNPLA3 I148M variant was larger and only partially attenuated by the impact on liver fibrosis in a previous study conducted in a UK cohort 21 . This difference may be due to lifestyle factors, and to the higher prevalence of clinical cofactors, as opposed to genetic risk variants (lower frequency of the I148M variant) in the UK cohort. In addition, we also report for the first time an association between the TM6SF2 E167K variant and NAFLD-HCC. However, this association was not detected by univariate analysis due to an interaction of the TM6SF2 variant with clinical factors, and it has previously been absent when sought in the UK NAFLD cohort 26 , whereas a predisposing effect on HCC was reported in a Italian cohort of patients with alcoholic cirrhosis 28 . Further studies are therefore required to confirm whether the E167K variant is an independent risk factor for HCC.
All in all, data suggest that genetic variants predisposing to hepatic fat accumulation promote hepatic carcinogenesis. Indeed, hepatocellular fat accumulation represents a key feature of hepatic carcinogenesis 35,36 . Therefore, they might represent useful biomarkers for risk stratification. In fact, in the Italian NAFLD cohort the number of genetic risk variants carried was strongly associated with HCC, with 13.4-fold higher risk in those carrying five risk variants as compared to none. Remarkably, the number of genetic risk variants was able to classify NAFLD patients in three groups with different HCC risk: 9% in the 36% of patients with 0-1 risk alleles, l9% in the 55% with 2-3 risk alleles, and 31% in those carrying more than 3 risk alleles. This could in principle allow a better stratification of HCC risk than those allowed by the PNPLA3 I148M variant alone, which was proposed by the EASL-EASD-EASO NAFLD guidelines 21,37 . However, in the present cross-sectional Italian cohort genetic risk variants did not significantly improve the predictive accuracy of clinical factors.
Limitations of the study include its cross-sectional retrospective nature, resulting similarly to previous reports 21 in an uneven representation of clinical risk factors (age, sex, T2DM, severe fibrosis) between HCC cases and controls. However, the majority of NAFLD-HCC patients are still diagnosed incidentally outside regular follow-up 38 , so that prospective studies in patients with advanced disease would not be more informative, especially for the risk of non-cirrhotic NAFLD-HCC. This could have led to an underestimation of the impact of inherited genetic risk variants on NAFLD-HCC, whereas the impact of clinical risk factors may have been overestimated. Therefore, the weight of specific factors in determining the HCC risk score should be reassessed in larger prospective cohorts with long follow-up and availability of the genetic risk profile before evaluation of genetic risk variants can be considered for implementation in clinical practice. Due to the relatively low number of patients included, these limitations are particularly relevant for the association of MBOAT7 variation with HCC development in patients without severe liver fibrosis. Finally, these results may not be applicable to other ethnic groups.
In conclusion, the MBOAT7 rs641738 T allele is associated with reduced MBOAT7 expression and may predispose to HCC in European individuals without cirrhosis, suggesting it should be evaluated in future prospective studies aimed at stratifying NAFLD-HCC risk.

Methods
Patients. We enrolled 132 consecutive unrelated patients with NAFLD-HCC of Italian descent, referred between January 2008 and January 2015 to the Milan, Udine, Turin, Rome, and Palermo hospitals, for whom DNA samples were available. Diagnosis of HCC was based on the EASL-EORTC Clinical Practice Guidelines 7 . In the absence of liver biopsy, diagnosis of NAFLD required detection of ultrasonographic steatosis plus at least one criterion of the metabolic syndrome.
As controls, we selected Italian-ancestry patients with histologically confirmed NAFLD followed at the same referral outpatient Hepatology services during the same study period 27,39 , from a recently published database 31 , who did not develop HCC during follow-up. We did not consider for this study bariatric patients from Milan because of the different recruitment criteria (indication to metabolic surgery), leading to different demographic and clinical features 27,39 , and the lack of incident HCC cases in the cohort.
All patients were tested for secondary causes of steatosis including alcohol abuse (≥30/20 g/day in M/F) and the use of drugs known to precipitate steatosis. Viral and autoimmune hepatitis, hereditary hemochromatosis, Wilson's disease, alpha-1-antitrypsin deficiency and present or previous active infection with HBV and HCV were ruled out using standard clinical and laboratory evaluation, as well as liver biopsy features.
Advanced fibrosis was defined in the presence of fibrosis stage F3-F4 40 , when liver biopsy was available. In HCC patients with radiological diagnosis, advanced fibrosis was defined in the presence of clinical, endoscopic or ultrasonographic signs of portal hypertension or cirrhosis (n = 46), or of liver stiffness ≥8.4 kPa evaluated by elastometry (n = 3), or by a positive NAFLD fibrosis score (n = 5) 41 . Obesity was defined when BMI > 30 Kg/m 2 . All NAFLD patients without HCC included in the study underwent liver biopsy, while among those who developed HCC, fibrosis staging was histologically performed in 78 (59%) of the cases. Clinical features of subjects included according to the presence of HCC are shown in Table 1.
The UK NAFLD cohort comprising HCC cases and controls recruited at a single tertiary centre (the Newcastle upon Tyne Hospitals NHS Foundation Trust, UK) has previously been described 21 . The study had all the necessary ethical approvals and all participants gave informed consent. For this study, we considered patients for whom liver disease staging scored according to the NASH CRN histopathological system and DNA samples for MBOAT7 genotyping were available. Clinical features of the 358 individuals analyzed in this study are presented in Table S5.
We next evaluated the impact of the rs641738 variant on HCC risk in an Italian multicenter cohort of non-cirrhotic patients with chronic hepatitis C (CHC; n = 597) and alcoholic liver disease (ALD; n = 524). CHC patients were from the well-described histological Milan CHC cohort 42,43 , whereas HCC cases were previously described by our group (Milan HCC cohort, where presence of cirrhosis was carefully assessed) 44 .
ALD patients with and without HCC included patients from the Milan center, which were partly described previously (again the Milan HCC cohort), and for whom liver disease was evaluated by histology or as described above for NAFLD 44 . We also considered consecutive individuals, who were admitted to the Outpatient Clinic at the Department of Clinical Medicine, Policlinico Umberto I, Rome for alcohol abuse or dependence between 2005 and 2014, for whom DNA samples were still available and genotyping was successful 45 . At-risk alcohol consumption was defined as ≥3/2 alcohol units per day for M/F, respectively. Cirrhosis was ruled out based on the presence of at least one of the following features: (i) current or past cirrhosis complications; (ii) the presence of at least two parameters among hyperbilirubinaemia, hypoalbuminaemia, prolonged prothrombin time, low platelet count, irregular liver surface at ultrasound/CT, reduced portal vein flow at ultrasound, gastroesophageal varices at endoscopy, or by histological analysis. Individuals with other coexistent liver diseases were excluded. Clinical features of this cohort are presented in Table S6.
The study protocol was conformed to the ethical guidelines of the 1975 Declaration of Helsinki, was approved by the Ethical Committee of the Fondazione IRCCS Ca' Granda of Milan, as well as by the other involved Institutions, and was performed according to the recommendations of the hospitals involved. Informed consent was obtained from each patient.
Gene expression analysis. Expression of MBOAT7 and TMC4 was determined in two different subsets of patients. The first one was made of 98 severely obese patients, with a very low prevalence of advanced liver fibrosis, and has previously been described in details 31 . This was used to analyze the association of a 3′-untranslated region MBOAT7 locus variant, rs8736 possibly influencing MBOAT7 mRNA stability, with MBOAT7 expression. The second one was made up of 47 patients from the Hepatology service, and was characterized by a higher prevalence of liver fibrosis. Clinical features of these patients are presented in Table S4. This was used to evaluate the impact of liver fibrosis on the association between the rs641738 variant and MBOAT7 expression.
MBOAT7 expression was quantified as previously described 31 . Association analysis between rs641738 variant (additive model) and gene expression, and linkage with rs8736 were conducted by the PLINK v1.07 genetic analysis software.
Statistical analysis. For descriptive statistics, continuous variables are shown as mean and standard deviation or median. Categorical variables are presented as number and proportion. The OR of HCC or severe fibrosis per MBOAT7 rs641738 T alleles and other risk factors were estimated by logistic regression models, assuming an additive effect of genetic variants, and adjusted for clinical risk factors (the major known clinical risk factors for NAFLD-HCC in previous studies) as specified, PNPLA3 I148M and TM6SF2 E167K genotypes (clinical and genetic factors previously associated with or candidate for liver disease evolution to HCC) 21,26 . A NAFLD-HCC risk score was developed according to a previously described procedure 41 .
Statistical analyses were carried out using the JMP 13.0 Statistical analysis software (SAS Institute, Cary, NC), R statistical analysis software version 3.3.2 (http://www.R-project.org/), and PLINK v1.07 46 . P-values < 0.05 were considered statistically significant. The study methods and results have been reported according to the STROBE/ STREGA guidelines for genetic association studies.