Copy number variation in exportin-4 (XPO4) gene and its association with histological severity of non-alcoholic fatty liver disease

A recent genome-wide copy number (CNV) scan identified a 13q12.11 duplication in the exportin-4 (XPO4) gene to be associated with non-alcoholic steatohepatitis (NASH). We sought to confirm the finding in a larger cohort and to assess the serum XPO4 pattern in a broad spectrum of non-alcoholic fatty liver disease (NAFLD) cases. We analysed 249 NAFLD patients and 232 matched controls using TaqMan assay and serum XPO4 was measured. Copy number distribution was as follows: copy number neutral (NAFLD: 53.8%, controls: 68.6%), copy number losses (NAFLD: 13.3%, controls: 12.9%), copy number gains (NAFLD: 32.9%, controls: 18.5%). CNV gain was significantly associated with a greater risk of NAFLD (adjusted OR 2.22, 95% CI 1.42–3.46, P = 0.0004) and NASH (adjusted OR 2.33, 95% CI 1.47–3.68, P = 0.0003). Interestingly, subjects carrying extra copy number showed significantly higher serum ALT and triglyceride (P < 0.05). Serum XPO4 levels progressively declined (P = 0.043) from controls (24.6 ng/mL) to simple steatosis (20.8 ng/mL) to NASH (13.8 ng/mL). In conclusion, XPO4 CNV duplication was associated with histological severity of NAFLD, and accompanied by changes in serum XPO4 levels providing insights into NAFLD pathogenesis, and has the potential for biomarker development.

Scientific RepoRts | 5:13306 | DOi: 10.1038/srep13306 (fatty infiltration of the liver) to the more severe form non-alcoholic steatohepatitis (NASH-fat with inflammation and/or fibrosis); the latter can progress to cirrhosis and liver cancer 3 . But overall GWAS approaches have only identified a fraction of the heritability of NAFLD, and as of now, candidate gene approaches have been inconclusive in linking genetic variants to the severity of NAFLD with the exception of patatin-like phospholipase domain containing 3 (PNPLA3) 4 .
CNVs constitute a substantial fraction of genetic and phenotypic variability affecting segments greater than 1 kb in length-these include both duplications (copy number gains) and deletions (copy number losses) of genetic material 5 . CNVs have been estimated to affect ~12% of the human genome 6 with over 1000 genes mapped within or close to CNV-affected regions 7 . Two genome-wide association scans have pioneered the study of CNV in NAFLD 8,9 . These have suggested that both the risk of non-alcoholic fatty liver disease (NAFLD) and its progression can be linked to DNA copy number variation (CNVs), which has added a whole new level of complexity to the study of molecular determinants of NAFLD.
We have identified several genomic regions that are potentially relevant to non-alcoholic steatohepatitis (NASH) 9 , a severe form of NAFLD, but these need to be confirmed in a larger set of samples. In the present study, we have further assessed CNVR 13q12.11 because (i) CNV gain at locus 13q is one of the highest differentially expressed CNVs in the hepatocellular carcinoma (HCC) genome 10 ; (ii) it contains exportin-4 (XPO4), a tumor suppressor gene involved in the pathogenesis of HCC 11 ; and (iii) a previous report has shown that XPO4 is known to be expressed in both the liver and peripheral blood 12 . The XPO4 gene is a member of the importin β family that mediates the nuclear-cytoplasmic transport of protein cargoes 13 . Zender and colleagues in their oncogenomics-based in vivo RNAi screen demonstrated that the reintroduction of XPO4 selectively suppresses tumors including HCC with XPO4 deletion 11 . Increased expression of XPO was correlated with better prognosis and survival rate among patients with HCC 14,15 . The expression of XPO4 also seems to be significantly decreased in cirrhotic livers and after chronic hepatitis B infection when compared with normal healthy controls 12 .

Results
Study subjects. Subject demographic and clinical data are shown in Table 1. There were a total of 249 NAFLD patients: 110 (44%) were Malays, 86 (35%) Chinese and 53 (21%) Indians. Out of the 232 controls, 79 (34%) were Malays, 97 (42%) Chinese and 56 (24%) Indians. NAFLD patients and controls significantly differed (P < 0.05) in gender, BMI, HbA1c, liver enzymes and lipid profiles, but not in age. The differences in the parameters simply reflect the nature of the subjects according to the criteria set. NAFLD patients were further grouped into simple steatosis and NASH (Table 2). Significantly higher (P < 0.05) levels of histological parameters, waist circumference, BMI, HbA1c, and liver enzymes were observed in patients with NASH compared to those with simple steatosis.
Association of CNV 13q12.11 with NAFLD. On the basis of the GWAS discovery, we followed up the CNVR on chromosome 13q12.11 in 481 case-control replication samples, in whom we detected eight subjects (1.7%) with homozygous deletions, 55 subjects (11.4%) with heterozygous deletions, 293 subjects (60.9%) with two copies, 99 subjects (20.6%) with three copies, 20 subjects (4.2%) with four copies, and six subjects (1.2%) with five copies. Overall, 134 NAFLD patients (53.8%) were copy number neutral, 33 (13.3%) had deletions and 82 (32.9%) had duplications. As for the controls, 159 (68.6%) were copy number neutral, 30 (12.9%) had copy number losses and 43 (18.5%) had copy number gains. As shown in Table 3, the frequency of CNV gain was significantly higher in the NAFLD patients compared to the controls (adjusted OR 2.32, 95% CI 1.49-3.61, P = 0.0002). However, there was no significant difference between the two groups for CNV loss. CNV gain was also significantly associated with NASH (adjusted OR 2.43, 95% CI 1.54-3.83, P = 0.0001) but not with simple steatosis. When further collapsed into NASH with no significant fibrosis (fibrosis score < 2) and NASH with significant fibrosis (fibrosis score ≥ 2), both groups were found to be significantly associated with CNV gain (adjusted OR 1.87, 95% CI 1.08-3.21, P = 0.029 and adjusted OR 3.21, 95% CI 1.90-5.72, P = 5.94 × 10 −5 , respectively) with a stronger effect observed in the latter group. We evaluated the association of CNV gain with NASH at the severe stage (fibrosis score ≥ 3; bridging fibrosis and cirrhosis) and found that patients with CNV gain possess 2.55 higher risk for advanced NASH (P = 0.012).
When the subjects were stratified by ethnicity, the frequency of CNV gain was found to be relatively high in the Malays (case: 43%, control: 27%), followed by the Chinese (case: 39%, control: 19%) and Indians (case: 27%, control: 18%). The CNV gain was associated with risk of NAFLD in the Malays (adjusted OR 2.02, 95% CI 1.02-3.99, P = 0.043) and Chinese (adjusted OR 2.80, 95% CI 1.32-5.95, P = 0.007) but not in the Indians. CNV gain was also associated with risk of NASH in the Chinese (adjusted OR 3.29, 95% CI 1.52-7.13, P = 0.003) (Supporting Information Table S1). Our replication study has a power of 98%, 22% and 98% with an α of 0.05 to detect associations with duplications at 13q12.11, for association with NAFLD, simple steatosis and NASH, respectively.
In order to further investigate the effect of the CNV gain, we first validated the 67 (68%) available samples from the 98 discovery samples 9 using qPCR. The frequency of CNV gain (20.7%) in the  control discovery samples was relatively similar to that of the controls from the replication samples (18.5). Results from the discovery samples indicated that CNV gain at 13q12.11 was associated with risk of NAFLD (OR 5.88, 95% CI 1.94-17.82, P = 0.002) and NASH (OR 5.11, 95% CI 1.67-15.67, P = 0.004), but not with simple steatosis. Then, we performed a meta-analysis including both discovery and replication studies ( Fig. 1), which confirmed the significant association with NAFLD (OR 2.68, 95% CI 1.79-4.02, P < 0.0001) and NASH (OR 2.64, 95% CI 1.75-4.00, P < 0.0001).

Comparison of clinical parameters by different copy number status.
We compared clinical parameters in NAFLD patients with different copy number status (Table 4). Significant differences in the serum were found with ALT (P = 0.038), GGT (P = 0.007) and triglycerides levels (P = 0.046); both serum ALT and triglyceride are indicators for NAFLD severity. Higher copy numbers were associated with higher serum ALT (copy number deletion vs. neutral vs. duplication: 72.1 vs. 82.3 vs. 90.3 IU/L) and triglycerides (136.0 vs. 154.1 vs 160.9 mg/dL). However, after adjustments for the covariates such as age, gender and ethnicity, the differences were no longer significant (Table 4). To investigate the magnitude and direction of the effect, using linear regression, we revealed positive correlations between CNV gains and log-transformed serum ALT (P = 3.20 × 10 −5 ) and triglycerides (P = 0.004). Spearman's correlation also showed a similar trend (P = 4.01 × 10 −5 and P = 0.003, respectively).

Sensitivity analysis.
Because triglyceride is a NAFLD-associated risk factor 16 , and there was a significant correlation between the CNV and serum triglyceride, we wanted to exclude the possibility of collider stratification bias. We therefore performed sensitivity analysis by excluding those patients with serum triglyceride levels that were above normal.    NASH (13.8 ng/mL) (P = 0.043; Table 2). As expected, the serum XPO4 levels were not different between controls and simple steatosis but were significantly different between controls and NASH (P = 0.014). We then assessed the effect of CNV 13q12.11 on serum XPO4 levels: there was a suggestion of a decrease in levels with extra copies of the DNA segment, although this was not statistically significant (CNV losses-23.0 ng/mL; CNV neutral-20.4 ng/mL; and CNV gains-18.0 ng/mL; Table 4).

Discussion
In this study, using a larger cohort of NAFLD patients with well-defined histological characteristics, we extended our previous work 9 and confirmed the association between the CNV 13q12.11 in the XPO4 gene and NAFLD. The XPO4 gene is a tumour suppressor gene located on the long arm of chromosome 13. Previous studies have suggested that XPO4 plays a role in the initiation of HCC as its expression level decreases with the development of cancer 14,15 . Furthermore, the inactivation of XPO4 is associated with HCC development in mice 11 . In the current study, we have shown that CNV gains in XPO4 are associated with severity of NAFLD, in particular an association between the CNV and NASH with significant fibrosis. Lower serum XPO4 levels were found in patients with NAFLD when compared to healthy controls. Thus, we suggest that the presence of CNV in the XPO4 gene is a predictor of the histological severity of NAFLD. Taken together, our data suggest that XPO4 CNV duplication is associated with histological severity of NAFLD especially with that of NASH, but may not be a distinctive marker for the progression from simple steatosis to NASH. In our CNV association study, we demonstrated that CNV gains may also be associated with increased serum ALT and triglyceride levels. Although it is unclear how the variations may cause these abnormalities, we speculate that it is related to the involvement of XPO4 in the signal transduction and nuclear export of SMAD family member 3 (Smad3) protein. Smad3 is an intracellular mediator of transforming growth factor beta 1 (TGF-β 1) that has a multifaceted regulatory effect on metabolic homeostasis 17 . The downstream effects of Smad3 were observed in the aggravation of insulin resistance and adiposity 18,19 . By contrast, Smad3-knockout mice exhibited improved insulin sensitivity and β -oxidation thereby ameliorating glucotoxicity and lipotoxicity in several organs including the liver 18 . However, the effects of the CNV on these parameters need to be confirmed in other populations and using larger samples.
The XPO4 gene is down-regulated in HCC tissues compared with normal tissue 14,15 . Recently, Zhang and colleagues showed that the expression was also lower in HCC when compared to liver cirrhosis and chronic hepatitis B 12 . In this study, we also report for the first time that serum XPO4 levels decrease in NAFLD with an association with disease severity. This trend may be due to the role of XPO4 as a tumor suppressor. XPO4 mediates TGF-β 1 that recruits and phosphorylates Smad3, and the resulting phosphorylation causes Smad3 to bind to DNA to modulate transcriptional events 20 . Notably, proteins such as XPO4 which are associated with signal transduction, have been found to be lowly expressed as the degree of phosphorylation increases 21 . Our data also suggest that serum XPO4 levels may decline in relationship to the number of copies, although this was not significant. This potential functional effect needs further study, particularly since there seems to be a trend to a decrease in levels as the copy number increases. Interestingly, although most CNVs are gene dosage insensitive 22 , about 10% of the CNV duplications in the human genome are dosage reversed 23,24 . This may be due to reduced transcription and gene silencing as a consequence of the gene duplication 22 .
Our study has several limitations. Among them is the relatively small number of simple steatosis patients which may have resulted in the negative association in this sub-group. However, since the observed association was consistent with the discovery study 9 , we feel the association is likely to hold true. In addition, Royo et al. in a genome-wide scan in a set of simple steatosis patients did not find an association of CNV 13q12.11 with simple steatosis 8 . This limitation is also due to the fact that the recruitment centre, UMMC is a tertiary referral center where more severely affected patients, i.e. those with NASH, are likely to be seen. Association comparison analysis was not performed in homozygous deletion, subjects with four copies and subjects with five copies due to limitation in the sample size. It is also important to note that in the discovery study using array comparative genomic hybridization (aCGH), this CNV was exclusively found only in NASH but not in controls 9 . However, the present study showed that the frequency of CNV gain was doubled in NASH (33.6%) compared to controls (18.5%). The observed events could be due to the following reasons: (i) The discovery sample in the genome-wide study was relatively low, (ii) in aCGH from the discovery study, the signal ratio between a case and control sample is normalised and converted to a log2 ratio, which is used for copy number call. Detection in the control may have been seen but when the ratio does not reach significant level, copy number is not called, and (iii) unlike monogenic disease with high penetrance and clear patterns of inheritance, NAFLD is a polygenic disease with greatly varying degrees of penetrance. Validation studies in a prospective setting are warranted. Our data were also not supported by XPO4 gene expression levels which would have added strength to the findings. The strength of this study, however, is that the association findings were consistent with the discovery GWAS 9 . Furthermore, meta-analysis of the data from the discovery and replication studies confirmed the association with NASH (Fig. 1). To the best of our knowledge, this is the first study to confirm the previous genome-wide report on the association of the CNV with NASH.
In conclusion, we have demonstrated and confirmed the association of CNV gains at locus 13q12.11 with NASH. Lower serum XPO4 levels were observed in patients with NASH compared to those with simple steatosis and there was a suggestive decline in levels of serum XPO4 with extra copy number. Knowledge of the biological actions of XPO4 will enhance our understanding of its role in NAFLD progression. This study also needs to be replicated in a larger cohort and in various other ethnic populations.

Study subjects. We included 249 consecutive biopsy-proven NAFLD patients from the University
Malaya Medical Centre (UMMC). NAFLD patients were confirmed through liver histology upon detection of symptoms or signs that are attributable to liver disease through imaging, or upon discovery of abnormal liver biochemistry 25 . NAFLD stages were evaluated according to the NASH Clinical Research Network (CRN) criteria 26,27 . All liver biopsy specimens were on average 1.5 cm long and contained at least six portal tracts. There was no evidence of Hepatitis B nor Hepatitis C infection, autoimmune hepatitis, history of alcohol consumption >10g/day 28 , exposure to drugs known to cause steatosis or Wilson's disease reported in any subjects. Based on the NASH CRN, NAFLD patients were grouped into simple steatosis (n = 32) and NASH (n = 217), the latter was further stratified into NASH without significant fibrosis (fibrosis score < 2, n = 114) and NASH with significant fibrosis (fibrosis score ≥ 2, n = 103) 29 . For subsequent analysis, NASH was also grouped into early NASH (fibrosis score ≤ 2, n = 167) and advanced NASH (fibrosis score ≥ 3, n = 50) 25 .
All controls (n = 232) were genetically unrelated healthy subjects, confirmed to have normal liver function and had no indication of fatty liver as determined by the following parameters: body mass index (BMI) < 23 kg/m 2 , fasting plasma glucose < 110 mg/dL, and normal lipid profile. NAFLD was actively excluded in the controls by ultrasonography according to the absence of the following criteria: (i) slight diffuse increase in bright homogeneous echoes in the liver parenchyma with normal visualization of the diaphragm and portal and hepatic vein borders, and normal hepatorenal echogenicity contrast; (ii) diffuse increase in bright echoes in the liver parenchyma with slightly impaired visualization of the peripheral portal and hepatic vein borders; and (iii) marked increase in bright echoes at a shallow depth with deep attenuation, impaired visualization of the diaphragm and marked vascular blurring 30 . All experimental protocols were approved by the responsible Medical Ethic Committee of UMMC (ethics reference number: 702.11) and the methods were carried out in accordance with the approved guidelines. Written informed consent was obtained from all the patients prior to recruitment into the study.
Biochemical and clinical assessments. Anthropometric data such as height and weight for the determination of body mass index (BMI, kg/m 2 ), and waist circumference, were determined using standard protocols. Measurement of blood pressure (mmHg) was according to standard recommendation and clinical practice guidelines. The biochemical tests for the determination of hemoglobin A1c (HbA1c), high-density lipoprotein cholesterol (HDL), low-density lipoprotein cholesterol (LDL), total cholesterol, triglycerides, alanine transferase (ALT), aspartate aminotransferase (AST), and gamma-glutamyl transpeptidase (GGT) levels were according to standard clinical laboratory methods carried out in an accredited laboratory at UMMC.

Measurement of serum XPO4 levels.
From the serum samples available, we randomly selected subjects that represented different copy number status from different disease stage (42 controls-7 losses, 23 neutral and 12 gains; 19 simple steatosis-2 losses, 9 neutral and 8 gains; and 34 NASH-5 losses, 18 neutral and 11 gains). XPO4 evaluation was performed on an aliquot of serum collected after overnight fasting at the time of sampling and stored at − 80 °C . Serum XPO4 levels were determined using a commercial enzyme-linked immunosorbent assay (ELISA) kit (SunRed Biotech, Shanghai, PRC) according to the manufacturer's recommendations, the lowest limit of detection being 0.172 ng/mL. CNVR 13q12.11 genotyping. Genomic DNA was extracted from the blood samples using the QiAamp DNA Mini Kit (Qiagen, Hilden, Germany). The extracted DNA with good quality (OD 260 / OD 280 = 1.8-2.0) was diluted to a final concentration of 5 ng/μ L. The Applied Biosystems protocols that use a duplex TaqMan real-time quantitative polymerase chain reaction (qPCR) method were employed to call for CNV (13q12.11: Assay Hs03857719_cn) for every sample. Basically, each reaction (20 μ L) contained 10 μ L master mix, 1 μ L TaqMan Copy Number Assay, 1 μ L TaqMan Copy Number Reference Assay, 4 μ L nuclease free water, and 4 μ L genomic DNA. All reactions were run in quadruplicate with PCR cycling conditions as follows: 1 PCR cycle at 95 °C for 10 min, followed by 40 cycles at 95 °C for 15 sec and 60 °C for 1 min. Negative controls were introduced for every run to ensure the genotyping quality.
CNV validation and meta-analysis. Validation of the previous CNV typing was done on the available samples using qPCR as above. To add strength to the findings, results of the discovery study and replication study were meta-analysed. Statistical analysis. All statistical tests were performed using SPSS version 16.0 (IBM Corp., Chicago, IL, USA), unless otherwise mentioned. Data were presented as percentage or mean ± standard Scientific RepoRts | 5:13306 | DOi: 10.1038/srep13306 deviation (S.D). Categorical and continuous variables were compared between NAFLD patients and controls using Pearson's χ 2 test, independent t-test and Mann-Whitney U test as appropriate. Odds ratios and 95% confidence interval (CI) for the findings were computed using logistic regression. Multivariate analysis revealed that gender was a contributing factor for NAFLD, and hence, gender was adjusted in the subsequent analysis. We also included age in the adjustment despite the fact that it was matched between NAFLD patients and controls as age is a known risk factor for NAFLD. Parameter comparisons among CNV status were tested using Analysis of Variance (ANOVA) and Kruskal-Wallis as appropriate. Subsequently, Analysis of Covariance (ANCOVA) using the general linear model was applied with age, gender and ethnicity as covariates. The P-values were corrected for multiple testing using the false discovery rate (FDR) method from the Benjamini-Hochberg procedure 31 . Linear regression was used to assess the correlation between genetic variants and clinical parameters for normally distributed variables; otherwise Spearman's correlation test was adopted. Variables were log-transformed to achieve normality. Sensitivity analysis was performed according to the approach by Mefford and Witte 32 . A two-sided P-value of < 0.05 is considered to be statistically significant.
Meta-analysis was conducted using the Review Manager (RevMan 5.3) of the Cochrane Collaboration utilising a Mantel-Haenszel test to estimate the pooled ORs and corresponding 95% CIs by assuming either fixed or random effect meta-analysis, where appropriate.
Power calculations were performed using Quanto power calculator version 1.2.4, with the following assumptions: the CNV frequency was 0.19, the baseline risk for the Malaysian population was 0.17 33,34 and the detectable odds ratio ranged from 1.5-2.0.