Variants in ABCG8 and TRAF3 genes confer risk for gallstone disease in admixed Latinos with Mapuche Native American ancestry

Latin Americans and Chilean Amerindians have the highest prevalence of gallstone disease (GSD) and gallbladder cancer (GBC) in the world. A handful of loci have been associated with GSD in populations of predominantly European ancestry, however, they only explain a small portion of the genetic component of the disease. Here, we performed a genome-wide association study (GWAS) for GSD in 1,095 admixed Chilean Latinos with Mapuche Native American ancestry. Disease status was assessed by cholecystectomy or abdominal ultrasonography. Top-10 candidate variants surpassing the suggestive cutoff of P < 1 × 10−5 in the discovery cohort were genotyped in an independent replication sample composed of 1,643 individuals. Variants with positive replication were further examined in two European GSD populations and a Chilean GBC cohort. We consistently replicated the association of ABCG8 gene with GSD (rs11887534, P = 3.24 × 10−8, OR = 1.74) and identified TRAF3 (rs12882491, P = 1.11 × 10−7, OR = 1.40) as a novel candidate gene for the disease in admixed Chilean Latinos. ABCG8 and TRAF3 variants also conferred risk to GBC. Gene expression analyses indicated that TRAF3 was significantly decreased in gallbladder (P = 0.015) and duodenal mucosa (P = 0.001) of GSD individuals compared to healthy controls, where according to GTEx data in the small intestine, the presence of the risk allele contributes to the observed effect. We conclude that ABCG8 and TRAF3 genes are associated with GSD and GBC in admixed Latinos and that decreased TRAF3 levels could enhance gallbladder inflammation as is observed in GSD and GSD-associated GBC.


Results
Discovery GWAs for GsD in an admixed Chilean population. The Discovery GWAS stage involved 1,235 admixed Chileans Latinos with Mapuche Native American Ancestry genotyped using the Affymetrix Axiom ® Genome-Wide LAT 1 World Array 4 (see complete pipeline in Supplementary Fig. 1). After genotype calling and quality controls, 1,095 individuals (529 GSD cases and 566 controls) and 677,835 SNPs remained for analysis (Table 1). We imputed genotypes in these individuals using the 1000 Genomes Project Phase 3 reference panel 16 and achieved a total number of 9,433,911 SNPs and Indels. We tested for genetic association with GSD by performing a logistic regression analysis adjusted by age (in years; cases = 51.32 ± 10.67, controls = 49.87 ± 9.57, one-way ANOVA P = 0.018), since other known associated covariates such as sex (women%; cases = 92.43%, controls = 92.57%, Z score test P = 0.928), body mass index (BMI; cases = 29.44 ± 4.28, controls = 29.66 ± 3.94, one-way ANOVA P = 0.376) and native American ancestry proportion (cases = 46.3% ± 7.0%, controls = 45.5% ± 7.5%, one-way ANOVA P = 0.069), did not differ significantly between cases and controls.

Association of ABCG8 and TRAF3 variants with GsD in an independent Chilean population.
The top-ten candidate variants were tested for replication in an independent admixed Chilean population composed of 1,643 individuals (626 cases and 1,017 controls) from Santiago de Chile (Table 1). Replication was performed by real-time qPCR with TaqMan probes for most leading variants, with the exception of SNP3 (4q12), SNP4 (TRAF3) and SNP9 (11p15.3), which were examined by proxy SNPs in high linkage disequilibrium (LD r 2 > 0.88). Nine of the ten candidates were in Hardy-Weinberg equilibrium, where only the SNP3 showed a significant deviation (p = 2 × 10 −40 ,  Table 2). Both variants passed Bonferroni correction for 10 tests (P < 0.005) and displayed an effect with the same direction as it was observed in the discovery stage of the GWAS. To assess for the combined effect between ABCG8 and TRAF3 SNPs with GSD, we performed a meta-analysis 17 between the Discovery and Replication stages and observed an increased significance in the association of ABCG8 (rs11887534, P = 3.24 × 10 −8 , OR = 1.74, CI 95% = 1.45-2.02) and TRAF3 (rs12882491, P = 1.11 × 10 −7 , OR = 1.40, CI 95% = 1.20-1.60) variants (Table 2). Notably, the OR obtained for ABCG8 represents a PAR of 7.1%, while for TRAF3 (rs12882491, risk allele C) the OR of 1.40 (CI 95% = 1.20-1.60) represents a PAR of 20.2%. Taking together, the combined effect of both associations gives an estimated PAR of 25.9%, meaning that if their contribution for the disease is hypothetically eliminated 18 , there would be 25.9% fewer affected individuals in the population. In addition, SNPxSNP epistasis analysis showed non-significant interaction between ABCG8 and TRAF3 association signals (P = 0.399), suggesting that the genetic contribution of both risk factors is independent from each other. Altogether these results confirm and replicate the previously observed associations of ABCG8 with GSD and identify TRAF3 as a novel candidate gene associated with the disease in the Chilean Admixed population.
ABCG8 and TRAF3 variants are associated with GBC. Although GBC is the most severe complication of GSD 5,6,19 , few studies have investigated the effect of ABCG8 as a genetic determinant for this cancer 20,21 . We therefore examined the orientation and magnitude of the effect for ABCG8 and TRAF3 variants in a Chilean population of GBC patients (n = 397 women), using sex-matched controls from the replication population (n = 667 women) ( Table 1). After logistic regression analyses adjusted by age (in years; cases = 62.65 ± 13.08, controls = 43.38 ± 12.13, one-way ANOVA P < 0.001), we observed that both SNPs were associated with GBC (ABCG8 rs11887534: P = 6.9 × 10 −4 , OR = 1.77, CI 95% = 1.27-2.45; TRAF3 rs12882491: P = 0.045, OR = 1.24, CI 95% = 1.004-1.53), with same direction of the effect and similar risk allele frequency (RAF), as it was observed in the discovery and replication populations (RAF ABCG8 : GBC = 0.14, discovery = 0.14, replication = 0.12; RAF TRAF3 : GBC = 0.66, discovery = 0.69, replication = 0.66) (Fig. 3). We also observed that the association of ABCG8 and TRAF3 variants with GBC was significant only when GBC samples were compared to gallstone-free  Table 2), indicating that the risk contribution to GBC was explained by the presence of gallstones. These results support the genetic association of ABCG8 with GBC and reveal TRAF3 as a novel marker for the disease.
Replication Analyses for ABCG8 and TRAF3 variants in europeans populations. Since we originally identified ABCG8 as a risk factor for GSD in a German population from Kiel 10,11 , we therefore examined ABCG8 rs11887534 and TRAF3 rs12882491 variants in an extended sample from the original POPGEN-Kiel population consisting in 1,938 individuals (1,027 cases and 911 controls) 22 , as well as in 4,154 individuals (882 cases and 3,272 controls) from the north-east side of Germany identified as SHIP-Greifswald 23 ( TRAF3 expression in human gallbladder and duodenal samples. TRAF3 is a member of the tumor necrosis factor (TNF) receptor-associated factor (TRAF) protein family that mediates signal transduction during the activation of immune and inflammatory responses [24][25][26] . To date, seven TRAF genes are known in humans (TRAF1 to TRAF7) and the products encoded by these genes are involved in cytokine production and cell survival 27 . Considering that the acute and chronic inflammation of the gallbladder (cholecystitis) are known hallmarks of GSD pathophysiology 28 , we explored the expression levels of the TRAF3 protein in 9 gallbladder mucosa samples (4 cases and 5 controls) and mRNA in 25 gallbladder (12 cases and 13 controls) and 41 duodenal (22 cases and 19 controls, independent from the mRNA samples) samples. We used the duodenal mucosa as a comparison for three reasons: First, it is a tissue in contact with bile coming from the liver and gallbladder; second, it is easily accessible by endoscopy in order to obtain biopsies from affected and healthy individuals; and third, evidence show the existence of a similar gene expression pattern compared with both liver and gallbladder tissues 29 . Immunohistochemical staining in the gallbladder mucosa revealed that TRAF3 protein was observed in different gallbladder cell types such as the epithelium, smooth muscle fibers, arterioles and veins (Fig. 4a). Notably, we found a significant decrease of protein levels in the duodenal mucosa of GSD cases compared to control individuals (P < 0.001, two-tailed t-test; Fig. 4b,c and Supplementary Fig. 3) and also observed a significant decrease of the gene transcripts in the set of mRNA samples from duodenal (P < 0.001, two-tailed t-test) and gallbladder mucosa (P = 0.015, two-tailed t-test) (Fig. 4d). These results show lower levels of TRAF3 in GSD affected individuals,

Discussion
Herein we report that common genetic variants within the ABCG8 and TRAF3 genes confer risk to GSD and GBC in the admixed Chilean population. The ABCG8 polymorphism rs11887534 (D19H) has been reported in several world populations including Latinos 15 , thus the results reported by the present study serves as a novel replication and adds strength to the observed association of the gene. In the discovery stage, the leading ABCG8 SNP showed a lower magnitude of association, compared with the estimated from the German population 10 , which could be explained by the difference in the risk allele frequency (C allele) between Chilean and German controls (0.079 vs. 0.05, respectively). This could be a reason for not being able to detect in the discovery stage a genome-wide signal (P < 5 × 10 −8 ) in the ABCG8 locus, suggesting that larger sample sizes are required to reach such significance level in future studies on this population.   ABCG8 was consistently associated with GSD in all the populations analyzed in this study, however, TRAF3 association was only observed in Chilean samples. These results suggested that TRAF3 association might have an ethnic-specific effect. Interestingly, data from the 1000 genomes project from Peru (PEL), Mexico (MXL), Colombia (CLM) and Puerto Rico (PUR) show a higher risk allele frequency compared with the one from the overall European population (RAF PEL = 0.84, MXL = 0.55, CLM = 0.53, PUR = 0.49, EUR = 0.34) indicating that would be of interest to validate the TRAF3 association in closely related Latino populations like Peru, Argentina and Bolivia, that also have a high GSD prevalence [30][31][32] .
GSD has known confounding factors such as age, gender, BMI and T2D, and past studies have considered some of these biological variables into their association analyses 10,15,33,34 . Furthermore, admixed populations present a special challenge in GWAS analyses due to the presence of 2 or more ancestry components that needs to be considered when performing the association testing 35,36 . In this regard, we performed 2 additional analyses to examine whether the association of ABCG8 and TRAF3 variants with GSD could be affected by these confounding factors. First, besides age as the main covariate, we included the 5 first eigenvectors from the principal component analysis (PCA, see methods) to take into account any hidden population structure that could still be present. Second, we adjusted the logistic regression by age, the aforementioned 5 eigenvectors and the other known GSD covariates such as gender and BMI to test their effect regardless of their distribution in cases and controls. Remarkably, we found that these additional adjustments did not alter the selection of ABCG8 or TRAF3 as candidate genes for the replication stage since the association P values for their leading variants remained above the significance cutoff of P < 1 × 10 −5 (Supplementary Table 3).
Here we also observed that ABCG8 and TRAF3 variants are significantly associated with GBC in admixed Native Americans. For instance, the ABCG8 rs11887534 variant has been previously associated with this malignancy in China 21 , India 20 and European populations 37 and most of the studies show that the GBC risk is more pronounced in patients that also had biliary stones in the gallbladder, as it is likely observed in this Chilean cohort (i.e. composed of 90% women GBC patients with cholesterol gallstones). This agrees with our findings where both ABCG8 and TRAF3 variants only contribute to risk when comparing GBC subjects with gallstone free controls 38 . A recent report from a group in India showed a GWAS scan for GBC in their population and found that the genes ABCB1 and ABCB4 were significantly associated with GBC 38 . We looked for their association in our GSD GWAS discovery population and found that none of the 3 reported SNPs showed at least nominally significant P values (Supplementary Table 4). This could be indicating that these new GBC risk factors are not be necessarily associated to the GSD phenotype linked to GBC in the Chilean population.
TRAF3 belongs to the TNF receptor associated factor (TRAF) protein. TRAF3 function has been related to the promotion of autoimmunity and predisposition to various cancers 39 , including multiple myeloma 40 , Hodgkin lymphoma 41 and splenic neoplasms 42 . The protein physically interacts with the NF-κB-inducing kinase (NIK) promoting NIK degradation by the proteasome, which results in the inactivation of the non-canonical NF-κB pathway 43 . Indeed, siRNA knockdown of TRAF3 in human cancer cell lines stabilizes NIK and activates NF-κB dependent transcription 44 . Interestingly, recent evidence indicates that activation of NF-κB results in invasion, lymphangiogenesis and tumor growth in GBC tissue and cell lines 45,46 , and therefore genetic alterations in one of its modulators could be directly involved in GBC pathogenesis and may serve as a target of future therapies and prevention.
Inflammation is a hallmark of GSD pathogenesis and it has been suggested that the formation of biliary stones is preceded by histopathologic alterations in the gallbladder wall that indicate tissue inflammation 47 , including edema, increased wall thickness, decreased motility and altered transport function in the epithelium. Therefore, a crosstalk between cholesterol stone formation and inflammatory processes might exist that promote the development or progression of the disease and suggests that ABCG8 and TRAF3 could be both relevant actors in the pathology. First, it is known that the ABCG8 variant promotes gallstone formation by enhancing the efflux of cholesterol-saturated bile 11 . Second, lower expression levels of TRAF3 in duodenal mucosa and gallbladder epithelium of GSD patients (as observed in the present study) could be responsible for increased production of pro-inflammatory factors, such as NF-κB, IL-6 and IL-12B, as well as for decreased activity of anti-inflammatory factors such as type 1 interferon and IL-10 44,48-50 . Notably, the IL-10 has already been associated with GSD and GBC 51 . Interestingly, according to the Genotype-Tissue Expression (GTEx) project database, which offers gene expression and quantitative trait loci associations from 53 human tissues 52 , individuals with the TRAF3 risk allele (rs12882491, C) present decreased TRAF3 expression levels in the small intestine, the closer tissue to our duodenum data (Supplementary Fig. 4). Although the correlation is not statistically significant, the direction of the effect agrees with our findings, indicating that the GWAS risk variant could be functionally linked to the decreased gene levels observed in the study.
In summary, our results confirm the association of ABCG8 and identify TRAF3 as a novel candidate gene for GSD and GBC in the high-risk Chilean admixed population. We hypothesize that both genes are functionally involved in the cascade of events that triggers gallstone formation and the inflammatory response observed in affected patients. Further research is needed to determine the role of the variations within the TRAF3 gene that could explain the observed differential gene expression and its contribution to the pathophysiology of the disease.

Methods ethical statement. All procedures involving genetic and clinical data usage from Chilean GSD and GBC
patients were approved by the Ethical Scientific Committee at the Pontificia Universidad Catolica de Chile and were conducted in accordance with the guidelines of the National Commission on Science and Technology (CONICYT-Chile). All participants provided informed consent. In the German replication samples, Ethical Scientific Committees have previously approved these studies 22 Table 1). To reduce the contribution of confounding factor in the following genetic association analyses, cases and controls were matched for known GSD covariates and comorbidities, including age at sampling, sex distribution and BMI (Table 1). We excluded T2D individuals from the discovery stage in order to reduce the number of factors to consider in the adjusted analyses.
Chilean GSD replication stage. The independent replication population consisted of 1,643 Chilean individuals (626 cases and 1,017 controls), coming from the ANCORA family health centers and from a population base study in La Florida, which has been previously described 7 . Information regarding age, sex, BMI and T2D was gathered for all individuals to control for covariates/comorbidities in the genetic association analyses.
Chilean gallbladder cancer (GBC) population. A secondary Chilean replication population was derived from GBC patients, with more than 90% presenting gallstones. The population is composed of a total number of 397 incident and prevalent women cases, recruited from different health centers of the country including: Hospital Sótero del Río, Clinical Hospital of the Pontificia Universidad Católica de Chile, Hospital Regional de Valdivia, Hospital Regional de Temuco and Hospital Regional de Concepción. GBC phenotype was defined according to criteria and steps described by the Classification of Malignant Tumors and the American Joint Committee on Cancer (TNM-AJCC) where only adenocarcinomas were included. Each histological cut was reviewed and selected by two independent expert pathologists. After delimiting the healthy tissue, either adjacent to the tumor or other non-tumoral tissues such as liver or cystic ganglion, blocks were selected for sequential cuts of 10 µm thickness and used for extraction of genomic DNA. DNA concentration and integrity were determined by the NanoDrop ND-1000 ® Spectrophotometer and by agarose gel electrophoresis, respectively.
European replication populations. We analyzed two European GSD population from Germany. The first population comes from the POPGEN biobank of the Medicine Faculty at the University of Kiel, consisting in 1,938 individuals (1,027 cases and 911 healthy controls) where GSD status was determined by cholecystectomy or ultrasound 22 . Considering that GSD prevalence increases with age 10 , the statistical analysis performed on this sample did not include age as a covariate, since stone-free controls were chosen and they had a higher median age (59.43 years old) than affected individuals (45.5 years old). The second population comes from the Study of Health in Pomerania (SHIP) from the Greifswald region in northeast Germany, comprising 4,154 subjects (882 GSD cases and 3,272 healthy controls) where GSD cases were identified by cholecystectomy/ultrasound and control individuals by senlf-reports 23 .
Genomic DNA extraction and genotyping. Genomic DNA was extracted from samples of the Chilean discovery population, replication and GBC incident cases (alive patients) using peripheral blood (Invisorb Blood Universal kit, Invitek, Berlin, Germany). For GBC prevalent cases, DNA was extracted from normal (non-tumoral) tissue samples using the QIAamp DNA-Mini-Kit © (Qiagen, Mainz, Germany). Extracted DNA was quantified with a nanodrop spectrophotometer to a fixed concentration of 50 ng/μl. For the 2 European populations, DNA extraction methods is described in their respective publications. Genome-wide genotyping on the Chilean discovery population was performed with the Affymetrix Axiom ® Genome-Wide LAT 1 World Array 4 at the Cologne Center for Genomics (University of Cologne, Germany). The microarray includes probes to identify 818,155 SNPs and it has been optimized to enhance the capture of variants in populations with Amerindian ancestry 54  Global Ancestry estimation. Global ancestry proportions for each individual in the Chilean discovery population was calculated with the ADMIXTURE v.1.3.0 software 60 using the reference panel of the 1000 Genomes project Phase-3 and 11 individuals from the Mapuche-Huilliche ethnic group as a Chilean Native American reference panel (HUI) 59 . The mean proportion of the Amerindian ancestry was calculated in cases and controls and tested for significant deviation using a one-way ANOVA test.
Genotype imputation and GWAs. Genotypes were imputed using IMPUTE2 v2.3.1 without the prephasing step in order to gain accuracy 61,62 . We used the genotype reference panel coming from the 1000 Genomes Project Phase 3 for autosomic variation (October 2014 release) and 1000 Genomes Project Phase 1 for the X chromosome (March 2012 release). After imputation, we selected SNPs and Indels with an INFO score > 0.4, MAF > 0.01 and HWE > 1 × 10 −6 leaving 9,433,911 analysis-ready variants. Information on whether a candidate variant comes from imputation or actual chip genotyping is presented in Table 1. Genetic association analyses were performed using the score statistic as an additive logistic regression model implemented in SNPTEST v2.5 63 and adjusted for age as a covariate. Other GSD known confounding factors such as gender, BMI and TD2 were either matched or non-existent in cases and controls (see results), therefore we did not include them in the main association analysis in order to avoid overfitting. The Population Attributable Risk (PAR) was calculated for ABCG8 and TRAF3 leading SNPs in the combined Chilean discovery and replication population, using the formula: PAR = P R F × (RR − 1)/1 + P RF × (RR − 1) × 100, where P RF is the prevalence of the risk factor in the general population. The allelic odds ratio (OR) was used as an approximation for the estimated relative risk (RR) of disease due to exposure to the risk allele. Combined PAR for the 2 risk variants was calculated as: . SNPxSNP interaction analysis was performed with PLINK v1.9 epistasis function.
statistical analyses in the replication populations. Replication association analysis was performed using additive logistic regression tests with PLINK v1.9, adjusting by age, sex, BMI and T2D status as they were available for the data. Fixed effect meta-analysis was performed between the discovery stage and variants showing successful replication after Bonferroni correction for 10 tests (P < 0.005), using the inverse-variance method implemented in the -meta-analysis command in PLINK. Forest plots were generated using the rmeta package v.2.16 in R. Regional association plots were generated using LocusZoom (v1.1) 65 .
Differential gene expression analysis of TRAF3. For expression and histological analyses (see below) human gallbladder mucosa samples were used as described 66 . Briefly, duodenal mucosa samples were extracted from gallstone and gallstone-free individuals that met the following criteria: women 18 to 35 years of age, non-obese (BMI 18-30 kg/m 2 ), without significant pathologies and consumption of any drug 2 months prior to recruitment and not pregnant at the time of contact. Duodenal biopsies were taken from the second segment of the duodenum (distal to the ampulla of Vater), during gastrointestinal endoscopy performed at the Endoscopic Unit of the Gastroenterology Department of the Pontificia Univesidad Católica de Chile. Total RNA from was extracted using the PureLink TM RNA Mini Kit (Ambion Life technologies, Carlsbad, USA) and reverse-transcribed into cDNA with a High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scientific Inc., Carlsbad, USA). Quantitative RT-PCR from 25 gallbladders (12 cases and 13 controls) and 41 duodenal (22 cases and 19 controls) samples was performed using specific DNA primers for TRAF3 (Forward primer = 5′-TACAGCGTGTCAAGAGAGCATCGT-3′, reverse primer = 5′-CACAACCTCTGCTTTCATTCCGACAATAG -3′) and SYBR-Green Brilliant III Ultra-Fast SYBR-Green qPCR Master Mix (Agilent Technologies Inc., Santa Clara, USA). Data were expressed in arbitrary units and were normalized to RNA18S levels. Statistical significance was determined using a two-tailed t-test (*P < 0.05).
Western blot. Fifty micrograms of total proteins from duodenal mucosa lysate of 9 samples (5 control and were used as secondary antibodies (1:1000, Santa Cruz Biotechnology, Inc., Santa Cruz, CA, USA) for 1 h at room temperature. The signals were detected using the chemiluminescence kit ECL Western Blotting (Pierce Biotechnology, Inc., Rockford, IL, USA). Complete Western Blot gels from where Fig. 4b. was composed is presented in the Supplementary Fig. 3.
Immunohistochemistry analysis. Two-micron sections of gallbladder samples were fixed with formalin and embedded in paraffin. For TRAF3 immunohistochemistry analysis, the rabbit anti-TRAF3-ab155298 antibody was used (1:100, ABCAM, Ltd, Cambridge, UK) and was visualized using the Envision Flex/HRP SM802 complex method (Dako, Agilent Technologies Company, Santa Clara, CA 95051, USA). Labeled sections were examined and captured using an Olympus BX51 microscope (Olympus, Tokyo, Japan) with the software Stereo Investigator v.11.03 (MBF bioscience, Williston, VT, USA) and analyzed with the Image-Pro Express program (Media Cybernetics, Bethesda, MD, USA) as described previously 66 .

Data Availability
All genotype files for the Chilean discovery population are available upon request to the corresponding authors.