Common variation in FAM155A is associated with diverticulitis but not diverticulosis

Colonic diverticulosis is a very common condition. Many patients develop diverticulitis or other complications of diverticular disease. Recent genome-wide association studies (GWAS) consistently identified three major genetic susceptibility factors for both conditions, but did not discriminate diverticulititis and diverticulosis in particular due the limitations of registry-based approaches. Here, we aimed to confirm the role of the identified variants for diverticulosis and diverticulitis, respectively, within a well-phenotyped cohort of patients who underwent colonoscopy. Risk variants rs4662344 in Rho GTPase-activating protein 15 (ARHGAP15), rs7609897 in collagen-like tail subunit of asymmetric acetylcholinesterase (COLQ) and rs67153654 in family with sequence similarity 155 A (FAM155A) were genotyped in 1,332 patients. Diverticulosis was assessed by colonoscopy, and diverticulitis by imaging, clinical symptoms and inflammatory markers. Risk of diverticulosis and diverticulitis was analyzed in regression models adjusted for cofactors. Overall, the variant in FAM155A was associated with diverticulitis, but not diverticulosis, when controlling for age, BMI, alcohol consumption, and smoking status (ORadjusted 0.49 [95% CI 0.27–0.89], p = 0.002). Our results contribute to the assessment specific genetic variants identified in GWAS in the predisposition to the development of diverticulitis in patients with diverticulosis.

risk of developing DD in the Icelandic population, containing a replication cohort of Danish individuals with DD 10 . The initial analysis included 15,220 Icelanders who were tested for associations with DD (5,426 cases) and its more severe form diverticulitis (2,764 cases). Subsequently, after applying weighted thresholds 11 , 16 sequence variants identified in the GWAS were followed up in a DD sample from Denmark with 5,970 cases and 3,020 controls. In the combined analysis of these sample sets, three genetic loci that show genome wide-significance and may be associated with the risk of DD and/or diverticulitis were identified: intronic variants at the ARHGAP15 (Rho GTPase-activating protein 15), COLQ (collagen-like tail subunit of asymmetric acetylcholinesterase) and FAM155A (family with sequence similarity 155 A) loci were significantly associated with DD. The second GWAS 12 included 27,444 patients from the European component of the UK Biobank resource and compared them with 382,284 controls. Overall, 154 associated variants were further tested in 31,221 patients from the Michigan Genomics Initiative, finally confirming 42 associated variants including the three previously identified variants 10 . Most recently, a third european GWAS 13 containing 451,099 patients in addition to identification of further loci, also confirmed these three major loci.
Notably, the dissection of the specific phenotypes diverticulosis and diverticulitis was incomplete in the GWAS, since the assessment of diverticulosis or diverticulitis is based on the ICD code. The studies applied the corresponding ICD codes in the ICD10 (K572-K579) and ICD9 (K562. [10][11][12][13] systems, which also encompasses patients with diverticulitis (as well as diverticular bleeding), among the patients with diverticulosis. Even though ICD codes can identify patients with diverticulosis and DD, they do not discriminate between diverticulosis and diverticulitis 14 . A subanalysis of the patients from the Icelandic subcohort of the Icelandic/Danish GWAS included only patients with either surgically treated or complicated diverticulitis 10 , whereas mild and often also outpatient-treated cases of diverticulitis were probably mostly missed. This information is not available in the Danish subcohort at all. Additionally, patients with asymptomatic diverticulosis were not analyzed in these GWAS studies. A part of our samples were used in the GWAS from Schafmayer et al. 13 , and after adding further samples re-analyzed using additional clinical covariates from the database as outlined below and specifically focusing on patients with diverticulosis no prior diverticulitis as controls and (endpoint diverticulitis) and healthy with no diverticula (endpoint diverticulosis).
In this study, we therefore aimed to evaluate the associations between three major SNPs reported in the Icelandic GWAS applying weighted thresholds and confirmed in the North-American 12 and European 11,13 GWAS: ARHGAP15 (rs4662344), COLQ (rs7609897) and FAM155A (rs67153654). The risk of developing diverticulosis and diverticulitis, respectively, was determined in a Caucasian (german/lithuanian) cohort phenotypically characterized for the specific phenotypes diverticulosis and diverticulitis. Due to the similar genetic background of Germans and Lithuanians 15 , a combined analysis was performed.

patients and Methods
Study population. All patients taking part in the study were recruited at the Department of Medicine II, Saarland University Medical Center, Homburg, the Clinic for Gastroenterology and Hepatology, University Hospital of Cologne, Cologne, Germany, and the Department of Gastroenterology at the Lithuanian University of Health Sciences, Kaunas in Lithuania between 2012 and 2016 from patients referred for colonoscopy. A part of the samples from the Lithuanian cohorts came from our previous studies on colonic diseases and diverticulosis [16][17][18] and were also used in a previous GWAS with less clinical information 13 . All patients and controls were of self-reported Caucasians ancestry (including grandparents). Risk factors, epidemiological and baseline data were assessed using a structured interview, performed by a physician assisting the patients with the questionnaires. The presence of diverticula was assessed by colonoscopy in all patients, which is the most widely accepted standard to detect diverticula. Only patients with complete colonoscopy including inspection of the cecum and at least adequate preparation, as assessed by the physician performing colonoscopy, were included in the study. All colonoscopies were performed using digital video endoscopes (high-resolution scopes Olympus CF 160, 180 or 190) by a senior gastroenterologist. Intestinal lavage for endoscopic examination was performed using 2 liters of a solution containing polyethylene glycol. Patients with inherited connective tissue disorders such as Ehlers-Danlos-or Marfan syndrome, non-Caucasian ethnicity or relatives of included patients were also excluded. The diagnosis of diverticulitis was established according to the current classifications for DD 19,20 . It was based on imaging by either computed tomography and/or ultrasound imaging as well corresponding clinical (pain in the lower left abdomen) and laboratory characteristics (increased serum inflammation markers). Suspected complicated diverticulitis was assessed with computed tomography in all cases. The study protocol was approved by the Research Ethics Committee of the Saarland University (approval 63/11), the Research Ethics Committee of the University of Cologne (approval 16-397) and the Regional Kaunas Ethics Committee (protocol No BE-10-2). The study was performed according to the Declaration of Helsinki. All patients have signed an informed consent form to participate in the study. For the purpose of this study, cases were defined as patients with diverticulosis or diverticulitis, respectively. DNA extraction and genotyping. Genomic DNA was extracted from peripheral blood mononuclear cells using the DNeasy Blood& Tissue Kit (Qiagen, Hilden, Germany). DNA concentrations were measured using a NanoDrop spectrophotometer. DNA samples were stored at −20 °C until analysis. Genotyping of the three genetic polymorphisms (rs4662344, rs7609897 and rs67153654) with Taqman assays was performed in 856 patients with diverticulosis and 479 controls of Caucasian descent in our accredited laboratory (DIN EN ISO 15189) in Homburg by a technician blinded to the phenotype of the patients. The fluorescence data was analyzed with allelic discrimination 7500 software v.2.0.6. Statistical analysis. Statistical Package for the Social Sciences (SPSS, Version 20, IBM, Munich, Germany) was used for statistical analysis. Power calculations were performed using PS (http://biostat.mc.vanderbilt.edu/ www.nature.com/scientificreports www.nature.com/scientificreports/ wiki/Main/PowerSampeSize) to detect a significantly increased OR of 2 with a power of 80%, based on the corresponding frequencies of the risk alleles in rs4662344, rs7609897 and rs67153654, and assuming type I error rates of 0.05. Quantitative data were expressed as medians and ranges. Comparisons of frequencies of genotypes at the three loci were performed in 3 × 2 contingency tables listing cases and controls. Genotypic and allelic association tests were performed using χ 2 -square or Fisher's exact tests (https://ihg.gsf.de/cgi-bin/hw/hwa1.pl). Due to the few homozygous mutants of the risk alleles we applied a dominant model. Genotype association analysis between SNPs and diverticulosis was performed using multiple logistic regression models adjusted for age, BMI, smoking status, and alcohol consumption, assuming log-additive effects. P-values < 0.05 were considered statistically significant. Results are expressed as odds ratios (OR) and 95% confidence intervals (CI). Pairwise linkage disequilibrium (r 2 ) was calculated utilizing LDpair 21 using a caucasian reference population (CEU) of the identified variants in all three GWAS 10,12,13 . Results patient characteristics. In total, 1,332 patients (634 men, 47.6%) were included. Table 1 summarizes the baseline data of this study cohort. Frequency of diverticulosis in our cohort was similar to prior data 20  www.nature.com/scientificreports www.nature.com/scientificreports/ in Supplementary Table 1. The call-rate for the variations were >95% for all variants. The genotype frequencies (cut-off P > 0.05) were in Hardy-Weinberg equilibrium (HWE) in all controls for the variants in ARHGAP15 and FAM155A. The HWE for the variant in COLQ deviated in controls (both in diverticulosis and diverticulitis-analyses p < 0.001), and was not included in further analysis. The minor allele frequencies (MAF) ( Table 2) were similar to prior data 10,12,13 . In comparison of patients with diverticulosis and controls, patients with diverticulosis were significantly (P < 0.001) older and more obese (P < 0.001) than individuals with no diverticulosis. When comparing patients with diverticulitis, with diverticulosis and no prior diverticulitis, patients with diverticulitis where significantly younger (P < 0.001), more often smokers (P = 0.006), and more frequently current alcohol drinkers (P = 0.001). No association was detected for BMI (P = 0.20) and gender (P = 0.26). Table 3 presents the data on linkage disequilibrium (LD) of all variants identified in the GWAS 10,12,13. Associations of variants and diverticulosis. Table 2 presents the allelic and genotypic frequencies comparing patients with diverticulosis to controls. MAF of the variant in ARHGAP15 (rs4662344) was increased compared to controls, as described in the GWAS 10,12,13 . The major (T) allele of rs4662344 in ARHGAP15 was significantly (OR 1.28; 95% CI 1.00-1.63) associated with diverticulosis. This association did not withstand after adjusting for corresponding environmental cofactors though (OR 1.22; 95% CI = 0.93-1.61) ( Table 4). Neither was the MAF of the variant in rs67153654 in FAM155A different between cases and controls, nor could an association with diverticulosis be detected ( Table 4).

Associations of SNPs and diverticulitis. The minor allele of the variant rs4662344 in ARHGAP15
was more frequent in diverticulitis cases in comparison to controls with diverticulosis and no prior diverticulitis, as similarly described previously in the GWAS 10,12,13 . The MAF of the major (A) allele of rs67153654 in FAM155A was markedly (OR 0.66; 95% CI 0.47-0.92) reduced in patients with prior diverticulitis compared to controls ( Table 2) as also previously described 10,12,13 . These associations remained significant after adjusting for  Table 3. Pairwise linkage disequilibrium (r 2 ) calculated using LDpair (Machiela et al. 31 ) with a caucasian reference population (CEU) of the identified variants in all three GWAS 10,12,13 . Bold values indicate where r 2 is >0.8 representing strong LD. Variants from the initial GWAS 10 are marked in underline.   www.nature.com/scientificreports www.nature.com/scientificreports/ environmental cofactors ( Table 5). The variant rs4662344 in ARHGAP15 was borderline significantly (OR 1.43; 95% CI 1.00-2.03; P = 0.05) associated with diverticulitis after adjusting for the corresponding cofactors. Even though hampered by small sample size, similar results (n = 64) were obtained when analyzing patients with surgically treated diverticulitis (Supplementary Tables 2 and 3).

Discussion
The aim of our present study was to assess the role of genetic variations consistently identified in the three large recent GWAS 10,12,13 for the specific risks for diverticulosis and diverticulitis, respectively. Our results are in line with previous data concerning the association of diverticulosis with age and BMI as risk factors (diverticulosis) [22][23][24] , as well as alcohol consumption 25,26 and smoking status [27][28][29] as risk factors for diverticulitis.
Our major finding is that the rs67153654 risk allele in FAM155 is significantly associated with diverticulitis after adjusting for cofactors, but not with diverticulosis. The data on the risk variant rs4662344 in ARHGAP15 was less consistent, it was borderline significantly associated with both diverticulosis and diverticulitis, but for confirmation additional larger studies are warranted. MAF of the analyzed SNPs was similar to previous data 10,12,13,30 . Even though the variants in ARHGAP15 and FAM155A initially discovered in the Icelandic/Danish GWAS 10 are at partially different genetic positions on the same genes compared to the variants identified in the following GWAS 12,13 , their LD indicates their common heritability. Therefore, an analysis using the SNPs initially identified in the Icelandic GWAS 10 , which applied weighted thresholds, is justified.
All of the analyzed SNPs are located in introns, supporting a molecular mechanism at the level of RNA-expression in the surrounding gene or LD to another, yet unidentified causal variant.
One of the major strengths of our study is the availability of clinical and endoscopic data and covariates, allowing the exact separation of patients with uncomplicated diverticulosis from patients developing diverticulitis. Patients treated for diverticulitis as outpatients were also included in our analysis. Our study adds new insights into the susceptibility for diverticulitis, specifically attributing the association with the risk variant in FAM155A (protective effect) to diverticulitis, but not diverticulosis. Our study has certain limitations though, that have to be acknowledged. Due to the retrospective design of the study, we could not investigate the outcomes of diverticulosis and diverticulitis, including DD-associated mortality. Even though a similar genetic background is shared between Germans and Lithuanians 15 our results can not necessarily be transferred to non-Caucasians and have to be validated in other ethnicities ethnicity-specific analysis was not feasible due to sample size. Furthermore, several other variants associated to DD in the prior GWAS were not assessed in our analysis, and could also contribute significantly to the genetic risk to develop diverticulitis in patients with diverticulosis. Ultimately, the development of a genome-wide polygenic score 31 should be strived for in both clincal entities diverticulosis and diverticulitis. The role of the risk variants in complicated DD could also not be explored, as our sample size for these additional subgroups was too small. Additionally, to the best of our knowledge only one study from 2010 assessing the application of ICD codes for the discrimination of diverticulosis and diverticulitis is available. Therefore, confirmatory investigations are necessary. Further studies are also needed to understand which of the variants in FAM155A are the causal mutations. Furthermore, as the function of the involved genes is largely known, further elucidation of the molecular background of FAM155A deficiency in the pathogenesis of diverticulitis is warranted.

conclusions
Our results indicate, that the variant in FAM155A is associated with diverticulitis, but not diverticulosis in Caucasians, whereas a risk variant in ARHGAP15 might be associated with both diverticulosis and diverticulitis. Our results contribute to the assessment of these genetic variants identified in GWAS in the predisposition to the development of diverticulitis in patients with diverticulosis.

Data availability
The datasets generated during and/or analyzed during the current study are available on request as permitted by data protection laws and patients consent.