Replication analysis of variants associated with multiple sclerosis risk

Multiple Sclerosis (MS) is a complex chronic neurodegenerative disorder resulting from an autoimmune reaction against myelin. So far, many genetic variants have been reported to associate with MS risk however their association is inconsistent across different populations. Here we investigated the association of the most consistently reported genetic MS risk variants in the Kuwaiti MS population in a case-control study designs. Of the 94 reported MS risk variants four variants showed MS risk association in Arabs exome analysis (EVI5 rs11808092 p = 0.0002; TNFRSF1A rs1800693 p = 0.00003; MTHFR rs1801131 p = 0.038; and CD58 rs1414273 p = 0.00007). Replication analysis in Kuwaiti MS cases and healthy controls confirmed EVI5 rs11808092A (OR: 1.6, 95%CI: 1.19–2.16, p = 0.002) and MTHFR rs1801131G (OR: 1.79, 95%CI: 1.3–2.36, p = 0.001) as MS risk genetic factors, while TNFRSF1A rs1800693C had a marginal MS risk association (OR: 1.36, 95%CI: 1.04–1.78, p = 0.025) in the Kuwaiti population. CD58 rs1414273 did not sustain risk association (p = 0.37). In conclusion, EVI5 rs11808092A, TNFRSF1A rs1800693C and MTHFR rs1801131G are MS risk factors in the Kuwaiti population. Further investigations into their roles in MS pathogenesis and progression are merited.

ethnic, geo-epidemiological, and environmental factors associating with MS risk 17 . To confirm the association of reported MS risk genetic factors, replication studies should be conducted across different MS populations to understand the influence of ethnic and geo-epidemiological factors on MS risk. Here, we report our findings from a replication study investigating non-HLA reported MS risk variants in a sampled Kuwaiti MS population. Our objective was to assess their association in a semi-ethnically homogeneous Kuwaiti cohort as these genetic risk factors were identified from multi-ethnic MS populations.

Results
Reported MS variants in arab exomes. The demographics and clinical characteristics of Kuwaiti MS patients and healthy Arab controls included in the exome analysis are shown in Table 1. Of the 96 selected variants 87 (90.6%) had variable detection frequency in SureSelect V5 library only, and 9 (10.4%) were not covered by our exome library. The final list of reported MS variants that had consistent detection across all exomes included 18 (20.7%) variants of acceptable detection frequencies across the two cohorts (Supplementary Table 1). Four variants had statistically significant different allele frequencies in MS exomes compared to healthy control exomes ( Table 2). These variants included; Ecotropic Viral Integration Site 5 (EVI5) rs11808092 (OR: 2.04, 95%CI: 1.4-2.9), TNF Receptor Superfamily Member 1 A (TNFRSF1A) rs1800693 (OR: 2.01, 95%CI: 1.45-2.75), Lymphocyte Function-Associated Antigen 3 (CD58) rs1414273 (OR: 2.2, 95%CI: 1.5-3.2), and a nominally significant and not clearly robust Methylenetetrahydrofolate Reductase (MTHFR) variant rs1801131 (OR: 1.4, 95%CI: 1.02-1.9). However, genotype frequencies were significantly different among the two cohorts for only three variants; TNFRSF1A rs1800693 (p = 0.0001), MTHFR rs1801131 (p = 0.024), and CD58 rs1414273 (p = 0.0001). Since these variants showed risk association in Kuwaiti MS patients when compared to Arab healthy controls we further investigated their association with MS in an exclusively Kuwaiti nationality case-control population sample.

Replication analysis of MS variants in kuwaiti samples. The replication Kuwaiti population sample
included 170 Kuwaiti MS patients and 311 healthy Kuwaitis. Replication cohorts' demographics and clinical characteristics are shown in Table 3. Allelic and genotype frequencies for TNFRSF1A rs1800693, EVI5 rs11808092, CD58 rs1414273 and MTHFR rs1801131 are shown in Table 4. TNFRSF1A rs1800693, EVI5 rs11808092 and MTHFR rs1801131 genotype frequencies in healthy Kuwaiti controls were in Hardy-Weinberg equilibrium, except for CD58 rs1414273 for which genotype frequencies were significantly different than those expected in European populations (p < 0.0001). TNFRSF1A rs1800693C allele did not sustain significant association with    Table 2). However, the effect size of these associations is variable and is relatively small depending largely on the number of samples analyzed. TNFRSF1A is a membrane-bound and soluble receptor for tumor necrosis factor-alpha (TNFα) that plays a role in cellular survival, apoptosis and inflammation in the immune and nervous systems 18 . TNFRSF1A rs1800693 is an intronic variant that has been shown to impact the splicing of TNFRSF1A mRNA resulting in a novel isoform that blocks the action of TNFα 19 . Rs1800693 association with MS risk has been shown to have consistently a small effect size in different MS populations 20 . It is possible that the marginal association of rs1800693 variant with an earlier age of MS onset seen in our study is the reason for the small effect size of this variant reported in other MS populations where younger age of MS onset cases were considerably well represented. Therefore, it is probable that due to the low representation of MS cases with younger age of MS onset in our replication MS cohort (17% with MS age of onset ≤20 years) rs1800693 association with MS risk did not reach statistical significance following adjustment for multiple testing.
EVI5 rs11808092 variant lies in the 3′-end intron of EVI5 gene that shares a similarity to an enhancer element and has been shown to act as a strong enhancer element on the promoter of an adjacent gene (GFI1) implicated in MS risk 21 . Nevertheless, EVI5 rs11808092 as well as other variants in EVI5 have been reported to associate with MS risk and clinical characteristics 22,23 . On the other hand, MTHFR rs1801131 had a nominal association with MS risk in our exome analysis, but later showed a robust association with MS risk in the Kuwaiti only cohorts. MTHFR is an enzyme involved in the conversion of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate and the missense variant rs1801131 (E429A) has been shown to cause decreased MTHFR enzymatic function and is associated with impaired folate metabolism and mild increases in homocysteine levels (hyperhomocysteinemia) that is reported to occur in MS patients 24,25 . Sustained elevated homocysteine levels associate with cardiovascular disturbances that are thought to predispose to MS pathogenesis and progression 26 .
Lastly, CD58 was first identified as an MS susceptibility factor by the association of its rs12044852 variant in a study of African American MS patients and further supported by evidence of altered expression in MS 27,28 . A follow-up investigation provided two other CD58 variants (rs2300747, rs1335532) associating with altered CD58 expression in MS 29 . Structural analysis of CD58 coding region revealed rs1414273 to be in strong linkage disequilibrium with the CD58 MS-associated SNP rs1335532 30 . A recent report highlighted the role of this variant in processing of the CD58 transcript and associated microRNA in MS pathology 31 . Our exome analysis did not detect any of the other CD58 variants except for rs1414273 and initially an association with MS risk was found in the exome analysis. However, the association was later lost in the Kuwaiti only replication analysis. It should be noted that rs1414273A is the minor allele in Kuwaitis and has a lower frequency in Kuwaitis (0.2) compared to ExAc database (0.5) and 1000 genome database (0.43) frequencies. Therefore, it is plausible that rs1414273 allelic distribution is well-retained in the majority of Kuwait's genetic background, and rs1414273 reported linkage to rs1335532 may not be applicable to the Kuwaiti genetic background. Moreover, Qatar's patriotization laws are very flexible with an estimated 11.6% Arab Qataris contributing to the largely immigrant 2,723,003 Qatari population 32 . Therefore, the association of rs1414273 with MS risk in our exome study was probably influenced by a non-uniform genetic background in the Qatari controls. Nevertheless, our replication study findings support the lack of rs1414273 association with MS risk in the Kuwaiti MS population but does not refute other CD58 variants association with MS risk or CD58 altered expression in MS.
In conclusion, we have analyzed the association of a set of reported MS genetic risk variants in a semi-homogenous exome case-control study, and confirmed the results in a replication study of a homogenous genetic background. Population genetic analyses should strive to maintain genetic background uniformity to accurately ascertain population specific genetic risk susceptibilities. Our findings support the existing evidence on the roles of these genes in MS risk and pathogenesis providing further evidence into MS etiology and potential therapeutic targets. (2020) 10:7327 | https://doi.org/10.1038/s41598-020-64432-3 www.nature.com/scientificreports www.nature.com/scientificreports/ Methods patient selection. Blood samples of 283 Kuwaiti MS patients were collected at Kuwait's Dasman Diabetes Institute (DDI), and 311 Kuwaiti healthy control volunteers were collected by word of mouth and social networks from the Kuwaiti population. Collection criteria for MS patients included; a complete clinical MS disease profile (demographics, age of onset, type of MS, disease duration, expanded disability status scale (EDSS) score, and history of MS treatments), being a Kuwaiti citizen, and agreement to provide a 4 mL blood sample. Exclusion criteria included; having an EDSS score of 0, and a disease duration of ≤1 year. Healthy controls' exclusion criteria included; being a non-Kuwaiti expatriate, having a family history of MS, and having a diagnosis of any complex disorder. All study protocols were approved by Kuwait's Health Sciences Center's Joint Committee for The Protection of Human Subjects and DDI's ethical review committee both of which adhere to declaration of Helsinki's Ethical Principles for Medical Research Involving Human Subjects' guidelines. All study information and protocols were fully explained to all participants prior to procurement of their informed written consent. Of the 283 MS patients collected, 113 were prioritized as most informative for exome sequencing to include equally distributed subgroups relevant to each of the following characteristics; EDSS score, disease duration, age group, and male to female ratio of ~1:2. Whereas 56 healthy control Kuwaiti samples were age and sex matched to selected MS samples and prioritized for exome sequencing.
DnA extraction. Blood samples were collected from all 283 MS patients, while healthy controls' sample collection varied between blood and saliva samples. Blood samples were centrifuged at 2,500xg for 10 minutes at room temperature and three fractions were retrieved. Plasma fractions were stored at −80 °C, and white blood cells fractions (buffy coat) were subjected to DNA extraction using QIAamp DNA blood mini kit (Qiagen, CA, USA) according to manufacturer's standard protocol. For saliva samples the same kit was used with minor modifications. For every 1 mL saliva sample 4 mL of sterile phosphate buffer saline (PBS) were added, and samples were centrifuged at 1,800 × g for 5 minutes at room temperature. Resultant pellet was suspended in 180 µl of PBS and 20 µl of kit provided proteinase K was added. The following steps where according to manufacturer's standard kit protocol. DNA quality and quantity were assessed by a NanoDrop spectrophotometer. exome sequencing. Exome sequencing of 113 Kuwaiti MS patients and 56 healthy Kuwaiti controls was performed on an Illumina HiSeq2000 platform using SureSelectXT v5 library preparation with target coverage of 50X (Illumina, CA, USA). Illumina raw paired-end reads, captured with Agilent exome library, were mapped to the human reference assembly (hg19) using Burrows wheeler aligner (BWA) 33 . Sequence Alignment Map data with average of 50X coverage were then compiled and converted to a single compressed binary file, for each strain, using SAMtools 34 . Picard software (http://picard.sourceforge.net) was used to flag PCR duplicate reads. Genome Analysis Toolkit (GATK) was used to (a) local realignment of a read that overlapped with an INDEL (b) recalibration of base quality (c) variants calling 35,36 .

MS variant selection.
A literature search was conducted using the following keywords "Multiple sclerosis genetics, Multiple sclerosis gene, multiple sclerosis polymorphism, multiple sclerosis variant, multiple sclerosis GWAS, multiple sclerosis exome sequencing, multiple sclerosis genome sequencing." In addition, variants included in the MSgene database (MSgene.org) were also retrieved. Collected variants were filtered according to the following criteria; being a single nucleotide variant, being within intronic gene sequence, being encoded in the nuclear genome, and being an MS risk factor. A total of 96 genetic MS risk factors were selected and used for mining exome data of MS patients and healthy controls (Supplementary Table 1). Two healthy control datasets were retrieved from publically available databases that are believed to share genetic background with Kuwaitis; Qatari and middle-eastern exome sequences. Selected Qatari and Middle Eastern (non-diabetic) individuals next generation sequencing data were obtained from the national center for biotechnology information (NCBI) sequence read archive with accessions; SRP060765, SRP061943 and SRP061463. Collectively, the case-control cohorts included 113 MS exomes and 460 healthy control exomes. A Fisher exact test was carried out for each variant, and variants were selected based on p-value of significance <0.05.

Variant genotyping in replication cohorts. DNA from a replicate MS cohort of 170 MS patients and 311
healthy controls was used to genotype polymorphisms with p-values < 0.05. Genotyping was performed using Taqman genotyping assays (Life Technologies, CA, USA) according to manufacturer's standard protocols and analyzed using ABI 7500 Fast Real-time PCR system (Life technologies, CA, USA). Genotype allelic discrimination was determined by SDS v1.4.1 software (Applied Biosystems, CA, USA).

Statistical analysis.
Hardy-Weinberg equilibrium was assessed in the Kuwaiti healthy control cohort for the four replicated variants using allele frequencies of European ancestry as the majority of detected MS risk variants were reported from European populations. For genotype analysis Fisher's exact-test, chi-square test, and linear and logistic regression analyses were used. All statistical analyses were performed using SPSS v.25 (IBM, NY, USA). For the replication analysis, a multiple testing Bonferroni-adjusted p-value ≤ 0.0125 was considered significant.