Introduction

Multiple sclerosis (MS) is a chronic inflammatory disorder of the central nervous system of possible autoimmune etiology, and its association with the HLA complex has been known for many years.1 As for most immune-mediated diseases, MS has a proposed complex genetic etiology, but the search for non-HLA susceptibility genes has proven difficult until the introduction of genome-wide association studies (GWASs).2, 3 Since 2007, four GWASs in MS have identified several common susceptibility variants outside the HLA complex,4, 5, 6, 7 many of which are shared by other autoimmune diseases. However, low-frequency alleles, set to a minor allele frequency (MAF) below 0.05 in this study, are seldom included in such studies because of the difficulties in detecting moderate genetic effects at this frequency level.8

A rare variant of the TYK2 gene (rs34536443, MAF=0.04), located on chromosome 19p13, was first identified as a possible susceptibility factor in the combined analysis of autoimmune inflammatory thyroiditis, ankylosing spondylitis and MS in an association scan of 14 500 non-synonymous single-nucleotide polymorphisms (ns-SNPs) carried out by the Wellcome Trust Case Control Consortium (WTCCC) and the Australo-Anglo-American Spondylitis Consortium (TASC) in 2007.5 The associated variant encodes a proline (major allele) to alanine substitution in exon 21 of the TYK2 gene. The TYK2 association in MS was recently substantiated in a replication analysis by Ban et al9 in a sample set of 4234 MS patients, 2983 controls and 2053 trio families from Australia, Belgium, Norway, United Kingdom and the United States of America (P=2.7 × 10−6). However, because of the low frequency of the minor allele, conclusive association with MS at genome-wide significance level (P5 × 10−7) was not established.

Materials and methods

In this Nordic collaboration, samples from 5429 MS cases and 6167 healthy controls were collected in Denmark, Finland, Norway and Sweden (sample distribution shown in Table 1). Patients were diagnosed with MS according to Poser and/or McDonald criteria.10, 11 The Norwegian samples did not overlap with the samples included in the replication study by Ban et al.9 Informed consent from all participants and local ethical approvals were obtained. The samples were genotyped for the TYK2 rs34536443 SNP, performed using TaqMan chemistry on an ABI7900 HT real-time PCR system (Applied Biosystems, Foster city, CA, USA) or on a Sequenom iPLEX platform (Sequenom Inc., San Diego, CA, USA).

Table 1 Association testing of the TYK 2 gene SNP rs34536443

The statistical analyses of the Nordic genotype data were carried out using PLINK v1.05 (http://pngu.mgh.harvard.edu/purcell/plink/).12 The homogeneity of the odds ratios (OR) from the four Nordic populations was tested using the Breslow-Day test, and a combined analysis was performed using the Cochran-Mantel-Haenszel (CMH) test. The Nordic genotype data were then combined with the data from the replication analysis performed by Ban et al9 and the original WTCCC/TASC study.5 This mega-analysis, comprising 10 642 MS patients, 10 620 controls and 2110 MS trios, was performed using likelihood ratio chi-square testing with nationality set as a discrete confounder in Unphased v3.1.3 statistical software (http://www.mrc-bsu.cam.ac.uk/personal/frank/software/unphased/).13

Results

In all Nordic populations, the frequency of the minor allele C of the ns-SNP analyzed was lower in cases compared with controls (Table 1). Genotyping success rate was above 97%, and there was no significant deviation from the Hardy-Weinberg equilibrium (P0.05). The Breslow-Day test excluded significant heterogeneity within the Nordic populations (P=0.8), allowing for a combined Nordic analysis (n=5429 MS cases and 6167 healthy controls) that revealed significant association (P=5 × 10−4, OR 0.78, 95% CI 0.68–0.90). In the analysis of the Nordic data combined with the raw genotype data from the replication analysis by Ban et al9 and the original WTCCC/TASC study (n=10 642 MS patients, 10 620 controls and 2110 MS trios),5 the Breslow-Day test yielded a P-value of 0.05; thus, the CMH test was replaced by a likelihood ratio test, accounting for population heterogeneity. In this mega-analysis, association with the TYK2 SNP reached a convincing level of genome-wide significance (P=5.08 × 10−9, OR 0.77, 95% CI 0.70–0.84).

Discussion

GWASs are powerful tools to detect common genetic susceptibility variants in complex diseases. However, low-frequency susceptibility alleles (MAF <0.05), which exert a modest effect, are harder to establish as large sample sets are required to show association at genome-wide significance level (P5 × 10−7). Even if we saw the same trend in all Nordic populations toward a protective effect of the minor allele of the analyzed SNP, analysis of more than 5000 patients and 6000 controls was not sufficient to reach genome-wide significant association. More than 10 000 cases and 10 000 controls are estimated to be required to have 80% power to detect associations of similar frequencies and effects with genome wide-significance (Quanto power calculator; http://hydra.usc.edu/gxe/). Indeed, this was achieved by increasing our sample size to 10 642 MS patients, 10 620 controls and 2110 MS trios, whereby association reached P=5.08 × 10−9.

Low-frequency SNPs, as analyzed in this study, have often not been selected for genome-wide screens, in which the focus traditionally has been on common variants. More than 1500 published genetic susceptibility variants in complex traits are now listed in the Office of Publication Genomics: A catalogue of published GWASs (www.genome.gov, accessed 23 June 2009), but low-frequency risk alleles make up less than 3% of the total associations reported. However, the WTCCC ns-SNP study had a different approach, that is, to scan only markers that cause alteration in the encoded protein sequence.5 This also implies that the associated TYK2 ns-SNP rs34536443 might have a direct functional significance. TYK2 is a proximal tyrosine kinase in the STAT signaling pathway that is important for signaling by type I interferons and induction of Th1 cell differentiation upon antigen stimulation of dendritic cells.14 The rs34536443 alters the amino acid sequence at position 1104 within the kinase domain of the TYK2 protein, and the protective minor allele, encoding alanine, is predicted to give a less efficient variant of TYK2.9 Interestingly, another functional SNP in TYK2 was recently confirmed to confer susceptibility to systemic lupus erythematosus (SLE).15 Comparing the MS- and SLE-associated TYK2 SNPs, that is, rs34536443 with rs2304256 in Haploview v.4·1 (using the CEU individuals from the HapMap project phase 1, 2 and 3 (http://www.hapmap.org/ and http://www.ncbi.nlm.nih.gov/)), there was no evidence of linkage disequilibrium (LD) between these SNPs (r2=0.044).16 Further analysis is required to see whether these SNPs are associated with both MS and SLE; however, should this be the case, the low LD suggests that these functional variants within the TYK2 gene confer independent effects or point toward a third common causal variant.

In conclusion, we confirm a protective effect of a rare variant in the TYK2 gene in MS. The study illustrates the requirement of large sample sizes for confirmation of low-frequency susceptibility alleles, and to our knowledge, this is the first low-frequency allele shown to be associated with MS at a genome-wide significance level. Although GWASs are establishing associations with more common variants, many of the rare variants are yet to be detected. The next generation of sequencing methods will hopefully facilitate the revelation of these SNPs.