Introduction

Multiple sclerosis (MS (MIM 1262000)) is an inflammatory autoimmune disorder, with strong evidence for both genetic and environmental risk factors in its aetiology. The importance of the major histocompatibility complex (MHC) as a genetic risk locus has been recognised for more than 30 years, and early linkage studies consistently demonstrated the significance of the human leukocyte antigen (HLA) class II locus.1, 2, 3 Fine-mapping and admixture studies suggest the importance of the HLA-DRB1 gene, especially the HLA-DRB1*1501 allele in European populations, but complex interactions within the MHC have further complicated interpretations of susceptibility.4, 5, 6, 7 It is only with the recent advances in genetic technology that influential loci outside the MHC have begun to be identified, and considerable progress has been made in recent years in identifying common variants of modest effect using genome-wide association studies (GWAS). However, these common variants account for only a portion of the heritability of MS, and GWAS approaches are limited in their ability to identify rare variants, epistatic effects or gene-environment interactions that are likely to be important in explaining this missing heritability. To date, alternative approaches have remained relatively neglected.

The prevalence of MS in Orkney and Shetland is among the highest in the world, and accounting for the excess of MS in these isolated island groups may help to elucidate MS aetiology.8 With relatively low genetic and environmental variability, these populations are ideal for studying genetic risk factors. Owing to the unusually high prevalence of MS in the Orkney and Shetland Islands, the Northern Isles of Scotland have long been of interest to MS researchers, and studies have ranged across the gamut of environmental and genetic risk factors.9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20

Given the isolated location of these islands off the northern coast of Scotland, and consequently the relatively minimal movement in and out of the populations, several of these studies have examined the potential effects of inbreeding and consanguinity within these populations as a means of indirectly assessing the role of recessive genetic effects in MS.9, 11, 12 Using pedigree data for MS cases and controls in Orkney, Roberts9 found 13% in each group to be the product of consanguineous unions. However, the inbreeding coefficients (the probability that two alleles at a random locus are identical-by-descent, having been inherited from a common ancestor) were twice as high in cases (F=0.00364) as in controls (F=0.00170). Similarly, kinship coefficients (the probability that an allele at a random locus is identical in two individuals) between parents of cases were twice that of parents of controls (ϕ=364 × 10−5 compared with ϕ=170 × 10−5). Despite the small size of this study sample, these results indicate a potential role for recessive alleles inherited identical-by-descent from both parents in influencing MS risk. A study in Shetland found similar, albeit less marked, differences in that population.11

Further evidence for a recessive effect in MS has been found elsewhere. A study in a Dutch isolate found MS patients to be significantly more related to each other than the control group, and more often inbred (82% compared with 36%).21 Sadovnick et al22 found the sibling recurrence rate of MS to be almost four times as high among offspring of consanguineous parents than among those of unrelated parents (8.0% compared with 2.3%). Evidence for the importance of a founder effect in explaining the high prevalence of MS in the Southern Ostrobothnian region of western Finland has been found, which is also consistent with a possible role of recessive effects in MS aetiology.23 Callander and Landtblom24 reported a cluster of MS cases in Lysvik in the Swedish county of Värmland occurring in a family with documented marriages between cousins, increasing the opportunity for recessive effects to influence MS risk.

Amidst the recent enthusiasm for GWAS in risk allele identification, the potential of homozygosity studies for identifying recessive effects has been relatively neglected in MS, as in other diseases. Genetic isolates, such as Orkney and Shetland, are ideal for investigating various aspects of homozygosity in disease aetiology, an approach which has been very successful at finding recessive effects in Mendelian diseases and which is increasingly being used for complex diseases.25, 26, 27 Although earlier methods of estimating homozygosity relied on either single-point or complex multi-point inferences from limited marker data, McQuillan28, 29 recently developed a more robust observational method that exploits the availability of high-density single nucleotide polymorphism (SNP) data. Such high-density data can be used to describe both ancient and more recent parental relatedness through comparing the total proportion of homozygosity and unbroken runs of homozygous markers in study participants. The application of this more accurate measure of homozygosity to a population possibly enriched for MS-associated alleles, such as those of the Northern Isles, has the potential to expose further evidence of the recessive effects indicated by earlier studies, and presents a complementary strategy to GWAS in the description of the genetic architecture of MS. This study aims to investigate the cumulative impact of (semi-)recessive effects on MS risk using a variety of measures of genome-wide homozygosity.

Materials and methods

Study participants

The Northern Isles Multiple Sclerosis (NIMS) study recruited MS patients and unaffected controls from the Orkney and Shetland Islands under approval from the North of Scotland Research Ethics Committee. A total of 266 individuals were recruited, of whom 88 were cases, 89 age-matched controls and 89 elderly controls. All known patients with MS living on the islands or in the Grampian or Highland regions of mainland Scotland were identified by contacting general practices on the islands, and reviewing MS databases held in secondary care in Aberdeen, Inverness, Orkney and Shetland. Recruitment of cases to the study was conducted through letters forwarded by general practitioners inviting those of Orcadian or Shetlandic descent to participate. Some cases volunteered directly, following media coverage.

Two groups of controls were recruited: one matched with cases by age, sex and ancestry within the islands; the other matched by sex and ancestry within the islands, but passed the majority of lifetime risk of developing MS. Ancestry was considered an important factor to control for, in order to account for the fine-scale population genetic structure that has developed between parishes and isles as a result of the very short distances over which marriages were traditionally conducted in these isolated islands. Both groups were recruited from among volunteers who came forward after requests in the local media for general controls and for those born between 1935 and 1937. NHS Shetland also assisted by forwarding invites to a random selection of their patients born between 1935 and 1937. These years were selected on the basis that evidence indicates that the risk of developing MS is lower after the age of 70, and that it is difficult to reliably construct complete six generation pedigrees in all ascendant lineages for people born before c.1930.28 Controls with first or second degree relatives with MS, or those who screened positive for possible undiagnosed MS, were excluded and a new control was chosen.

Informed consent was obtained from all participants. A neurology clinical research fellow (EV) then performed a structured history and examination (including the Kurtzke Expanded Disability Score) of all patients and controls and reviewed their hospital and GP notes.30 The Poser and McDonald diagnostic criteria for MS were then applied and the type of MS classified.31, 32, 33 Blood samples were subsequently collected and detailed family history questionnaires administered.

Genotyping

Genomic DNA was extracted from whole blood for MS cases and controls. Samples were genotyped using the Illumina Infinium HumanOmni1 – Quad BeadChip (Illumina Inc., San Diego, CA, USA), according to the manufacturer's protocols, involving 1 140 419 markers and coded according to the top strand allele coding scheme.

Data analysis

Data cleaning and quality control were performed on the genotyping data using PLINK version 1.04, and SNPs with low genotyping call rates (<0.95, n=35 241), low minor allele frequency (<0.01, n=231 500) and Hardy–Weinberg equilibrium failures (P<0.001, n=2845) were removed.34 Individuals with low genotyping calls were checked for, but none required removal (<0.97, n=0). X chromosomes display different levels of homozygosity to autosomes (and are hemizygous in males), and their inclusion in an analysis of this type would give misleading results. For this reason, non-autosomal SNPs were also removed (n=19 176), resulting in a total of 266 individuals and 794 367 SNPs. The genotyping rate per individual is 0.999032. The length of typed autosomal genome, excluding centromeres is 2 689 340 kb and the density of SNP coverage is 3.38 kb/SNP.

Reported and genomic sex were compared and no anomalies were identified, resulting in no removals. Pedigree-genomic anomalies were also assessed, indicating the removal of three individuals: one on the basis of sample contamination and two on the basis of inconsistency between genomic and pedigree information suggestive of DNA switching or other errors. Multidimensional scaling plots of identity-by-state sharing distances revealed no extreme outliers, but identified three principal components to be included in subsequent regression analyses. The data set was pruned for strong local LD, and the resultant set contains 41 004 SNPs, a typed autosomal genome length of 2672.78 Mb and a SNP coverage density of 65.18 kb/SNP.

Three measures of genome-wide homozygosity were generated in PLINK for each individual.34 These included the following:

  1. 1)

    Hobs: observed homozygosity is the simplest measure. It is defined as the percentage of SNPs that are homozygous, and estimates total parental relatedness; that is, this measure reflects the size of both an individual's ancient as well as recent ancestral genepool.

  2. 2)

    FROH: the proportion of the typed autosomal genome in runs of homozygosity longer than 1 Mb, as a measure of recent ancestral history.29

  3. 3)

    FROH(pr): a similar measure to FROH, but is a more stringent method of assessing recent parental relatedness, as it utilises a panel of independent SNPs (after LD pruning).

Using R version 2.10.1,35 these measures were included in logistic regression analyses of cases and age-matched controls, and of cases and 1936 controls, adjusting for sex and the three principal components identified in the multidimensional scaling plots.

Results

Demographic characteristics for the final 263 participants are summarised in Table 1. The three measures of homozygosity generated (FROH, FROH_pr and Hobs) are summarised in Table 2. To minimise the potential confounding influence of differential numbers of half-Orcadian and half-Shetlandic participants (that is, those with one set of grandparents born outside the Northern Isles – a group with significantly lower levels of homozygosity than other subjects), figures are given for both the whole sample and for a sample restricted to those with all four grandparents born in the Northern Isles. In his study of inbreeding and MS, Roberts further divided his sample into those with all four grandparents born in either Orkney or Shetland.9 To facilitate comparison with Roberts' work, analogous summaries are given in Table 3.

Table 1 Demographic characteristics of cases, age-matched controls and past lifetime risk controls
Table 2 Means (%) and SD for three measures of homozygosity by MS status in all participants and in those with four grandparents born in Orkney or Shetland
Table 3 Means (%) and SD for three measures of homozygosity by MS status in participants with all four grandparents born in the Orkney Islands, and in those with four grandparents born in the Shetland Islands

None of these measures of homozygosity was found to be positively associated with MS in any of the samples (Table 4). In fact, within the cases and past lifetime risk control group, there was marginal evidence of a negative association between genome-wide homozygosity and MS. The effects are mostly in the opposite direction to that indicated by previous research, although this trend is reversed when only those with Shetland-born grandparents are examined. Given the very small subsample and consequently wide confidence intervals, the importance of this finding is questionable.

Table 4 Association between genome-wide homozygosity and MS susceptibility

Discussion

The small population and historically limited migration into the Northern Isles mean that these populations experience higher levels of parental relatedness than elsewhere in the British Isles, leading to a greater proportion of alleles at a given locus in an individual being identical-by-descent.36 It could reasonably be expected that if recessive effects were an important part of MS aetiology, they could easily be discerned in such a population enriched for homozygosity. This, combined with rates of MS among the highest in the world, makes the populations of Orkney and Shetland ideal for investigating the effect of genome-wide homozygosity on MS risk.

That runs of homozygosity increase with endogamous matings is demonstrated by the consistently higher mean values of FROH and FROH_pr in participants with four Northern Isles grandparents compared with the whole sample. Mean Hobs values are also higher in the endogamous group, but to a lesser extent, as this measure also includes more ancient shared parental ancestry common in all participants. Participants in the past lifetime risk control group, being older, are more likely to have four grandparents born in the Northern Isles (or within one isle or parish), and consequently exhibited higher mean FROH, FROH_pr and Hobs values than either cases or age-matched controls. It is this differential indigenousness that biases the results to produce the apparent protective effect of homozygosity against MS when cases are compared with the past lifetime risk control group, rather than any real effect.

Overall, no effect of genome-wide homozygosity was detected. Although this does not discount the possibility of recessive effects, particularly for individual rare recessive variants, it does suggest a limit to the size of any genome-wide effect, if the failure to detect it is a consequence of inadequate sample size. Further, it places some doubt upon the value of Roberts' extrapolation of the potential involvement of recessive effects from cruder measures.9 It is worth noting that, before that study, Roberts, Roberts and Poskanzer12 found no association between inbreeding or kinship coefficients and disease status, and it was only when Roberts expanded upon this study to include a larger number of participants that an effect was found. This might suggest that the present study has failed to detect a real effect through small numbers. However, Roberts used a total of 69 cases, with corresponding controls, and a pedigree-based measure of inbreeding, which owing to its incomplete and error-prone nature, can provide only expected, rather than realised, levels of homozygosity.9 Conversely, this study included a total of 88 cases, with two sets of corresponding controls, and a more accurate, genomic measure of realised homozygosity. The number of cases was, necessarily, reduced when only those with all four grandparents born in Orkney or Shetland were retained (n=52), but the increased precision of the more direct measure of homozygosity should have enabled replication of Roberts' earlier result, if present.

Standard regression techniques assume independent observations but, as the sample includes related individuals, this assumption is potentially violated. Although this suggests that a regression model accounting for relatedness may be indicated, it is salient to note that the sample is not highly related (mean pair-wise kinship: pi hat=0.004), nor is MS highly heritable. Thus, standard logistic regression techniques are appropriate for use in this instance.

The poor levels of precision indicated by the wide confidence intervals are the result of the relatively small sample size. This is an unavoidable consequence of restricting the scope of the study to the small populations of Orkney and Shetland, as there are only a limited number of MS cases available for recruitment. Furthermore, the limited number of cases precludes use of this data set as a stand-alone GWAS, although the data may be included as part of a meta-analysis. Despite this, the use of genetic isolates to investigate the genetic risk factors of complex diseases provides a unique opportunity to elucidate the causative role of recessive and rare variants. While the findings of such investigations may be less widely generalisable than those of GWAS, they nevertheless contribute to greater understanding of the aetiology of disease and, furthermore, are of particular relevance to the study population.

No effect of genome-wide homozygosity was observed, and therefore provides no evidence that inbreeding or consanguinity is a risk factor for MS in these populations. However, recessive effects at individual variants may still prove important in explaining the high rates of MS in this population. Techniques for performing homozygosity mapping have recently been developed for use beyond the previously limited scope of consanguineous families, and this represents the logical next step in assessing how homozygosity and recessive alleles might contribute to the heritable risk of developing MS.37