Introduction

Systemic lupus erythematosus (SLE) is an inflammatory disease characterized by autoantibody production and tissue injury. The vast majority of patients are women in their childbearing age (sex ratio 9:1). Symptoms flare up and recede over time and can appear in a variety of forms, ranging from mild to very severe. SLE is a complex disease with an unknown etiology. Both environmental, such as certain drugs or viruses, and genetic factors are involved in the development of the disease.1,2 The mode of inheritance is unknown, but both familial aggregation and twin studies support a strong genetic component for SLE: the concordance ratio of affected monozygotic twins over dizygotic twins is 10-fold (>20 vs 2%) and the risk for a first-degree relative is 10–20 times higher compared to the general population.3,4 Several candidate loci (including the HLA region, Fcγ receptors, complement components and recently programmed cell death 1 gene (PDCD1) have been implicated through association studies and candidate susceptibility loci have been detected on inbred mouse models of SLE.5,6

Until present, six genome-wide scans using SLE as phenotype without stratification have been carried out on SLE in different ethnic groups using different analytical approaches.7,8,9,10,11,12 Each genome scan revealed several suggestive loci, but only six have surpassed the significance threshold of LOD ≥3.3. These loci are on chromosomes 1q22–q23 and 1q41–q42, 2q37, 4p16, 6p21–p11 and 16q13. The highest LOD score of 4.24 was found on 2q37, where recently the PDCD 1 was found to associate with SLE in a large multinational study.6 Moreover, Johanneson et al13 have recently shown a significant linkage to 1q31 (Z=3.79) with a set of 87 multicase families from Europe and Mexico, Colombia and the United States.

We have focused our genetic studies of SLE on the Finnish population characterized by multiple rural founder effects.14 We reasoned that such a population structure might offer advantages for studying a relatively rare complex disease. Studies in multiple sclerosis, asthma, type II diabetes and obesity, as examples of complex diseases, have already revealed susceptibility loci in Finland, and SLE with its lower incidence might show even more discrete founder effects. We conducted a genome scan in 35 multiplex families with 388 microsatellite markers with subsequent fine mapping and found evidence for suggestive linkage in regions on 6q23–q27 (NPL 2.47, P=0.008) and 14q21–q23 (NPL=2.2, P=0.02) as well as HLA on 6p (NPL=2.17, P=0.02). The 6q23–q27 linkage region also associates with other autoimmune diseases such as diabetes and rheumatoid arthritis, and hence might suggest a candidate region for susceptibility in several autoimmune diseases.15,16,17 Our genome scan also suggested a peak on 14q21–q23 (at D14S587), within 1 cM from the peak of Gaffney et al7 (NPL=2.81 P<0.00016 at D14S276). Furthermore, this region has been reported in two other independent studies, and thus should be considered a confirmed susceptibility locus for SLE.9,12

To proceed toward the identification of the susceptibility genes and to test our initial hypotheses of founder effects, we conducted further fine mapping of two of the susceptibility regions on 6q and 14q. In addition to the 35 multiplex families used in the linkage study, we ascertained 31 simplex families from a restricted region in Central Eastern Finland (Savo province, Kuopio). Kuopio was included because an initial recruitment phase in Helsinki suggested that more patients than expected originated from the Kuopio region, where the prevalence of SLE appeared to be slightly elevated. In order to verify the result of our linkage regions, blood samples of 104 SLE patients and 158 healthy relatives were genotyped with 44 new markers (19 markers on 6q and 25 markers on 14q) and analyzed for haplotype associations. The results reveal short haplotype conservation between families, lending support to the founder hypothesis and suggesting a method of positional cloning of the susceptibility genes.

Subjects and methods

Patient recruitment

The recruitment of patients as well as their phenotypes have been described.18 Briefly, about 1200 SLE patients from a total of 21 hospitals in Finland were asked personally or by mail as to whether they had any relative or a family member with SLE or a connective tissue disease similar to SLE. Patients with a positive family history of SLE and patients with sporadic SLE attending the Helsinki and Kuopio University Hospitals were asked to participate in the study. Kuopio was included because an initial recruitment phase in Helsinki suggested that more patients than expected came from the Kuopio region. All patients included in this study met the American College of Rheumatology criteria for the diagnosis of SLE.19 Comparing the estimated prevalence of SLE in Finland, 28/100 000, we were able to identify 80–85% of all Finnish SLE patients requiring hospital-based treatment.20

Selection of multiplex families

A total of 53 families multiply affected by SLE were identified among all contacted SLE patients. Of these families, 35 families with 73 affected pedigree members and 96 healthy relatives were suitable for a linkage mapping study.18 One patient per pedigree was selected at random for association analyses using a computer program designed for this purpose.

Singleton trios

To facilitate linkage disequilibrium mapping, we recruited SLE patients from the same geographical area, the Savo region, in Central Eastern Finland. This region is also known for a previously characterized founder effect for the MLH1 susceptibility gene for colon cancer.21 Altogether, 38 patients and 68 relatives were collected. To allow the unambiguous reconstruction of haplotypes, available parents or alternatively spouses and offspring were collected. All SLE patients had at least one parent or grandparent from the Savo region. Population records were used to track possible distant relationships between families.

Genotyping

PCR assays, with 20 ng of genomic DNA, were conducted as described previously.22 Our linkage peaks were initially genotyped with an average density of 3.5 cM between markers. Those data were combined with the data from the present study for an overall association analysis. Combined, we have data for a total of 32 markers for the linkage peak on chromosome 14q and for 25 markers on chromosome 6q, yielding a final density of <1 cM between markers (Figure 1). All markers for fine mapping were selected from Marshfield genetic maps at http://research.marshfieldclinic.org/genetics/ and the marker order was verified using the NCBI sequencing database (http://www.ncbi.nlm.nih.gov) and UCSC human genome assembly (http://genome.ucsc.edu). The marker order was in accordance with the deCode genetic linkage map.23 For multiplex families, Mendel errors were checked using the Pedmanager and PedCheck software24 and overall genotype frequencies were verified for Hardy–Weinberg equilibrium.

Figure 1
figure 1

(a) Linkage maps of chromosomes 6 and 14 from 35 Finnish families multiply affected by SLE, full chromosome views (Koskenmies et al, 2003). The distances between markers in linkage regions (black bars) are on average 3.5 cM. (b) Combined linkage results from 35 multiplex and 31 singleton (Savo) families after addition of fine-mapping markers, detailed maps. The intermarker distances are ≤1 cM. Gray bars show the shared associating haplotypes in 10 patients by HPM analysis. Detailed haplotypes are shown in Figure 2. (c) Association analyses using HPM. The dotted line depicts the empirical significance level of 0.01. The y-axis shows the P-value, expressed in logarithmic scale. The x-axis shows the genotyped region in cM (same scale as b).

Association and linkage analyses

We combined all available genotypes for an overall association analysis. To find allele and haplotype associations and study their effects, we used two approaches: haplotype pattern mining (HPM) and transmission disequilibrium test (TDT). HPM is a data mining-based algorithmic approach to genetic association analysis, in which frequent patterns of haplotypes associated with a trait are sought.25,26 HPM analyses case–control data using nontransmitted parental chromosomes as controls (ie, pseudo-controls). We extracted independent trios from the pedigrees using an in-house software tool designed for this purpose. In total, 181 disease-associated chromosomes and 175 control chromosomes were obtained from 89 trios. The following parameters were used for HPM: maximum length of the pattern, 10 markers; maximum number of gaps, 1; and minimum χ2 for a pattern, 3. To compensate for variable marker densities and marker information contents, a total of 10 000 permutations, fashioned as described,25 were run to obtain empirical P-values. Linkage analyses were performed with Genehunter2.27

Results

Patients

The fine mapping and association study included 262 individuals from 66 families. Families originating from Savo region comprised 38 patients and 68 healthy relatives. With the exception of three multiplex families, included already in the genome scan, the other 30 families were simplex trios mostly with mother, father and offspring. In 21 of 30 families with mother, father and child, the affected was either an offspring (in 13 families) or one of the parents (in eight families), available for haplotype reconstruction. In families with a missing family member (mostly spouse or father), another relative (sib or first-degree relative) was available for haplotyping. Consanguinity was studied from population records up to the level of the late 18th century (in most families up to the mid-19th century) and none was observed between the families. Relationships between pedigree members in 35 multiplex families have been described previously.22

Haplotype association in 14q

A total of 32 markers were analyzed on chromosome 14q covering a region of 24 cM across the implicated region between markers D14S70 and D14S63 (Figure 1). There were three intervals with 3 cM between markers (the consecutive intervals D14S980-D14S274-D14S592-D14S63) and one with 2 cM (D14S587-D14S1064). The average polymorphism information content (PIC) value was 0.72. By linkage, the highest peak after fine mapping was observed at D14S1055 (50.30 cM), NPL 2.22 (P=0.015), 5 cM centromeric from our previous peak at D14S587 using a lower density of markers.22

Trios from 65 families were analyzed with TDT and HPM using the randomization procedure to assess empirical significance. A significant association of a 2 cM long haplotype involving 10 chromosomes for the markers D14S978-D14S589-D14S562 at 53.2–55.2 cM (alleles 3-2-8) was identified (Figure 1). The pattern was present in 10 chromosomes from SLE patients but in none of the control chromosomes (χ2=9.95, empirical P=0.00598 for marker D14S562, odds ratio not defined). The next best region of allele sharing was seen between markers D14S1009 and D14S748 (alleles 3-1) at 50.1–50.2 cM. Altogether, 15 patient chromosomes and two control chromosomes shared this haplotype, which, however, did not reach significance (χ2=9.99, empirical P=0.14, odds ratio=13, 95% confidence interval 2.9–59).

Interestingly, both shared haplotypes 3-1 and 3-2-8 were found in the same chromosomes in three individuals, suggesting common origin. Indeed, the longest shared haplotype between two unrelated SLE patients covered 8 cM from D14S288 to D14S281 (Figure 2). Only five meioses were informative for TDT analyses for these haplotypes. The 3-2-8 haplotype was transmitted four times and nontransmitted one time (P=0.18). For the 3-1 haplotype, there were six transmitted and two nontransmitted chromosomes (P=0.157). The TDT thus remained inconclusive.

Figure 2
figure 2

Detailed analysis of independent SLE-associated haplotypes for marker loci on 14q from a set of patients of 35 multiplex and 31 singleton (Savo) Finnish families. The haplotype 3-2-8 is present in 10 patients and no controls and is transmitted four times and not transmitted once. The haplotype 3-1, present in 15 patients and two controls, is transmitted six times and not transmitted two times.

Haplotype association in 6q

On chromosome 6q, the number of mapped markers was 25 across a 22 cM region from D6S308 to D6S1035. With the exception of two gaps with 4 and 3 cM between markers D6S308-D6S1703 and D6S437-D6S1035, respectively, all other markers had an intermarker distance of 1 cM or less, as listed in Figure 1. The mean marker information content (PIC) was 0.74. The highest NPL score observed in genome scan decreased slightly from 2.47 (marker D6S960 at 151.4 cM) to 2.27 (P=0.013) and the peak marker was shifted 5 cM telomeric to D6S1708 at 157.8 cM. The NPL scores and the corresponding information content are shown in Figure 1.

Trios used for the HPM and TDT analyses were the same as for chromosome 14q analysis. By HPM analysis, some allele sharing was observed between markers D6SGATA184A08 (allele 1) and D6S1637 (allele 3) at 147–148 cM, 10 cM away from the peak of linkage. The haplotype GATA184A08-D6S1637 (alleles 1-3) was seen in 27 patient chromosomes and nine control chromosomes (χ2=8.017, empirical P=0.0719 for marker D6S1637, odds ratio=5.8, 95% confidence interval 2.6-13). The empirical association thus remained nonsignificant. However, in TDT analyses, the GATA184A08-D6S1637 haplotype 1-3 was transmitted 15 times and not transmitted five times (χ2=5.00, P=0.025).

Geographical distribution of the associated haplotypes

Our ascertainment scheme was intentionally biased to include an excess of families from Savo in central eastern Finland, in anticipation of a founder effect. Tracing back the birthplaces of parents or grandparents of the patients, we found that five out of 10 chromosomes (5/9 families) with 3-2-8 haplotype on chromosome 14 came from Savo. The population of this geographical region has previously shown founder effects in studies with monogenic diseases.21,28 When looking even more carefully, two of the Savo families from neighboring parishes shared a long haplotype of 8 cM, comprising both associated haplotypes 3-1 and 3-2-8, as described above. Two families with the 3-2-8 haplotype originate from western Finland. There are three families (two originating from Savo) where both the chromosome 14q haplotypes and also the 6q two-marker haplotype (3-1, 3-2-8 on chromosome 14 and 1-3 on chromosome 6) are found, and five families (three originating from Savo) with six patients, where the haplotype 3-2-8 or 3-1 on chromosome 14q and 1-3 on chromosome 6q are present. Considering the exclusion of close relationships between the families, such haplotype sharing is highly unexpected if not associated to the selection criterium (ie SLE), and the clustering of the associated haplotypes to eastern Finland may suggest a founder effect, possibly involving two loci on chromosomes 14q and 6q.

Discussion

The positional cloning of susceptibility genes in complex diseases proceeds from linkage mapping to association study under the hypothesis of shared ancestral mutations among the patients. In this study, we have verified our linkage study results with a dense map of markers and taken the first steps toward a genetic association study for two loci on chromosomes 14q and 6q. We designed our study to take advantage of the subisolate structure of Finland and targeted especially one such region in central eastern Finland. Remarkably, we found significantly excessive haplotype sharing between patients from neighboring parishes whose close relationships were, however, excluded at least back to the mid-19th century. Even more surprisingly, six patients from five unrelated families shared two loci in different chromosomes, an observation that is highly unlikely if not causally associated with the disease.

The length of the associated haplotypes is roughly in accordance with the population history. The chromosome 14q haplotype 3-2-8 spans 2 cM and the chromosome 6q haplotype 1-3 spans 1 cM. Such a long haplotype sharing is not expected unless caused by a relatively young founder effect. The area was settled in the early 16th century and similar founder effects have been noted there before.29,30

Our finding is compatible with previous studies on three independent data sets, but refines the search region considerably. Gaffney et al7 found linkage of SLE to D14S276 (LOD 2.81, P<0.00016) using multipoint nonparametric methods with 105 sib pairs. Our conserved haplotype lies less than 1 cM away from the peak of Gaffney et al.7 Moreover, Shai et al,9 in a study of 80 multiplex families, and Lindqvist et al12 with Swedish families have reported linkages to 14q with LOD scores of 2.02 (P=0.02, D14S258) and 1.15 (D14S592), respectively. For chromosome 6q, linkage to SLE has previously been reported about 20 cM telomeric to our linkage peak.8 Moreover, the same general region has been implicated in previous studies with insulin-dependent diabetes mellitus (6q25, IDDM5, between markers D6S476 and D6S473, LOD 4.5) and rheumatoid arthritis (D6S311 and D6S440, P=0.016 and 0.017, respectively).15,16,17 The overtransmitted haplotype on 6q identified in our study coincides with the peak of Myerscough et al.17

No obvious candidate genes reside within the conserved haplotype regions in gene maps. It will be necessary to perform further fine mapping with tens of more tightly spaced SNP markers within and near the identified haplotypes on both 14q and 6q to confirm the suggested identity-by-descent of the haplotypes. Thereafter, positional cloning can proceed to verifying new transcripts and studying their polymorphisms and possible associations with SLE in this and other sample sets. We conclude that our fine-mapping results suggest potentially highly interesting regions for additional susceptibility genes in SLE.