The autoimmune reactions associated with many human diseases are still only partially understood. Unravelling the molecular pathogenesis of inherited diseases with a strong autoimmune component in their clinical expression could help to dissect the molecular background of abnormal immune responses. One such genetic disorder is autosomal recessive polyglandular autoimmune disease type I, also known as autoimmune polyendo-crinopathy-candidiasis-ectodermal dystrophy (APECED, MIM number 240300) [1,2]. The disease is especially enriched in the genetically isolated population of Finland [3]: to date 70 patients have been identified in Finland and less than 100 elsewhere, the incidence of APECED in Finland being at least 1:25,000. The clinical phenotype of APECED includes variable combinations of the failure of parathyroid glands, the adrenal cortex, gonads, pancreatic beta cells, gastric parietal cells and the thyroid gland. Additional features include hepatitis, chronic mucocutaneous candidiasis, dystrophy of dental enamel and nails, alopecia, vitiligo and keratopathy. The diagnosis is typically made in childhood, but symptoms may appear as late as the fifth decade of life [4].

The pathogenesis of APECED is unknown and there are no reports of abnormal biochemical findings or chromosomal rearrangements in the affected individuals. Consequently, a random search of the human genome directly at the DNA level is a relevant approach for revealing the defective gene locus. Here we have taken advantage of recently identified amplifiable polymorphisms assigned to human chromosomes [5]. The microtiter well format of the PCR [6] was adapted for screening these markers, and high informativeness of the markers guarantees relatively wide regions of exclusion in linkage analysis. The map constructed from multiallelic markers was complemented with biallelic markers also detected in the microtiter well format using the solid-phase minisequencing technique [7]. The screening for the APECED locus using 62 polymorphisms suggests that the most probable location for this autoimmune-disease gene is chromosome 22.

Materials and Methods

A total of 15 APECED families were included in this study (fig. 1). In total, 94 blood samples were collected, 25 from affected siblings. Genomic DNA was extracted by standard procedures from 10-ml blood samples [8].

Fig. 1
figure 1

The collection of APECED pedigrees included in this study.

In this study 62 polymorphic markers were analyzed, most of them highly polymorphic microsatellite markers. Alleles were detected by PCR amplification [6] followed by Polyacrylamide gel electrophoresis. PCR was performed in a volume of 15 µl containing 20 ng of template DNA, 2 pmol of primers, 0.2 mM each of dATP, dCTP, dGTP and dTTP, 20 mM Tris-HCl (pH 8.8), 15 mM(NH2)SO4, 1.5 mMMgCl2, 0.1 % Tween 20, 0.01 % gelatin and 0.8 U of Thermits aquatiens DNA polymerase (Promega). One of the primers was labeled at the 5′ end using 32P. The PCR reactions were performed in multiwell microtitration plates for 28 cycles, the first cycle consisting of 95 °C for 3 min, 55 °C for 30 s and 72°C for 30 s followed by 26 cycles consisting of 95 °C for 30 s, 55 °C for 30 s and 72 °C for 30 s and the final cycle of 95 °C for 30 s and 55 °C for 30 s followed by a 4-min extension at 72 °C. The amplified fragments were separated by electrophoresis in 6% denaturing Polyacrylamide sequencing gels. Twelve of the analyzed polymorphisms were single-nucleotide variations. These biallelic markers were analyzed by PCR and the solid-phase minisequencing technique was performed as described in detail elsewhere [7, 9].

To test if the family material available was sufficient for conclusive results in the linkage analyses we first carried out data simulation analysis assuming a single marker locus and double heterozygosity for the parents. We used the SLINK program, specifically its MSIM option, with 1,000 replicates (one replicate equals one round of generating marker genotype for each individual) to obtain the ELOD (expected lod score) values [10, 11].

The MLINK option of the LINKAGE package programs (version 5.10) was used to calculate the linkage between the disease and the individual markers [12]. Because the age at onset of the disease varies, the following age-dependent categories of penetrance were adapted for the healthy individuals for the linkage analyses: for age categories of <2, 2–5, 6–10, 11–15, 16–20, 21–30 and > 30 years a penetrance of 5, 20, 45, 60, 70, 80 and 95%, respectively, was assigned. The original MLINK data from the pairwise linkage analysis was subsequently analyzed with the EXCLUDE program to construct an exclusion map of the disease [13]. The data input into the program consisted of the chromosomal position of the locus, locus name, the linkage data in the form of recombination fraction (we used five θ values: 0.00, 0.10, 0.20, 0.30 and 0.40) and the corresponding lod score calculated by the MLINK program. In all calculations the Haldane map function was used to convert recombination fractions to map distances. The locations of the markers analyzed were based on HGM11 and the map (Marshfield markers 8) released by Dr. J.L. Weber (Marshfield Medical Research Foundation, Marshfield, Wisc., USA) and expressed as a percentage from pter to qter. When the precise chromosomal location was unknown or expanded over a region in a chromosome, the marker was placed in the middle of that region.

Results and Discussion

The family material analyzed is summarized in figure 1. Although the clinical phenotype of APECED is highly variable, the disease is enriched in the genetically isolated Finnish population. Consequently the assumption of locus homogeneity is valid within this population. To test if the available family material was sufficient to obtain conclusive data in linkage analysis of this disease with heterogeneity in phenotype and age-dependent penetrance, we carried out simulations in our family material with 1,000 replicates using the SLINK program [10, 11]. Under the one-locus model for this family material the mean ELOD is 7.110 with a standard deviation of 1.424. Thus, the expected lod score at a recombination fraction of 0.01 ranges from 4.262 to 9.958 (95% confidence interval). The lod score still remained significant in over 50% of the replicates at the recombination fraction of 0.10. This simulation analysis suggests that an evenly spread set of informative markers should result in a conclusive lod score in our family material.

We analyzed the linkage data from a total of 62 polymorphic markers on all 22 autosomes. Fifty markers detected multiallelic polymorphisms. The informativeness of these multiallelic markers is very high and we applied the PCR in the microtiter well format and multi-channel pipetting which provides a rapid semiautomated method for detecting these multiallelic DNA polymorphisms. Two subsequent loadings of the samples on large sequencing-type Polyacrylamide gels made it possible to obtain information from two markers of 60 individuals per gel. In the multiwell format, 96 samples can conveniently be simultaneously analyzed and, although loading of the sequencing gels was carried out manually, the system significantly improved the efficiency in detecting allelic polymorphisms.

The problems encountered with multiallelic polymorphisms were to a large extent associated with the identification of individual alleles in the case of each polymorphism. Frequently this required simultaneous analyses of representative samples of individual families on the same sequencing gel. For linkage analyses per se this would not have been necessary since within a family the allelic pattern could be identified without any problem but for determining allelic frequencies the precise typing of alleles is a necessity.

Twelve biallelic markers were analyzed to bridge the gaps still existing in the chromosomal maps of highly polymorphic amplifiable markers. The alleles of these markers were also identified after PCR amplification, and the specific solid-phase minisequencing technique was applied to detect the alleles [7]. Using this method the results are obtained in numeric values which unequivocally identify heterozygotes and homozygotes for a given polymorphism [9].

In our family material originating from the genetically isolated Finnish population the in-formativeness of the markers analyzed was close to the published heterozygosity values obtained in other populations (table 1). Pair-wise linkage data to individual markers resulted in some low positive lod scores but no definitive proof of linkage to any of the markers analyzed was found. The total length of the chromosomal regions revealing a relative exclusion (lod score <–1.0) was 1,597.7 cM (table 1) compared to total autosomal length of 2,885 cM (table 2).

Table 1 Polymorphic markers analyzed
Table 2 Exclude analysis

In addition to performing the traditional pairwise linkage analyses using the MLINK program we combined the MLINK data and analyzed them with the EXCLUDE program [13]. The result gave evidence of some chromosomal locations with a clearly higher possibility of representing the site of the APECED locus than analyzed loci on the other chromosomes (table 2, figure 2). The result obtained suggested a fairly high probability for the disease being located on chromosome 22, corresponding to a lod score of 3.1 (table 2). This value cannot be directly interpreted as proof of linkage, however. The corresponding maximal lod score achieved in multipoint (LINK-MAP) analysis with the disease against the three markers of chromosome 22 is 1.4. This apparent numeric discrepancy is due to the nature of the EXCLUDE program in which the obtained lod scores are not comparable with the values achieved in the classical linkage analyses. The program takes advantage of the fact that a gene excluded from one region must have a higher probability of being in another region of the genome and this assumption increases the numeric values of the likelihood compared to the values achieved with the more classical linkage approaches adopted in the LINKAGE package programs [12].

Fig. 2
figure 2

Graphical presentation of the exclusion map for APECED disease. The most probable areas for the gene location are colored black while the remaining white areas represent the excluded areas.

Although it is problematic to specify candidate genes for the multicomponent APECED disease with multiple phenotypic features and clinical symptoms from several tissues, the genes encoding proteins that control normal immune responses could serve as tentative candidate loci. Such proteins include, besides histocompatibility antigens and T cell receptors for antigen, immunoglobulins, as well as genes affecting antigen processing and presentation, lymphocyte proliferation and differentiation and the effector molecules that mediate immune destruction in normal immune responses.

Association with the HLA region on chromosome 6 was excluded in an earlier study based on serologically determined HLA haplotypes in APECED families [14]. So far we have analyzed genes coding for the T cell antigen receptors (TCR) on chromosomes 7 and 14 [1518], as well as the immunoglobulin heavy chain (IgH) gene cluster segments on chromosome 14 [18], the gene encoding the CD3 δ chain on chromosome 11[19], and the CD8 α-chain gene on chromosome 2 [20]. The markers D7S435 on chromosome 7, TCRD and IGHJ on chromosome 14, CD3D on chromosome 11 and CD 8A on chromosome 2 are located in the immediate vicinity of these regions. The result of the linkage analyses was a definitive exclusion of all these areas. Interestingly, chromosome 22, which is the most likely chromosome to carry the APECED locus, contains a gene cluster coding for structural proteins of the immunoglobulin λ chain [21], as well as genes for pre-B lymphocyte gene 1 and the interleukin-2 β receptor [22]. This is further justification for future analysis focusing on the chromosomal regions containing these genes.

Efficient chromosomal screening with the microsatellite markers represents a major improvement in the mapping of inherited human diseases [5]. Here we analysed both multiallelic and biallelic markers in the microtiter well format suitable for automatization. Even the semiautomated protocol applied here proved to be effective and economical. The highly polymorphic, multiallelic markers are efficient tools in the mapping of any inherited disease, even with limited family resources. Coupling of the microtiter well format amplification of these markers to the nonradioactive labeling of the product and utilization of an automated sequencer in the identification and storage of allelic information will provide in the future the technical basis for mapping inherited human diseases.