Introduction

The Cohen syndrome (MIM No. 216550) [1] is a rare autosomal recessive disorder of unknown pathogenesis. It is characterized by nonprogressive mental and motor retardation, usually of moderate to severe degree, typical facial features, short stature, and hypotonia [24]. Craniofacial features include microcephaly, short philtrum, prominent upper central incisors, prominent bridge of the nose, high-arched or flame-shaped outline of the lid openings and thick eyebrows [24]. Other characteristic features are laxity of the joints, slender extremities and a cheerful disposition [24]. In addition, marked ophthalmological changes and granulocytopenia are typical features of COH1 in Finnish patients [4]. The ophthalmological findings include myopia and chorioretinal dystrophy that lead to bull’s eye-like macula, retinal pigmentary deposits, decreased visual acuity, night blindness, constriction of visual fields and extinguished electroretinogram [4]. Although the ophthalmological findings in patients outside Finland have often been described incompletely or not at all, ocular findings exactly similar to those of the Finnish patients [4] have been described in patients from several countries [58]. COH1 occurs with low frequency worldwide, but is enriched in the Finnish population. So far, 28 patients have been diagnosed in Finland.

Following a genome-wide search, the COH1 gene for Cohen syndrome was mapped to the long arm of chromosome 8 in five multiplex Finnish families displaying a homogeneous clinical phenotype [9]. The initial gene localization was in an approximately 10-cM region between marker loci D8S270 and D8S521 [9]. We have now extended our family panel by 11 families to include 16 Finnish families. Based on the uniformity of the diagnostic criteria, the rareness of the condition, and the typical features of a founder population we assumed that all, or at least an overwhelming majority, of these patients have inherited the same ancestral mutation. Here we used newly described markers in the COH1 region to look for putative ancestral recombinations in the disease haplotypes, and applied the Luria-Delbrück principle [10] to calculate genetic distances allowing us to greatly refine the assignment of COH1.

Materials and Methods

Subjects and Samples

Twenty-two affected and 36 unaffected individuals (22 parents and 14 sibs) were included in the study. The patients belonged to 16 kindreds, 5 of which were multiplex nuclear families [9] and 10 had a single affected individual. All patients fulfilled the diagnostic criteria for COH1 in Finland, and the affection status was verified by two of us (R.N. and S.K.-K.). From each family, only 1 affected individual and both parents, if available, were considered in the linkage disequilibrium and haplotype analysis. In families 4 and 5 that were closely related [9] the shared disease-bearing chromosome was taken into consideration only once. Venous blood was collected from each consenting individual and genomic DNA was isolated from leukocytes by standard methods [11].

Markers and PCR

Twelve polymorphic markers form Généthon [1214] were studied. The published order and genetic distances [1214] between the markers are: D8S270 — 3.2 cM — D8S1699 — 1.1 cM — D8S1772 — 0.1 cM — D8S1822 — 2.3 cM – D8S506 — 1.5 cM — D8S257 — 0.7 cM — D8S559 — 0.8 cM — (D8S1808, D8S1762, D8S546) — 3.0 cM — (D8S1714, D8S521). The italicized markers are novel markers that were not included in the previous linkage study [9].

PCR reaction and electrophoresis conditions were as described earlier [15], and the polymorphic alleles were visualized by the silver staining method [16]. Alleles were numbered consecutively, ‘1’ being the largest allele.

Linkage Disequilibrium Analysis

Allele frequencies on disease-bearing chromosomes were compared with allele frequencies of the normal, non-COH1 chromosomes of the carrier parents. When searching for an increase in the frequency of a single allele on disease-bearing over normal chromosomes each allele was examined separately. We used a standard onesided test (Fisher’s exact test) and corrected for multiple testing by using a Bonferroni correction [1 − (1 − p)k, where k is the number of alleles] under the assumption that the tests for each allele detected were approximately independent.

To estimate the distances between the COH1 locus and markers that were in linkage disequilibrium with it, the Luria-Delbrück-based method [10, 17, 18] was applied. The proportion of disease-causing chromosomes (α) descending from the putative common ancestor, i.e. the degree of allelic homogeneity of COH1 in Finland, was calculated using the formula α = 1 − µgq−1, where ‘µ’ denotes the mutation frequency at the COH1 locus, ‘g’ the number of generations since founding and ‘q’ the disease allele frequency. Allelic excess (pexcess) was calculated by the equation pexcess = (paffected − pnormal)/(1 − pnormal), where paffected and pnormal denote the allele frequencies in COH1 and normal chromosomes, respectively [17]. The genetic distance (θ) of the disease mutation from a marker locus was estimated using the equation pexcess = α(1 − θ)g. In these calculations μ was set at 5.0 × 10−6, as the disease is rare. The parameter g was allowed to vary from 10 to 100. As a best estimate for the population under study, COH1 disease allele frequency (q) was set at 0.004 based on the Hardy-Weinberg principle and the actual incidence of 1:60,000 of COH1 in the regions of Finland, where the disease occurs [Norio, unpubl. data].

The likelihood-based DISMULT program [19] version 2.1 was used for the computation of lod scores in a multipoint linkage disequilibrium analysis. The program has been developed for analyzing data from multiallelic markers. It uses information from all the marker loci simultaneously and has a built-in location parameter. In this analysis all markers except the most centromeric one, D8S270, were considered. The intermarker genetic distances were transformed to physical distances under the assumption 1 cM = 1 Mb. The intermarker distance of markers with a genetic distance of 0 cM was set at 50 kb. The decay parameter was allowed to vary from 5 to 500. The 99% confidence interval for the maximum likelihood ratio (Zmax) was calculated as Zmax ± 3 lod units.

Haplotype A nalysis

Haplotypes were constructed for six markers in the order suggested by the genetic map [14]: D8S559-D8S1808-D8S1762-D8S546-D8S1714-D8S521. Most likely haplotypes were constructed manually using the minimum recombination strategy. The phase of the COH1 haplotypes was determined by genotyping the patients and their parents. When parents were not available haplotypes were inferred from children.

Results

Significance of Linkage Disequilibrium

We tested for linkage disequilibrium at eight marker loci that did not show recombinations with COH1. Alleles on 31 apparently unrelated COH1-bearing chromosomes compared with alleles on 24 normal chromosomes of the parents are shown in table 1. A highly significant (p < 0.001) linkage disequilibrium was detected with markers D8S559 and D8S1762, the most common allele occurring in 75 and 87% of the COH1-bearing chromosomes, but in 21 and 38% of the normal chromosomes, respectively (marker D8S1808 with the same reported map location with D8S1762 and D8S546 was less informative resulting in nonsignificant allelic association). Marker loci D8S506, D8S257 and D8S546 showed a less significant (0.001 < p < 0.01) linkage disequilibrium.

Table 1. Distribution of alleles on disease/normal chromosomes at the eight marker loci

Estimating Distances by Linkage Disequilibrium

Previously described strategies (see Materials and Methods for details) were applied to estimate the genetic distance between COH1 and marker loci based on linkage disequilibrium. The results are shown in table 1 and figure 1. The highest pexcess value (table 1) of 0.79 was obtained at marker locus D8S1762, the next highest value being 0.73 at locus D8S546. Consequently, COH1 appears to lie closest to locus D8S1762 (fig. 1), the next closest marker being D8S546 residing at a genetic distance of 0.00 cM telomeric to D8S1762. Marker loci D8S1808 and D8S559 residing 0.00 and 0.80 cM, respectively, from D8S1762 on the centromeric side showed a greater distance to COH1. Based on the geographical distribution of the birthplaces of known carriers (fig. 2) and parameters characterizing the Finnish population history (see Discussion for details), we assumed that the founding COH1 mutation started to spread in the Finnish population 20–50 generations ago. Assuming 20 generations since founding results in a value of a of 0.98 (see Materials and Methods) and a distance estimate of 1 cM of COH1 from D8S1762. The assumption of 50 generations results in α = 0.94 and a genetic distance of 0.33 cM.

Fig. 1.
figure 1

Genetic distances separating the COH1 gene locus from six marker loci as a function of the number of generations since founding.

Fig. 2.
figure 2

Birthplaces of the parents of Cohen syndrome patients in 16 Finnish families showing the geographical distribution of COH1-associated haplotypes for marker loci D8S559-D8S1808-D8S1762-D8S546. The dotted line in the southeast is the boundary of Finland before the Second World War.

Multipoint Analysis

The results of the likelihood-based multipoint linkage disequilibrium analysis are shown in figure 3. We analyzed simultaneously six markers residing in the region of the strongest allelic association and included the distal flanking marker D8S1714 in the analysis. The maximum likelihood ratio value on a lod unit scale was 9.2 at about 0.2 cM proximal to D8S1808. The 99% confidence interval for this estimate covers 3.7 cM, from 1.2 cM telomeric of D8S506 to 1.1 cM centromeric of D8S1714.

Fig. 3.
figure 3

Results of multipoint linkage disequilibrium analysis using the DISMULT program [19]. The likelihood ratio estimate on a lod unit scale is plotted as a function of map distance from D8S506. A maximum lod score value of 9.8 is indicated with the arrow. The bar on the x-axis indicates the 99% confidence interval. The centromere is on the left.

Haplotype Analysis

Haplotypes for the markers D8S257, D8S559, D8S1808, D8S1762, D8S546 and D8S1714 were determined in 29 COH1 and 24 normal chromosomes. The haplotypes of COH1 chromosomes are shown in table 2 and the geographical origin of the various haplotypes in figure 2. The ‘ancestral’ haplotype 1-4-2-5-7-5 was seen in four chromosomes studied. When markers D8S257 and D8S1714 were not considered, a ‘core ancestral’ haplotype 4-2-5-7 for markers D8S559, D8S1808, D8S1762 and D8S546 was seen in 22 COH1 chromosomes. This haplotype occurred in only one normal chromosome (data for the normal chromosomes not shown). Two chromosomes had the haplotype 6-2-5-7 which deviated from the ‘core’ haplotype by a change interpretable as a historical recombinational event, placing COH1 telomeric to D8S559. In 1 chromosome, the haplotype 6-3-5-1 could be the result of two historical recombinations in the ancestral haplotype, placing COH1 telomeric to D8S1808 but centromeric to D8S546. Two chromosomes had the haplotype 6-3-1-2 and one the haplotype 6-3-1-7 that both could be derived from 6-3-5-1 as a result of a historical recombinational event placing COH1 proximal to D8S1762. An alternative explanation for haplotypes 6-3-1-2 and 6-3-1-7 is that they represent a second COH1 mutation in Finland (see Discussion). The 7-3-3-1 haplotype seen on a single chromosome may represent another ‘nonancestral’ COH1 mutation in Finland.

Table 2. Haplotypes associated with 29 COH1 chromosomes

Discussion

Using conventional recombinational mapping with novel markers, we were able to narrow the previous COH1 gene region [9] to an approximately 9-cM interval flanked by marker loci D8S1699 and D8S1714 (data not shown). Linkage disequilibrium mapping allowed us to greatly refine the COH1 gene localization. This method is a powerful tool in refining disease gene localizations for mendelian disorders that are relatively rare in the population [20]. It has already been proven to be a crucial tool in the positional cloning of several disease genes [e.g. 10, 17, 2125]. Linkage disequilibrium mapping exploits recombinations that have occurred during the entire history of a population. In an ideal situation such as the one in Finland [18], most of the disease chromosomes descend from a single ancestral mutation, and the mutation is old enough so that recombinations have made the region of strongest linkage disequilibrium small, but not too small [10]. A strong linkage disequilibrium was detected between COH1 and several marker loci. These findings support the hypothesis of a major contribution from one founding COH1 mutation in Finland. The most striking linkage disequilibrium was detected with markers D8S1762 and D8S546 that map within 0.00 cM of each other [14].

Assumptions about the actual age of the COH1 mutation in the Finnish population were made by implications based on the geographical distribution of birthplaces of carriers of the COH1 mutation. As demonstrated in figure 2, this distribution is uneven in Finland. The pattern of clustering of the disease in mid Southeast Finland, the spreading towards the East and North [areas in which people settled in the 16th century or later, ‘late settled areas’, 18, 26], and the occurrence of occasional cases in South and West Finland [early settled areas, 18, 26] suggest that at least 20, probably more, generations have elapsed since the major COH1 mutation began to spread in the population. However, it is highly unlikely that the COH1 mutation is very old in the Finnish population as the disease gene does not occur evenly in the early settled areas. Therefore, assuming that the mutation is 50 generations old the distance of the closest marker D8S1762 from COH1 would be 0.33 cM.

Further evidence regarding the localization of the COH1 gene was obtained by constructing haplotypes for the markers in the critical region. Analysis of markers D8S559, D8S1808, D8S1762, and D8S546 showed that most of the COH1 chromosomes are derived from a single founder chromosome carrying the haplotype 4-2-5-7 (table 2). Of the 29 haplotyped chromosomes, 22 carried the ancestral haplotype, and we propose that at least three haplotypes (chromosomes 23–25, table 2) most likely derived from it by historical recombinations. Clearly, haplotypes deviating from the ancestral haplotype at only one marker locus could represent mutation events at micro-satellite repeat loci. However, the two chromosomes (chromosomes 23 and 24, table 2) differing at D8S559 from the ancestral haplotype also had a nonancestral allele at the nearest centromeric locus D8S257 consistent with a recombinational event between marker loci D8S559 and D8S1808. Moreover, extending the haplotype proximally and distally on the one chromosome (chromosome 25) sharing only allele 5 with the ancestral haplotype suggests the occurrence of two recombinational events flanking D8S1762 rather than other explanations. Therefore, these haplotype data suggest that the localization of COH1 is between marker loci D8S1808 and D8S546.

Interestingly, four chromosomes had haplotypes very different from the ancestral one. For three of these (of haplotype 6-3-1-2 and 6-3-1-7, chromosomes 26–28, table 2) we propose two alternative interpretations. First, both haplotypes could derive from haplotype 6-3-5-1 by single historical recombinations, in which case the localization of COH1 would be further narrowed to the interval between marker loci D8S1808 and D8S1762. However, as chromosomes 26–28 all contain allele 2 at the nearest proximal marker D8S257 whereas chromosome 25 contains allele 4, it is likely that more than one recombination must have occurred in case of a common origin of these haplotypes. Therefore, as a second interpretation we propose that the haplotypes on chromosomes 26–28 represent a second ‘founding’ COH1 mutation in the Finnish population. The remaining different haplotype (chromosome 29), on the other hand, probably represents yet another mutation. The hypothesis of more than one COH1 mutation in Finland is further supported by the origin of chromosomes 26–29 in the western coastal area of Finland (fig. 2). Moreover, as the disease has a worldwide occurrence one might expect that more than one COH1 mutation exists in Finland, and it is more likely to detect the rare mutations in a population with a high frequency of one main mutation.

Our haplotype analysis is limited by the small number of chromosomes available for study. For example, the relatively high mutation rate at microsatellite loci might explain in part the differences observed between haplotypes. However, given the clinical homogeneity of COH1 in Finland and the fact that patients carrying ‘atypical’ haplotypes are, with the exception of the patient carrying chromosomes 27 and 28, compound heterozygotes for the ancestral COH1 haplotype, it is highly unlikely that genetic locus heterogeneity would explain the different haplotypes. Based on our entire set of data, we feel safe in concluding that COH1 lies between marker loci D8S1808 and D8S546, at the most 0.3 cM on either side of locus D8S1762. Our strategies for physical mapping and cloning of COH1 will therefore aim at constructing a physical map covering the region D8S1808-D8S546 using marker locus D8S1762 as a starting point.

The eventual cloning and characterization of the gene underlying COH1 will allow us to test how accurate these predictions are. Data from several other diseases of the so-called Finnish disease heritage have shown that more than one single mutation exists in the Finnish population in disorders that show a distribution in both the early and late settled areas as seen in the Cohen syndrome. For these disorders, the proportion of the main ancestral mutation is in the range of 94–98%, e.g. 94% in diastrophic dysplasia, 96% in progressive myoclonus epilepsy [27] and 98% in aspartylglucosaminuria [28, 29].

One candidate gene for the Cohen syndrome is the syndecan-2 precursor gene whose function is already identified and which resides close to the COH1 gene region. It is homologous with the rat heparan sulfate proteoglycan. Five different heparan sulfate proteoglycans have been reported in humans, some affecting leukocyte differentiation, neural migration, and retinal development [3033]. Novel EST markers residing in the COH1 region [34] comprise two sequences isolated from a retinal cDNA library. Candidate genes and incomplete genes in the form of ESTs will be mapped by physical methods and, provided they fall in the critical interval, searched for mutations in COH1 patients. Ultimately, the characterization of the COH1 gene mutations unravels the question of possible heterogeneity and will allow a more specific classification of the Cohen syndrome.