Introduction

Genome-wide linkage analysis is an established tool to map inherited diseases, but has not been used in any published prenatal diagnostics of an unknown genetic disorder.

When the phenotype of affected individuals in a family can be assigned to a specific syndrome and the underlying disease-causing gene is known, prenatal diagnosis is straightforward.1, 2, 3, 4 However, no reliable tests are available for families with rare, unknown genetic syndromes during pregnancy. A clinical geneticist can only tell the parents with an affected child that the recurrence risk for having another affected child is 25% (0.25) in recessive diseases. That risk is considered too high by many parents. Many parents decide to terminate such a pregnancy, although there is a chance of 75% (0.75) for an unaffected child. We were recently confronted with an unplanned ongoing pregnancy in such a family. By using data from a genome-wide SNP scan for subsequent linkage analyses, we identified the disease-associated haplotype in an unknown mental retardation syndrome. With microsatellite markers, such scans required months; however, that time is reduced to only days with high-throughput SNP genotyping.

In this way, high-throughput SNP genotyping combined with linkage mapping is a very powerful diagnostic tool for prenatal diagnostics of unknown genetic disorders. On the other hand, latest full genome-sequencing technologies have to deal with extensive genomic variation, thus making the interpretation of potential disease-causing mutations very difficult, time-consuming and expensive.5, 6, 7, 8, 9 However, next-generation sequencing (NGS) technologies combined with linkage analyses will be even more powerful and improve the quality of prenatal counseling for many patients.

Materials and methods

Patients

We studied a consanguineous family with Kurdish origin and control individuals. There were five children in the core family and three of them presented a similar phenotype of a so far unknown genetic syndrome. The parents of the affected individuals were first cousins (individuals 782 and 783 in the pedigree, Figure 2c and Table 1). In a second family branch, another affected cousin had the same clinical symptoms (individual 9408, Figure 2c and Table 1). The affected individuals were two boys and one girl in the core family, and another girl in a second family branch.

Table 1 Clinical characteristics of patients

Transmission of the disease was consistent with autosomal recessive inheritance. Blood samples or skin biopsies were collected after obtaining written informed consent from the participants in accordance with our ethics committee. Genomic DNA was extracted from blood samples, patient fibroblasts, lymphoblastoid cell lines and by chorionic villus sampling (CVS), according to standard procedures.

Genome-wide scan and finemapping

We performed a 10k Affymetrix linkage scan (including family members 782, 783, 784, 785, 786, 787, 788, C052) and analyzed the data using ALLEGRO v1.2c, GENEHUNTER v2.1r5 under the graphical user interface easyLINKAGE v6.00. We assumed a recessive model with complete penetrance and equally distributed marker allele frequencies. Additional finemapping with microsatellites was done as described elsewhere.10 We included all initially available family members, and reconstructed haplotypes by GENEHUNTER and also manually. Subsequently obtained DNA from another affected cousin and her healthy sister was analyzed later on by microsatellites and further sequencing.

Risk estimation

Risk calculation was performed using standard methodology, that is, via Bayes’ theorem. The corresponding likelihoods (that are proportional to the relevant probabilities) were calculated via a modification of MERLIN.11 A ratio was formed with the likelihood of the fetus being affected in the denominator, and the sum of the likelihoods of the fetus being either affected or unaffected in the numerator. Given the proportionality between the likelihoods and the corresponding probabilities, the value of this ratio is exactly equal to the risk as calculated by Bayes’ theorem (Supplementary Figure 1).

Assuming autosomal recessive inheritance, the risk for the fetus to be affected was 25% in phase 1 of prenatal counseling before we performed our linkage analysis. Phase 2 indicates the initial SNP mapping results with the 10k Affymetrix chip. Phase 3 includes data of the finemapping with additional SNPs and microsatellite markers, including the affected cousin.

Sequencing

Genes within the critical region were identified via the UCSC genome browser. Primer for the amplification of the entire coding regions and exon−intron boundaries of all screened genes were designed using the software PRIMER3. We sequenced genes within the linkage interval using the Sanger method. Primer sequences are available on request.

Results

We studied the members of a consanguineous family, where the parents were first cousins (Figures 1a and b). The mother of three severely mentally retarded children was pregnant in gestational week 9. She consulted a clinical geneticist with the intention of terminating her unplanned pregnancy. Two clinical geneticists and two pediatric neurologists independently examined the patients, and ruled out the possibility of already known diseases (Table 1, Figures 1a and b).

Figure 1
figure 1

Facial appearance of the patients and their healthy brother who was born subsequent to prenatal gene mapping. (a) All the three affected siblings (784, 787 and 788) showed slightly downward-slanted palpebral fissures, heavy eyebrows, a prominent root of the nose, a short philtrum, an unusual preauricular hairline, posteriorly rotated ears and a prominent lower lip; in two of the patients (784 and 788) divergent strabismus was diagnosed. (b) After the final result of our indirect prenatal analysis, the parents decided to continue the pregnancy and the mother gave birth to a healthy baby. The boy started walking and talking at the age of 16 months, and until now his motor and mental development is normal.

The major consistent symptoms in all the affected family members are severe mental retardation, extremely retarded motoric development and a very similar facial appearance. Affected individuals 784, 787 and 788 were born with no abnormalities; antenatal ultrasound diagnostic was uneventful. In the first affected individual, 784, the mother noticed a progressive microcephaly and motoric retardation at the age of six months. Similarly, the other two siblings, 787 and 788, presented with a central hypotonia, resulting in an extremely retarded neuromuscular development. At the age of 2 years, the three children suffered the first seizures, and autoaggressive behavior became apparent. The age at first unsupported sitting was about 4 years and one patient started walking at 14 years. None of them achieved verbal communication (Table 1 and Figure 2c).

Figure 2
figure 2

Prenatal genome-wide linkage scan with subsequent finemapping and haplotype analyses excluded an affection of the fetus. (a) Initial LOD score after the genome-wide SNP mapping with linkage to chromosome 1p (LOD score of 2.56). The initial mapping included the parents (individuals 782 and 783), as well as their affected and unaffected children (784, 785, 786, 787 and 788). Subsequently, we analyzed the fetus (C052) at week 12 of gestation. (b) Final LOD score after additional finemapping of microsatellite markers, including the core family as well as one affected and one unaffected cousin from a second family branch (individuals 13389 and 9408), resulting in a significant LOD score of 3.98. (c) Haplotypes after final mapping results. The disease-linked region shown here is located on chromosome 1p36.12. The affected children (784, 787, 788 and 9408) inherited two disease-associated dark–green haplotypes from their parents. The risk region for the fetus (C052) is narrowed down to a 120 717-bp interval, which contains three exons (plus one alternative exon 1) of the EPHB2 gene. We sequenced all coding exons and did not find any mutations. Thus, an affection of the fetus was extremely unlikely. The remaining critical region of homozygosity between markers D1S2826 and rs1961413 in all the affected individuals (784, 787, 788 9408) contains 58 candidate genes (markers that were tested for this region are listed in order from the p-terminal end of the chromosome (NCBI build 36.1)). Haplotypes for all tested genetic markers and SNPs are shown in columns beneath family members who underwent linkage analysis. The disease-associated haplotype is marked in dark-green, haplotypes marked in blue or light orange are not disease-linked. Non-informative segments are marked in gray (non-informativity results from recombination events in regions where the parents are homozygous for the marker allele).

The condition was not associated with any visible antenatal abnormality, thus making prenatal ultrasound diagnosis impossible. Birth history was completely uneventful in all offspring. Developmental delay was not recognized until after 6 months of life. Magnetic resonance imaging of the brain was performed in one affected child at age 3 years and it did not reveal any abnormality. Similarly, no pathological ophthalmologic findings, except a divergent strabismus, were identified (patients 784 and 788). Apart from the severe developmental delay and the similar phenotype (especially the facial presentation, Figure 1a), no other organic anomaly could be found in the patients. The phenotype did not match any known syndrome.12, 13

Screening for neurodegenerative metabolic diseases (eg, peroxisomal disorder and cerebral organic acidemia) was without pathological findings. Apart from karyotyping (normal result), no further genetic testing was performed in any of the affected individuals in the family prior to the analyses described here. To our knowledge there was no prior genetic testing in affected individual 9408.

Disease transmission was consistent with autosomal recessive inheritance (Figure 2c). Initially, we obtained DNA from the core family, including the parents (782 and 783), their three affected children (784, 787 and 788) and the unaffected children (785 and 786). Multipoint linkage simulations revealed a maximum possible LOD score of 2.56 by using ALLEGRO v1.2c14 under the interface easyLINKAGE v6.00.15, 16 We performed 1000 replicates assuming 100% penetrance, a disease allele frequency of 0.001, and equally distributed SNP alleles in 500 markers with an intermarker distance of 0.3 cM, which conforms to the average marker density of a genome-wide scan with 10 085 SNPs. We used the 10k SNP chip by Affymetrix and found one single region on chromosome 1p36.12 with a LOD score of 2.56 (Figure 2a) that matched the maximum expected LOD score. Because of parental consanguinity, we expected homozygosity for the disease region in the affected family members. This homozygous region was inherited by all the affected siblings and enclosed a 25-cM segment between SNPs rs1934489 and rs728337, corresponding to 16 million bp (66 000 coding bp). None of the unaffected family members was homozygous for this region (Figure 2c). No other homozygous and disease associated region with such a significant LOD score was found in the family.

At 12 weeks gestation, we performed CVS that revealed a normal male karyotype, and in addition typed the DNA of the fetus (C052) on the 10k SNP chip. In the first phase of the prenatal testing, the fetus was at risk for about 11 cM (corresponding to 10 million bp) of the disease-associated haplotype. Within the middle part of the 11 cM segment, the fetus had a recombination subdividing the risk region for the fetus into a non-informative interval of 5 cM and another homozygous segment of 6 cM (data not shown). Thus, at that time we could not determine the disease status of the fetus, because the recombination within the disease-linked haplotypes made exclusion or confirmation impossible. To further specify the haplotypes of the fetus within this region, we analyzed more microsatellites and SNPs, which were not included in the initial SNP scan. Furthermore, after the initial SNP mapping with the core family was completed, we were able to subsequently obtain DNA from another affected and one unaffected cousin of a second family branch (individuals 9408 and 13389 in Figure 2c). We then included these cousins and typed more microsatellite markers. This narrowed down the disease-associated haplotype in all the affected individuals to an 8.8-cM interval between markers D1S2826 and rs1961413. In phase 3 the LOD score was 3.98, corresponding to the maximal expected LOD score obtained for this specific situation (Figure 2b). Figure 3 shows risk development in the fetus concerning the cM (Figure 3a) and coding genes (Figure 3b) at risk in phase 2 and 3 of the prenatal diagnosis in order to the number of analyzed SNPs and additional family member.

Figure 3
figure 3

Development of the risk region in the fetus and the affected individuals throughout the different steps of the prenatal diagnosis. The comparison of the risk region for the fetus concerning cM at risk (a) and coding genes at risk (b) in phase 2 and 3 of the prenatal diagnosis compared with the affected region is shown here. Especially in the situation of a prenatal genetic counseling, this figure could help to demonstrate the power of mapping and finemapping results, and the possible reduction of the risk for the fetus from the initial 25% risk in phase 1 (autosomal recessive inheritance) to a minimized final risk of 4 coding exons (1 coding gene), which could also be excluded.

A recombination event in the affected cousin narrowed down the risk region in the fetus on chromosome 1p36.12. However, a remaining small segment in the distal part of the haplotype of the fetus was still non-informative. Therefore, we typed additional, previously not described, microsatellite markers and sequenced more SNPs within this small interval. This approach narrowed down the risk region for the fetus to a 120 717-bp interval (896 coding bp and 4 coding exons; marked gray in Figure 2c).

In phase 2 of the prenatal testing, the risk region for the fetus covered about 62% of the physical disease region with the affected siblings, and included 1380 possible disease-causing genes.

By further extensive finemapping in phase 3, we could narrow down this risk region to 120 717 bp of the initially 10 million bp enclosing segment. Finally, four coding exons of the EPHB2 gene were remaining. The protein encoded by this gene is a receptor for ephrin-B family members. The EPHB receptor tyrosine kinases are involved in formation and remodeling of synapses, and are therefore important in many pathways of developmental processes in the central nervous system.

Supplementary Figures 2a and b show the development of the risk region for the fetus in phase 2 (after the initial 10k SNP scan) and phase 3 (after further finemapping and sequencing) in comparison to the disease region in the affected children.

Within this region we sequenced all known and predicted coding regions in the affected individuals as well as in the fetus and did not find mutations. Thus, it was extremely unlikely that the fetus would be affected. The parents decided to continue the pregnancy and the mother gave birth to a healthy baby boy (Figure 1b).

Pediatricians and clinical geneticists have intensively followed this child. At the age of 16 months he started walking and speaking, and gathered milestones that his affected siblings never reached. Until now, his motor and mental development is normal.

By means of Supplementary Figures 2a and b, we explained to the parents the rapid progress of our analysis and the small remaining region at risk in the fetus. The parents were told that mutations in noncoding regions, for example, the promoter region or regulatory elements of the EPHB2 gene, a fetal-placental mosaic, genetic heterogeneity in the family, double recombination or inaccuracies in the current genetic databases could not be excluded. We assumed the residual risk for the fetus to develop the same disease as the affected siblings is <1%. Furthermore, we explained to the parents that other autosomal recessive diseases in a consanguineous family are possible and could not be excluded by the present genetic mapping. We further explained that the background risk of having a child with any type of congenital or genetic disorder remains at a level of 3–5%. Based on this information the parents were able to reach an informed balanced conclusion, concerning the decision for the ongoing pregnancy.

Discussion

This example demonstrates the power of genome-wide linkage analysis in prenatal diagnostics and the potential benefits for the family. We accurately localized the disease-linked gene region and subsequently evaluated whether or not the fetus was affected. We believe that this approach could be used in diagnosing other unknown syndromes that are not amenable to diagnosis by prenatal imaging.

Generally, genome-wide linkage analyses in affected and unaffected family members should be performed before any new pregnancy. However, sometimes the issue first arises when the pregnancy is discovered (usually weeks 4–8). Even then, there is sufficient time in general to recruit family members for initial genome-wide scanning and finemapping of the disease region before the pregnancy is sufficiently advanced for CVS or amniocentesis (usually 12–16 weeks).17, 18 Chip analyses can be done in days, similar to karyotyping, leaving enough time for any subsequent decisions. This diagnostic approach will improve prenatal counseling and the chance for an informed decision for many families. It can further be expected that the costs will decrease significantly along with the development of new genetic technologies.

Exact phenotyping is essential. Unknown syndromes, such as in our family, may be difficult to characterize and require specific diagnostic strategies.19, 20 All affected individuals must show exactly the same phenotype to exclude the possibility of phenocopies. Two experienced clinical geneticists performed this task in our family. Their independently established characterization of the affected status was congruent. Moreover, pedigree size is important for successful linkage.21, 22 Simulation studies are required to determine the power of a pedigree for linkage analyses. As estimates, in a second-cousin marriage, three affected individuals can suffice for an appropriate LOD score. In case two second-cousin marriages resulted in an affected child, already these two affected individuals are sufficient to reach the significant linkage. Affected families are highly motivated in helping to recruit family members.23, 24, 25, 26 In case the fetus has a recombination with several adjacent non-informative SNPs within the critical region, as in our family, additional microsatellites and SNP markers help to further specify haplotypes.

Rapid genome analyses using SNP chips allowed completion of a genome scan within days. Upcoming NGS technologies are even more powerful.27, 28 However, the variability of the human genome is probably the most important challenge for total genome sequencing. Present technologies reveal not only pathogenic mutations, but also sequence variations (especially in non-coding genomic regions) of unknown significance.5, 6, 7, 8, 9, 29, 30 Clinical management of such sequence variations of unknown significance, including the consideration of abortion, makes prenatal genetic testing very difficult and ambivalent.5, 6

In addition, recent studies have shown that even healthy subjects can carry several mutations in their DNA that were previously thought to always cause severe disease.31 Thus, diagnostic whole-genome resequencing will require methods to determine which of the several suspicious mutations is disease-causing. Several combined approaches, including bioinformatics, functional and segregation analyses can be helpful in that regard. The combination of linkage studies with new genome-sequencing-technologies could be the best approach towards a fast and reliable prenatal counseling.

Linkage analyses have been very successful in the identification of countless inherited disorders.32, 33 They can map a phenotype to a particular genomic region, and thus provide additional evidence whether or not a mutation is relevant. We suggest that genome-wide linkage analysis should be introduced into national guidelines. Better counseling is sorely needed for many families. Linkage analysis has been a generously funded research tool. We suggest that the method also has clinical utility and should be incorporated into qualified clinical laboratories.