Introduction

Genome-wide copy-number analysis by microarray is a frontline test for the diagnosis of microdeletion/microduplication syndromes.1,2 Single-nucleotide polymorphism (SNP) microarrays can also detect regions of homozygosity (ROH) throughout the genome. Depending on the genomic context, constitutional ROH may indicate ancestral homozygosity, uniparental disomy, or parental consanguinity.3,4,5 Short ROH (up to 5 Mb) are considered ancestral markers of an outbred population.3,4,6,7 The presence of a single large ROH or a couple of large ROH on the same chromosome most likely indicates uniparental disomy, especially if the ROH is telomeric.4 Multiple large ROH spread across different chromosomes is representative of a parental blood relationship.3,6,8,9

Clinically, ROH of any size can aid in diagnosis of autosomal recessive disease through homozygosity mapping and selection of a candidate gene for sequence analysis.10,11 In consanguineous families, the risk for autosomal recessive disease is directly proportional to the degree of parental relationship.3 As a guide, it is estimated that offspring of first cousins have an additional 1.7–2.8% increased risk for congenital malformations and a 4.4% increased risk for prereproductive mortality.12 Close consanguineous unions, mostly first-cousin marriages, occur in up to 60% of relationships in some parts of the world13 but fall in the range of 0.1–1.5% in North America.14,15,16,17

Although several laboratories perform SNP microarray, and reviews are available to describe the capability of these arrays to detect ROH,3,4,5 literature describing patients with ROH due to parental consanguinity in a clinical setting is limited.11,18 As more laboratories incorporate this test, it becomes increasingly important to discuss the advantages and disadvantages of the technology. Our report highlights the ability of SNP microarray to detect parental consanguinity in a patient population and uses select cases to illustrate the utility of this tool in clinical diagnosis.

Methods

Patient population

Study data were compiled from consecutive samples sent to our clinical laboratory between May 2008 and July 2011 for SNP microarray analysis. Patient indications were chosen by referring physicians from a broad range of specialties. Ethnicity/race was extracted from the patient medical record. This study included patients with at least two ROH on two separate chromosomes, each >10 Mb. The 10 Mb cutoff was based on the suggestion by Kirin et al.19 that ROH >10 Mb are rarely seen in cosmopolitan populations; the suggestion by Kearney et al.5 that conservative clinical thresholds for ROH are between 3 and 10 Mb; and the exclusion of patients from uniparental disomy analysis by Papenhausen et al.4 when a second chromosome had an ROH >10 Mb, indicating identity by descent.

When multiple family members were submitted simultaneously, only one was included in the study as the proband. Percentage of homozygosity was determined for the other siblings but not included in the cohort summary. Retrospective chart reviews were done in accordance with the institutional review board policies at Cincinnati Children’s Hospital Medical Center.

DNA extraction and SNP microarray analysis

DNA was extracted using MagnaPure Compact kits (Roche, Indianapolis, IN) following the manufacturer’s instructions. Between May and September of 2008, microarray analysis identified ROH in three patients using the Illumina Human CNV 370-duo DNA Analysis BeadChip platform (Illumina, San Diego, CA). Mean spacing of SNPs on this platform is 7.7 kb; median spacing is 5 kb. Between September 2008 and October 2010, microarray analysis identified ROH in 32 patients using the Illumina Human610-DUO Quad v1.0 DNA Analysis BeadC chip platform. Mean spacing of SNPs is 4.7 kb; median spacing is 2.7kb. Between October 2010 and July 2011, microarray analysis identified ROH in 24 patients using the Illumina HumanOmni1-Quad. Mean spacing of SNPs is 2.4 kb; median spacing is 1.2 kb. The Illumina Infinium Assay was performed as described by the manufacturer on 250 ng DNA. B-allele frequency and log2R ratio were analyzed with Illumina Genome Studio V2009.2 software (Illumina, San Diego, CA). DNA copy-number changes were prioritized using cnvPartition Plug-in v2.3.4. The software identifies ROH based on the presence of homozygosity in the B-allele frequency but no change in the log2R ratio, to exclude regions that are hemizygous due to deletion. ROH interrupted by homozygous deletions or genotyping errors were manually adjusted.

Calculation of percentage of homozygosity

Percentage of homozygosity (Froh) was calculated by summing ROH >5 Mb across the covered autosome (Lroh auto) and dividing by the total autosome base pairs (Lauto) represented on each respective microarray platform ( Supplementary Tables S1 S3 online). The calculation included ROH >5 Mb based on the suggestion by Papenhausen et al.4 that multiple, long ROH >5 Mb are likely to represent identity by descent. The sum of covered autosomes was 2,691,971,030 bps for the 370-duo, 2,691,868,142 bps for the 610-Quad, and 2,699,116,387 bps for the Omni1-Quad. This calculation, adapted from McQuillan et al.,6 excludes mitochondrial DNA and sex chromosomes:

Separate calculations were performed on female patients to evaluate the impact of the X chromosome on total homozygosity. ROH >5 Mb were summed across the entire covered female genome (Lroh genome), including the X chromosome, and divided by the total number of base pairs in the genome (Lgenome). The sum of the covered female genomes was 2,843,859,790 bps for the 370-duo, 2,843,756,902 bps for the 610-Quad, and 2,851,012,420 bps for the Omni1-Quad:

Classification of degree of consanguinity based on the proportion of ROH

On the basis of theoretical coefficients of inbreeding, the expected degree of homozygosity in offspring of consanguineous matings is 25, 12.5, 6.25, 3.125, and 1.5625% for first-, second-, third-, fourth-, and fifth-degree relatives, respectively.20,21 However, estimates of homozygosity in any one individual may vary by chance. Therefore, to molecularly classify degrees of consanguinity, we determined the 95% confidence intervals for expected proportions of first- through fifth-degree relatives. To calculate confidence intervals on expected degree of homozygosity, we assumed that degree of homozygosity was based on an underlying binary distribution (homozygosity for any one region scored as yes/no). Of note, the distribution of degree of homozygosity could also be viewed as a continuous trait, but as this trait exhibits a multimodal distribution, calculation of confidence intervals would be extremely challenging. Calculation of confidence intervals was based on autosome coverage on the Omni1-Quad ( Supplementary Table S3 online). To calculate confidence intervals, the effective number of regions (n) across the autosome must be determined. Determining this is complicated because the autosome is not linear, is not captured completely by current SNP chips, and is not completely independent. To address nonlinearity, the analysis was performed following breakdown of each chromosome into its respective p and q arms. To address incomplete capture, only regions covered by the Omni1-Quad were considered in the calculation. As not all regions of the autosome segregate independently due to linkage, we considered the number of possible 5 Mb regions covered on the Omni1-Quad chip for each chromosomal arm. Five megabase pairs was selected to be consistent with the minimum ROH considered in this study; however, other lengths could have been selected and would have resulted in different confidence intervals. The number of 5 Mb regions for each chromosomal arm was calculated and rounded down to the next whole number, yielding 519 (n) regions. The 95% confidence interval was then calculated using the following equation:

In this formula, p is the theoretic inbreeding coefficient and n is the number of regions, 519. Using this confidence interval, individuals were molecularly classified to the estimated degree of relationship.

Results

Demographics for patients with regions of homozygosity

Most samples collected for SNP microarray were from pediatric patients. However, 5/59 samples with ROH were nonpediatric; these included one product of conception, three prenatal amniocentesis samples, and one adult sample. There were 34 male and 25 female SNP samples. Ethnicity/race was documented for 43/59 patients. Of these, 4/43 were Hispanic/Other and 39/43 were Non-Hispanic. Of the Non-Hispanics, 22 were Caucasian, 3 were African-American, 3 were Asian, 8 were Middle Eastern, 1 was Other (Italian/Spanish), and 2 had no documented race provided. The most common postnatal indications for microarray testing were developmental delay/mental retardation, hypotonia, seizures, and dysmorphic features.

Findings of ROH indicating parental consanguinity are not rare in patients undergoing SNP microarray testing

Retrospective data analysis of 3,217 patients identified 59 (1.8%) with at least two ROH >10 Mb on two separate chromosomes. In this cohort, SNP microarray detected a range of homozygosity from 0.9 to 30.1%, indicating parental relationships from first-degree to distant relatives. Representative microarray plots ( Figure 1 ) provide visualization of the autosome in the offspring of consanguineous parents. Figure 1a represents the offspring of parents with a first-degree relationship, as indicated by ROH >5 Mb covering 24.4% of the autosome. Figure 1b shows a more distant third-degree parental relationship with homozygosity covering 7.3% of the autosome. A first-cousin relationship was confirmed for this patient’s parents.

Figure 1
figure 1

Single-nucleotide polymorphism (SNP) microarray data indicate regions of homozygosity (ROH) associated with parental consanguinity. (a) ROH in a patient with closely related parents. Based on ROH >5 Mb, ~24.4% of the autosome is identical by descent indicating parental first-degree relatives. The coefficient of inbreeding for first-degree relatives is one-fourth (25%). (b) ROH in a patient whose parents are reported to be third-degree relatives. Based on ROH >5 Mb, ~7.3% of the autosome is identical by descent. The coefficient of inbreeding for third-degree relatives is one-eighth (6.25%). Parents were confirmed to be first cousins. Grayed blocks indicate ROH identified by SNP microarray across the autosome.

By calculating the confidence interval for the coefficients of inbreeding for different types of consanguineous matings, we were able to categorize the parents’ suspected degree of relationship ( Figure 2 ). Individuals who were unable to be clearly classified fell between the calculated intervals ( Table 1 ).

Figure 2
figure 2

Estimated degree of parental relationship in patients with regions of homozygosity (ROH). A confidence interval was generated to predict the parental degree of relationship in patients with two or more ROH, found on different chromosomes, each >10 Mb. Patients who fell between categories were considered to have parents with uncertain degrees of relationship and were categorized accordingly.

Table 1 Predicted degree of relationship between parents of individuals with two or more ROH >10 Mb on separate chromosomes

Using generated confidence intervals, individuals with homozygosity exceeding 21.3% were categorized as suspected offspring of first-degree relatives. In this study, 11/59 (18.6%) patients met this threshold. The remaining 48 patients had <21.3% homozygosity, representing parents who were more distantly related.

Impact of the X chromosome on estimates of homozygosity

Separate calculations were performed on 25 female patients to determine whether inclusion of the X chromosome impacts percentage of homozygosity. Fifteen of 25 (60%) females had ROH on the X chromosome. Comparisons of total percentage of homozygosity were made between the autosome and genome. The mean difference was 0.5% and the median difference was −0.1%. Parental classification remained the same in 13/25 (52%) patients. In 12/25 (48%) patients, the parental classification was more ambiguous. In 20/25 (80%) females, the difference was ≤1% ( Supplementary Table S4 online). The remaining five patients had differences of 1.6, 2.5, 2.6, 2.6, and 4%. One patient had a ROH covering the entire X chromosome, and this increased percentage of homozygosity from 25.2 to 29.2%.

Discrepancies in clinical documentation of family history and molecular findings of ROH

Parental consanguinity was documented on the requisition in 8/59 (13.5%) cases. There was no documentation of consanguinity before 2010. An internal medical chart review was performed on 43 patients. In 19/43 (44%) cases, consanguinity was documented in the clinical chart before microarray testing, but only 3 (16.7%) physicians provided this information on the requisition.

Following chart review, patients were classified as family history unknown (10/43, 23%), molecular results consistent with family report (12/43, 28%), consanguinity denied (10/43, 23%), ROH more than expected (9/43, 21%), and ROH less than expected (2/43, 5%).

Patients/families that denied consanguinity fell into four categories. Two patients had a high level of homozygosity most consistent with a first-degree parental relationship (27.2 and 23.8%). Three patients with 4.6, 4.1, and 2.5% homozygosity denied consanguinity but were Jordanian, Pakistani, and American Amish, respectively. Four patients had relatively low levels of homozygosity (1.7, 1.7, 1.1, and 0.9%). One patient with 5.4% homozygosity denied consanguinity, and the reason for the discrepancy is unclear.

In 19 cases, microarray results suggested more homozygosity than would be expected by clinical documentation. Ten cases have been discussed above as they denied consanguinity. The remaining nine are divided into three categories. Two patients had high levels of homozygosity that may be consistent with a first-degree parental relationship (28.4 and 21.0%). Six patients admitted a parental relationship, but the microarray finding suggested a closer relationship than reported. All six were from ethnicities in which multiple generations of consanguinity is common. In one family, parents reported a second-cousin relationship, but microarray detected 3.8% homozygosity; the reason for the discrepancy is unclear.

Finally, in two cases, microarray suggested less homozygosity than would be expected by clinical documentation for unknown reasons. In one family, the mother confirmed that the father of the baby was her paternal half-brother. The expected coefficient of inbreeding based on the patient report was 12.5% but the observed percentage of homozygosity was 8%. Table 2 provides a summary of expected versus observed homozygosity.

Table 2 Comparison of clinical documentation of family history and molecular findings of ROH on SNP microarray

Consistency of homozygosity calculations in siblings

Percentage of homozygosity was calculated for two sibling pairs and one sibling trio to appraise consistency of the microarray tool. In family 1, the siblings had 7.3 and 6.9% homozygosity and would be classified in the same category of parental relatedness. In family 2, the siblings had 9.8 and 11.1% homozygosity and would be classified in the same category. In family 3, the siblings had 4.3, 4.4 and 6.4% homozygosity. For this trio, two siblings were categorized as offspring of third/fourth-degree relatives whereas the third sibling was categorized as offspring of third-degree relatives.

SNP microarray-based homozygosity mapping aids in selection of candidate genes for diagnosis of autosomal recessive disease

In two known consanguineous families, homozygosity mapping led to molecular testing/patient diagnosis. In family 1, both the proband and an affected sibling displayed profound mental retardation, hypotonia, and seizures. Severe parenchymal volume loss with white matter signal abnormalities (abnormal bright signal on FLAIR/T2-weighted images) in the cerebral/cerebellar hemispheres bilaterally were identified by magnetic resonance imaging. SNP microarray revealed 15 ROH covering 7.3% of the autosome in the proband, and 10 ROH covering 6.9% of the autosome in the affected sibling. Consistent with percentage of ROH identified by microarray, the parents were known first cousins. There were seven regions of overlapping homozygosity ( Supplementary Table S5 online). The physician suspected a deficiency of acyl-CoA oxidase 1, palmitoyl, encoded by the ACOX1 gene on 17q25, because of elevated very long chain fatty acids with normal urinary bile acids. Homozygosity was not identified around ACOX1, ACOX2, or ACOX3. However, the siblings shared an ROH at 5q23.1 ( Figure 3a,b ), that included HSD17B4 (hydroxysteroid (17-β) dehydrogenase 4), which encodes the D-bifunctional protein ( Figure 3b ). Mutation of this gene can present clinically with features similar to ACOX deficiency.22 Enzymatic testing of D-bifunctional protein showed a deficiency (OMIM no. 261515) in the siblings, and sequencing identified a previously reported23 homozygous 3 bp (c.233_235del, p.Glu78del) deletion in HSD17B4.

Figure 3
figure 3

Single-nucleotide polymorphism microarray analysis from affected siblings narrowed regions of homozygosity (ROH), leading to identification of candidate genes. (a) ROH on chromosome 5 is overlapping in siblings with profound mental retardation, hypotonia, seizures, and abnormal magnetic resonance imaging results. The box demarcates the region that is enhanced in (b). (b) Overlapping ROH and clinical features implicate the HSD17B4 gene. (c) ROH on chromosome 17 is overlapping in siblings with seizures associated with an unspecified disorder of metabolism. The box demarcates the region that is enlarged in (d). (d) Overlapping ROH and clinical features implicate the PNPO gene.

In family 2, sisters presented with global developmental delay, muscle weakness, profound hypotonia, lack of coordination, and intractable neonatal seizures beginning in the late prenatal period, recurring within the first hours of life, and resistant to multiple antiepileptic medications. Microarray analysis revealed 9.8 and 11.1% ROH in the siblings. Overlapping ROH narrowed the autosome to eight regions of interest ( Supplementary Table S5 online). The siblings were diagnosed with pyridoxine phosphate oxidase deficiency (OMIM no. 610090), which is caused by recessive mutations in pyridoxamine 5′-phosphate oxidase (PNPO), a candidate gene in an ROH on chromosome 17q21.32 ( Figure 3c,d ). Gene sequencing in both siblings identified a homozygous nonsynonomous missense mutation (c.674G→T) in the coding sequence that resulted in the substitution of a highly conserved amino acid (p.R225L). Both patients were started on pyridoxine supplementation. The younger sibling had excellent seizure control on pyridoxine monotherapy. The older sibling had recurrent seizure activity and was on seizure medications in addition to pyridoxine. However, she has had an episode of status epilepticus that was successfully aborted using repeated doses of intravenous pyridoxine.

Discussion

This study describes the use of SNP microarray to detect ROH associated with parental consanguinity and illustrates the clinical utility of ROH with two pertinent families. In an earlier study, Bruno et al.11 identified ROH >5 Mb, in 5/117 patients tested; however, no diagnoses were made and the study concluded that most regions were unlikely to be clinically significant. In this study, homozygosity mapping of probands and affected siblings identified a causative gene mutation in four patients from two families ( Figure 3 ). This is expected to represent the lower limit of clinical utility as not all patients had comprehensive follow-up and for some families the gene of interest may be located within an ROH <5 Mb, the cutoff used in this study. In siblings with a homozygous PNPO mutation, identification of the genetic defect led to important clinical management decisions, improved patient care, and potentially benefitted the long-term neurodevelopmental outcome. It should be emphasized that successful homozygosity mapping is dependent on excellent communication between the managing clinician and the laboratory regarding patient phenotype and clinical suspicion for disease. Web tools are in development to aid in the analysis of ROH for candidate recessive disease genes in the context of clinical features (Genomic Oligoarray and SNP array evaluation tool v1.0, University of Miami, Coral Gables, FL and Oklahoma University Health Sciences Center, Oklahoma City, OK).

Homozygosity calculations from this study revealed patients with coefficients of inbreeding ranging from 0.9 to 30.1%. This includes 11 probands who had ROH consistent with a parental first-degree kinship (>21.3% homozygosity), clearly demonstrating that SNP microarray has potential to identify a high degree of parental relatedness including potentially illegal incestuous relationships. Particularly concerning are cases where one parent is underage or mentally incapacitated. States have different laws defining incest, with some including only first-degree relatives and others extending to first- and second-degree relatives.24,25 Due in part to this legal ambiguity, there is no clear national consensus or standard of care to provide guidance to laboratories and physicians in responding appropriately and consistently to homozygosity data generated by genetic testing. Although every attempt should be made to address these ethical/legal issues in a thorough pretest counseling session, it would be helpful to have national consensus guidelines for a structured and consistent response to this type of result.

Because the proportion of the autosome covered by ROH is a continuous measure, but the theoretical coefficient of inbreeding is a single-point estimate, there is difficulty in classifying the likely degree of parental consanguinity. Thus, confidence intervals were obtained in this study to classify parental degree of relationship. Although the utility of reporting the location of ROH in clinical cases is recognized, we do not suggest including an estimate of parental degree of relatedness in a clinical laboratory report. Such calculations are only an estimate and cannot account for multiple generations of consanguinity or the random nature of crossovers in meiosis. Although this approach was successful in classifying some cases, not all patients fit cleanly into the designated categories. Furthermore, we had only a small number of individuals with clinically reported consanguinity; therefore, it is difficult to determine the accuracy of our assumptions. Clearly, further studies are required to determine the most appropriate thresholds for ROH reporting.

In some cases, the degree of clinically reported relationship did not match up with molecular findings. In families that denied consanguinity, discrepancies could be explained by a distant parental relationship that may be unknown to the family, families that were from isolated populations, or a close parental relationship that the family did not wish to disclose. In families where observed ROH was more than expected, discrepancies could be explained by patient ethnicities where multiple generations of consanguinity are common or a close parental relationship that the family was unaware of or did not wish to disclose. In two cases, we were unable to explain the discrepancy. It is not surprising that these types of discrepancies were identified, as many families have incomplete or inaccurate information about their ancestors.26

In two families, molecular results suggested parents were more distantly related than expected. Although these results cannot be explained with certainty, it is possible that the parental relationship was unclear in one family, as the child was adopted. In the second case, it is possible that a recombination event in the formation of gametes led to an overall decrease in percentage of homozygosity. The offspring of this union, a male child, was expected to have 12.5% homozygosity based on clinical report that the mother and father were paternal half-siblings. However, microarray identified 8% homozygosity, a significant departure from the expected value.

Even when parental relationships were clinically documented, the relationship was rarely communicated to the laboratory performing microarray analysis but was instead found on chart review. However, our laboratory has seen an increase in documentation of consanguinity as an indication, suggesting that physicians recognize the utility of SNP microarray in identifying ROH in these populations.

In this study, the impact of the X chromosome on homozygosity calculations was evaluated by comparing autosome and genome homozygosity in 25 females. Fifteen of 25 females had homozygosity on the X chromosome, but in most cases it changed the total percentage by ≤1%; the mean net difference was 0.5%. However, 25% of the females had an increase >1%, suggesting that inclusion of the X chromosome may have a net impact of increasing total percentage of homozygosity in some cases. Of note, the parental classification system was made more ambiguous for 12/25 females when the X chromosome was added to the calculation, as minor adjustments were made to the degree of relationship ( Supplementary Table S4 online). In one patient where a first-degree parental relationship was suspected, the entire X chromosome was homozygous, increasing the percentage of homozygosity by 4% and supporting the classification of the parental relationship. On the basis of these findings, it may be prudent to exclude the X chromosome from homozygosity calculations, but ROH on the X chromosome should not be completely ignored. Additional information may be obtained by consideration of ROH on the X chromosome. Such information may strengthen suspicion of parental incest.

Consistency of microarray prediction of ROH was also evaluated in three families where microarray analysis was done on more than one child. For the most part, molecular findings in the offspring were consistent according to our classification strategy. In one trio, percentage of homozygosity was 4.3, 4.4, and 6.4%. Two siblings had a parental classification that fell into the overlapping third- or fourth-degree category, whereas the third sibling was classified as offspring of third-degree relatives. The parents in this family denied close consanguinity, explicitly stating they were not first cousins. However, they are from Palestine, a relatively isolated population where rates of consanguinity approach 44.3% in some populations.27 This family further illustrates the difficulty of classifying families with multiple generations of consanguinity. Future studies addressing the genomic landscape of ROH may be useful in delineating actual parental relationship categories. As a general observation, this study noted increased centromeric, increased telomeric, and decreased intra-arm ROH in the offspring of parents who were first-degree relatives as compared with parents who were more distantly related.

In this study, microarray testing was performed on three different platforms. The mean spacing of SNPs on the oldest platform (370-duo) was 7.7 kb; the median spacing was 5 kb. It is possible that reported ROH from this platform is slightly inflated due to spacing of SNPs. However, this platform was used for only three patients and based on spacing of SNPs, the impact on total percentage of homozygosity is expected to be minor. Density of SNPs is an important consideration in the assessment of homozygosity in a patient population. Another issue that needs to be addressed in future studies is the presence of common ROH in populations.4,7,28. Tracking these regions may be beneficial in the exclusion of uniparental disomy, but it is unclear how such regions impact percentage of homozygosity. The authors are unaware of a publicly available database that tracks common ROH. Such a database would be a valuable resource in the determination of consistently reported common ROH in specific ethnic populations.

This study represents the first systematic review of clinical patients determined to be offspring of a consanguineous relationship with SNP microarray technology. Although the technology can be used to estimate parental degree of relationships, future studies are important to delineate categories of consanguinity. The ability to detect ROH is not limited to microarray; results of parental consanguinity also can be detected by whole-exome/whole-genome sequencing. Although this study provides evidence that ROH can be helpful in directing clinical diagnosis, it also highlights challenges associated with the discovery of illegal/unethical parental relationships. As many laboratories are currently offering SNP microarray, and whole-exome/whole-genome testing is on the rise, the field must consider laboratory and physician guidelines for handling results indicating parental relatedness.

Disclosure

K.L.S., S.L.Z., L.B., and T.A.S. are employed by the Cytogenetics Laboratory in the Division of Human Genetics at Cincinnati Children’s Hospital Medical Center.