Introduction

In humans, the normal patterning of internal organs on the left or right side of the body is a complex, yet highly orchestrated, developmental process. Consequently, disordered left-right (LR) patterning can lead to a broad spectrum of laterality defects, including situs inversus totalis (SIT), in which there is complete transposition of visceral organs with maintenance of organ concordance, and heterotaxy (HTX) in which at least one organ is discordant along the left-right axis [1]. SIT is rarely associated with congenital organ malformations, whereas heterotaxy is highly associated with a host of congenital malformations that include isomerisms (in which normally asymmetric structures appear duplicated on both left and right e.g., bilateral bilobed lung), segmental reversals of sidedness (e.g., dextrocardia), and failures of embryonic structures to regress (e.g., persistent left superior vena cava) or complete loss of structures (e.g., asplenia). The conventional anatomic classification of HTX has been based on splenic and atrial/bronchial situs—namely asplenia-type/right atrial isomerism and polysplenia-type/left atrial isomerism. Although there are characteristic lesions within each of these classifications, their considerable anatomic and mechanistic overlap justifies their combination in epidemiological and genetic studies [2].

Heart formation in the early embryo requires proper LR axis signaling [3], and therefore, it is not surprising that cardiovascular malformations occur in approximately 80% of individuals with heterotaxy. The resulting spectrum of lesions is extensive and includes anomalies of pulmonary and systemic venous return, atrioventricular septal defects, ventricular inversion, and hypoplasia, transposition of the great arteries (TGA), double outlet right ventricle (DORV), and heart rhythm disorders [4]. In fact, abnormal LR patterning may contribute to as much as 3–7% of all congenital heart defects, and, with the associated multi-organ involvement, the medical and surgical management of such cases is particularly challenging [5]. The mortality for HTX thus remains appreciably high [6].

The etiologies of congenital laterality defects are complex and thought to include both environmental and genetic factors. Contributing environmental factors are poorly understood, but have been consistently observed [7]. From a genetic standpoint, laterality defects are occasionally observed in patients with chromosome abnormalities and genomic disorders [6, 8], and rare families have been reported with apparent segregation of HTX or related heart defects as autosomal dominant, autosomal recessive or X-linked traits [9,10,11,12]. There is thus some support for highly-penetrant rare variants, i.e., monogenic or Mendelian inheritance, causing laterality defects. At the same time, it has been noted that the same gene can cause the divergent phenotypes of SIT and HTX, and that polysplenia- or asplenia-types of HTX can be observed in families segregating the same pathogenic allele in the same disease gene, as well as in isolated cases with distinct alleles in the same gene [9, 10, 13]. In this context, the complexity of the heart defects and associated birth defects observed among patients with presumed protein-damaging variants in ZIC3 and CFC1 is most illustrative; there is little genotype-phenotype correlation among patients with ZIC3 variants, and the spectrum of heart defects is extensive, including defects not traditionally associated with laterality phenotypes and complex extra-cardiac malformations [10, 13]. Similarly, CFC1 variants have been associated with a variety of isolated congenital heart defects in addition to laterality abnormalities [9]. These observations demonstrate that there is substantial overlap in the genetic etiology of congenital heart defects, regardless of their phenotypic classification.

The known genes contributing to laterality defects account for only 15–20% of cases, and are largely associated with NODAL/TGFβ signaling (NODAL, CFC1, ACVR2B, LEFTYB, GDF1, TGFBR2, FOXH1), SHH signaling (ZIC3, LZTFL1) and monocilia functions (NPHP2, NPHP3, NPHP4, PKD2, TTC8) (see Supplementary Table S1). This connection to ciliary function is further reinforced by the observation that ~50% of patients with primary ciliary dyskinesia (PCD) defects have SIT and ~10% present with heterotaxy [8, 14]. Other gene variants with well-studied connections to early cardiac development (NKX2-5, CRELD1, MMP21, PKD1L1) and genes with relatively limited functional annotation (BCL9L, SHROOM3, MEGF8) have also been implicated in laterality defects. A critical review of these reports demonstrates that, although significant recurrence in first-degree relatives [15] and increased frequency in consanguineous populations [16] support a single gene view of laterality defects, putative pathogenic variants are often transmitted from unaffected parents, suggesting additional factors affect disease penetrance. Thus, these primary epidemiological observations are compatible with both Mendelian and oligogenic or complex inheritance mechanisms.

To explore the role of genetic variation in laterality defects, we performed singleton whole-exome sequencing (WES) of 323 unrelated cases, and investigated rare predicted-damaging variation, homo/hemizygous exon deletions, and gene-based burden of rare variation in a set of biologically-implicated cardiovascular genes defined a priori.

Results

High-quality candidate variants identified in the laterality cohort (Methods) were prioritized using three criteria: (1) extremely low allele frequency compared to population-based databases (minor allele <0.05% in ARIC, 1000 Genomes Project, ESP, ExAC, gnomAD—see Methods), (2) prediction of a deleterious functional effect including loss-of-function (LOF – e.g., frameshift-, stopgain-, and splice site- variants within 2 base pairs of intron-exon boundaries) or damaging nonsynonymous variation (Methods), and (3) a priori evidence for the gene playing a role in laterality or cardiac development (Methods; Supplementary Table S3). A total of 24 single nucleotide variants (SNVs) or small insertions/deletions met these criteria, representing 15 distinct genes (Table 1). The cases carrying these candidate SNVs represented ~7% of our total laterality cohort. Of these genes, nine have been previously implicated in human laterality disorders, and six represent novel candidate genes. We further investigated the clinical presentation of variant-carrying patients as well as their associated inheritance patterns.

Table 1 Candidate variants in laterality defect cases identified by whole exome sequencing

Known laterality genes

Six cases had rare LOF or nonsynonymous damaging SNVs in two known dominant laterality genes—NODAL and ACVR2B. The p.(R383C) missense variant observed in ACVR2B was inherited from an affected parent, and further analysis of this pedigree demonstrated clear segregation of the variant with five affected individuals from three generations (Fig. 1a). Conversely, all four NODAL variants were inherited from apparently unaffected parents; this is consistent with incomplete penetrance of these pathogenic variants, which is well-described for pathogenic variants in this gene [17]. For example, in the one available NODAL multiplex pedigree, a splice variant in patient LAT0022 was shown to be inherited from an apparently unaffected father and was shared with an affected paternal aunt (Fig. 1b). Somewhat surprisingly, three of the four predicted-damaging SNVs in NODAL were predicted loss-of-function variants—an uncommon observation in this gene (pLI = 0.95 [18]). We also observed compound heterozygous, rare, predicted-damaging SNVs in five genes previously implicated in either PCD—DNAI1, DNAH5, DNAH11, and HYDIN (1 case each), or HTX - MMP21 (2 cases), PKD1L1. PCD was not a part of the study entry criteria, and none of the cases harboring variants in PCD genes were known to have PCD. A genetic and phenotypic overlap between PCD and laterality has been previously postulated [19] and our data support this. Homozygous and compound heterozygous variants in MMP21 have been reported as the cause of HTX in a total of 14 cases (from 12 families) [20, 21]. Our recent report of homozygous recessive PKD1L1 variants as a cause of HTX included both the case reported here and another case with a homozygous splicing variant in PKD1L1 [22]. Amongst known laterality genes, the largest contribution of predicted-damaging SNVs was in ZIC3—we observed five LOF SNVs (5 cases) and one predicted-damaging missense variant (1 case), including pathogenic variants in three cases that we have previously reported [10]. Consistent with X-linked disease inheritance, ZIC3 candidate variants were all found in the hemizygous state in male offspring and were inherited from heterozygous mothers, with evidence of more extensive X-linked segregation of the phenotype in available pedigrees (Fig. 1c and ref. [10]).

Fig. 1
figure 1

Multiplex families with laterality defects and CHD. Families segregating variants in ACVR2B (a, dominant), NODAL (b, dominant), ZIC3 (c, X-linked), and SMAD2 (d, dominant) are shown. Shaded individuals are affected with a laterality defect. Presence (+) or absence (−) of variant is indicated for tested individuals. Accompanying numbers indicate LAT IDS (see Supplementary Table S2). Phenotypes are provided for affected individuals for whom a DNA sample was not available

Novel human laterality candidate genes

Finding likely pathogenic variants in known laterality genes served as a benchmark for our a priori filtering approach. Next, we focused on candidate laterality variation in genes not previously implicated in human disease. De novo variants have been shown to be of particular relevance in CHD [23,24,25,26]; however, we did not observe any of the candidate variants in our cohort to occur de novo. We observed rare heterozygous singleton LOF SNVs in six candidate genes with strong supporting evidence for a role in cardiac development and a low tolerance for loss-of function variation (pLI) in gnomAD—SMAD2, ROCK2, ISL1, and SUPT16H (Table 1). Where parents and family members were available to assess segregation, these variants, with the exception of SUPT16H but like those in NODAL, all showed evidence of dominant inheritance with incomplete penetrance—all were transmitted from parents without a reported history of significant cardiac disease. De novo damaging variants in SMAD2 were previously reported in two cases from a large genetic study of CHD [23, 24], which, on review, both had phenotypes consistent with laterality. In our cohort, the SMAD2 LOF variant was inherited from an unaffected father whose brother passed away with a diagnosis of HTX prior to genetic testing (Fig. 1d).

We did not systematically perform echocardiograms or additional molecular (e.g., mosaicism) investigations on carrier parents; therefore, reportedly ‘unaffected’ parents could still have less severe, undetected, laterality phenotypes. Given the known incomplete penetrance of dominant inherited variants in these disorders [25], the predicted molecular effect of the identified variants on the protein, and robust prior biological support, we chose to maintain these presumed partially penetrant dominant genes on our candidate list.

We observed biallelic, rare, predicted-damaging variants occurring in trans in one gene not previously associated with either PCD or laterality defects in humans- ZFYVE16 (Table 1). ZFYVE16 encodes a FYVE zinc finger domain protein; related proteins in this family have been implicated in TGF-β and BMP signaling through Smads [27], as well as in ventral folding of the developing mouse embryo, including heart development [28]. Notably, these ZFYVE16 variants were observed in the same male individual with a hemizygous frameshift variant in ZIC3 (LAT0264—see Table 1). In addition to Dextrocardia, this individual was also noted to have an absent stomach; an unusual feature suggestive of a more severe phenotype outside of the true laterality spectrum.

We also assessed evidence for X-linked inheritance models among our candidate genes. We noted a male proband with a hemizygous, maternally-inherited RAI2 variant and SIT in combination with asplenia as well as other cardiac and vascular abnormalities (Table 1; Supplementary Table S5). RAI2 (retinoic-acid induced 2) encodes a protein (RAI2) of unclear function that is thought to regulate transcription. RAI2 is expressed in fetal heart, brain, and kidney [29], and copy number loss of chromosome Xp22, encompassing RAI2, has been noted in two families with a syndromic diagnosis that included congenital heart lesions [30]. Another retinoic-acid induced gene—RAI1 —is the putative phenotypic driver in Smith–Magenis syndrome (MIM# 182290), a neurodevelopmental syndrome that includes cardiac defects and is caused by haploinsufficiency of 17p11.2 due to deletion CNV or LOF SNV in RAI1.

Homozygous and Hemizygous exon deletions in human laterality disorders

Next, we assessed structural genomic variation in our cohort; specifically, small and rare hemizygous and homozygous deletions detectable from comparison of whole exome sequence reads [31]. We observed a total of 14 high-confidence events (Z < −1.5 (autosomes), Z < −1.0 (X chromosome); maximum 3 events per person and 3 overlapping event calls) (Supplementary Table S6). One of these events overlapped our original a priori list - a hemizygous single exon deletion of ZIC3 observed in a patient with a common AV canal, asplenia, single ventricle, and atrioventricular valve regurgitation. Subsequent validation and parental testing showed this deletion to be maternally inherited (Fig. 2). We also validated an inherited, hemizygous single exon deletion of RIPPLY1 (Ripply Transcriptional Repressor 1) in a male patient with abdominal situs inversus, asplenia, interrupted inferior vena cava (IIVC), total anomalous pulmonary venous return (TAPVR) and complex CHD (D-TGA, common AV canal, ventricular septal defect, atrial septal defect, pulmonary atresia) (Supplementary Figure F1). Although not on our original a priori list, RIPPLY1 encodes a putative transcriptional repressor that acts in the NOTCH pathway and is paralogous to RIPPLY2, which we have recently described in a case of autosomal recessive Klippel-Feil syndrome with SIT [32]. In zebrafish embryos, Ripply1 interacts with Tbx6 and Mesp-b to regulate somite and vertebral development [33], and t-box genes are known to be important regulators of the establishment of asymmetry mediated by cilia in Kupffer’s vesicle [34], making it a strong candidate gene for laterality defects.

Fig. 2
figure 2

Hemizygous partial deletions of ZIC3. a WES read count data (RPKM) plotted for subject LAT0097 (red line) and all other BHCMG subjects (black lines) in the indicated regions of the X chromosome. Near-zero RPKM values in (a) suggest a hemizygous deletion of the final exon of ZIC3; b Confirmation and mapping of ZIC3 CNV by array comparative genomic hybridization using a custom 8 × 60k probe Agilent array (design ID = 064211). This array and all array procedures have been described previously [49]. c To more accurately map the CNV, long-range PCR with primers spanning the predicted breakpoints was employed, resulting in a smaller product compared to wild-type (ctrl DNA); sequencing of the resulting PCR product illustrated the deleted region (d)

Possible multigenic contribution to human laterality disorders

Recent studies have revealed associations between common variants and a variety of cardiovascular traits [35, 36], as well as highlighting the complex multigene/multi-variant interactions in hypoplastic left heart syndrome [37]. Therefore, we chose to expand our interrogation beyond single marker Mendelian analyses to perform gene-based aggregation analyses across the entire exome by comparing the burden of putative loss-of-function and predicted-damaging nonsynonymous variation between laterality cases and ARIC controls (see Methods). This analysis revealed compelling statistical evidence (surpassing our significance threshold of p < 7.06 × 10−6) for two genes—PXDNL and BMS1 (Supplementary Table S7), with little evidence of systematic stratification between the cohorts.

Although neither Peroxidasin-like (PXDNL; p = 1.61 × 10−6) or Ribosomal biogenesis factor (BMS1; p = 7.29 × 10−7) were included on our a priori list, both are good biological candidates in the context of the broader laterality phenotype. PXDNL is a homolog of human peroxidase that is exclusively expressed by cardiomyocytes in the human heart and thought to interact with peroxidase in remodeling of the extracellular matrix after stress exposure [38]. BMS1 is highly conserved across species. The zebrafish homolog, bms1l, is strongly expressed throughout the digestive tract and accessory organs including the liver, and mutant bmsl1 has been associated with hypoplasia of the liver and digestive tract [39]. Thus, BMS1 may be of particular relevance to the ~2/3rds of our cohort with extra-cardiac laterality defects, which included liver abnormalities and intestinal malrotation (Supplementary Tables S2 and S5), as was evident in two of the six LOF variant carriers.

Discussion

Our analysis of rare damaging variation and exonic deletions with suspected and established CHD genes detected 28 compelling monogenic candidate variants (26 SNV/indels, 2 deletion CNV) in 25 of 323 cases, or 7.1% of our starting cohort of unrelated laterality patients. This approach identified pathogenic and likely pathogenic variants in known and putative laterality genes, expanding the spectrum of disease-causing variation within these genes. Though we incorporated evidence from familial segregation patterns of these variants when possible, these variants were not primarily ascertained using family-based methods. This demonstrates that stringent variant filtering strategies can be used to identify candidate variants in rare disease case cohorts without the necessity of performing WES in all family members. Nevertheless, the overall “solve” rate would likely improve using a combination of these approaches.

We anticipate that further genetic studies of this phenotype will confirm many of our candidates and identify new ones, thereby improving the overall molecular diagnostic rate for laterality defects. For instance, our approach was able to confirm SMAD2 (dominant) as a laterality CHD gene alongside two reported de novo cases identified from large CHD cohorts [23, 24], and was instrumental in identifying and confirming PKD1L1 (recessive) [22]. We thus posit RIPPLY1 (X-linked hemizygous) and RAI2 (two X-linked hemizygous cases) as particularly compelling laterality candidates emerging from our analyses.

The majority of cases remained without an identifiable genetic etiology. In part, this reflects the limitations of our somewhat conservative analytical framework—we restricted our Mendelian SNV analyses to high-confidence biological candidates and imposed a high threshold for pathogenic variant classification among identified variants. This enabled us to bring our analytical framework in line with recent recommendations for inferring pathogenic and likely-pathogenic genomic variants [40]—the variants in newly described candidate genes presented here would be considered to have (at least) moderate evidence of pathogenicity. Nevertheless, our stringent criteria and cohort approach  could miss potential pathology-contributing variants in individual patients and families. Future surveys that incorporate data from the entire exome/genome, utilize segregation in larger pedigrees, and prioritize functional validation of novel variation will be of high value in further expanding the genetic spectrum of these defects.

The variants and candidate genes we identified in individual subjects suggest that alleles contributing to laterality-related traits largely segregate with either recessive or X-linked inheritance, although we did observe suggestive examples of dominant and complex/polygenic modes of inheritance. Almost all of the dominant heterozygous LOF variants we observed—both in known and new candidate genes—were inherited from apparently unaffected parents. With new candidate genes, it is difficult to know a priori whether haploinsufficiency is sufficient to cause disease; however, this phenomenon is consistently observed in known laterality-causing genes such as NODAL, and was congruent with the disease segregation of SMAD2 LOF variants in extended families. One potential mechanism contributing to this observation is parental mosaicism. We did not perform WES on parents, and although validation traces were not highly suggestive of mosaicism, a role for low-level gonadal- or germline- mosaicism cannot be completely ruled out. Another consideration is that the phenotype of dominantly-inherited laterality genes may involve more complex mechanisms of non-penetrance [37] and variable expressivity [41], such that ‘unaffected’ carrier parents might have milder phenotypes that do not readily come to clinical attention. It may also be that multiple ‘hits’ are required to disturb the complex processes involved in early embryonic left-right patterning, such that the effect of a single heterozygous ‘hit’ is most relevant in the context of epistatic modifiers around the genome [42]. Similarly, multi-locus variation may provide an adverse variant burden in a pathway, system or interactome [43]. The latter two speculations are bolstered by the associations between the burden of rare variation in BCL1 and PDXNL and laterality defects. A similar argument can be made for rare compound heterozygous damaging variants in ZFYVE16 occurring in the same individual (LAT0264) carrying a ZIC3 frameshift variant (Table 1). Functional studies of our candidate variants in model organisms together with consideration of more complex multi-locus genetic models may help to confirm or refute these notions.

Expanded assessments of coding-sequence variation, however, may leave our understanding of the etiology of laterality defects incomplete. For instance, LOF variants accounted for all but one of the ZIC3 variants, and included distinct types of LOF variants (LAT0180—frameshift; LAT1177—stopgain; LAT0097—hemizygous deletion). The effects of these variants converge on absence of ZIC3 protein; thus, it may follow that other variants with a similar effect could also cause HTX and congenital heart disease (e.g., miRNA, gene silencer motifs). Screening for these and other categories of LOF-like variation in ZIC3 could be prioritized in future genomic analyses of HTX, in addition to the exome-centric analyses presented here.

This study provides a comprehensive coding-sequence survey for variants at known and putative candidate loci in laterality defects. The results implicate a total of six candidate genes for human laterality defects and reinforce that the genetic architecture of such defects is complex, spanning several single gene inheritance models, potential multi-locus contributions and other putative disease mechanisms. These data provide a basis for future investigation of additional monogenic causes of heterotaxy and related defects in left-right patterning as well as a starting point for discovery of complex genetic mechanisms underlying CHD and other human birth defects.

Methods

Case ascertainment

Cases were recruited through Texas Children’s Hospital (TCH) in Houston, TX. The Institutional Review Board of Baylor College of Medicine approved the study and all participating subjects gave informed consent. When available, parents and affected family members of the index cases were also invited to participate. The details of case ascertainment have been previously published [17]; briefly, patients were eligible if they presented with evidence of disturbed left-right patterning, which included situs abnormalities (SIT, HTX) or an isolated congenital malformation consistent with disturbed left-right patterning such as D-TGA or DORV (summarized in Supplementary Table S2). The clinical diagnoses were confirmed by a detailed review of medical records, including radiologic and cardiac-specific imaging. DNA samples were obtained from blood, saliva, and skin fibroblasts as previously described [17].

Cardiovascular gene set

Given the strong relationship between laterality and CHD, we compiled a list of 1702 human genes with a priori evidence for a role in laterality and/or cardiovascular malformation (CVM) from multiple public resources (Supplementary Table S3). This list was compiled from searches of human disorders including laterality defects and CVM (OMIM, NCBI, literature), relevant biological pathways and interactions (HH, NOTCH, TGFβ, PITX2) as determined by the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, and model organism support including zebrafish heart expression (ZFIN) and abnormal cardiac morphology in mouse models (MGI, MP:0000266).

Population-based allele frequency comparison

In additional to population-based databases of variant minor allele frequencies—Exome Aggregation Consortium (ExAC, v 0.3.1), Genome Aggregation Database (gnomAD, v3.0.1), Exome Variant Server (EVS)—WES data from 5492 European American (EA) individuals from the population-based Atherosclerosis Risk in Communities (ARIC) study—sequenced and annotated using the same pipeline as the laterality cases [44]—were utilized as a variant frequency comparison group. ARIC samples with heart failure, major Q-wave, or LVH by the Cornell definition were excluded from the analyses.

Whole exome sequencing, annotation, and validation

WES was performed on 323 individual cases at the Human Genome Sequencing Center at Baylor College of Medicine through the Baylor-Hopkins Center for Mendelian Genomics initiative using the Illumina HiSeq platform and the Mercury pipeline [45]. ARIC samples were captured using VCRome 2.1 (42 Mb) and HTX cases were captured using HGSC-CORE (52 Mb), and all analyses were restricted to the intersection of the targeted regions of these reagents. Furthermore, we analyzed only high-quality genotypes including single base substitutions with a minimum coverage of 10x, and small insertions/deletions (indels) with >30x coverage. Read mapping to Genome Reference Consortium Human Build 37 (GRCh37) was performed with Burrows-Wheeler alignment [46], and allele calling was performed with the Atlas2 suite [47] (Atlas-SNP, Atlas-Indel) in order to identify high-quality variants annotated for their potential protein-damaging effect. All candidate SNVs from the laterality cohort were validated on an orthogonal platform (Dideoxy -Sanger- sequencing) in cases. In order to assess inheritance patterns, validated variants were then genotyped, using the same methodology (dideoxy -Sanger- sequencing), in parents and available family members.

Variant filtering

The Variant Call File (VCF) contained flagged low-quality variants including SNPs with posterior probability lower than 0.95, total depth of coverage less than 10x, fewer than 3 variant reads, an allelic fraction less than 10%, 99% reads in a single direction, and homozygous reference alleles with < 6x coverage. We increased stringency to remove low-quality indels with a total depth less than 30x and allelic fraction below 30%. Eight cases presented extremely high or low heterozygosity (more than 3 standard deviations from sample mean) and were excluded from burden analyses. Variants were annotated to Refseq gene definitions using ANNOVAR. Conservative loss-of-function (LOF) annotation was performed by selecting only included premature stopgains in the non-terminal exon, essential splice sites used by all gene isoforms, and frameshift indels similarly mapping to all isoforms. Damaging nonsynonymous (DNS) variation was defined as protein-altering substitutions predicted to be damaging by a consensus of at least 3 out of 6 prediction scores downloaded via dbNSFP (SIFT, Polyphen2 HDIV, LRT, Mutation Taster, Mutation Assessor, FATHMM). A PHRED-like scaled C-score (CADD) was also used to assess pathogenicity of variants (LOF and DNS), but was not used to exclude candidate sites. LOF constraint was quantified by first calculating the sum of all LOF alleles in ARIC (gene-wise observed LOF). Next, we simulated all potential nucleotide substitutions in exonic regions to determine the number of total potential LOF sites for each gene and calculated the ratio of observed to potential LOF alleles (OP ratio). OP ratios were used alongside pLI (probability of loss-of-function scores from ExAC/gnomAD), to filter for genes with a very low OP ratio (zero, or lowest 30th percentile) or high pLI score (>0.9) amongst laterality candidates. Variants identified in known laterality genes were also curated for pathogenicity using ACMG/AMP criteria [48], and then annotated as either a variant of uncertain significance (VUS), a likely pathogenic (LP), likely benign (LB), or pathogenic (P) variant (Table 1; Supplementary Table S8). Finally, for large genes in which we observed singleton biallelic variants in trans (e.g., DNAH5, DNAH11 and HYDIN), we also calculated the chance of observing two such variants by taking the maximum frequency of the two observed variants and catalogued all other variants (in gnomAD) in the gene at that frequency or lower. The resulting frequencies were added, assuming independent segregation, to give the proportion of people with rare variants. Assuming random mating, we then estimated the chance that any two variants would be observed in the same person. If this probability of biallelic variants occurring by chance was higher than observed in our cohort, we removed the gene in question (e.g., DNAH11), keeping those that were less than observed.

WES of laterality cases revealed 1,217,653 total variants (1,152,651 single nucleotide substitutions, and 65,002 small indels; Supplementary Table S4) across the entire allele frequency spectrum.

Validation of hemizygous/homozygous copy losses

CNVs were confirmed and mapped by array comparative genomic hybridization using a custom 8 × 60k probe Agilent array (design ID = 064211). This array and all array procedures have been described previously [49]. To more accurately map the CNV and provide substrate for breakpoint sequencing, long-range PCR with primers spanning the predicted breakpoints was employed. PCR reagents and concentrations have been described previously [50]. The thermal cycler was programmed as follows: 94 °C x 1 min; 30 cycles of 94 °C x 30 sec followed by 68 °C x 7 min; 72 °C x 10 min. PCR primers are listed in the Supplemental Methods. Breakpoint PCR products were treated with ExoSAP-IT (Affymetrix) according to the manufacturer’s instructions, then sequenced by Sanger di-deoxynucleotide sequencing (Baylor College of Medicine Sequencing Core, Houston, TX, USA).

Gene burden testing

Firth logistic regression was performed on case-control status using the total number of heterozygous sites per individual as a covariate in order to address potential platform differences between sequencing batches. These analyses were restricted to samples of reported European ancestry (111 HTX cases; 5,752 ARIC participants) and excluded genes in the MHC region of chromosome 6 and multiallelic variants. Only rare DNS/LOF variation that were covered at 10x or better and meeting our variant-filtering criteria in both cohorts were included in our analyses; a p-value of 7 × 10−6—reflecting a Bonferroni correction for 7086 genes harboring DNS in the case cohort – was deemed statistically significant.

Web resources

1000 Genomes, http://browser.1000genomes.org

ExAC Browser, http://exac.broadinstitute.org/

gnomAD Browser, http://gnomad.broadinstitute.org/

dbNSFP, http://varianttools.sourceforge.net/Annotation/DbNSFP

OMIM, http://www.omim.org/

UCSC Genome Browser, http://genome.ucsc.edu

HMZDelfinder, https://github.com/BCM-Lupskilab/HMZDelFinder