INTRODUCTION

Inherited axonopathies (IA) are a group of disorders unified by a common pathological mechanism: length-dependent axonal degeneration. They are traditionally classified into two broad genetic disorders: hereditary spastic paraplegia (HSP, OMIM PS303350) and Charcot–Marie–Tooth disease (CMT, OMIM PS118220) depending on upper or lower motor neuron involvement, respectively. Historically, CMT and HSP have been treated as distinct disorders, but their increasingly apparent clinical and genetic overlap challenges this classification. CMT and HSP can be caused by variants within the same gene (e.g., KIF1A, REEP1, and BSCL2), yet the additional factors that determine peripheral or central nerve involvement in each IA patient remain unclear. Currently, more than 50% of cases do not receive a genetic diagnosis from next-generation sequencing (NGS).1 The high percentage of genetically undiagnosed IA cases may be a result of undiscovered highly penetrant alleles in both known and yet to be associated disease genes, cases without a true genetic etiology, or currently difficult to detect and/or interpret variation such as deep intronic, regulatory, or structural. Additionally, growing evidence suggests that rare mutational mechanisms or modes of inheritances that are likely overlooked in standard exome sequencing (ES) analysis may also contribute to the overall genetic etiology.2 Finally, environmental contributions to phenotypes are very difficult to assess and may have a larger than estimated importance in IA patients.

The phenotypic variability and reduced penetrance observed within IA support the possibility of multilocus inheritance or genetic modification. These two events are closely related and both result in phenotypic effects that are caused by more than a single Mendelian allele. A distinction between these processes lies in the sufficiency of the primary allele to cause disease.3 If the presence of the primary allele alone manifests the phenotype, then the secondary allele is a genetic modification of the phenotype, such as the severity of progression or the age at onset.3 However, if the presence of an allele in a second gene or multiple genes is required to cause disease, then inheritance is multilocus in nature.3 Non-Mendelian modes of inheritance have been independently demonstrated in both CMT and HSP,4,5 but cohort-level sequencing analyses are limited. In this pilot study, we gathered over 800 exomes from IA cases to determine whether multilocus inheritance warrants deeper investigation in classically Mendelian disease groups.

MATERIALS AND METHODS

Ethics statement

The cases were collected from the Inherited Neuropathies Consortium, the University of Miami (UM), Children’s Hospital of Philadelphia (CHOP), the University Hospital Tübingen, and McGill University. All participating individuals gave informed consent prior to initiating this study in agreement with the institutional review boards.

Methods

Families included in the study are affected by IA (either CMT or HSP). CMT cases were diagnosed with CMT (type 1, 2, 4, or intermediate), distal hereditary motor neuropathy, hereditary sensory autonomic neuropathy, or hereditary sensory neuropathy; HSP cases were diagnosed with pure or complicated HSP. ES was performed at UM (CMT and HSP cases), McGill (HSP cases), and at CHOP (controls). Enrolled cases had previously negative testing for key IA genes; however, solved research cases were included in the cohort (8.7% [30/343] CMT cases were solved across 21 disease genes while 5.8% [30/515] HSP cases were solved across 19 disease genes). All samples were sequenced on Illumina HiSeq2000 and joint-genotyped according to the GATK (v.3.3) germline ES best practices. After extensive quality control (including duplication percentage, sex and relatedness, depth and missingness metrics, ancestry), the cohort contained 343 CMT cases, 515 HSP cases, and 935 non-neurological controls of predominantly European ancestry.

To detect risk alleles, a gene-based rare variant association test was performed by the C-alpha test in the PLINK/SEQ suite. Following recommended protocol, tests with an i-statistic greater than 10−3 were removed, and Bonferroni correction was applied.6 To compare the mutational burden across known disease genes (CMT: n = 88, HSP: n = 95), the number of rare variants (nonsynonymous or loss-of-function [LoF] at ExAC minor allele frequency [MAF] ≤0.01 and ≤0.001) within disease genes was computed for each sample, and the average counts were compared between case and control using a Mann–Whitney–Wilcoxon test followed by 10,000 iterations of affection permutation for significance. To assess multilocus inheritance, the number of known disease genes carrying at least one qualifying variant was determined per sample. Case and control carrier status was organized into 2 × 2 contingency tables and assessed by Fisher’s exact test.

CMT disease genes

AARS, AIFM1, ARHGEF10, ATL3, ATP7A, CHCHD10, CLCF1, COX6A1, CRLF1, DCAF8, DCTN1, DGAT2, DHTKD1, DNAJB2, DNM2, DNMT1, DST, DYNC1H1, EGR2, ERBB3, FAM134B, FBLN5, FBXO38, FGD4, FIG4, GARS, GDAP1, GJB1, GNB4, HARS, HINT1, HK1, HSPB1, HSPB3, HSPB8, IGHMBP2, IKBKAP, INF2, KARS, LITAF, LMNA, LRSAM1, MARS, MED25, MFN2, MME, MORC2, MPZ, MTMR2, NAGLU, NDRG1, NEFH, NEFL, NGF, NTRK1, PDK3, PLEKHG5, PMP2, PMP22, PRPS1, PRX, RAB7A, SBF1, SBF2, SCN10A, SCN11A, SCN9A, SEPTIN, SETX, SH3TC2, SLC5A7, SMN1, SPTLC1, SPTLC2, SYT1, TRIM2, TRPA1, WNK1, YARS.

HSP disease genes

ADD3, AFG3L2, AIMP1, ALDH18A1, AP4B1, AP4E1, AP4M1, AP4S1, AP5Z1, ARSA, ATP13A2, ATXN1, ATXN2, ATXN3, AUH, B4GALNT1, C12ORF65, C19ORF12, CAPN1, CYP27A1, CYP2U1, CYP7B1, DARS2, DDHD1, DDHD2, ELOVL4, ENTPD1, ERLIN1, ERLIN2, FA2H, FAM126A, FBXO7, FLRT1, FXN, GAD1, GALC, GAN, GBA2, GFAP, GJC2, GLRX5, HSPD1, KANK1, KCNA2, KCND3, KIAA0196, KIF1C, KIF5A, KLC2, L1CAM, LMNB1, MAG, MTHFR, MTPAP, NIPA1, NT5C2, OPA1, OPA3, PDYN, PGAP1, PLA2G6, PLP1, PNPLA6, POLR3A, POLR3B, PPP2R2B, PRNP, RTN2, SACS, SLC16A2, SLC33A1, SPAST, SPG20, SPG21, SPG7, STUB1, SYNE1, TBP, TECPR2, TGM6, TUBB4A, VAMP1, VCP, VPS37A, VWA3B, ZFYVE26.

CMT and HSP disease genes

ATL1, BICD2, BSCL2, CCT5, KIF1A, REEP1, SPG11, TFG, TRPV4.

RESULTS

Association of EXOC4 with CMT cases

Exome-wide association analysis was performed at 17,637 protein coding loci by the C-alpha test. The PLINK/SEQ suite computes an estimate of the minimal achievable p value for a locus, the i-statistic. We followed the recommended protocol to filter out loci with an i-statistic greater than 10−3 before Bonferroni correction to remove noncontributing genes.6 Based on the 2145 remaining loci, the p value threshold for an experiment-wide significance (ɑ = 0.05) was 2.3 × 10−5. After filtering results by the PLINK/SEQ i-statistic and applying Bonferroni multiple-testing correction, three genes—KDM5A (p value = 9.9 × 10−7, odds ratio [OR] = 3.6), EXOC4 (p value = 6.9 × 10−6, OR = 2.6), and CEP78 (p-value = 2.3 × 10−5, OR = 4.4)—reached experiment-wide significance (Fig. 1a). KDM5A and EXOC4 both contained a single allele in cases that drove the association: NM_001042603.1(KDM5A):c.11T>G and NM_021807.3(EXOC4):c.1648G>A. Sanger sequencing confirmed the driver allele in EXOC4 (Fig. 1b) and revealed a false positive exome call in KDM5A (indicating a false positive association to KDM5A). We did not follow up with CEP78 since the gene did not contain a single driving allele. At the EXOC4 gene level, heterozygous carriers were 2.6 (95% confidence interval [CI]: 1.28–5.37) times more likely to be affected, while at the driver allele level, heterozygous carriers were 9.07 times more likely to be affected (95% CI: 2.94–28.01) (Fig. 1c). The variant was a nonsynonymous missense change (p.Gly550Arg) predicted to be disease causing by MutationTaster2 with a gnomAD MAF of 0.00411.

Fig. 1: Risk allele and multilocus inheritance in inherited axonopathies.
figure 1

(ac) Charcot–Marie–Tooth (CMT) gene-based rare variant association analysis. (a) qq plot of the observed p values from C-alpha gene-based association analysis. Blue line indicates multiple testing correction threshold. Known CMT genes with nominal significance are annotated. (b) Transcript model of EXOC4 annotated with variant positions and counts (green bubble). (c) Heterozygous carrier risk for CMT at gene and variant level. CI confidence interval, OR odds ratio. (d, e) Cumulative mutational burden across disease genes. Distribution of the average count of qualifying variants in known hereditary spastic paraplegia (HSP) and CMT disease genes per case at 1% and 0.1% ExAC minor allele frequency (MAF) for (d) nonsynonymous and (e) loss-of-function variation. Difference in case/control distribution tested with Mann–Whitney U Test (*p value ≤ 0.05). (fi) Multilocus variant counts across disease genes. Proportion (f, g) and absolute counts (h, i) of cases carrying nonsynonymous (f, h) and loss-of-function (g, i) variants in the indicated number of mutated disease genes (1, 2, 2+, or 3+) at 0.1% and 1% ExAC MAF.

Increased mutational burden across known disease genes in IA cases

IA cohorts were independently tested for a mutational burden (an excess of rare variants) across known disease genes. In our CMT and HSP cohorts, we identified a significant mutational burden (Mann–Whitney, nominal p value ≤ 0.05) in each tested variant set (nonsynonymous [NS] and LoF variation at ExAC MAF ≤ 0.001 or ≤0.01); (Fig. 1d, e). As a further test of our observations, we repeated the mutational burden comparison with permutated case/control status over 10,000 iterations. We found that each tested variant set remained statistically significant (empirical p value ≤ 0.05), thus supporting that the mutational burden found across disease genes is specific to each IA cohort. The number of samples that contained variants above 3 standard deviations upper limit (3 SD) are as follows [cohort, n, 3 SD]: HSP NS 1%: [cases, 3, 9.8], [controls, 4, 8.1]; HSP NS 0.1%: [cases, 6, 7.5], [controls, 10, 5.4]; HSP LoF 1%: [cases, 9, 2.8], [controls, 19, 1.4]; HSP LoF 0.1%: [cases, 9, 2.7], [controls, 17, 1.3]; CMT NS 1%: [cases, 2, 8.3], [controls, 10, 7.7]; CMT NS 0.1%: [cases, 2, 5.5], [controls, 6, 5.1]; CMT LoF 1%: [cases, 5, 1.4], [controls, 11, 1.2]; CMT LoF 0.1%: [cases, 5, 1.3], [controls, 11, 1.1].

Multilocus inheritance suggested in IA cases

Next, we sought to determine whether the observed mutational burden was more likely to follow a monogenic (single gene), digenic (two genes), or oligogenic (more than two genes) inheritance. Unlike the mutational burden, the significance of each inheritance type was influenced by the MAF (Fig. 1f, g). HSP cases showed consistent evidence for oligogenic inheritance (≥3 genes) of NS variation and monogenic inheritance (1 gene) of LoF variation at both ExAC MAF ≤ 0.01 and ≤ 0.001 (Fisher’s exact, p value ≤ 0.05). HSP cases also displayed significant di/oligogenic inheritance (≥2 genes) of NS variation at the less common ExAC MAF ≤ 0.001 (p value ≤ 0.05). Furthermore, di/oligogenic inheritance of both NS and LoF variation for HSP cases is suggested at ExAC MAF ≤ 0.01 (p value = 0.0598 and 0.0572, respectively). Evidence for inheritance types in CMT was not as consistent as in HSP, possibly due to a lower CMT sample size. At ExAC MAF ≤ 0.01, CMT cases demonstrated monogenic inheritance for LoF variation and oligogenic inheritance for NS variation (p value ≤ 0.05) with potential di/oligogenic inheritance for NS variation (p value = 0.521). Lastly, at ExAC MAF ≤ 0.001, CMT cases only showed significant evidence for monogenic inheritance of NS variation (p value ≤ 0.05) with potential evidence for oligogenic NS inheritance and monogenic LoF inheritance (p value = 0.0641 and 0.0536, respectively). The counts of samples carrying variants are summarized in Fig. 1h–i.

DISCUSSION

As the cost and availability of NGS continues to drop, we are now reaching large enough sample sizes to apply statistical approaches to rare diseases. In this study, we sought to assess the mutational burden and multilocus involvement of rare variation in a cohort of inherited axonopathies as well as identify potential risk loci.

To identify genes that could potentially carry non-Mendelian risk alleles, we performed an unbiased exome-wide rare variant burden analysis with the C-alpha test. After filtering results and performing Sanger sequencing, EXOC4 stood out as a candidate CMT gene. EXOC4 is involved in vesicle transport and membrane tethering in polarized cells and is expressed in Schwann cells.7 In a CMT4B1 mouse model, Exoc4 (Sec8) formed a complex with Mtmr2 and Dlg1 to coordinate homeostatic control of myelination.7 Exoc4 is abundantly expressed at the Drosophila neuromuscular junction and required for in vivo regulation of synaptic microtubule formation.8 Furthermore, Exoc4 is suggested to play a central role in oligodendrocyte membrane formation through the regulation of vesicular transport of myelin proteins.9 Although EXOC4 has biological plausibility, this result should be interpreted with a degree of caution. Stronger genetic evidence for EXOC4 is necessary, including replication of the association or identification of highly penetrant Mendelian variants. Unfortunately, a secondary large CMT exome cohort does not currently exist for follow-up replication analysis.

From the rare variant burden analysis, we were also able to re-identify several established monogenic CMT2 genes, including MME,10 MORC,11 and MFN2.12 This is despite a general effort to exclude cases with MFN2 and other common CMT genes from exome analysis. Similarly, known familial amyotrophic lateral sclerosis (ALS) genes showed strong associations in a gene-based rare variant burden analysis of sporadic ALS cases.13 These results give us confidence about the utility of association studies in rare disease cohorts, and may indicate the presence of additional risk alleles contributing to the phenotype in these known CMT genes. We interpret these results as additional evidence supporting cohort-level statistical approaches to identify Mendelian and non-Mendelian factors involved in classically monogenic disease.

Additionally, we observed a significant mutational burden across CMT and HSP disease genes in cases compared with non-neurological controls. The aggregation of rare, damaging alleles in disease-associated genes may contribute to risk, severity, and clinical heterogeneity. This inheritance model has been suggested in CMT based on exome sequencing from 37 individuals.14 Gonzaga-Jauregui et al. observed an average of 1.8 variants per case across 58 neuropathy genes compared with 1.3 variants per control. They followed up this observation with a second small cohort of 32 cases of Turkish descent, and observed a mutational burden of 2.1 versus 1.6 nonsynonymous rare variants in cases versus controls. The mutational burden hypothesis was functionally evaluated in vivo in zebrafish experiments, which resulted in increased phenotypic severity when pairs of neuropathy genes were inactivated.13 Our cohort is roughly ten times larger than the previous cohorts, and is now the third independent CMT cohort to support the mutational burden hypothesis. Furthermore, rare nonsynonymous variation was also significantly distributed across two or three disease genes in our cohort, indicating multilocus inheritance—which remains underexplored in rare diseases because of functional validation challenges. Additional variants in multiple disease genes can have either a combinatorial effect on the same biological pathways or a destabilizing effect on the entire disease module.

The primary goal of this study was to move beyond the “one disease–one gene” model to assess an expanded genetic architecture in IA. An appreciation for the extent of allelic and locus heterogeneity, reduced penetrance, and variable expressivity within IA has come from traditional family-based approaches. These insights across Mendelian diseases are driving the genetics community to delineate the more complicated and nuanced patterns of inheritance. First, a gene-based variant burden test was successfully applied to a cohort of ALS cases and identified a new risk gene.13 Using this approach, we observed an enrichment of qualifying variants (in a candidate gene and in known disease genes) that influence disease risk. We are extremely cautious about overstating any potential involvement of EXOC4 in disease pathogenesis. However, we interpret these results as evidence supporting the hypothesis of “risk alleles” in IA. Second, a mutational burden that can modulate phenotypic severity was observed in two small CMT cohorts, and the increased burden of protein-altering variants was functionally tested in a zebrafish model and demonstrated phenotypic modification.13 We have replicated this finding in a larger cohort of CMT cases and discovered a similar result in HSP cases. Beyond CMT and HSP, a rare variant aggregation has also been shown to influence susceptibility to Parkinson disease, and the age of onset of ALS.15

These results are subject to several limitations. First, copy-number (CNV) and structural variation (SV) were not included in this exome pipeline due to the high false positive rate of CNV from short-read NGS.11 Family-based study designs with genomic regions of interest allow for high confidence filtering approaches. However, with a proband-only design, we were concerned about identifying false associations from false positive calls. Given the importance of SV and CNV in human disease, future studies should consider genome sequencing, long-read technology, or family-based approaches. Another limitation of short-read NGS technology is the challenges in phasing rare variation. As such, we were unable to identify compound heterozygous variation. Finally, this study is limited by the availability of a replication cohort for rare disease. Until the EXOC4 risk allele is replicated or the EXOC4 locus is supported by additional genetic evidence, this association can only be clarified to a point in functional in vitro or in vivo studies.

Concluding remarks

Concepts such as risk alleles, mutational burden, and multilocus inheritance within rare Mendelian diseases lie at the intersection of rare and common diseases. Recent discoveries have shed light on the architecture of common disease, including increased risk for a common disease from heterozygous alleles in recessive Mendelian genes.16 However, the impacts of multilocus inheritance on Mendelian disease, including phenotypic severity, oligogenic inheritance, blended phenotypes, and phenotypic expansion, require further exploration.15 Investigating these non-Mendelian concepts will lead to a unified model of human disease and facilitate precision genetic therapies. Here, we continue pushing these boundaries in IA, suggest potential involvement of EXOC4 in disease pathogenesis of CMT, and provide further evidence supporting a multilocus Mendelian model. Clinicians should be aware of these developments when interpreting negative genetic testing results, and future studies will investigate specific combinations of risk alleles with potential clinical actionability.