Estimating the occurrence of primary ubiquinone deficiency by analysis of large-scale sequencing data

Hughes, Bryan G.; Harrison, Paul M.; Hekimi, Siegfried

doi:10.1038/s41598-017-17564-y

Download PDF

Article
Open access
Published: 18 December 2017

Estimating the occurrence of primary ubiquinone deficiency by analysis of large-scale sequencing data

Bryan G. Hughes¹,
Paul M. Harrison¹ &
Siegfried Hekimi¹

Scientific Reports volume 7, Article number: 17744 (2017) Cite this article

2210 Accesses
27 Citations
21 Altmetric
Metrics details

Subjects

Abstract

Primary ubiquinone (UQ) deficiency is an important subset of mitochondrial disease that is caused by mutations in UQ biosynthesis genes. To guide therapeutic efforts we sought to estimate the number of individuals who are born with pathogenic variants likely to cause this disorder. We used the NCBI ClinVar database and literature reviews to identify pathogenic genetic variants that have been shown to cause primary UQ deficiency, and used the gnomAD database of full genome or exome sequences to estimate the frequency of both homozygous and compound heterozygotes within seven genetically-defined populations. We used known population sizes to estimate the number of afflicted individuals in these populations and in the mixed population of the USA. We then performed the same analysis on predicted pathogenic loss-of-function and missense variants that we identified in gnomAD. When including only known pathogenic variants, our analysis predicts 1,665 affected individuals worldwide and 192 in the USA. Adding predicted pathogenic variants, our estimate grows to 123,789 worldwide and 1,462 in the USA. This analysis predicts that there are many undiagnosed cases of primary UQ deficiency, and that a large proportion of these will be in developing regions of the world.

Diagnostic and clinical utility of whole genome sequencing in a cohort of undiagnosed Chinese families with rare diseases

Article Open access 18 December 2019

Specialist multidisciplinary input maximises rare disease diagnoses from whole genome sequencing

Article Open access 07 November 2022

Applying genomic and transcriptomic advances to mitochondrial medicine

Article 23 February 2021

Introduction

Mitochondrial disease is a complex and heterogeneous collection of disorders that can result in death or prolonged disability. The prevalence of these disorders could be as high as 1 in 4,300 individuals, which makes it one of the most common forms of inherited illness^1,2. There is no general cure for these disorders but the most widespread treatments are vitamin and nutritional supplementation, most commonly with L-carnitine, creatine and ubiquinone (UQ; a.k.a. Coenzyme Q), despite the fact that there is little evidence supporting their effectiveness^3,4.

UQ is a redox-active lipid-like molecule that plays a number of critical roles in biological membranes. Its best characterized role is as a key electron carrier of the mitochondrial electron transport chain⁵. UQ is also a co-factor in a number of other enzymatic processes as well as a potential membrane antioxidant. The rationale for generalized UQ supplementation in mitochondrial disease is thus the hope that it might support mitochondrial function, and that its antioxidant function could ameliorate any increase in oxidative stress. Furthermore, UQ deficiency secondary to other mitochondrial defects is observed in a substantial subset of mitochondrial disease patients⁶.

There is, however, one patient population that could directly benefit from effective UQ supplementation: individuals suffering from primary UQ deficiency due to mutations in genes required for UQ biosynthesis. Although such patients have been much discussed^7,8,9, we are not aware of any formal attempt to estimate the prevalence of primary UQ deficiency. At this point, approximately 70 patients have been described in the published literature, and it has been informally estimated that their prevalence may be less than 1 in 100,000⁸. Despite clear genetic evidence that UQ deficiency is the primary cause in these patients, UQ supplementation has not met with consistent success, possibly due to poor bioavailability of the highly lipophilic UQ molecule^10,11. A better understanding of the possible prevalence of this disorder would help guide decisions regarding investigations into novel UQ formulations or potential drugs which could modulate the UQ biosynthesis pathway.

UQ is composed of a redox-active benzoquinone ring with a lipid tail consisting of a species-specific number of isoprenoid sub-units (ten in humans). Although UQ biosynthesis has been most extensively studied in yeast, human homologues of the critical genes have been identified^7,8,9. Thirteen yeast genes are required for UQ biosynthesis (COQ1 – COQ11, YAH1, ARH1). In brief, COQ1 (or the human homologues PDSS1 and PDSS2 acting as a hetero-tetramer) assembles an isoprenoid tail from precursors produced by the mevalonate pathway. COQ2 joins this isoprenoid tail to a tyrosine-derived benzoquinone ring precursor, and COQ3, COQ5, COQ6 and COQ7 are responsible for various methylation and hydroxylation reactions affecting the benzoquinone ring. COQ8 appears to play a regulatory role by modulating phosphorylation of COQ3, COQ5 and COQ7. COQ8 has two human homologues, COQ8A (also known as ADCK3 or CABC1) and COQ8B (ADCK4), both of which can independently result in UQ deficiency^12,13. The roles of COQ4 and COQ9 are not well defined, although COQ4 appears to play a role in the assembly of COQ2 – COQ7 into a complex and COQ9 is required for COQ7 function. ARH1 (human homologue FDX1L) and YAH1 (FDXR) transfer electrons to COQ6, while also participating in other pathways. There are two modification steps of the UQ benzoquinone ring that have yet to be assigned an enzyme.

To date, pathogenic variants in nine of these proteins (PDSS1, PDSS2, COQ2, COQ4, COQ6, COQ7, COQ8A, COQ8B and COQ9) have been shown to cause UQ deficiency in human patients^7,9. We sought to leverage the recent availability of exome or genome sequences of very large numbers of individuals in order to estimate the frequency of known pathogenic variants in these genes. We used the NCBI ClinVar database¹⁴ and conducted a literature search to identify variants in the known UQ biosynthesis genes that result in illness and UQ deficiency. The gnomAD exome and genome database¹⁵, with sequences for almost 138,632 individuals divided into seven genetically-distinct populations, was used to estimate the frequencies of these variants. Using these frequencies, we estimated the birth prevalence of individuals homozygous or compound heterozygous for known or predicted pathogenic genetic variants for primary UQ deficiency (assuming Hardy-Weinberg equilibria) on a population-by-population basis and used known population sizes and distributions to estimate the actual numbers of afflicted individuals due to each variant world-wide, as well as in a population with the particular size and mix of the USA. Importantly, the calculation of the number of afflicted individuals on a per-variant, per-population, basis eliminates a potential confounding factor when working with large numbers of variants present at very low frequencies – namely, that many individual variants may be too rare to result in any homozygous or compound heterozygous individuals, and the traditional method of summing these frequencies could yield frequencies high enough to artificially suggest that individuals are affected.

It is likely that many pathogenic variants simply have not been clinically documented at this relatively early stage in our awareness of primary UQ deficiency. To account for this, we also estimated the number of individuals who would be homozygous or compound heterozygous for variants observed in gnomAD but that have not yet been observed in the clinic, focusing on predicted loss-of-function (LoF) or pathogenic missense mutations.

There are many challenges to making estimates of this nature. For example, it is not possible to conclusively determine the pathogenicity of missense variants based on sequence information alone. We attempt to address this by conservatively included only those variants independently predicted to be pathogenic by two separate bioinformatic algorithms (see Methods). There is also extreme variability in severity of primary UQ deficiency, ranging from neonatal lethality (with mouse studies suggesting that embryonic lethality is a possible outcome for null alleles for some genes^16,17,18,19) to mild disease that becomes apparent only in later decades of life. This makes accurate predictions of disease prevalence based on allelic frequencies extrapolated from public databases of genomic variants challenging, which is why our results are best interpreted as birth prevalence of individuals homozygous or compound heterozygous for variants likely to cause disease. Actual disease prevalence would be expected to diverge from these estimates. We discuss these issues in greater detail below.

We found that the carrier frequencies for most previously identified pathogenic variants were low (averaging 1/6,420 for the populations in which they were present), and given known population sizes we estimated they would result in a total of 1,016 individuals worldwide due to homozygosity and an additional 649 due to compound heterozygosity, with a total of 192 in the USA. The addition of all predicted loss-of-function and pathogenic missense variants results in a predicted total of 123,789 individuals worldwide and 1,462 in the USA.

Methods

Identification of known pathogenic variants

We identified pathogenic variants of UQ biosynthesis genes (PDSS1, PDSS2, COQ2 – COQ7, COQ8A/ADCK3, COQ8B/ADCK4 and COQ9) using the NCBI ClinVar database and via PubMed literature searches. ClinVar is a public archive (https://www.ncbi.nlm.nih.gov/clinvar/) describing human genetic variants and their relationship to human health¹⁴. Variants are extracted from the peer-reviewed literature or directly reported by CLIA certified or ISO 1589 accredited clinical testing laboratories. Variant pathogenicity is reported by the submitter according to the ordinal scale recommended by the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (“pathogenic”, “likely pathogenic”, “uncertain significance”, “likely benign” or “benign”)²⁰. Note that ClinVar results cannot be used to directly estimate birth prevalence and the database does not include fields for incidence frequency. Each ClinVar entry describes a unique variant, and may be derived from multiple submissions.

We queried ClinVar (search conducted on 2017-03) for each gene (e.g., ‘COQ2[gene]’) and identified pathogenic variants using the following inclusion criteria:

(i) At least one submission describes the variant as “pathogenic” or “likely pathogenic”.

(ii) No submitter assigns a significance as “benign” or “likely benign”.

(iii) Variant only affects one gene (i.e., no multi-gene deletions or duplications).

Complete records for all variants meeting our inclusion criteria were manually reviewed, including confirming that the record matches the description in any cited studies. To ensure as complete as possible a record of known variants, we also conducted a systematic literature search via pubmed (search conducted on 2017-01), where we reviewed all clinical studies in the search results for each gene name.

COQ2 transcript start

It was recently shown that the canonical COQ2 transcript, used by most previous studies, erroneously includes a 150 base N-terminal region that is only rarely, if ever, transcribed in humans²¹. Thus, any variant involving this region is unlikely to be pathogenic. For ease of comparability with previous studies we have retained the canonical numbering, but we have not included any variant affecting this region. For this reason, we did not consider variants such as p.Ala17Argfs (ClinVar allele ID 237155).

Estimation of variant birth prevalence

To determine the birth prevalence of these variants we used the Genome Aggregation Database (gnomAD, http://gnomad.broadinstitute.org/), an updated version of the previously released dataset from the Exome Aggregation Consortium (ExAC)¹⁵. The gnomAD release includes exome or genome sequences from a total of 138,632 individuals without severe pediatric diseases. The database assigns ancestry by a principal components analysis based upon a subset of samples of known ancestry. Most sequences clustered into one of seven geographic or endogamous groups (African, Ashkenazi Jews, East Asian, Finnish, non-Finnish Europeans, Latin Americans, and South American), with the remainder (3,234) considered to be ‘other’. Hence, almost all samples included in this dataset have an ancestry that is well-defined on a genetic basis. This dataset has undergone extensive quality control measures to remove poor-quality sequences, related individuals and to flag variants of questionable reliability¹⁵, and assessments of variant pathogenicity are provided via the SIFT and PolyPhen2 tools.

When querying gnomAD for each of our genes of interest, variants affecting protein-coding regions were considered equivalent to known pathogenic variants if they resulted in the same change to protein structure (i.e., the same amino acid conversion, a stop codon introduced in the same location, or a frameshift resulting in the same residue changes). For variants affecting splice sites, only variants that exactly matched the nucleotide changes of the pathogenic variants were included. We only considered variants that had passed gnomAD random forest filters.

We acquired estimates of population sizes from various sources (Table S1). The population estimates summed to 6 billion, accounting for 80% of total global population. Estimates of population sizes within the USA summed to 309 million (vs. a total population of approx. 319 million). To estimate the number of affected individuals, we used the individual frequencies for each variant in a population (i.e., not on summed frequencies), and estimates for each variant for a population were rounded down to the nearest whole number.

Rates of compound heterozygosity were determined using data tables of missense and loss-of-function variants for each gene obtained from the gnomAD browser. An R script (available upon request) was written to systematically strip out undesirable variants (e.g., those affecting non-canonical transcripts) and make the multiple comparisons required. We calculated the predicted frequency of compound heterozygotes within each population in which it was possible (i.e., some variants were not observed in the same population, making compound heterozygosity for that variant impossible in that population). When determining rates of compound heterozygosity for the group of predicted loss-of-function (LoF) and pathogenic missense variants, we included known pathogenic/predicted LoF variant pairs in our calculations.

Pearson’s chi-square test of goodness of fit was calculated in Excel 14.0.7180.5002 (Microsoft, USA), and other calculations were performed in R. 95 percent confidence intervals were calculated with the exact binomial test.

Identification of predicted pathogenic variants

To identify predicted pathogenic variants in the gnomAD database, we first excluded variants that did not pass quality-control filters and those in non-canonical transcripts (as defined by gnomAD, the canonical transcript is the longest consensus coding sequence translation with no stop codons). To identify LoF variants, we extracted those annotated as “stop gained”, “frameshift”, “splice donor” or “splice acceptor” and excluded variants which gnomAD had flagged as low-confidence LoF. To identify the missense variants that were most likely to be pathogenic, we extracted only those variants for which gnomAD reported an assessment of “probably damaging” and “deleterious” by PolyPhen2 and SIFT respectively. To reduce the risk of obtaining false positives, we excluded variants with high minor allele (MAF) frequencies. Although a MAF cut-off of 0.5% has been suggested²², we chose a more conservative approach, instead using the highest observed MAF in the list of “known” pathogenic variants as a threshold: thus variants with a global MAF greater than 0.019% or a MAF for any population greater than 0.31% were excluded.

Data Availability

The datasets analyzed during the current study are available at www.ncbi.nlm.nih.gov/clinvar/ and http://gnomad.broadinstitute.org/.

Results

Through ClinVar, we identified 552 reported genetic variants affecting UQ biosynthesis genes (for complete listing, see File S1). Of these, 143 were deletions or duplications affecting multiple genes (all 17 variants reported for COQ3 and COQ5 fell into this category), and were not considered further because the pathogenicity of these variants could potentially be related to the activity of multiple genes. Of the remainder, 315 were excluded because submitters did not assess them as pathogenic (in only one case, ClinVar variation 3645, COQ8A p.Phe331=, were there both pathogenic and benign interpretations – in this case, a MAF of 1.57% supports the benign interpretation). Thirteen of those remaining were subsequently excluded because close inspection of the records and cited works revealed a number of problems, including single-copy variants not consistent with the typically recessive nature of UQ deficiencies (4 records), duplicate records (4), risk factors mis-categorized as causative pathogenic variants (2), an incomplete ClinVar entry (1), a multi-variant haplotype not testable in gnomAD (1), reliance on a secondary, unreferenced, source (1), and one variant present in the untranscribed N-terminal region of COQ2 (see Methods).

Of the remaining 80 records, the majority (49) had been extracted from the peer-reviewed literature, 22 were from the genetic testing company GeneDx (MD, USA), with the remainder from 6 other testing labs (see Table S2 for detailed information). GeneDx and five other testing labs provided detailed assertion criteria for the determination of variant pathogenicity, all adhering to established standards.

To account for the possibility that not all known pathogenic variants are included in ClinVar, we carried out an independent review of the literature, identifying 18 additional pathogenic variants: 2 affecting COQ2, 3 COQ4, 1 COQ6, 9 COQ8A, and 2 COQ8B (see Table S2 for literature references).

In total, we identified 97 pathogenic variants. Of these, 57 resulted in a single residue substitution, 21 in frameshifts, 10 premature stop codons, 7 variants altering splice-site donor or acceptor regions in ways predicted to be pathogenic, and 3 single-residue indels (see Table S2 for a complete listing of all identified known pathogenic variants). COQ8A was most frequently affected, with 40 variants.

To better understand the birth prevalence of these variants we queried the gnomAD exome and genome database. We found 441 carriers, with 49 of 97 pathogenic variants present (Table 1). No variants were present in homozygous form, all missense variants were predicted to be damaging by PolyPhen2, SIFT, or both, and all premature stop, frameshift or splice site-disrupting variants were predicted to be high-confidence loss-of-function. All of these findings are fully consistent with the reported pathogenic nature of these variants. Global allele frequencies ranged from 4.1 × 10⁻⁶ to 1.7 × 10⁻⁴, yielding a combined frequency of 1.76 × 10⁻³, implying that 1/321,368 individuals will be homozygous for pathogenic variants at birth.

Table 1 Known pathogenic variants from ClinVar or literature review that are represented in gnomAD sequence database.

Full size table

Through casual observation it was apparent that several of the known pathogenic variants were not distributed evenly within the different populations. For example, the COQ8A p.Met555Ile variant was observed in 39 European or Finnish individuals, but in no other population, and the COQ8B p.Glu483* variant was observed in 10 individuals from South Asia but only in 1 European, despite the almost 4-fold greater number of European alleles genotyped. Indeed, the six variants with the greatest numbers of carriers had frequencies that were distributed unevenly between populations (Pearson’s chi-squared 24.3 to 537.5, p < 0.001) (Figure S1 - statistically significant differences were rarer among the variants with lower allele counts, potentially due to the decreased statistical power inherent in a lower sample size). Because of this unevenness, subsequent analysis was conducted on a population-by-population basis.

Each pathogenic variant was observed in an average of 1.9 populations (not counting ‘Other’), with an average allele frequency of 1.56 × 10^–4 (Table 2). Combined estimates of Hardy-Weinberg homozygosity for all variants for each of the 7 populations averaged 1/5,492,983, ranging from 1/12,021,014 (Latin Americans) to 1/60,113 (Ashkenazi Jews). Predicted homozygous frequency for individual variants averaged 1/5.4 M, with the variant found at the greatest frequency being COQ4 p.Arg240Cys (with a 1/162 carrier frequency among Ashkenazi Jews which would result in the birth of homozygotes at a frequency of 1/104,733). With an estimated worldwide population of 10 M, this would imply 95 afflicted Ashkenazi Jews susceptible to UQ deficiency due to homozygosity for this one variant alone. Considering all variants across all populations, we can predict 1,016 homozygous-at-birth individuals globally, or 122 in the USA (Table 2).

Table 2 Population breakdown and predicted prevalence of afflicted individuals for known pathogenic variants present in the gnomAD database.

Full size table

The presence of multiple variants within the same populations is consistent with the numerous reports of compound heterozygosity in patients with primary UQ deficiency⁷. When estimating birth prevalence of compound heterozygosity, pathogenic variants in COQ8A again exhibited a greater prevalence relative to other genes. In fact, the birth prevalence of compound heterozygotes for COQ8A among Ashkenazi Jews (1/725,578), Finns (1/1.4 M) or non-Finnish Europeans (1/1.6 M) alone were all individually greater than the combined prevalence of all other genes (1/17.1 M) (Table 3, full variant-by-variant breakdown in Table S3). We can estimate that 649 individuals worldwide are born as compound heterozygous for pathogenic genetic variants causing UQ deficiency, with 70 in the USA-like population.

Table 3 The predicted occurrence of compound heterozygotes of known pathogenic variants.

Full size table

Premature stop codons, frameshifts or the disruption of canonical splice sites (LoF) or critical protein residues (via missense mutations) are all expected to result in significant impairments to protein function. Although we can expect an unknown proportion of these predicted pathogenic variants to result in embryonic lethality, those that do allow survival to birth are likely to result in clinically significant illness. We therefore determined the birth prevalence of all predicted pathogenic variants in UQ biosynthesis genes, as described in Methods. Across all UQ biosynthesis genes there were a total of 782 predicted pathogenic variants (including all known pathogenic variants), and 618 possible compound heterozygote combinations (summarized in Table 4, complete variant list in Table S4 and Table S5). The two genes with the highest frequency of predicted pathogenic variants (combining homozygotes and compound heterozygotes) were COQ8A and COQ8B, with cross-population average incidences of 1/193,621 and 1/198,391, resulting in a predicted 27,321 and 44,727 afflicted individuals worldwide, respectively, and 391 and 398 afflicted individuals in the USA. The gene with the lowest frequency was COQ3 (1/57 M), with only 146 predicted affected individuals worldwide, and none predicted in the USA. The population with the greatest total frequency of pathogenic variants was that of East Asia (1/20,170), with a predicted 79,423 afflicted individuals worldwide. The variant with the greatest prevalence in any population was COQ4 p.Arg240Cys in the Ashkenazi Jewish population, with a MAF of 0.0001719 (Table S4).

Table 4 Predicted prevalence of homozygous and compound heterozygous afflicted individuals for all known and predicted pathogenic variants.

Full size table

Considering the occurrence of both homozygotes and compound heterozygotes averaged across all populations, our results predict a global birth prevalence of 1/52,092. However, not all the populations considered are of equal size, and the predicted number of afflicted individuals worldwide was 41,555 due to homozygosity and 85,581 due to compound heterozygosity, for a total of 123,789 (1/48,495). In the USA, our analysis predicts 1,462 afflicted individuals (1/211,917).

Discussion

Overall, our results predict a worldwide total of 123,789 individuals suffering from primary UQ deficiency, and 1,462 in a population with a composition similar to the USA. Of these, 1,665 and 192 respectively are due to variants that are known to be pathogenic, with the remainder due to predicted LoF and pathogenic missense variants (summarized in Fig. 1A and B). However, the extent to which known pathogenic variants contributed to the total varied between populations. The addition of predicted LoF variants has less impact for Western populations (Ashkenazi Jews, Finnish and non-Finish Europeans: blue in Fig. 1), with inclusion of predicted pathogenic variants resulting in an average 3.5-fold increase in the number of afflicted individuals, relative to known pathogenic variants only (Fig. 1c). In contrast, in populations from non-Western, developing regions (South and East Asians, Latin Americans and Africans: red in Fig. 1), the addition of predicted LoF variants resulted in an average 122-fold increase in the number of afflicted individuals (Fig. 1c). The increased likelihood of pathogenic variants to have been identified in Western populations is consistent with the reality of their relatively higher clinical coverage compared to non-Western populations, where the expense of clinical sequencing has limited the genetic characterization of patients suffering from mitochondrial disease. Our results imply that primary UQ deficiency is substantially under-diagnosed in Latin American, African and Asian populations.

There are several factors that could induce error in our predictions. For example, LoF variants may be so harmful that a homozygous individual is not viable in the first place. That this is possible is supported by the embryonic lethality of the complete genetic ablation of PDSS2, COQ2, COQ3, COQ6 and COQ7 in mice^16,17,18,19, with COQ4 exhibiting pre-weaning lethality¹⁷. In contrast, COQ8A²³ and COQ9²⁴ –null mice have been reported as viable. Indeed, among patients with pathogenic variants in the UQ biosynthesis genes likely to be necessary for life (PDSS1 – COQ7), very few are homozygous or compound heterozygous for severe variants expected to result in significant LoF. Among the severe variants (nonsense, frameshift, splice site affecting) for these genes described in the literature we reviewed, only COQ2 p.Asn401Ilefs*15 was present in homozygous form, resulting in multi-organ failure and death in an infant patient²⁵, and there was only one patient compound heterozygous for LoF variants (COQ6 p.Trp447* and p.Gln461fs*478²⁶). In all three variants the region affected was close to the C-terminus (closer than any other known pathogenic variant for these genes), implying that these patients may have retained some partially functional protein, and that other severe variants may have resulted in complete LoF and embryonic lethality.

It is therefore likely that some of the LoF variants that contribute to our final totals may not actually contribute to disease rates due to embryonic or pre-natal lethality. Variant severity is not easy to predict – for example, COQ9^R239X mice that express a partial protein have a much more severe phenotype than COQ9^Q95X mice with no measurable protein expression, presumably due to the destabilization of a multiprotein UQ biosynthesis complex by the truncated protein²⁷. However, homozygous or compound heterozygous severe variants in PDSS1 through COQ7 account for only 6,142 out of 123,789 predicted individuals worldwide, and 219 out 1,462 in the US. This suggests that our predictions are not greatly inflated by the inclusion of embryonically lethal allelic combinations.

Our predictions may also suffer from the opposite problem - missense variants identified as damaging by SIFT or PolyPhen2 may, in fact, not have deleterious physiological effects. We attempted to address this by requiring our “predicted pathogenic” variants to be rated as highly likely to be deleterious by both PolyPhen2 and SIFT, but such prediction algorithms are clearly not infallible. For example, COQ4 p.Arg145Gly was rated as “tolerated” by SIFT, yet was reported in homozygous form in a neonate who died 4 h after birth, and it also failed to rescue Δcoq4 yeast²⁸. It is therefore reasonable to expect a certain proportion of predicted-pathogenic missense variants to result in asymptomatic individuals. Interestingly, missense variants seem to be responsible for a lesser proportion of COQ8A and COQ8B-deficient individuals, with patients homozygous or compound heterozygous for LoF variants being relatively common^29,30,31,32. Given that COQ8A alone can rescue COQ8-null yeast³³, and COQ8A patients with truncating nonsense mutations shown to result in nonsense-mediated decay remained viable in their mid-20’s³², it is likely that these genes may be relatively insensitive to some borderline-pathogenic missense variants. This has the potential to greatly impact our predictions, with homozygous or compound heterozygous variants in COQ8A and COQ9B accounting for 46,654 out of 123,789 predicted affected individuals worldwide, and 559 out of 1,462 in the USA.

COQ8A and COQ8B are also noteworthy in that most of the known patients have relatively well-defined, gene-specific, pathologies. Specifically, symptoms of ataxia (often associated with cerebellar atrophy or other neurological abnormalities) are found with 26 of the 29 known pathogenic variants of COQ8A, and all 13 of the published COQ8B pathogenic variants exhibited nephrotic syndrome (citations provided in Table S2). It would therefore be tempting to claim that our predicted patients would exhibit similar clinical conditions, with, for example, all predicted COQ8B patients suffering from nephrotic syndrome³⁴. However, it is likely (and our results support) that only a subset of primary UQ deficiency patients have been identified at this point, and they may be non-representative of the actual patient population. Of note, many of the known pathogenic variants were identified in studies where clinicians screened cohorts of patients with specific subsets of well-defined symptoms. For example, our knowledge of COQ8B variants largely comes from two studies in which large numbers of patients with nephrotic syndrome were subjected to sequencing of either whole exomes or multi-gene panels designed for nephrotic syndrome^29,35. A similar issue can be raised for the ataxic nature of COQ8A variants. For example, two studies described how, after identifying pathogenic COQ8A variants in ataxic patients, they proceeded to sequence COQ8A in other ataxic patients, identifying additional novel pathogenic variants^30,32. Additional pathogenic variants were found in later studies in which COQ8A, alone or in combination with other UQ biosynthesis genes, was specifically sequenced in ataxic patients^31,36. We hypothesize that future COQ8A or COQ8B patients identified via less targeted methodologies may present with more diverse clinical phenotypes, as is characteristic of other UQ biosynthesis genes such as COQ2 or COQ4.

There are also several factors that could increase the number of afflicted individuals beyond our estimates. For example, we conservatively assumed that primary UQ deficiency is always recessive; however, haploinsufficiency of COQ4 has been shown to cause clinically significant primary UQ deficiency³⁷. Also, violations of Hardy-Weinberg equilibrium (e.g., consanguinity or populations with a large degree of endogamy) could increase the likelihood of an individual being born with two pathogenic variants. It is also noteworthy that 6 of the 29 missense variants known to be pathogenic would not have met our criteria for inclusion as “predicted” pathogenic variants, since they were not assigned the highest level of confidence for pathogenicity by both SIFT and PolyPhen2 (Table 1). This supports the conservative nature of our selection criteria.

Furthermore, there are several reasons why truly pathogenic variants may not appear on our list of known variants. Some variants may have been identified in clinics without being formally described in the literature. For example, COQ2 p.Met128Val and p.Arg387* have been cited as pathogenic in the secondary literature³⁸, but without a formal research citation they would not have met our inclusion criteria as known pathogenic variants. Furthermore, although the latter variant was included as a predicted pathogenic variant, the former was assessed as ‘benign’ and ‘tolerated’ by Polyphen and SIFT respectively, excluding it from our list of predicted pathogenic variants. In addition, many predicted pathogenic variants were more common in non-western populations, meaning that they are less likely to have been identified in the existing clinical reports, which have focussed on western populations. Additionally, our list of known pathogenic variants may not have included variants detected as part of recent large-scale studies^39,40,41, and the fact that some UQ biosynthesis genes were found to be associated with disease earlier than others (e.g., COQ2 was first found in 2006⁴², vs. COQ8B in 2013²⁹ and COQ7 in 2015⁴³) could have delayed the introduction of some genes into widely used genetic screening panels⁴⁴, meaning that more patients were screened for some genes compared to others. Finally, after the literature review phase of our analysis was concluded, novel pathogenic variants have continued to be described in the clinical literature (e.g., COQ4⁴⁵, COQ6⁴⁶, COQ7⁴⁷, ADCK4^48,49), indicating that many remain to be reported.

Several aspects of our results point towards their general reliability. For example, there have been no reports of pathogenic variants in COQ3 or COQ5, which is consistent with our prediction of few individuals with primary UQ deficiency due to pathogenic variants in these genes (less than 2,000 individuals worldwide, and only 4 in the USA). Conversely, more patients with defects in COQ8A and COQ8B have been described than for any other UQ biosynthesis gene⁸, which corresponds to our finding that pathogenic variants in these genes make the greatest contribution to the number of individuals worldwide predicted to suffer from primary UQ deficiency, together accounting for more than half of the predicted 127,136 patients worldwide.

In conclusion, we have made the first estimates of the worldwide and within-population birth prevalence of individuals who are homozygous or compound heterozygous for pathogenic variants causing primary UQ deficiency by combining a decades-worth of clinical genetics with the recently available large-scale full exome/genome sequencing. Our calculations suggest a minimum of 1,665 afflicted individuals worldwide or 192 in the USA (using only variants clinically shown to be pathogenic), up to a maximum of 123,789 worldwide or 1,462 in the USA (with all variants predicted to be pathogenic). Notably, the gap between predictions made using “known” vs. “predicted” pathogenic variants appears smallest in populations expected to have the greatest access to the modern methodologies of clinical genetics. This implies that healthcare providers have already made substantial headway in identifying individuals suffering from this disorder. However, it remains likely that the bulk of patients worldwide suffering from primary UQ deficiency have yet to be recognized.

References

Skladal, D., Halliday, J. & Thorburn, D. R. Minimum birth prevalence of mitochondrial respiratory chain disorders in children. Brain 126, 1905–1912 (2003).
Article PubMed Google Scholar
Gorman, G. S. et al. Prevalence of nuclear and mitochondrial DNA mutations related to adult mitochondrial disease. Annals of Neurology 77, 753–759 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kerr, D. S. Review of Clinical Trials for Mitochondrial Disorders: 1997–2012. Neurotherapeutics 10, 307–319 (2013).
Article CAS PubMed PubMed Central Google Scholar
Pfeffer, G., Majamaa, K., Turnbull, D. M., Thorburn, D. & Chinnery, P. F. Treatment for mitochondrial disorders. Cochrane Database of Systematic Reviews (2012).
Wang, Y. & Hekimi, S. Understanding Ubiquinone. Trends in Cell Biology 26, 367–378 (2016).
Article CAS PubMed Google Scholar
Yubero, D. et al. Secondary coenzyme Q10 deficiencies in oxidative phosphorylation (OXPHOS) and non-OXPHOS disorders. Mitochondrion 30, 51–58 (2016).
Article CAS PubMed Google Scholar
Wang, Y. & Hekimi, S. Molecular genetics of ubiquinone biosynthesis in animals. Critical Reviews in Biochemistry and Molecular Biology 48, 69–88 (2013).
Article CAS PubMed Google Scholar
Desbats, M. A., Lunardi, G., Doimo, M., Trevisson, E. & Salviati, L. Genetic bases and clinical manifestations of coenzyme Q10 (CoQ10) deficiency. Journal of inherited metabolic disease 38, 145–156 (2015).
Article CAS PubMed Google Scholar
Acosta, M. J. et al. Coenzyme Q biosynthesis in health and disease. Biochimica et Biophysica Acta (BBA) - Bioenergetics 1857, 1079–1085 (2016).
Article CAS Google Scholar
López, L. C. et al. Treatment of CoQ10 Deficient Fibroblasts with Ubiquinone, CoQ Analogs, and Vitamin C: Time- and Compound-Dependent Effects. PLoS ONE 5, e11897 (2010).
Article ADS PubMed PubMed Central Google Scholar
Wang, Y., Oxer, D. & Hekimi, S. Mitochondrial function and lifespan of mice with controlled ubiquinone biosynthesis. Nat Commun 6, 6393 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Ashraf, S. et al. ADCK4 mutations promote steroid-resistant nephrotic syndrome through CoQ10 biosynthesis disruption. The Journal of Clinical Investigation 123, 5179–5189 (2013).
Article CAS PubMed PubMed Central Google Scholar
Barca, E. et al. Cerebellar ataxia and severe muscle CoQ10 deficiency in a patient with a novel mutation in ADCK3. Clinical Genetics 90, 156–160 (2016).
Article CAS PubMed PubMed Central Google Scholar
Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2016).
Article CAS PubMed Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Article CAS PubMed PubMed Central Google Scholar
Peng, M. et al. Primary coenzyme Q deficiency in Pdss2 mutant mice causes isolated renal disease. Plos Genetics 4, 14 (2008).
Article Google Scholar
Koscielny, G. et al. The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data (IMPC data Release 5.0). Nucleic Acids Res. 42, D802–D809 (2014).
Article CAS PubMed Google Scholar
Lapointe, J., Wang, Y., Bigras, E. & Hekimi, S. The submitochondrial distribution of ubiquinone affects respiration in long-lived Mclk1+/− mice. The Journal of Cell Biology 199, 215–224 (2012).
Article CAS PubMed PubMed Central Google Scholar
Levavasseur, F. et al. Ubiquinone is necessary for mouse embryonic development but is not essential for mitochondrial respiration. J Biol Chem 276, 46160–46164 (2001).
Article CAS PubMed Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17, 405–423 (2015).
Article PubMed PubMed Central Google Scholar
Desbats, M. A. et al. The COQ2 genotype predicts the severity of coenzyme Q10 deficiency. Human Molecular Genetics 25, 4256–4265 (2016).
Article CAS PubMed Google Scholar
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Stefely, J. A. et al. Cerebellar Ataxia and Coenzyme Q Deficiency Through Loss of Unorthodox Kinase Activity. Molecular Cell 63, 608–620 (2016).
Article CAS PubMed PubMed Central Google Scholar
García-Corzo, L. et al. Dysfunctional Coq9 protein causes predominant encephalomyopathy associated with CoQ deficiency. Human Molecular Genetics 22, 1233–1248 (2013).
Article PubMed Google Scholar
Mollet, J. et al. Prenyldiphosphate synthase, subunit 1 (PDSS1) and OH-benzoate polyprenyltransferase (COQ2) mutations in ubiquinone deficiency and oxidative phosphorylation disorders. Journal of Clinical Investigation 117, 765–772 (2007).
Article CAS PubMed PubMed Central Google Scholar
Heeringa, S. F. et al. COQ6 mutations in human patients produce nephrotic syndrome with sensorineural deafness. The Journal of Clinical Investigation 121, 2013–2024 (2011).
Article CAS PubMed PubMed Central Google Scholar
Luna Sánchez, M. et al. The clinical heterogeneity of coenzyme Q10 deficiency results from genotypic differences in the Coq9 gene. EMBO Molecular Medicine 7, 670–687 (2015).
Article PubMed PubMed Central Google Scholar
Brea-Calvo, G. et al. COQ4 Mutations Cause a Broad Spectrum of Mitochondrial Disorders Associated with CoQ(10) Deficiency. American Journal of Human Genetics 96, 309–317 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ashraf, S. et al. ADCK4 mutations promote steroid-resistant nephrotic syndrome through CoQ(10) biosynthesis disruption. The Journal of Clinical Investigation 123, 5179–5189 (2013).
Article CAS PubMed PubMed Central Google Scholar
Lagier-Tourenne, C. et al. ADCK3, an Ancestral Kinase, Is Mutated in a Form of Recessive Ataxia Associated with Coenzyme Q(10) Deficiency. American Journal of Human Genetics 82, 661–672 (2008).
Article CAS PubMed PubMed Central Google Scholar
Mignot, C. et al. Phenotypic variability in ARCA2 and identification of a core ataxic phenotype with slow progression. Orphanet Journal of Rare Diseases 8, 173–173 (2013).
Article PubMed PubMed Central Google Scholar
Gerards, M. et al. Nonsense mutations in CABC1/ADCK3 cause progressive cerebellar ataxia and atrophy. Mitochondrion 10, 510–515 (2010).
Article CAS PubMed Google Scholar
Xie, L. X. et al. Expression of the Human Atypical Kinase ADCK3 Rescues Coenzyme Q Biosynthesis and Phosphorylation of Coq Polypeptides in Yeast coq8 Mutants. Biochimica et biophysica acta 1811, 348–360 (2011).
Article CAS PubMed PubMed Central Google Scholar
Sadowski, C. E. et al. A Single-Gene Cause in 29.5% of Cases of Steroid-Resistant Nephrotic Syndrome. Journal of the American Society of Nephrology: JASN 26, 1279–1289 (2015).
Article CAS PubMed Google Scholar
Korkmaz, E. et al. ADCK4-Associated Glomerulopathy Causes Adolescence-Onset FSGS. Journal of the American Society of Nephrology: JASN 27, 63–68 (2016).
Article CAS PubMed Google Scholar
Horvath, R. et al. Adult-onset cerebellar ataxia due to mutations in CABC1/ADCK3. Journal of Neurology, Neurosurgery & Psychiatry 83, 174–178 (2012).
Article Google Scholar
Salviati, L. et al. Haploinsufficiency of COQ4 causes coenzyme Q10 deficiency. Journal of Medical Genetics 49, 187–191 (2012).
Article CAS PubMed PubMed Central Google Scholar
Salviati, L., Trevisson, E., Doimo, M. & Navas, P. Primary Coenzyme Q10 Deficiency. GeneReviews® [Internet] https://www.ncbi.nlm.nih.gov/books/NBK410087/. (2017).
Legati, A. et al. New genes and pathomechanisms in mitochondrial disorders unraveled by NGS technologies. Biochimica et Biophysica Acta (BBA) - Bioenergetics 1857, 1326–1335 (2016).
Article CAS Google Scholar
Yang, Y., Muzny, D. M. & Xia, F. et al. Molecular findings among patients referred for clinical whole-exome sequencing. Jama 312, 1870–1879 (2014).
Article CAS PubMed PubMed Central Google Scholar
Neveling, K. et al. A Post-Hoc Comparison of the Utility of Sanger Sequencing and Exome Sequencing for the Diagnosis of Heterogeneous Diseases. Human mutation 34, 1721–1726 (2013).
Article CAS PubMed Google Scholar
Quinzii, C. et al. A Mutation in Para-Hydroxybenzoate-Polyprenyl Transferase (COQ2) Causes Primary Coenzyme Q(10) Deficiency. American Journal of Human Genetics 78, 345–349 (2006).
Article CAS PubMed Google Scholar
Freyer, C. et al. Rescue of primary ubiquinone deficiency due to a novel COQ7 defect using 2,4–dihydroxybensoic acid. Journal of Medical Genetics 52, 779–783 (2015).
Article CAS PubMed PubMed Central Google Scholar
Parikh, S. et al. Practice patterns of mitochondrial disease physicians in North America. Part 1: Diagnostic and clinical challenges. Mitochondrion 14, 26–33 (2014).
Article CAS PubMed Google Scholar
Sondheimer, N. et al. Novel recessive mutations in COQ4 cause severe infantile cardiomyopathy and encephalopathy associated with CoQ(10) deficiency. Molecular Genetics and Metabolism Reports 12, 23–27 (2017).
Article CAS PubMed PubMed Central Google Scholar
Park, E. et al. COQ6 Mutations in Children With Steroid-Resistant Focal Segmental Glomerulosclerosis and Sensorineural Hearing Loss. American Journal of Kidney Diseases (2017).
Wang, Y. et al. Pathogenicity of two COQ7 mutations and responses to 2,4-dihydroxybenzoate bypass treatment. Journal of cellular and molecular medicine, n/a-n/a (2017).
Park, E. et al. Focal segmental glomerulosclerosis and medullary nephrocalcinosis in children with ADCK4 mutations. Pediatric Nephrology, 1–8 (2017).
Lolin, K. et al. Early-onset of ADCK4 glomerulopathy with renal failure: a case report. BMC Medical Genetics 18, 28 (2017).
Article ADS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Research in SH laboratory is funded by grants from the Canadian Institutes of Health Research: MOP-114891, MOP-123295 and MOP-97869, as well as by McGill University. S.H. is Campbell Chair of Developmental Biology.

Author information

Authors and Affiliations

Department of Biology, McGill University, Montreal, Canada
Bryan G. Hughes, Paul M. Harrison & Siegfried Hekimi

Authors

Bryan G. Hughes
View author publications
You can also search for this author in PubMed Google Scholar
Paul M. Harrison
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried Hekimi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

B.G.H. and S.H. devised the study; B.G.H. and P.M.H. conducted the analysis; B.G.H. and S.H. wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Siegfried Hekimi.

Ethics declarations

Competing Interests

Funding to Dr. Hekimi was provided by the Canadian Institutes of Health Research.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplemental Tables

Figure S1

File S1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hughes, B.G., Harrison, P.M. & Hekimi, S. Estimating the occurrence of primary ubiquinone deficiency by analysis of large-scale sequencing data. Sci Rep 7, 17744 (2017). https://doi.org/10.1038/s41598-017-17564-y

Download citation

Received: 04 July 2017
Accepted: 28 November 2017
Published: 18 December 2017
DOI: https://doi.org/10.1038/s41598-017-17564-y

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.