Introduction

Mitochondrial disorders are a very complex and heterogeneous group of human diseases, affecting multiple organs or tissues, and are caused by malfunction of the cellular energy production system hosted by the mitochondria. Defects in this oxidative phosphorylation (OXPHOS) system are the result of both nuclear or mitochondrial (mt) genome-encoded gene mutations.1, 2 Mitochondrial DNA (mtDNA) is a 16 569 bp circular double-stranded molecule present in hundreds to thousands of copies in the mitochondria of a single cell, which encodes 37 essential elements of the OXPHOS system. Within an individual those molecules can be all identical (homoplasmy) or be a mixture of multiple mtDNA types (heteroplasmy). Pathogenic alterations have been found along the entire mt genome.3 At the moment, it can no longer be considered good laboratory practice to test the mtDNA of patients with a strong clinical suspicion of a mtDNA aberration for the presence of deletions and a handful of common point mutations solely. Furthermore, accurate genetic counseling to exclude a mt maternal inheritance is only possible after a systematic analysis of the complete mt genome. Sanger sequencing, the Gold standard, has several limitations and disadvantages as it is costly and labor intensive for large numbers of samples, and not appropriate for accurate detection or quantification of low-level heteroplasmy.4 A novel approach is now available with the development of NGS methodologies.5, 6 We tested a budget-friendly massive parallel sequencing (MPS) device for diagnostic purposes, and we believe that our findings are of interest to the whole diagnostic community.

Materials and methods

Sample collection and DNA

The patients were selected from a population, referred to the Center of Medical Genetics or the Division of Pediatric Neurology and Metabolism, for molecular or/and biochemical investigation for a clinical suspicion of a mt disorder. For all these samples, the mtDNA sequence was previously determined with conventional dye terminator chemistry, and routine Southern blot analysis has been performed.

Sanger sequencing

The complete mtDNA was amplified for dideoxy analysis using the AmpliTaq Gold PCR Master Mix Kit (Life Technologies, Guilford, CT, USA). Data were checked with appropriate software and by manual visual inspection by a technician and a scientist on the basis of the rCRS NC_012920.1 reference sequence.

Long-range PCR

Whole mtDNA was amplified from 50 ng of gDNA, using the Roche Expand Long Template Kit (Roche Applied Science, Vilvoorde, Belgium), in three overlapping long-range (LR)-PCR amplicons according to the manual’s instructions (Zaragoza et al7 for primers). In addition, analysis of five DNA samples was also performed for a LR-PCR fragment of the entire mt genome in one single amplicon.8

Next-generation sequencing by Ion PGM Torrent

To obtain a more uniform coverage, amplicons B and C were added in a 25% excess over A, per patient. Library preparations were performed following the manufacturer’s instructions (Ion Xpress plus Fragment Library Kit; Life Technologies) for 200 bp single-end reads. Library material was purified using AMPure beads, and concentration and length were determined using the qubit 2.0 fluorimeter (Invitrogen, Merelbeke, Belgium) and microchip electrophoresis (labchip GX; Perkin-Elmer, Zaventem, Belgium). Emulsion PCR was performed on pooled libraries (Ion One Touch 200 Template Kit v2 DL Kit; Life Technologies), and template-positive Ion Sphere particles were enriched using Dynabeads Myone streptavidin C1 beads. Samples were prepared according to the instructions provided with Ion PGM 200 Sequencing Kit (Life Technologies) protocol. Five pooled samples were loaded on an Ion 316 chip.

Sequencing a clonal specimen pUC19

For determination of the technical error rate of the Ion Torrent chemistry and technology, 100 ng of pUC19 plasmid DNA (Thermo Fisher, Erembodegem-Aalst, Belgium) was directly used in a library preparation protocol, and subsequently sequenced. To define the error rate of the sequencing process, an overview was created listing per position every nucleotide signal detected. Dividing the numbers of all non-reference bases by the total numbers of bases per site resulted in the error rate for each position separately. The average error rate was deduced by calculating the average of all positional error rates of the pUC19 plasmid genome.

Preparation of artificial heteroplasmic samples to determine the detection limit of different bases at the same nucleotide position

Combining PCR products, and not gDNA of Control samples 2 and 3 (Supplementary Table 1), is a prerequisite to this experiment, as the copy numbers of mtDNA molecules vary substantially among individuals. Both control DNA samples differed at 11 positions (Figure 2). For an extra two substitutions, at m.93A>G and m.7989T>C, natural heteroplasmy prevented their inclusion into a detailed data analysis of allele frequencies. The variations of nucleotide stretches in the vicinity of m.310 were not included for analysis either. After quantification of relative DNA concentrations using real-time PCR techniques (ViiA7 Real-Time PCR System; Life Technologies), each amplicon was diluted and combined in four different ratios (2:98, 5:95, 10:90 and 30:70). A final amount of 100 ng was used as the starting material for library preparation. DNA mixtures were generated from both the three and one LR-PCR amplicon approach and run in duplicate.

Data analysis

Reads were generated using the Ion Torrent platform pipeline software v.3.6 (Life Technologies), which included standard control of the read quality. Only high-quality reads were aligned to the reference sequence using TMAP (Torrent Mapping Alignment Program). Independently, the read quality was also determined with the FastQC software (Babraham Bioinformatics, Cambridge, UK), a quality control tool for NGS data (http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc). Finally, for each sample, the coverage per base was obtained from the BAM (Binary Alignment Map) files data using the SAMtools (Sequence Alignment Map) software.9 Detection of large deletions was performed by normalization of the coverage profile of each sample to the one of a non-deleted control DNA. Subsequently, the mutation load of large deletions identified was calculated by dividing the average read depth in the deletion region by the average read depth of the non-deleted part of the mt genome. Sequence variants were detected using the Variantcaller v3.6.59049 plugin of the Torrent Suite (Life Technologies), which used default the Genome Analysis Tool Kit to identify possible indel sites. The Variantcaller was run with default somatic–low stringency settings (Supplementary Table 2), except for minor allele frequency (snp-min-allele-freq at 0.05) and the general minor altered allele frequency (gen-min-alt-allele-freq at 0.05). For each of the 32 samples processed, all possible SNPs and indels were accessed through a vcf (Variant Call Format) file or excel file. All variants listed with Variantcaller v3.6.59049 were also visualized in the Integrative Genomics Viewer (IGV) to discard manually alterations generated by incorrect calling.

Variants were retrieved from mtDB (http://www.mtdb.igp.uu.se) and MITOMAP (http://www.mitomap.org).3 For each sample, the sequencing error was determined for each position, with exception of variant sites. Per position the average sequencing error and their standard deviation was calculated for all 32 samples sequenced. Potential low-frequency sites were compared with the average and standard deviation values to reduce the false-positive (and false-negative) rate of the result data.

Results

Sequencing overview and coverage analysis

The mt genome of all 32 samples was amplified in three overlapping LR-PCR amplicons. In addition, the mtDNA of five of these samples was also covered by one large LR-PCR. Control of LR-PCR amplification of gDNA extracted from rho zero cells excluded the potential coamplification of nuclear pseudogenes of mt origin (NUMTs) and thereby contamination of the mt sequence data that might result in apparent heteroplasmy. Coverage, uniformity of distribution of read coverage, read length, strand bias, data quality, types of modifications detected, range of variant detection and error rate are key issues to evaluate MPS output data. The total number of reads generated for all samples sequenced was 32 713 681, of which 96.5% mapped to the reference sequence. The average read length was 161 bp, with an average read quality of 25. An average coverage of 6302 (range 4456–10 266) for the 32 samples was obtained, whereas the five samples additionally sequenced from one large amplicon yielded an average read depth of 5886 (range 4782–7934) reads. For all amplicons, it was noticed that different areas along the mtDNA molecule were not sequenced uniformly, but displayed peaks and troughs (Figure 1). Three areas with a larger read depth were representing the regions of overlapping PCR amplicons. However, for some regions scattered throughout the genome, the read depth dropped considerably because of an extremely under-representation of the plus strand. These particular patterns were reproduced in replicates of the same and between different samples.

Figure 1
figure 1

Relative coverage of the mt genome for each of the samples sequenced. The relative coverage was calculated by dividing the read depth of each position by the average read depth of the whole mtDNA. (a) Three LR-PCR overlapping fragment assay for all 32 samples. The three major peaks, indicated with an arrow, represent the overlapping regions. (b) One large LR-PCR fragment assay for four samples.

Sequencing of the pUC19 plasmid to define error rate

Base calling errors associated with NGS data could result in false-positives as they, incorrectly, might be considered as heteroplasmic nucleotide variants. Therefore, a pUC19 plasmid was analyzed to determine the impact of the error rate of the Ion Torrent PGM chemistry and technology (including software) on minor allele frequency calling. The read depth ranged between 12 711 and 69 640 with an average of 50 846, and 81% of all positions contained a depth of at least 40 000 × . Results showed an average substitution error rate of 0.12% (range 0–7). Including indel errors increased the sequencing error rate to an average of 0.80% (range 0–52; Supplementary Table 3). In general, the majority of the sequencing errors in the pUC19 data was located in or near homopolymeric stretches, and were mostly detected on one strand only. To be at the more conservative side, the detection threshold for identification of base variants was set at 2%. Implemented on the pUC19 data set, this resulted in three credible but unexpected minor allelic substitutions within the sequence at subsequent positions 1396G>A, 1397G>C and 1398T>C, at 5.6%, 5.6% and 5.4%, respectively, and detected on both strands. IGV analysis of the reads showed clearly that these alterations are located on the same molecule. However, forced PCR amplification of this small subfraction of mutated pUC19 molecules with a forward primer complementary, with the 3 bp alteration at its 3′ end, did not generate a PCR fragment. In addition, the presence of any mosaic sequence pattern was not confirmed by MPS of this pUC19 DNA with MiSeq technology (Illumina, Leusden, The Netherlands). Taken together, these results strongly suggest that these variants are sequencing errors and not true mosaic substitutions.

Determination of the detection limit of mtDNA heteroplasmy in patient samples

To determine both the accuracy and precision of our settings for identification of heteroplasmy and for evaluation of the reliability of quantitative measurements, a DNA control sample was spiked in with another control DNA. Mixtures in four known ratios were prepared for both LR-PCR assays. Among the 11 variant positions, 10 were transitions, whereas the remaining was a single cytosine insertion, embedded in a 12-bp-long interrupted C-tract region. For the latter, a false-negative result was generated as the system failed to detect the single cytosine insert m.5899dupC at a level ≤10%. An estimated variant load of 28% was measured in the 30% mixtures (Figure 2a). All observed frequencies were consistent with the theoretical values for each of the single or multiple amplicon mixtures and a linear regression (R2=0.99) could be seen (Figure 2b). The results also demonstrated a systematical overcall of the empirical minor allele measurements of the 10 substitutions, a discrepancy that is marked more within the three LR-PCR fragment experiments. There was no significant difference in base calling for the technical replicates of the mt sequencing data at the 11 variant sites (paired t-test; P=0.40) or all other homoplasmic positions. The average difference between the duplicate runs for the four samples varied from 0.1 to 0.9%. Although neither the theoretical nor the experimental natural heteroplasmy levels present in the control samples were evaluated thoroughly in these mixtures, a reduction of these heteroplasmies consistent with increasing dilutions was observed.

Figure 2
figure 2

(a) Observed and theoretical heteroplasmy levels for 11 positions under investigation in the artificial DNA mixtures. Horizontal lines represent the theoretical values, whereas the dots represent the observed heteroplasmy levels. (b) Linear regression between the average observed and the theoretical heteroplasmy levels for the 11 positions. (▪) represents the single amplicon LR-PCR amplicon mixtures and (•) symbolizes the multiple amplicon LR-PCR assay.

A significant additional number of false positives (range 5 to 6) were detected at the threshold setting of 2% versus 5%. Most, but not all, of these false positives resulted from miscalls in homopolymeric stretches and required manual input for evaluation. To underpin the presence of non-reference bases as either low heteroplasmy or rather a technical artifact, the average sequencing error per nucleotide position throughout the genome was calculated from the compiled data of all samples processed, and compared with the frequencies found in individual DNA samples (Supplementary Table 4). On the basis of the sensitivity and specificity experiments and the sequencing error determination, it was decided to balance the number of false positives and negatives and to perform further analysis of mtDNA sequences using a stringent 5% as detection threshold.

Variant detection using the Torrent Suite 3.6

Comparison semiconductor sequencing versus Sanger sequencing

The MPS results of the mt genome of 32 mtDNA samples were compared with previous Sanger sequencing results (summary MPS data in Supplementary Table 1). Dideoxy sequencing of the mtDNA had previously revealed a total of 774 variants in the 28 patients, and an additional 88 variants in the four controls, with multiple variants present in more than one DNA sample. Using the Torrent Suite pipeline version 3.6.59049 Variantcaller plugin, 956 variants were detected in total. However, human insight and manual inspection of the assembly alignment data allowed to remove 128 (13%) false-positive results. Discarding the mala fide data resulted in 828 true variants that are 96% (828/862) concordant with the Sanger data. Problems originating from analyzing the m.302–316 region (composed of two mononucleotide C-stretches separated by a single thymidine base at position m.310) were the major reason for these 34 remaining false-negative results. This C7TC5 region is very polymorphic and several single or multiple nucleotide insertions, deletions and substitutions have been reported. For the 31 (31/32) non-rCRS mtDNA sequences, Sanger sequencing revealed an expansion of this hypervariable region with one (19/31) to up to four extra Cs (12/31). However, in almost half of the samples (13/31), the algorithm identified the cytosine insertions primarily as a heteroplasmic nucleotide substitution at position m.310. In addition, heteroplasmic sequence mixtures of molecules with an insertion of one and two Cs or a much more complex sequence pattern of several heteroplasmic C-tract inserts (differing in insert numbers) with or without a 3′ thymine were seen in the mapped BAM files using IGV in these samples. The other two nucleotide positions resulting in incorrect, false-negative data were also located in the vicinity of homopolymeric stretches. The m.16183A>C is a known polymorphism that occurred in three patients (patient 21, 24 and 25) and is embedded in an adenine-rich region. Analogous inconsistencies in this part of the D-loop were reported by Parson et al.10 The last variant not reported was a pathogenic deletion of a cytosine at position m.7402, which was embedded in a mononucleotide stretch of six Cs in patient 12. This single-nucleotide deletion was, however, identified by IGV analysis. Known pathogenic mutations and heteroplasmic variants identified in this study are listed in Table 1.

Table 1 Overview of all pathogenic and heteroplasmic variants

Detection of three novel potential pathogenic variants

In our data set, three novel alterations with a potential pathogenic character were detected with both MPS and conventional sequencing. A novel heteroplasmic m.14721G>A mutation in the MT-TE (tRNAGlu) gene was revealed in patient 7 who presented with an isolated complex I deficiency. This substitution, with a mutation load of 48% in the skeletal muscle tissue, altered a highly conserved nucleotide in the DHU stem and disturbed the original C–G pairing. Another novel mutation was identified in the MT-CYB gene of patient 10. The homoplasmic m.15453T>C mutation was detected in the leukocytes of this patient with an isolated complex III deficiency and resulted in an amino-acid change of a conserved leucine into a proline (p.(Leu236Pro)). The variant was predicted as probably damaging by in silico data analysis with Polyphen2 (Polymorphism Phenotyping v2) and SIFT (Sorting Intolerant From Tolerant). A last novel pathogenic aberration was seen in the skeletal muscle MT-COI gene of patient 12 and involved a single heteroplasmic base deletion m.7402delC in a stretch of six Cs. The deletion introduced a premature stop codon and a truncated protein. The patient presented with an isolated complex IV deficiency.

Heteroplasmic variant calling

A total of 25 heteroplasmic variants at 23 different positions (m.189A>G is present in three DNA samples) in 18 samples were present with an allele frequency of ≥5% on both forward and reverse reads (Table 1). In addition, these sites were also evaluated by calculating per position the sample-specific error rate, and comparing this value to the average error rate of all samples. In case of both an elevated average error rate and a high standard deviation, the low-frequency base variant was more likely a sequencing error than a trustworthy low-level heteroplasmy. Consequently, positions m.72T>C and m.16294C>T with a minor allele frequency of 6% and 10%, and with an average error rate of 6.7 (range 0.79–19%) and 10% (range 1.42–10.4%), respectively, were discarded. Furthermore and consistent with the Sanger sequencing data of the patient samples for these positions, the presence of six mutations with known deleterious consequences (m.3242G>A, m.3243A>G, m.3244G>A, m.7453G>A, m.9203_9204delAT, m.13513G>A) was confirmed by NGS analysis, whereas simultaneously heteroplasmies could be quantified. Twelve positions, none of them acknowledged by initial Sanger sequencing, concerned variants with an allele frequency in a range from 6 to 18%. Meticulous review of the initial electropherograms confirmed the low-level alleles for only five of these sites. Seven positions were masked by Sanger sequencing noise or were too weak to be spotted. Three minor alleles, m.9769C (MT-CO3), m.10866C (MT-ND4) and m.12071C (MT-ND4) with levels of 9%, 8% and 11%, respectively, were seen in the protein coding genes of the mtDNA of leukocytes of a patient and a control (latter one) and are not present in MITOMAP3 or mtDB. In silico prediction tools assigned a possible deleterious role to the change of the conserved amino acid p.(Ile188Thr) in the p.CO3 protein. Clinically relevant tissue such as the muscle was not available for investigations. In addition, a pathogenic m.7989T>C variant was found with a mutation load of 18% in blood mtDNA of a healthy control. In a reported patient, with symptoms and signs of exercise intolerance and myoglobinuria, the variant was present with a high mutation load (>90%) in COX-negative fibers.11

Detection of large single or multiple deletions in patients

This study population included two skeletal muscle samples (Supplementary Table 1) with single large-scale deletions. Identification of the large deletions was only possible with the single LR-PCR amplicon protocol. Data analysis revealed deletions and breakpoints congruent with previous results. The mtDNA of patient 31 contained a deletion of 2309 bp (m.12113_14421del2309), whereas a 2380 bp deletion (m.10224_12603del2380) was detected in patient 32, with a mutation load of 70% and 86%, respectively. In the mtDNA of patient 30, the adapted MPS protocol identified the presence of multiple deletions (Figure 3) not revealed with Southern blot analysis.

Figure 3
figure 3

Representation of the large-scale deletions as detected in patient nos. 30, 31 and 32. Case nos. 31 and 32 are patients with single large-scale deletion in their mtDNA. Case 30 is a patient with multiple deletions. (a and b) Results from the ‘3 overlapping LR-PCR fragment’ assay for patients 31 and 32, respectively. (ce) Results from the ‘one large LR-PCR’ assay for patients 31, 32 and 30, respectively.

Discussion

Mutations of mtDNA are frequent in humans and implicated in a heterogeneous group of mt disease pathology.1, 2 However, nucleotide analysis is not straightforward as substantial sequence variations and heteroplasmy complicate variant calling. We retrospectively analyzed 32 mtDNA samples (28 patients and 4 controls) with the Ion Torrent PGM system (including bioinformatics tools) to evaluate its diagnostic potential for mt genome inspection. A LR-PCR strategy of one single (five samples) or three overlapping amplicons (32 samples) was used to amplify the mt genome. Unexpectedly, the Ion Torrent PGM sequencing protocol did not provide uniform coverage. Unlike the pUC19 plasmid DNA, a variable read depth distribution with the presence of troughs and peaks throughout the whole mtDNA molecule was seen. This impressive strand bias for almost 10% of the sequences was not correlated to the GC content and was not PCR, primer or sample dependent. The reads were neither discarded as low-quality data by the system. As a consequence, depth of coverage is a prominent point to consider. In our experience, a mean depth of 5000 × is indicated to assure the necessary number of reads to discriminate true heteroplasmy in badly covered regions. Of course, these values are greatly influenced by the software’s performance. In the course of our experiments, we got a software update of Variant Suite 2.2.3 to 3.2 improving the interpretation of NGS data considerably. The detection and characterization of single and multiple deletions is greatly simplified with MPS, surpasses Southern blot techniques to detect deletions and allows simultaneous identification of their junctions and measurement of corresponding mutation loads. Amplification of the whole mt genome in three overlapping PCR amplificons was initially preferred to the generation of one large molecule as the latter assay might be more prone to amplification failure. Detection of (low) heteroplasmic mutations is essential for mtDNA analysis in a diagnostic setting, but the accuracy of the MPS results relies on a meticulous differentiation between sequencing errors and genuine heteroplasmic variants. Data of pUC19 sequence experiments were used to define the error threshold associated with the Ion Torrent PGM chemistry and device. The obtained average error rate of 0.8% was in accordance with error rates seen in previous studies with indels, mainly insertions associated with homopolymeric stretches, as the dominant type of error.12 The measurement of heteroplasmy amounts reflected the levels initially present with only a slight systematical overestimation of the values, as was seen in the assessment of the artificial mtDNA control mixtures for all substitutions evaluated. However, a false-negative result was generated as the single insertion m.5899dupC was solely detected for a heteroplasmy level of 30%. The sequence context of the insertion challenged the performance of mononucleotide variation identification extremely, as this position is located near the end of an almost perfect 12 bp C-stretch. However, quantification of a mtDNA mutation might be decisive for information and prognosis on the disease severity and development, or for reproductive choices. The Sanger sequence data of the 32 samples were not fully consistent with the results of the NGS analysis. Not surprisingly, higher sensitivity was obtained with the Ion Torrent PGM assay for low-level heteroplasmy as seven undetected (range 6–10%) heteroplasmic substitutions were remarked in five samples. Dideoxy sequencing is known to be all but a perfect reflection of in vivo low heteroplasmic alterations.4 Therefore, it is hard to comment on the specificity of this protocol for the detection of low-level length and point heteroplasmy. MPS strategies of mt genomes, including those of healthy individuals, with other NGS platforms reported also an elevated incidence of heteroplasmy variation in comparison with conventional sequencing.13, 14, 15, 16 Conversely, systematic comparison of the NGS results with conventional sequencing exposed both false-positive and -negative results. Indeed, 34 variants previously detected with Sanger sequencing were not identified. All but one are located at two homopolymeric sites of the D-loop, which have also shown a major pitfall for accurate base calling in a paper discussing the Ion Torrent PGM performance in a forensic context.10 Although the majority of nucleotide variants are often innocent polymorphisms or clinically insignificant, the missed call of the pathogenic single-nucleotide deletion with a 80% heteroplasmy at position m.7402 would have resulted in a misdiagnosis of a patient sample. False–positive calls outnumbered the false-negative calls largely, up to a point were only relaxing the stringent heteroplasmic threshold level of 2 to 5% balanced the numbers acceptably. Inadequate processing of homopolymeric regions is a well-known problem associated with the Ion Torrent PGM system.17 This burden, together with a 5% threshold error rate and an unequal coverage, might impede the system’s potential in its implementation as a diagnostic tool for mt genome analysis, at present. For budgetary reasons, we were not in the ability to test other software algorithms but the Variant Suite. In Parson et al10 the forensic mt genome data of imperfect homopolymer tracts in the D-loop were tested in parallel with both PGM supported and NextGENe pipelines. The sensitivity and specificity of the processed data was significantly distinct for both systems. Taken together, these results highlight the significance of software systems, and indicate an important potential for further improvement of their present performance. Simultaneous use of independent pipelines for the analysis of data is definitely a less attractive option from a diagnostic perspective, where turnaround time and cost-effective tests are key elements. A comprehensive analysis of the whole mt genome is essential in any optimal diagnostic work-up of patients suspected of a mt disorder. MPS strategies are offering us this opportunity within a reasonable time frame and at an affordable cost. Furthermore, extensive examination of the mt genomes of large cohorts of control and patient populations will contribute largely to refine further the spectrum of rare variants responsible for mtDNA disease and unusual but clinical irrelevant nucleotide diversity. NGS analysis of the human genomes is definitively here to stay, but the rate at which this technology will conquer the diagnostic mt genome bench will largely depend on the power of the analytical and bioinformatic procedures.