Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Respiratory complex and tissue lineage drive recurrent mutations in tumour mtDNA


Mitochondrial DNA (mtDNA) encodes protein subunits and translational machinery required for oxidative phosphorylation (OXPHOS). Using repurposed whole-exome sequencing data, in the present study we demonstrate that pathogenic mtDNA mutations arise in tumours at a rate comparable to those in the most common cancer driver genes. We identify OXPHOS complexes as critical determinants shaping somatic mtDNA mutation patterns across tumour lineages. Loss-of-function mutations accumulate at an elevated rate specifically in complex I and often arise at specific homopolymeric hotspots. In contrast, complex V is depleted of all non-synonymous mutations, suggesting that impairment of ATP synthesis and mitochondrial membrane potential dissipation are under negative selection. Common truncating mutations and rarer missense alleles are both associated with a pan-lineage transcriptional programme, even in cancer types where mtDNA mutations are comparatively rare. Pathogenic mutations of mtDNA are associated with substantial increases in overall survival of colorectal cancer patients, demonstrating a clear functional relationship between genotype and phenotype. The mitochondrial genome is therefore frequently and functionally disrupted across many cancers, with major implications for patient stratification, prognosis and therapeutic development.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: MtDNA mutations are among the most frequent genomic alterations in cancer.
Fig. 2: Truncating variants preferentially target CI.
Fig. 3: Non-truncating mtDNA mutations arise as rare recurrent alleles in protein-coding and RNA elements.
Fig. 4: Mitochondrial genotypes associated with transcriptional and clinical phenotypes.

Data availability

All data not available for download on public repositories as described in Methods are available on GitHub (

Code availability

R code to regenerate all figures is available on GitHub ( with the relevant data and instructions to execute the code.


  1. 1.

    Hornshøj, H. et al. Pan-cancer screen for mutations in non-coding elements with conservation and cancer specificity reveals correlations with expression and survival. NPJ Genom. Med. 3, 1 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  2. 2.

    Ju, Y. S. et al. Origins and functional consequences of somatic mitochondrial DNA mutations in human cancer. eLife 3, e02935 (2014).

  3. 3.

    Yuan, Y. et al. Comprehensive molecular characterization of mitochondrial genomes in human cancers. Nat. Genet. 52, 342–352 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  4. 4.

    Stewart, J. B. et al. Simultaneous DNA and RNA mapping of somatic mitochondrial mutations across diverse human cancers. PLoS Genet. 11, e1005333 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  5. 5.

    Grandhi, S. et al. Heteroplasmic shifts in tumor mitochondrial genomes reveal tissue-specific signals of relaxed and positive selection. Hum. Mol. Genet. 26, 2912–2922 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  6. 6.

    Hopkins, J. F. et al. Mitochondrial mutations drive prostate cancer aggression. Nat. Commun. 8, 656 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  7. 7.

    To, T.-L. et al. A compendium of genetic modifiers of mitochondrial dysfunction reveals Intra-organelle buffering. Cell 179, 1222–1238.e17 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. 8.

    Birsoy, K. et al. An essential role of the mitochondrial electron transport chain in cell proliferation is to enable aspartate synthesis. Cell 162, 540–551 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Samuels, D. C. et al. Finding the lost treasures in exome sequencing data. Trends Genet. 29, 593–599 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Li, H. et al. The sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  12. 12.

    Collura, R. V., Auerbach, M. R. & Stewart, C. B. A quick, direct method that can differentiate expressed mitochondrial genes from their nuclear pseudogenes. Curr. Biol. 6, 1337–1339 (1996).

    CAS  PubMed  Article  Google Scholar 

  13. 13.

    Cheng, D. T. et al. Memorial Sloan Kettering–Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn. 17, 251–264 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. 14.

    Bolze, A. et al. Selective constraints and pathogenicity of mitochondrial DNA variants inferred from a novel database of 196,554 unrelated individuals. Preprint at bioRxiv (2019).

  15. 15.

    Chang, M. T. et al. Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat. Biotechnol. 34, 155–163 (2016).

    CAS  PubMed  Article  Google Scholar 

  16. 16.

    Chang, M. T. et al. Accelerating discovery of functional mutant alleles in cancer. Cancer Discov. 8, 174–183 (2018).

    CAS  PubMed  Article  Google Scholar 

  17. 17.

    Triska, P. et al. Landscape of germline and somatic mitochondrial DNA mutations in pediatric malignancies. Cancer Res. 79, 1318–1330 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Gopal, R. K. et al. Early loss of mitochondrial complex I and rewiring of glutathione metabolism in renal oncocytoma. Proc. Natl Acad. Sci. USA 115, E6283–E6290 (2018).

    PubMed  Article  CAS  Google Scholar 

  19. 19.

    Zehir, A. et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat. Med. 23, 703–713 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Alston, C. L. et al. A novel mitochondrial MTND5 frameshift mutation causing isolated complex I deficiency, renal failure and myopathy. Neuromuscul. Disord. 20, 131–135 (2010).

    PubMed  Article  Google Scholar 

  21. 21.

    Castellana, S. et al. High-confidence assessment of functional impact of human mitochondrial non-synonymous genome variations by APOGEE. PLoS Comput. Biol. 13, e1005628 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  22. 22.

    Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).

    CAS  PubMed  Article  Google Scholar 

  23. 23.

    Martínez-Reyes, I. et al. Mitochondrial ubiquinol oxidation is necessary for tumour growth. Nature 585, 288–292 (2020).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  24. 24.

    El-Hattab, A. W., Adesina, A. M., Jones, J. & Scaglia, F. MELAS syndrome: clinical manifestations, pathogenesis, and treatment options. Mol. Genet. Metab. 116, 4–12 (2015).

    CAS  PubMed  Article  Google Scholar 

  25. 25.

    Gorman, G. S. et al. Prevalence of nuclear and mitochondrial DNA mutations related to adult mitochondrial disease. Ann. Neurol. 77, 753–759 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. 26.

    Gopal, R. K. et al. Widespread chromosomal losses and mitochondrial DNA alterations as genetic drivers in Hürthle cell carcinoma. Cancer Cell 34, 242–255.e5 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Terzioglu, M. et al. MTERF1 binds mtDNA to prevent transcriptional interference at the light-strand promoter but is dispensable for rRNA gene transcription regulation. Cell Metab. 17, 618–626 (2013).

    CAS  PubMed  Article  Google Scholar 

  28. 28.

    Spagnolo, M. et al. A new mutation in the mitochondrial tRNA(Ala) gene in a patient with ophthalmoplegia and dysphagia. Neuromuscul. Disord. 11, 481–484 (2001).

    CAS  PubMed  Article  Google Scholar 

  29. 29.

    Horváth, R., Reilmann, R., Holinski-Feder, E., Ringelstein, E. B. & Klopstock, T. The role of complex I genes in MELAS: a novel heteroplasmic mutation 3380G>A in ND1 of mtDNA. Neuromuscul. Disord. 18, 553–556 (2008).

    PubMed  Article  Google Scholar 

  30. 30.

    Agip, A.-N. A. et al. Cryo-EM structures of complex I from mouse heart mitochondria in two biochemically defined states. Nat. Struct. Mol. Biol. 25, 548–556 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  31. 31.

    Joshi, S. et al. The genomic landscape of renal oncocytoma identifies a metabolic barrier to tumorigenesis. Cell Rep. 13, 1895–1908 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    Ganly, I. et al. Integrated genomic analysis of Hürthle cell cancer reveals oncogenic drivers, recurrent mitochondrial mutations, and unique chromosomal landscapes. Cancer Cell 34, 256–270.e5 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Yaeger, R. et al. Clinical sequencing defines the genomic landscape of metastatic colorectal cancer. Cancer Cell 33, 125–136.e3 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Smith, A. L. et al. Age-associated mitochondrial DNA mutations cause metabolic remodelling that contributes to accelerated intestinal tumorigenesis. Nat. Cancer 1, 976–989 (2020).

    PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    Yan, H. et al. IDH1 and IDH2 mutations in gliomas. N. Engl. J. Med. 360, 765–773 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Baysal, B. E. et al. Mutations in SDHD, a mitochondrial complex II gene, in hereditary paraganglioma. Science 287, 848–851 (2000).

    CAS  PubMed  Article  Google Scholar 

  38. 38.

    Tomlinson, I. P. M. et al. Germline mutations in FH predispose to dominantly inherited uterine fibroids, skin leiomyomata and papillary renal cell cancer. Nat. Genet. 30, 406–410 (2002).

    CAS  PubMed  Article  Google Scholar 

  39. 39.

    Fendt, S.-M., Frezza, C. & Erez, A. Targeting metabolic plasticity and flexibility dynamics for cancer therapy. Cancer Discov. 10, 1797–1807 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. 40.

    Gammage, P. A. & Frezza, C. Mitochondrial DNA: the overlooked oncogenome? BMC Biol. 17, 53 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  41. 41.

    Priolo, C. et al. Impairment of gamma-glutamyl transferase 1 activity in the metabolic pathogenesis of chromophobe renal cell carcinoma. Proc. Natl Acad. Sci. USA 115, E6274–E6282 (2018).

    CAS  PubMed  Article  Google Scholar 

  42. 42.

    Gammage, P. A., Moraes, C. T. & Minczuk, M. Mitochondrial genome engineering: the revolution may not be CRISPR-ized. Trends Genet. 34, 101–110 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. 43.

    Trifunovic, A. et al. Premature ageing in mice expressing defective mitochondrial DNA polymerase. Nature 429, 417–423 (2004).

    CAS  PubMed  Article  Google Scholar 

  44. 44.

    Mok, B. Y. et al. A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature 583, 631–637 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. 45.

    Cancer Genome Atlas Research Network et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

    Article  CAS  Google Scholar 

  46. 46.

    Hyman, D. M. et al. Precision medicine at Memorial Sloan Kettering Cancer Center: clinical next-generation sequencing enabling next-generation targeted therapy trials. Drug Discov. Today 20, 1422–1428 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. 47.

    McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  49. 49.

    Chakravarty, D. et al. Oncokb: a precision oncology knowledge base. JCO Precis. Oncol. (2017).

  50. 50.

    Sonney, S. et al. Predicting the pathogenicity of novel variants in mitochondrial tRNA with MitoTIP. PLoS Comput. Biol. 13, e1005867 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  51. 51.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  52. 52.

    Zhu, A., Ibrahim, J. G. & Love, M. I. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics 35, 2084–2092 (2019).

    CAS  PubMed  Article  Google Scholar 

  53. 53.

    Benjamini, Y. et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).

    Google Scholar 

  54. 54.

    Liberzon, A. et al. The molecular signatures database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  55. 55.

    Sergushichev, A. An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation. Preprint at bioRxiv (2016).

  56. 56.

    Liu, Y. et al. Comparative molecular analysis of gastrointestinal adenocarcinomas. Cancer Cell 33, 721–735.e8 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  57. 57.

    Ellrott, K. et al. Scalable open science approach for mutation calling of tumor exomes using multiple genomic pipelines. Cell Syst. 6, 271–281.e7 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).

    CAS  PubMed  Article  Google Scholar 

  59. 59.

    Jurcik, A. et al. CAVER Analyst 2.0: analysis and visualization of channels and tunnels in protein structures and molecular dynamics trajectories. Bioinformatics 34, 3586–3588 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  60. 60.

    Baker, N. A., Sept, D., Joseph, S., Holst, M. J. & McCammon, J. A. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc. Natl Acad. Sci. USA 98, 10037–10041 (2001).

    CAS  PubMed  Article  Google Scholar 

  61. 61.

    Dolinsky, T. J., Nielsen, J. E., McCammon, J. A. & Baker, N. A. PDB2PQR: an automated pipeline for the setup of Poisson–Boltzmann electrostatics calculations. Nucleic Acids Res. 32, W665–W667 (2004).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

Download references


We thank the members of the Reznik, Gammage and Taylor laboratories for discussion and support. We also thank L. Finley, K. Birsoy, J. Blaza, C. Winchester and N. Rusk for their feedback. A.N.G., M.K., W.K.C., A.A.H., M.F.B, B.S.T. and E.R. were supported by the National Cancer Institute (NCI) Cancer Center Support (grant no. P30 CA008748). W.K.C. was supported by a National Institutes of Health (NIH) award (no. T32 GM132083). K.C.L. was supported by an F31 Predoctoral Fellowship from the NCI (award no. 7F31CA247528-02). B.S.T. was supported by the NIH (award nos. U54 OD020355, R01 CA207244, R01 CA204749 and R01 CA245069), as well as the American Cancer Society, Anna Fuller Fund and the Josie Robertson Foundation. E.R. was supported by the Geoffrey Beene Cancer Research Center Grant Award, Department of Defense Kidney Cancer Research Program (no. W81XWH-18-1-0318), and a Kidney Cancer Association Young Investigator Award. P.A.G. was supported by core funding from CRUK BI (nos. A17196 and A31287).

Author information




A.N.G., P.A.G. and E.R. conceived the study. M.K., W.K.C., K.L., A.A.H., M.F.B. and B.S.T. assisted with genomic data analysis. A.N.G. analysed protein structures. A.N.G., P.A.G. and E.R. wrote the manuscript, with input from all authors.

Corresponding authors

Correspondence to Payam A. Gammage or Ed Reznik.

Ethics declarations

Competing interests

B.S.T. reports receiving honoria and research funding from Genentech and Illumina, and advisory board activities for Boehringer Ingelheim and Loxo Oncology, a wholly owned subsidiary of Eli Lilly, Inc. All stated activities were outside the work described in the present study. He is currently an employee of Loxo Oncology. P.A.G. is a shareholder of Pretzel Therapeutics Inc. The remaining authors declare no competing interests.

Additional information

Peer review information Nature Metabolism thanks Navdeep Chandel, Young Seok Ju and Carlos Moraes for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: George Caputa.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Baseline demographics of cohort and aspects of sequencing coverage.

a, The demographic distributions of patient age, race, gender, mtDNA somatic mutation status, history of neoadjuvant treatment, mtDNA coverage, and tumor sample type for each of the cancer types included in our analysis. Somatic mutation status is annotated among the subset of samples with ≥90% paired tumor-normal mtDNA sequencing coverage (see Methods: classifying sample mtDNA variant status); mtDNA status distributions are shown for cancer types >10 such samples. Cancer types are ordered by increasing proportions of samples with VUS or truncating mtDNA mutations. b, Cancer type mtDNA coverage variation based on sequencing center. Center, the average percentage of mtDNA (among regions considered in our study) with sufficient coverage for calling mutations, compared between different cancer types in our cohort. Dot color indicates the sequencing center from which the exome sequencing data originate. Top, density histograms of the average % mtDNA coverage for each sequencing center. Samples sequenced at the Broad Institute are uniquely depleted for mtDNA off-target coverage. c, mtDNA coverage from off-target reads at each position. The number of samples for which the given mtDNA position was sequenced to at least 5 reads (top, the depth threshold used in our analyses) and 20 reads (bottom, for comparison). Red, the number of samples using unpaired tumor-only data, applicable only for protein-truncating variants which were always assumed to be of somatic origin; blue, the number using only matched-normal samples; green, the number of samples with coverage in both tumor and matched-normal samples at the given position (applicable for all non-truncating variants which required evidence that the variant was absent in the matched normal to be classified as somatic). Purple, the number of whole-genome sequenced samples available from ICGC/PCAWG for comparison. d, Proportion of samples with detectable mutations is not biased by cancer type sequencing coverage. There is no correlation between the fraction of well-covered samples in a cancer type and the proportion of well-covered samples with a detectable somatic mtDNA mutation. Cancer types with ≥30 well-covered samples shown, P-value and 95% confidence intervals from linear regression.

Extended Data Fig. 2 Strand-specific mutational signatures in our dataset.

The frequency of somatic SNVs on the light or heavy mtDNA strand with each of the 96 possible mutational signatures with trinucleotide contexts (among n = 3,872 SNVs). Blue bars indicate the prevalence of mutational signatures for heavy-strand encoded SNVs (substitutions at C or T central nucleotides); red bars indicate those for light-strand encoded SNVs (substitutions at G or T nucleotides, which were standardized to their C or T complementary nucleotide). The most prevalent mutational signatures are labeled. The underlined central position is mutated with the single nucleotide substitution labeled in the tile below.

Extended Data Fig. 3 Analysis of mutation burden in normal tissues and of tumor mtDNA mutation burden with nuclear mutagenic processes.

a, Comparison of heteroplasmies between truncating variants detected in tumor tissue, adjacent normal tissue, and blood. P-values from two-sided Wilcoxon-rank sum test. Boxes are centered at the median and extended to from 25th-percentile to 75th-percentile; whiskers extend from 25/75th-percentiles to the largest value within 1.5 × IQR (interquartile range, 75th-percentile - 25th-percentile. b, Rate of truncating variants in TCGA tumors compared to matched non-malignant tissue, matched blood, and unmatched saliva samples from HelixMTdb. Truncating variants arise at 10-80-fold higher rate in tumors relative to normal tissues. Error bars are exact binomial 95% confidence intervals. P-values are from two-sided two-sample z-tests. c, The percentage of rescued truncating variants in TCGA that are recapitulated in orthogonal RNA sequencing from the same tumor sample. d, Correlation between heteroplasmies of rescued truncating variants in DNA and orthogonal RNA sequencing. Pearson correlation coefficient as shown. e, Mitochondrial and nuclear tumor mutation burdens (TMB, mutations/Mb) are shown for each well-covered tumor, among cancer types with n ≥ 100 samples. Nuclear TMBs are calculated based on mutations to 468 cancer-associated genes and their total exonic-sequence length. Pearson correlation coefficients r indicate no linear correlation between mitochondrial and nuclear TMBs were observed for any cancer type tested. f, TMBs for somatic mtDNA mutations and mutations to cancer-associated genes are compared between microsatellite stable (MSS) and microsatellite unstable (MSI-High) tumors, for both (n colorectal cancer: MSI=65, MSS=318; n stomach adenocarcinomas: MSI=75, MSS=256). Although MSI-High tumors have elevated TMB for nuclear cancer genes, there is no effect on mtDNA TMB. Moreover, mtDNA TMB is similar to (or exceeds) that of nuclear cancer associated genes in both cancer types. Error bars are 95% exact Poisson confidence intervals.

Extended Data Fig. 4 Age- and tumor stage-associations of somatic mtDNA mutations across cancer types.

Heatmap shows tumor mutation burden (total mutations/total covered Mbps) for samples of each tumor type (a) combined across varying patient age at time of diagnosis and (b) tumor pathologic stage. Gray tiles indicate cancer type/age combinations with fewer than 3 patients; cancer types shown had at least 2 non-gray tiles. Right column: Spearman correlation coefficient r indicating correlation between age or pathologic stage and tumor mutation burden. Asterisks denote statistically significant correlations based on FDR-corrected P-values from a Student’s t-distribution.

Extended Data Fig. 5 Molecular features of truncating variants at homopolymeric loci.

a,b, Enrichment for truncating variants in CI and non-truncating in CIII when restricted to mutations with 20+ reads supporting the alternate allele. Error bars are 95% Poisson exact confidence intervals; P-values from two-sided Poisson tests. c, Comparison of frameshift indel homopolymer hotspots detected among indels supported by a minimum of 20 alt-reads (Y-axis) to those with a minimum support of 5 alt-reads (X-axis). d, Percentage of cases per cancer type with truncating frameshift indels at any of 6 indel hotspot loci. Plotted cancer types had ≥ 20 well-covered samples (n=4,432 paired tumor and matched-normal samples total). Bar height indicates the fraction of samples with any indels at homopolymer hotspot out of the total number of well-covered samples for the given cancer type; numbers above bars indicate the total number of cases. e, Validation of homopolymeric indel hotspot loci. The proportion of samples in TCGA (X-axis) or PCAWG (excluding samples also in TCGA, Y-axis) with frameshift indels at 73 homopolymeric regions. The 6 indel hotspot loci are colored red and labeled. y=x is shown as a dashed line. Pearson correlation coefficient r as indicated. f, Breakdown of homopolymer loci and their hotspot incidence rates by mitochondrial complex. Heatmap tile shading indicates overall mutation rate (total number of mutants across homopolymer loci divided by the total number of samples with sufficient sequencing coverage). Fractions in tile labels are the number of homopolymer hotspots divided by the total number of homopolymer loci. Right, histogram of the total number of loci with each homopolymer length. g, The percentage of all truncating variants which arose at 6 homopolymer hotspot loci in TCGA tumor samples and in saliva-derived normal samples from HelixMTdb. Error bars are 95% binomial confidence intervals.

Extended Data Fig. 6 Homopolymer hotspots for frameshift indels in TCGA and MSK-IMPACT cohorts.

a, Analysis of 73 homopolymer loci for enrichment of protein-truncating indels in TCGA samples in shown on X-axis, and in MSK-IMPACT samples on Y-axis. 5 out of 6 originally reported hotspot loci are enriched in both TCGA and MSK-IMPACT (green), while 1 was only enriched in TCGA samples (orange). Two additional candidate hotspot loci are unique to the MSK-IMPACT dataset (recurrently observed in TCGA, but not reaching statistical significance). b, Heatmap indicates the proportion of truncating indels at each of the 8 homopolymer hotspots to affect samples of different cancer types (that is rows are proportion of samples with indels at a given homopolymer summing to 1; clustered by hierarchical clustering). Right, histogram of the total number of affected samples in TCGA and MSK-IMPACT data. Note that the CIV hotspot discovered in MSK-IMPACT data preferentially arises in lung and prostate cancers.

Extended Data Fig. 7 Validation of VUS pathogenicity and tRNA mutation recurrence.

a, VUSs only observed in tumors are more likely to be pathogenic. Bars compare the clinical significance (annotated by ClinVar) of SNVs observed somatically in tumors but never in patients’ matched-normal samples against SNVs never observed in either tumor or matched-normal samples. P-value from a two-sided Cochran-Armitage trend test. b, Validation of tRNA structural hotspots in PCAWG. The number of samples with SNVs in tRNAs at the indicated cloverleaf structural position, bottom; top, the statistical enriched of the given position for mutations. Position 31 Q-value=0.014, n=196 tRNA mutations among 1,951 PCAWG samples.

Extended Data Fig. 8 mtDNA mutations produce transcriptional phenotypes.

a,b, Transcriptional dysregulation attributed to truncating (a) and VUS (b) mtDNA variants. Heatmaps shows directional significance of dysregulation of a given geneset in tumors with truncating or VUS mtDNA variants among the given cancer type; −log10(Q-value) > 2 indicates significant up-regulation in mutated compared to wild-type samples, < −2 indicates significant down-regulation. Histograms on the right show the number of wild-type samples and mutated samples used in calculating differentially expressed genes and dysregulated genesets. c, Difference in mtDNA mutation status between colorectal cancer consensus molecular subtypes. Left, the proportion of samples with wild-type mtDNA (that is no somatic mutations), VUS (any non-truncating) or truncating variants among colorectal tumors with each consensus molecular subtype (CMS) is shown. Right, histogram of the number of well-covered colorectal tumors. There was a statistically significant difference in mtDNA mutation status between different CMS classifications (P=0.03, Chi-squared test, n=415 samples total, error bars are 95% exact binomial confidence intervals).

Extended Data Fig. 9 mtDNA mutations are protective in colorectal cancer patients in the MSK-IMPACT cohort.

a, Multivariate survival analysis based on Cox proportional hazards regression demonstrating the effect of VUS or truncating mtDNA mutations (relative to wild-type) on colorectal cancer patient overall survival in the MSK-IMPACT cohort. b, Same as in (a) but treating VUS and truncating mtDNA mutations as a single class compared to wild-type. Error bars are 95% confidence intervals from Cox proportional-hazards regression, n=172 MSK-IMPACT patients.

Extended Data Fig. 10 Repurposing whole-exome and clinical sequencing data optimizes sample size at the expense of sensitivity for low-heteroplasmy variants.

a, The number of different classes of mtDNA variants detected from either repurposed TCGA samples using our approach, or using only whole-genome sequenced tumors from PCAWG, stratified by heteroplasmy < 30% or ≥ 30%. Labels above bars indicate the exact number. b, Comparison of the difference in gene expression between (1) high-heteroplasmy truncating mutations and wild-type tumors (X-axis) and (2) low-heteroplasmy truncating mutations and wild-type tumors (Y-axis). c, Strengths and use-cases of three common tumor DNA sequencing modalities for mtDNA mutation analysis. WGS-based approaches are optimal for studying low-heteroplasmy variants and identifying structural variants and mtDNA copy number, while whole-exome and targeted gene-panel-based approaches optimize sample size, detection of recurrent variants, and clinical associations.

Supplementary information


Reporting Summary

Supplementary Tables

Supplementary Tables 1–6

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gorelick, A.N., Kim, M., Chatila, W.K. et al. Respiratory complex and tissue lineage drive recurrent mutations in tumour mtDNA. Nat Metab 3, 558–570 (2021).

Download citation


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing