Single-cell mtDNA dynamics in tumors is driven by coregulation of nuclear and mitochondrial genomes

Kim, Minsoo; Gorelick, Alexander N.; Vàzquez-García, Ignacio; Williams, Marc J.; Salehi, Sohrab; Shi, Hongyu; Weiner, Adam C.; Ceglia, Nick; Funnell, Tyler; Park, Tricia; Boscenco, Sonia; O’Flanagan, Ciara H.; Jiang, Hui; Grewal, Diljot; Tang, Cerise; Rusk, Nicole; Gammage, Payam A.; McPherson, Andrew; Aparicio, Sam; Shah, Sohrab P.; Reznik, Ed

doi:10.1038/s41588-024-01724-8

Download PDF

Article
Open access
Published: 13 May 2024

Single-cell mtDNA dynamics in tumors is driven by coregulation of nuclear and mitochondrial genomes

Nature Genetics volume 56, pages 889–899 (2024)Cite this article

6397 Accesses
1 Citations
33 Altmetric
Metrics details

Subjects

Abstract

The extent of cell-to-cell variation in tumor mitochondrial DNA (mtDNA) copy number and genotype, and the phenotypic and evolutionary consequences of such variation, are poorly characterized. Here we use amplification-free single-cell whole-genome sequencing (Direct Library Prep (DLP+)) to simultaneously assay mtDNA copy number and nuclear DNA (nuDNA) in 72,275 single cells derived from immortalized cell lines, patient-derived xenografts and primary human tumors. Cells typically contained thousands of mtDNA copies, but variation in mtDNA copy number was extensive and strongly associated with cell size. Pervasive whole-genome doubling events in nuDNA associated with stoichiometrically balanced adaptations in mtDNA copy number, implying that mtDNA-to-nuDNA ratio, rather than mtDNA copy number itself, mediated downstream phenotypes. Finally, multimodal analysis of DLP+ and single-cell RNA sequencing identified both somatic loss-of-function and germline noncoding variants in mtDNA linked to heteroplasmy-dependent changes in mtDNA copy number and mitochondrial transcription, revealing phenotypic adaptations to disrupted nuclear/mitochondrial balance.

Massively parallel single-cell mitochondrial DNA genotyping and chromatin profiling

Article 12 August 2020

Haplotype-aware analysis of somatic copy number variations from single-cell transcriptomes

Article 26 September 2022

Comprehensive molecular characterization of mitochondrial genomes in human cancers

Article Open access 05 February 2020

Main

Tumors commonly accumulate mutations and copy number alterations to mitochondrial DNA (mtDNA)^1,2. The functional effects of these genetic changes on cell metabolism^3,4, apoptotic potential^5,6, innate immunity⁷ and other phenotypes depend on at least the following two key factors: the fraction of mutated mitochondrial genomes in the cell (heteroplasmy) and the total number of mtDNAs in the cell (mtDNA copy number)^8,9. Furthermore, because mtDNA mutations normally arise over the course of human development, somatic cell division, aging and tumorigenesis, mtDNA genotypes are nonrandomly distributed across cells and consequently display potentially large cell-to-cell variation^2,10,11.

The prevalence of intracellular and intercellular variability in mtDNA genotype represents both a critical confounder to the characterization of phenotypes associated with mtDNA mutations and an effective cell-endogenous mutational barcode for tracing ongoing somatic evolution¹². To date, several techniques such as single-cell RNA sequencing (scRNA-seq) and single-cell transposase-accessible chromatin sequencing (scATAC-seq) have been applied to measure mtDNA genotypes across tumors, focusing exclusively on the detection of somatic mutations (as opposed to mtDNA copy number) for use as cell-endogenous lineage markers^13,14,15,16. These methods typically require DNA amplification or other approaches to library preparation that inhibit accurate quantification of the absolute mtDNA copy number in a single cell. Yet, the total number of wild-type mtDNA copies, which is determined jointly by heteroplasmy and the total mtDNA copy number, is a key property for understanding the genotype-phenotype map of pathogenic mtDNA mutations^8,17,18. A comprehensive understanding of mtDNA genotypic variability, evolution and functional consequences therefore requires joint measurement of genotype and absolute copy number.

We previously developed a single-cell whole-genome sequencing (scWGS) platform called Direct Library Prep (DLP+) to study genome plasticity, cell-to-cell variation and clonal evolution driven by copy number alterations of the nuclear genomes of human cancers and model systems^19,20,21. Because DLP+ is amplification-free and mtDNAs exist in multiple copies within each cell, it uniquely enables the simultaneous, high-fidelity interrogation of mtDNA genotype, mtDNA copy number and nuclear DNA (nuDNA) genotype across single cells. Here we analyzed DLP+ data of 72,275 single cells from engineered breast epithelial cell lines, patient-derived xenograft models of triple-negative breast cancer (TNBC) and high-grade serous ovarian cancer (HGSC) and primary HGSC tumors. Through the application of computational methods to this unique collection of single-cell genomes, we interrogated the regulatory architecture that quantitatively connects single-cell variation in mitochondrial and nuclear genotypes to downstream phenotypes.

Results

Per-cell mtDNA copy number quantification by DLP+

To study mtDNA copy number and heteroplasmy jointly at single-cell resolution, we collected scWGS (DLP+) libraries from a variety of distinct biological settings covering nontransformed cell lines, patient-derived xenografts (PDXs) and primary human tumors (Fig. 1a,b and Supplementary Table 1). These data included previously published^19,20,21 sequencing of cell lines from (1) GM18507 diploid lymphoblastoid cell line (n = 3,203 cells), (2) nontransformed 184-hTERT mammary epithelial cell line (n = 4,011 cells), (3) four TP53^−/− 184-hTERT cell lines (n = 30,012 cells), (4) engineered TP53^−/−;BRCA2^+/− 184-hTERT cell line (n = 2,012 cells), (5) two TP53^−/−;BRCA2^−/− 184-hTERT cell lines (n = 1,056 cells), (6) TP53^−/−;BRCA1^+/− 184-hTERT cell line (n = 463 cells) and (7) TP53^−/−;BRCA1^−/− 184-hTERT cell line (n = 430 cells), as well as the ovarian cancer cell line OV2295 (n = 573 cells), cervical cancer cell line HeLa (n = 507 cells) and HER2+ breast cancer cell line T-47D (n = 2,534 cells). Furthermore, our dataset included 12 different PDX models of TNBC (n = 23,466 cells), three of which were cisplatin-treated (n = 7,300 cells), one HGSC PDX (n = 38 cells) and five primary HGSC tumors (n = 4,150 cells) including two newly sequenced surgical resections for a total of 32 distinct samples. For 18 of these samples (eight cell line samples, seven PDX samples and three primary tumor samples), matching scRNA-seq from the same sample was available. Many samples include multiple sequencing libraries performed at different time points as part of a serial passaging experiment, resulting in 127 distinct libraries (median 507 cells per library).

**Fig. 1: Overview of the data and coverage information.**

We first compared the coverage of mtDNA in DLP+ and matching 3′-enriched 10× scRNA-seq. Reads aligning to mtDNA were abundant across all cells and, in contrast to the scRNA-seq, covered the entire mitochondrial genome (Fig. 1c). In total, 93.96% of the mitochondrial genome had higher coverage in DLP+ compared to scRNA-seq, enabling comparatively robust mtDNA variant calling and mtDNA copy number estimation directly from primary DLP+ sequencing data. Read depth per cell and the relative capture efficiency of mtDNA/nuDNA had a low correlation, suggesting minimal technical bias derived from sequencing depth in calculating mtDNA copy number (R = 0.03, P < 10⁻¹⁵; Extended Data Fig. 1a). Notably, DLP+ data were both broader and deeper compared to scRNA-seq data from the same sample in seven PDX (Kolmogorov–Smirnov test, all P < 2.2 × 10⁻¹⁶; Fig. 1d,e).

Unlike most other single-cell DNA sequencing technologies, DLP+ does not use pre-amplification, enabling relatively unbiased quantification of both nuclear and mtDNA copy numbers at single-cell resolution. Following prior work^2,22, we estimated mtDNA copy number by comparing the read depth of mtDNA- and nuDNA-aligned reads and calibrating mtDNA ploidy to the baseline ploidy of nuDNA. Lymphoblastoid GM18507 cells typically contained 756 copies of mtDNA per cell (25th and 75th percentiles: 575 and 999, respectively) with highly robust and reproducible mtDNA copy number estimates across sequencing libraries (Extended Data Fig. 1b). In silico downsampling of one of the libraries deeply sequenced with a median mtDNA read depth of 79× per cell to a median mtDNA read depth of 8× per cell indicated stable estimation of mtDNA copy number and heteroplasmic mtDNA variant calling down to 30% of the original sequencing depth (Extended Data Fig. 1c–g). The presence of ~1,000 copies of mtDNA per cell is consistent with lower-throughput digital droplet polymerase chain reaction estimates of single-cell mtDNA copy number²³. Together with the reproducibility of such estimates across sequencing libraries, these analyses establish DLP+ as a robust high-throughput assay for single-cell mtDNA copy number quantification.

mtDNA copy number correlates with cell size

We analyzed DLP+ sequencing data of treatment-naive samples along with 3,203 GM18507 diploid lymphoblastoid cells, included in several DLP+ runs as controls for nuDNA copy number estimation, for a total of 55,930 cells (Methods; Supplementary Table 1). Median copy number across these diverse cells varied from 531 in the TNBC PDX model SA1142 to 3,274 in the BRCA1^−/−;TP53^−/− 184-hTERT cell line sample SA1054 (Fig. 2a). Per-cell mtDNA copy number estimates were reproducible across technical replicates and exhibited a high degree of temporal stability across multiple 184-hTERT cell lines (Fig. 2b–d and Extended Data Fig. 2a). In contrast to population-level stability in mtDNA copy number, cell-to-cell variation in any single library was substantial (Fig. 2a). Most libraries exhibited a typical coefficient of variation of 0.65, consistent with observations in embryos and parathyroid^24,25 and the per-sample variation observed in Pan-Cancer Analysis of Whole Genomes (PCAWG) bulk whole genomes of the corresponding cancer type² (Extended Data Fig. 2b).

**Fig. 2: Relative increase in mtDNA copy number with larger cell size.**

Next, we used mtDNA copy number quantification from DLP+ to interrogate cell-type-specific mtDNA copy number levels in both malignant and nonmalignant cells from the tumor microenvironment. Prior analysis of mtDNA copy number levels in tumors has focused on comparing estimates of mtDNA copy number from bulk tumor sequencing to matched adjacent-normal tissue¹, potentially conflating changes in cellular composition with tumor-cell-intrinsic adaptations in mtDNA copy number. In four primary HGSC tumors with DLP+, we were able to identify both malignant and nonmalignant (corresponding to a mixture of stromal and immune) cells on the basis of nuDNA copy number profiles and found that malignant cells displayed a significantly higher mtDNA copy number (log₂(fold change) = 1.3–3.0; all P < 10⁻¹⁴; Extended Data Fig. 2c). These data indicate that mtDNA copy number is elevated in tumor cells relative to colocalized nontumor cells in TNBC and HGSC. Further conclusions from this analysis, however, are limited by the inability to definitively distinguish nontransformed cells of a common cell-of-origin to HGSC from other nontransformed cells such as immune cells.

As mitochondria provide anabolic substrates for both cellular maintenance and proliferation^26,27,28,29, we hypothesized that cell-to-cell variation in mtDNA copy number within clonally related cells may reflect bona fide variation in the energetic and anabolic demands of single cells^30,31. Such changes in anabolic demand might, for example, result from normal variation in cell size, which has been previously posited in the literature and recently quantified in budding yeast^24,25,32. We analyzed coregistered bright field images from the DLP+ platform (n = 4,011 184-hTERT breast epithelial cells, n = 26,024 of eight 184-hTERT-derived cell lines and n = 1,731 GM18507 diploid lymphoblastoid cells) and correlated estimates of cell size from these images with single-cell mtDNA copy number. The diameter of diploid cells ranged from 10.43 µm to 50.38 µm and varied significantly according to lineage (Extended Data Fig. 2d). This corresponded to an approximate 24 and 29.2 mtDNA copy number increase per micron, respectively (Extended Data Fig. 2e). In total, 46/52 sequencing libraries (covering 11 distinct cell lines) demonstrated a statistically significant positive correlation between cell size and mtDNA copy number (Pearson correlation, Q < 0.05; Fig. 2e,g), corroborating previous studies in budding yeast^27,32,33. We then studied tumor cells, analyzing 5,476 images of cells across 40 libraries of six TNBC PDX models and 4,005 images across nine libraries of five primary HGSC samples. In total, 20/24 sequencing libraries showed a significant positive correlation between cell diameter and mtDNA copy number (Fig. 2f,h). We also correlated the mtDNA-to-nuDNA ratio (MNR), that is, the number of copies of mtDNA per average haploid nuclear genome, against cell diameter, and found statistically significant results across conditions (Extended Data Fig. 2f,g). These findings confirm that, in both cultured cells and human tumors, cell-to-cell variation in mtDNA copy number is associated with a biophysical adaptation in cell size.

Stoichiometric adaptation of mtDNA copy number to whole-genome doubling (WGD)

We hypothesized that somatic alterations in the nuclear genome, and especially large-scale changes to total copy number might contribute to the extensive variation in mtDNA copy number observed in Fig. 2a. In particular, we anticipated that WGD events, which have previously been associated with large metabolic changes and increase in cell sizes and are common in TNBC and HGSC, may be major contributors to mtDNA copy number variation in any given sample^34,35. WGD was a readily identifiable and frequent event in DLP+ data—we observed WGD in an average of 13% of all sequenced cells from cell lines, 4.7% of all cells from sequenced PDXs and 18% of all cells from sequenced primary tumors (Extended Data Fig. 3a). Interestingly, there was only a small difference in the number of mtDNA variants in diploid and tetraploid cells (Supplementary Table 4). On the other hand, tetraploid cells had significantly higher mtDNA copy numbers than diploid cells across all cell lines, PDX models and primary tumor samples (two-sample Wilcoxon test, all P < 10⁻¹⁵; Fig. 3a and Extended Data Fig. 3b).

**Fig. 3: MNR homeostatically balances with WGD but exhibits clone-specific differences.**

Because coordinated transcription between the nuclear and mitochondrial genomes is necessary for proper stoichiometric assembly of respiratory complexes^36,37,38, we tested whether the mtDNA copy number would increase in direct proportion to the ploidy of the nuclear genome³⁹ (Methods). To do so, we investigated how MNR varies in tetraploid versus diploid populations of related subclones in a common sequencing library. In parental 184-hTERT cells, the MNR difference between diploid and tetraploid cells was negligible (log₂(fold change) = 7.5 × 10⁻³, P = 0.067; Extended Data Fig. 3c). Similar marginal differences in MNR were observed between tetraploid versus diploid cells in most 184-hTERT-derived lines, with the exception of BRCA1-null 184-hTERT cells that exhibited 31% (SA1292) and 15% (SA1054) increases in MNR in tetraploid cells relative to diploid cells (one-sided Wilcoxon test, P = 9.3 × 10⁻⁴ and P = 2.4 × 10⁻⁷, respectively; Fig. 3b). We observed a similar tendency for preservation or small increases in MNR in tetraploid cells relative to diploid cells in the majority of eight PDX and primary tumor samples with sufficient tetraploid/diploid cells for analysis (percent change −6.3% to 12.6%; 2/8 samples statistically significant: one-sided Wilcoxon test, both P < 0.025; Fig. 3c). These data establish that for the majority of samples, tetraploid cells and diploid cells derived from a common progenitor contain roughly equal numbers of mtDNA copies per haploid genome. This dosage homeostasis between mtDNA and nuDNA is remarkable especially because mtDNA replication has generally been thought to be uncoupled from the nuDNA replication^40,41, however, recent studies suggest additional preferential mtDNA replication during S phase^42,43. Through fluorescence-activated cell sorting (FACS)-based isolation of cells from T-47D breast cancer and GM18507 lymphoblastoid cell lines, we investigated the relationship between absolute mtDNA copy number and nuDNA ploidy across cell cycle phases and found that the mtDNA copy number was preferentially replicated in the S phase. This resulted in a higher mtDNA copy number, but not MNR, in the G2 and S phases compared to the G1 phase (Extended Data Fig. 3d,e). Additionally, while the mtDNA copy number was approximately doubled in the presence of a WGD, the increase in cell diameter was not as pronounced, indicating that MNR homeostasis is not completely explained by adaptations in cell size (Extended Data Fig. 3f,g). Together, these results suggest that a combination of passive and active mechanisms homeostatically coordinate absolute mtDNA copy number and nuDNA ploidy. We subsequently focused on investigating the factors driving exceptions to this phenomenon.

To better understand why some samples exhibited large increases in MNR in tetraploid cells relative to diploid cells, we investigated in detail the TP53^−/− 184-hTERT sample SA906a and the primary HGSC tumor SPECTRUM-OV-081, both of which demonstrated large, statistically significant differences in tetraploid versus diploid MNR (9.35% and 12.6%, respectively; one-sided Wilcoxon test, both P < 1.2 × 10⁻⁴). We hypothesized that, in these samples, high levels of clonal diversification produced clones with distinct MNR that could indirectly produce an apparent difference between MNR in diploid and tetraploid cells. To test this hypothesis, we ran HDBSCAN⁴⁴ to detect clusters of cells with similar nuDNA copy number profiles and assigned each cell to a specific clone. Somatic mtDNA variants determined to be informative based on a Bayesian clonal assignment model were present in both diploid and tetraploid cells of the same clones, confirming the presence of both diploid and tetraploid cells within a clone (Methods; Extended Data Fig. 3h–k). Consistent with our hypothesis, we observed substantial differences in the MNR of clone A in SA906a, which is primarily distinguished by the presence of a MYC focal amplification, compared to ancestral clone D (log₂(fold change) = 0.5, Wilcoxon test, P < 2.2 × 10⁻¹⁶). Notably, diploid and tetraploid cells in the same clone demonstrated indistinguishable MNRs, whereas the differences in MNR between ploidy-matched diploid cells across clones A and D were large and statistically significant (log₂(MNR) of 0.5, P < 10⁻¹⁵; Fig. 3d,e). A similar effect was observed in the primary tumor sample SPECTRUM-OV-081, where the clonal differences in MNR (for example, in exceptionally high MNR in clone C) dominated intraclone differences in diploid and tetraploid cells (Fig. 3f,g). These data indicate that clonal diversification can drive apparent differences in MNR between diploid and tetraploid cells, and when clonal identity is controlled, diploid and tetraploid cells demonstrate equivalent MNR levels.

High MNR increases interferon (IFN) response and depletes hypoxic gene expression

We next asked if clone-specific differences in MNR elicited phenotypic consequences. We computationally assigned cells in scRNA-seq to clones identified from matched DLP+ using TreeAlign⁴⁵ across samples with both DLP+ and matching scRNA-seq data (Methods). We then compared mtDNA-encoded gene expression patterns of clones with the highest MNR to those with the lowest MNR. For instance, HGSC primary tumor SPECTRUM-OV-022 contained eight clones that closely clustered clones (A, C, D, E, G, I, J and K; Fig. 4a,b). Clone A, which had a high MNR, had higher expression of mtDNA-encoded MT-CO2 compared to clone I, which had the lowest MNR (Fig. 4c). A similar pattern was observed in SPECTRUM-OV-081, which showed three clones in the UMAP—clones A, B and C (Fig. 4d,e). Clone C (highest MNR) had higher MT-ND3 expression compared to clone B (lowest MNR; Fig. 4f). We then expanded this analysis across all tumors, comparing tumor subclones for cases with large clonal differences in MNR (log₂(MNR) > 0.15). For each of the three tumor samples with sufficiently large differences in MNR across clones, we compared the transcriptional profiles of cells in clones with the maximal and minimal MNR (including one PDX and two primary HGSC tumors), observing that transcription of mtDNA-encoded genes was significantly higher in MNR-high clones compared to MNR-low clones (Extended Data Fig. 4a). Similarly, we observed enrichment in mtDNA-encoded gene expression for MNR-high clones for cell lines (Extended Data Fig. 4b). While an association between MNR and mtDNA expression has been suggested in earlier work^30,46, these data directly connects subclonal variation in MNR to mtDNA-encoded gene expression.

**Fig. 4: Mapping clones between DLP+ and scRNA-seq reveals that high MNR leads to enrichment in IFN signaling and depletion in the hypoxia pathway.**

To more granularly understand the association between MNR and non-mtDNA-encoded gene expression, we undertook a pathway enrichment analysis. Pathway analysis on the three tumor samples with matched DLP+ and scRNA-seq using differential expression between high and low MNR clones identified 8/51 Molecular Signatures Database (MSigDB) hallmark gene sets with recurrent enrichment/depletion, including elevated expression of mtDNA oxidative phosphorylation (OXPHOS) pathway and innate immune-related pathways (Fig. 4g). Interestingly, only SPECTRUM-OV-022 exhibited statistically significant enrichment in nuDNA-encoded OXPHOS in the same direction as mtDNA-encoded OXPHOS, ruling out MNR as a dominant regulator of nuDNA-encoded OXPHOS transcription. Instead, high MNR clones exhibited a recurrent depletion in hypoxic gene expression—in both SPECTRUM-OV-022 and SPECTRUM-OV-081, high MNR clones (OV-022 clone A; OV-081 clone C) had significantly lower PROGENy hypoxia enrichment score than low MNR clones (SPECTRUM-OV-022 clone I; SPECTRUM-OV-081 clone B; Wilcoxon test; OV-022, P = 0.0064, OV-081, P = 7.9 × 10⁻⁶; Fig. 4h,i). Variation in MNR in vivo is thus primarily associated with changes to mtDNA-encoded, but not nuclear-DNA-encoded, OXPHOS expression, as well as transcriptional adaptations to nuDNA-encoded metabolic pathways.

Dosage-dependent mtDNA variant effects on mtDNA copy number

Recently, a genome-wide association study (GWAS) of variation in mtDNA copy number in whole blood reported that certain germline mtDNA insertions, including those affecting the length of a homopolymeric block at m.302 associated with the balance of mtDNA replication and transcription, could potentially regulate mtDNA copy number^47,48,49. Interestingly, this study revealed (using scATAC-seq) that individual cells from the same patient often exhibited different heteroplasmic levels of such insertions, suggesting that cell-to-cell variation in mtDNA genotype could, in cis, drive variation in mtDNA copy number levels. Because of the unique ability of DLP+ to simultaneously track mtDNA copy number and genotype in phenotypically distinct tumor cells, we evaluated if DLP+ could identify the length heteroplasmy at m.302 and, if so, test the hypothesis that the length heteroplasmy is associated with changes in single-cell mtDNA copy number. For each of the 32 samples in our dataset, we genotyped the mtDNA of individual cells (Methods) and identified cells with homopolymeric insertions at m.302. We identified a single sample (SA1047) with a sufficient number of cells for subsequent analysis (at least 20 diploid cells with minimum coverage of 10 reads at position 302; Fig. 5a). Considering diploid cells only to avoid any confounding effects associated with nuDNA, we quantified both mtDNA copy number and the heteroplasmy of the reference allele (m.302A) and evaluated the association between the two. This analysis revealed that, consistent with the bulk GWAS data, cells with the reference allele (m.302A) demonstrated elevated mtDNA copy number (Wilcoxon test, P = 0.025; Fig. 5b), and m.302A heteroplasmy was associated with higher mtDNA copy number (Pearson correlation, R = 0.17 and P = 0.018; Fig. 5c), indicating that mtDNA genotype itself may modulate mtDNA copy number levels.

**Fig. 5: Genetic perturbation in mtDNA can elicit a dosage-dependent decrease in mtDNA copy number.**

We next analyzed truncating mutations in mtDNA, which arise in approximately 20% of all cancers and are thought to impair mitochondrial respiration^2,7,50. Prior studies have suggested that somatic truncating mtDNA mutations can also elicit increases in mtDNA copy number^51,52,53. However, because prior analyses were undertaken from bulk sequencing data, there remains little understanding of the adaptive response of mtDNA copy number to single-cell variation in heteroplasmy. Analysis of mtDNA copy number and truncating mtDNA mutations in 11,691 total cells across seven distinct cell lines, five PDX samples and five primary tumor samples identified 23 truncating mutation events spanning across 19 distinct genomic positions and 20 silent mutations spanning across 20 distinct genomic positions (truncating variants shown in Fig. 5d). Consistent with prior reports^7,54, these mutations predominantly affected complex I subunits at homopolymeric hotspots (for example, m.12417).

Among the 23 truncating variants across both cell lines and tumors, we identified a statistically significant association between the heteroplasmy of m.6708G>A (encoding a complex IV truncating mutation) and mtDNA copy number (Q = 2.1 × 10⁻²; Fig. 5e). The pathogenicity and clinical significance of this variant have been reported previously in mitochondrial myopathy and rhabdomyolysis⁵⁵, confirming our prediction that somatic truncating mutations in mtDNA can have deleterious effects on the cellular fitness in the form of increased mtDNA copy number. We corroborated the presence of m.6708G>A in matched scRNA-seq data (Fig. 5f,g) and, consistent with the positive correlation between heteroplasmy and mtDNA copy number observed in DLP+ (Fig. 5h), the heteroplasmy of m.6708G>A in scRNA-seq data was positively associated with the expression of mtDNA-encoded genes (Pearson correlation against mtDNA copy number, P < 2.2 × 10⁻¹⁶; against MT-ND3 expression, P < 0.001; Fig. 5i–k). In contrast, we found no statistically significant association between heteroplasmy and mtDNA copy number levels among 20 silent mutations. Finally, we also evaluated the association between the heteroplasmy of 130 nontruncating mitochondrial variants, the vast majority of which were variants of unknown significance, and mtDNA copy number. This identified two variants (m.822G>A, affecting a nearly universally conserved locus of MT-RNR1, and m.10197G>A, a confirmed-pathogenic allele causing Leigh disease^56,57) whose heteroplasmy significantly associated with elevated mtDNA copy number, implicating these mutations as putative modifiers of resting mtDNA copy number. A fourth variant of unknown significance (m.1150G>A, also affecting MT-RNR1 and universally conserved in the human germline) was observed in two TP53^−/− 184-hTERT cell lines and associated with decreased mtDNA copy number. These data establish that single cells adapt to some pathogenic mtDNA mutations, but not silent mutations, by increasing mtDNA copy number in a heteroplasmy/dosage-dependent manner.

Discussion

Although the few proteins encoded by mtDNA are essential to normal cellular metabolism and physiology, both mtDNA copy number and genotype can vary dramatically across otherwise isogenic populations of cells. Neither the regulatory principles controlling this cell-to-cell variation nor the phenotypes arising from variation to mtDNA copy number in individual cells are well-understood. By applying DLP+ to simultaneously characterize mtDNA copy number, mtDNA genotype and nuDNA genotype in >72,000 cells, we were able to carry out scaled analyses of the biophysical, evolutionary and phenotypic consequences of cell-to-cell variation in mtDNA copy number.

We observed extensive variation in per-cell mtDNA copy number, which is consistent with previous observations^24,25. By characterizing the quantitative variation in per-cell mtDNA copy number in human cancer in relation to cell size, nuclear ploidy, clonal composition and expression of mtDNA-encoded OXPHOS genes, we have shown that mtDNA copy number variation reflects, at least in part, both anabolic cellular demands for increased levels of cellular building blocks to produce larger cells⁵⁸ and stoichiometric equipoise to ensure appropriate relative levels of mtDNA and nuDNA³⁶ (that is, the MNR). Remarkably, the emergence of genetically distinct subclones can perturb MNR levels, and such variation in MNR appears to have specific transcriptional consequences on mtDNA-derived, but not nuDNA-derived, OXPHOS transcription. This represents a previously poorly considered class of phenotypic variation that arises from clonal evolution in cancer with potential implications for improved understanding of cellular fitness.

We also find, in agreement with the population-scale analysis of healthy individuals and patients with cancer, that certain mtDNA genotypes were themselves associated with changes to copy number⁴⁷. Unlike bulk sequencing studies, we harnessed DLP+ to quantitatively interrogate how mutant dosage, or mtDNA heteroplasmy, in individual cells affected mtDNA copy number. We observed that both relatively common germline polymorphisms (at m.302) and highly pathogenic somatic mutations elicited adaptive increases in mtDNA copy number in a heteroplasmy-dependent manner. Given that disruption of different functional components of mtDNA (such as complex I versus complex IV subunits or tRNA genes versus protein-coding genes) is known to produce vastly different phenotypes and sensitively depend on cell-of-origin, investigation of the adaptive mtDNA copy number response to functionally distinct mtDNA mutations in diverse cellular backgrounds may prove insightful. In summary, our work here implicates the coevolution of the mitochondrial and nuclear genomes in individual cells as a regulator of cellular fitness and phenotypic states in cancer.

Methods

Experimental model and participant details

Cell culture and PDXs

Cell lines were generated as previously described^19,21. In brief, the samples included (1) an immortalized normal human female breast epithelial cell line 184-hTERT L9, (2) four sets of 184-hTERT cell lines with perturbations in TP53^−/− passaged over multiple time points, (3) five 184-hTERT cell lines with a variety of genetic perturbations in the repair pathway, including TP53^−/−, BRCA1^−/−, BRCA2^+/− and BRCA2^−/− and (4) a GM18507 lymphoblastoid cell line. The samples also included three sets of TNBC PDX models. The University of British Columbia’s Ethics Committees granted approval for all experiments involving human resources. Donors from Vancouver, British Columbia, provided their consent for the Tumor Tissue Repository protocols (TTR-H06-00289, H16-01625). These samples were then transplanted into mice following the Animal Resource Center bioethics protocol (A19-0298-A001), which received approval from both the University of British Columbia’s Animal Care Committee and the BC Cancer Research Ethics Board under protocols H20-00170 and H18-01113. The serial passaging was done by seeding approximately 1 million cells each time and profiled with DLP+ at 4–11 different passage points with a mean of 6,070 cells at each time point.

SPECTRUM

All patients from the MSK SPECTRUM cohort^60,61,62 provided their consent to the institutional biospecimen banking protocol. The Memorial Sloan Kettering Cancer Center’s Institutional Review Board (IRB) approved all related protocols (15-200 and 06-107). The consent process adhered to the IRB’s standard operating procedures for obtaining informed consent, ensuring that all participants were fully informed and agreed in writing before any study-specific activities commenced. This study was carried out in accordance with the principles of the Declaration of Helsinki and adhered to the Good Clinical Practice guidelines. Matched 10x Genomics 3′-end scRNA-seq and DLP+ were obtained from two patients with HGSC (OV-022 and OV-081). Single-cell suspensions were flow-sorted on CD45 to separate the immune component, and the CD45-negative fractions were then profiled with DLP+.

Quantification and statistical analysis

Mitochondrial variant calling and genotyping

Quality score is assigned to each cell as part of the DLP+ pipeline based on 18 features related to read depth and nuDNA CNV information, as described in ref. ²¹. Only live cells with a quality score of at least 0.75 were kept for further analysis. We developed a single-cell variant calling workflow to identify mtDNA variants in single cells based on our previously described variant calling pipeline⁷. Variants are called by two independent variant-calling pipelines, and only the variants identified by both pipelines were retained for further analysis. The first pipeline is Mutect2 (GATK v4.1.2.0) using the mitochondrial option, which was run on every cell and then merged into a single VCF file. The second pipeline is samtools mpileup (v1.9) to generate a pileup file using variant-supporting reads with a minimum mapping quality (>20) and base quality (>20). This was run on the merged pseudo-bulk of all the single cells for the variant calling step. Variants were required to contain at least two variant-supporting reads in both the forward and reverse directions. PCR duplicates and reads that failed any of the quality checks were removed. As described in ref. ¹⁴, capturing the agreement of heteroplasmy between the strands is important in eliminating false positive calls. Thus, variants were further filtered based on a high Pearson correlation (R ≥ 0.2). Next, the black-listed, homopolymer repeat regions (513–525 and 3105–3109) in the mtDNA genome were filtered out as well²². The filtered variants were genotyped by running the second pipeline on individual cells for a per-cell heteroplasmy calculation. Mutational signature and strand bias were assessed as described in ref. ^22,63. The trinucleotide sequence context (immediate 5′ and 3′) was extracted, and the substitution rate for each context was calculated with the number of substitutions normalized by the frequency of all the observed contexts, in the L and H strand, respectively. We defined the germline variants as variants that enable us to infer the ancestral haplogroup for each cell line. Homoplasmic variants then refer to variants that are not found in the haplogroup of the sample (local private mutations) or in any of the defined haplogroups (global private mutations).

Estimation of average nuclear ploidy and baseline ploidy

Both the average ploidy and the baseline ploidy level of each cell were estimated with HMMcopy⁶⁴, as previously described in ref. ²¹. Briefly, for each cell, we calculate the average ploidy as the mean copy number across the 500 kilobase-wide bins in the entire nuclear genome, which is a nonnegative real number. On the other hand, the baseline ploidy of cells is categorized as either diploid, triploid, tetraploid or some other integer value based on the most commonly occurring copy number state across the 500 kilobase-wide bins of the entire nuclear genome.

Estimation of mtDNA gross copy number

The mtDNA copy number was calculated for each cell as follows:

$${\rm{mtDNA}}\; {\rm{copy}}\; {\rm{number}}=\frac{{\mathrm{mtDNA}}\,{\mathrm{read}}\,{\mathrm{depth}}}{{\mathrm{nuDNA}}\,{\mathrm{read}}\,{\mathrm{depth}}}\times {\mathrm{average}}\,{\mathrm{ploidy}}$$

The MNR refers to the ratio of mtDNA read depth to nuDNA read depth. Average ploidy was calculated using the mean copy number of all bins across the nuDNA genome from the HMMcopy⁶⁴ result.

Determining the cell diameter from microscopic images

DLP+ platform has microscopic image data at the nozzle before the cells are isolated into wells²¹. Microscopic images taken during the dispensing of the cells are used to automatically filter for doublets, and additional manual inspection of tetraploid cell images found that the median number of doublets across 25 sequencing libraries was 3.76%, suggesting that WGD predictions are unlikely to be confounded by doublets (Extended Data Fig. 2h,i and Supplementary Table 3). The diameter was calculated as Waddel disk diameter²¹.

A linear regression model for inference of cell size

First, a linear regression model was built to predict mtDNA copy number from the average nuDNA ploidy. Then the model was expanded to a linear multiple regression model to predict mtDNA copy number from cell diameter and the average ploidy. The average ploidy level could deviate from the integer baseline ploidy level in the presence of large chromosomal arm level copy number changes. Benjamini–Hochberg correction was applied for each sequencing library to account for the multiple testing of cells. For plotting, the scale was standardized and normalized to the mean.

Comparison of mtDNA copy number across cell cycle phases

Cell cycle analysis was performed on T-47D and GM18507 cell lines generated through the combination of experimental FACS²¹ and PERT⁶² output. FACS cell cycle phase labels were derived by staining cells for their total DNA content using DAPI and then isolating cells into G1-, S- and G2-phase populations before sequencing. PERT was then run on this scWGS data at 500 kb resolution using default model parameters and the FACS labels as initializations for the G1/2- and S-phase populations. PERT calls cells with 5–95% replicated loci as S phase and all others as G1/2 phase. The fraction of replicated loci per cell is also used to scale the total copy number of these cells. Only cells with matching FACS and PERT phase labels were included in the downstream analysis.

Relative change in MNR between diploid and tetraploid cells

The change in the MNR was calculated for each group as follows:

$${\mathrm{Difference}}=\frac{{\mathrm{Median}}({\mathrm{MN}}{\mathrm{R}}_{\mathrm{t}})-{\mathrm{Median}}({\mathrm{MN}}{\mathrm{R}}_{\mathrm{d}})}{{\mathrm{Median}}({\mathrm{MN}}{\mathrm{R}}_{\mathrm{d}})}\times 100$$

Inference of clones based on nuDNA read counts

Clonal assignment of the cells was done by running HDBSCAN on the two-dimensional embedding from UMAP of the per-cell GC-corrected read count profiles²⁰. Parameters used in UMAP and HDBSCAN were the same as previously described—UMAP was run with min_dist = 0.0 and metric = ‘correlation’, whereas HDBSCAN was run with approx_min_span_tree = False, cluster_selection_epsilon = 0.2 and gen_min_span_tree = True.

Model description and clonal inference using mtDNA variants

MityBayes is a Bayesian statistical model that systematically assigns cells into clones based on both the presence of mtDNA mutations and their heteroplasmy levels. The inputs to MityBayes are a prior on the number of clones, alternate read counts and the total read counts for each mtDNA variant across the cells. The alternate read counts of a variant in a cell follow a binomial distribution. The total read count at a specific genomic position where a variant is present is equivalent to the number of trials (n) and the clone-specific heteroplasmy level serves as the probability of success (p). Inference is performed using stochastic variational inference in the Pyro package. We generate the variational distributions using the AutoDelta function that uses Delta distributions to construct a MAP guide over the latent space. Optimization is performed using the Adam optimizer. By default, we set a learning rate of 0.1, and the convergence is determined when the relative change in evidence lower bound (ELBO) is lower than 10⁻⁵. We benchmarked MityBayes against the most similar method available in the literature, MQuad⁶⁵, which does not assign cells to clones based on mtDNA as MityBayes does but rather prioritizes mtDNA mutations that discriminate among different clones. MityBayes weighed the true variants with a higher probability of contribution in the clone assignment and was able to detect the clones when the input variants list was filtered (Extended Data Fig. 3l,m).

Integration of scDNA and scRNA data with TreeAlign

TreeAlign was used to computationally integrate scDNA and scRNA data by assigning transcriptional profiles to scDNA-based subclones. Briefly, TreeAlign explicitly models clone-specific copy number dosage effects and defines subclones informed by transcriptional changes from scDNA-based single-cell phylogenies. Here we ran TreeAlign with the following parameters: infer_b_allele = False, repeat = 8, min_clone_assign_prob = 0.9, min_clone_assign_freq = 0.75, min_consensus_gene_freq = 0.55, max_iter = 900, rel_tol = 1e-5, initialize_seed = True, min_cell_count_expr = 40, min_cell_count_cnv = 30, min_gene_diff = 150, min_snp_diff = 60, level_cutoff = 50, min_proceed_freq = 0.80, min_record_freq = 0.75.

Pathway enrichment analysis in matched scRNA-seq

CellRanger software (version 4.0.0) was used to perform read alignment, barcode filtering and UMI quantification using the 10× GRCh38 transcriptome (version 3.0.0) for gene expression. Filtered matrices were processed using the Seurat R package (version 3.0.1)^66,67. The resulting gene-by-cell matrix was log normalized and merged by the patient. Cell-type assignments were computed on each patient with cellassign (version 0.99.2)⁶⁸ using a set of curated marker genes, and cancer cells with a high probability (>0.99) were retained. Clone labels were assigned from using CNV data obtained from DLP+ using CloneAlign (version 0.99.0)⁴⁵. Cell-type annotated matrices for individual patients across time points were integrated with Harmony (version 0.1)⁶⁹ into a single batch-corrected matrix. Dimensionality reduction and visualization as a UMAP embedding were performed with the Seurat R package. Differentially expressed genes (P < 0.001, log(fold change) > 0.25) were computed using the Wilcoxon test using clone labels.

Concordance between mtDNA copy number and heteroplasmy

Because there are multiple sequencing libraries per sample with cells of different average ploidy, we used a stratified and weighted concordance model to identify pairs of heteroplasmy and mtDNA copy numbers that were consistently associated. Similar to Kendall’s Tau, concordance is a nonparametric measure of correlation that relies on the concept of concordant pairs⁷⁰. The concordance analysis was adapted from ref. ⁷¹. Briefly, the calculation was done using the concordance function from the survival R package⁷². As with Somers’ D and Kendall’s tau, the magnitude of c_scaled captures the strength of the effect, with values near −1 or 1 corresponding to strong discordance and concordance, respectively. We weighed each observation by the number of cells in the corresponding library. A z score was computed as unscaled concordance minus 0.5 and divided by the square root of the variance, and the resulting value was used to derive a two-tailed P value. P values were then corrected for multiple testing using the Benjamini–Hochberg method to control the false discovery rate. We filtered for highly covered mtDNA variants with at least ten reads supporting the alternate allele. For each variant, we filtered cells with heteroplasmy less than 0.05 or greater than 0.95 to prevent clusters of cells near 0 or 1 heteroplasmy from erroneously skewing the correlation estimation. Only the variants that had a range of 0.15 were kept for downstream analysis.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The sequencing data associated with the study spans already publicly available datasets^19,20,21 and are available at the European Genome-Phenome Archive with the accessions EGAS00001006343, EGAS00001004448 and EGAS00001003190. The DLP+ and matching scRNA-seq data for the two patients with HGSC (patient 022 and patient 081) from the MSK SPECTRUM cohort are available via dbGaP (accession phs002857.v2.p1). The processed data are available on Zenodo (https://doi.org/10.5281/zenodo.10498240)⁷³.

Code availability

Mutect2 (GATK v4.1.2.0), Samtools (v1.9), CellRanger software (v4.0.0), cellassign (v0.99.2) and CloneAlign (v0.99.0) R packages: R (v4.2.3), Seurat R package (v3.0.1) and Harmony (v0.1) were used in this study. Custom R code to regenerate all figures is available on GitHub (https://github.com/reznik-lab/mtdna-dlp)⁷⁴ with the relevant data and instructions to execute the code.

References

Reznik, E. et al. Mitochondrial DNA copy number variation across human cancers. eLife 5, e10769 (2016).
Article PubMed PubMed Central Google Scholar
Yuan, Y. et al. Comprehensive molecular characterization of mitochondrial genomes in human cancers. Nat. Genet. 52, 342–352 (2020).
Article CAS PubMed PubMed Central Google Scholar
Gaude, E. et al. NADH shuttling couples cytosolic reductive carboxylation of glutamine with glycolysis in cells with mitochondrial dysfunction. Mol. Cell 69, 581–593 (2018).
Article CAS PubMed PubMed Central Google Scholar
Vyas, S., Zaganjor, E. & Haigis, M. C. Mitochondria and cancer. Cell 166, 555–566 (2016).
Article CAS PubMed PubMed Central Google Scholar
Shidara, Y. et al. Positive contribution of pathogenic mutations in the mitochondrial genome to the promotion of cancer by prevention from apoptosis. Cancer Res. 65, 1655–1663 (2005).
Article CAS PubMed Google Scholar
Park, J. S. et al. A heteroplasmic, not homoplasmic, mitochondrial DNA mutation promotes tumorigenesis via alteration in reactive oxygen species generation and apoptosis. Hum. Mol. Genet. 18, 1578–1589 (2009).
Article CAS PubMed PubMed Central Google Scholar
Gorelick, A. N. et al. Respiratory complex and tissue lineage drive recurrent mutations in tumour mtDNA. Nat. Metab. 3, 558–570 (2021).
Article CAS PubMed PubMed Central Google Scholar
Filograna, R. et al. Modulation of mtDNA copy number ameliorates the pathological consequences of a heteroplasmic mtDNA mutation in the mouse. Sci. Adv. 5, eaav9824 (2019).
Article CAS PubMed PubMed Central Google Scholar
Stewart, J. B. & Chinnery, P. F. The dynamics of mitochondrial DNA heteroplasmy: implications for human health and disease. Nat. Rev. Genet. 16, 530–542 (2015).
Article CAS PubMed Google Scholar
Wei, W. et al. Germline selection shapes human mitochondrial DNA diversity. Science 364, eaau6520 (2019).
Article CAS PubMed Google Scholar
Chinnery, P. F., Samuels, D. C., Elson, J. & Turnbull, D. M. Accumulation of mitochondrial DNA mutations in ageing, cancer, and mitochondrial disease: is there a common mechanism? Lancet 360, 1323–1325 (2002).
Article CAS PubMed Google Scholar
Kang, E. et al. Age-related accumulation of somatic mitochondrial DNA mutations in adult-derived human iPSCs. Cell Stem Cell 18, 625–636 (2016).
Article CAS PubMed Google Scholar
Ludwig, L. S. et al. Lineage tracing in humans enabled by mitochondrial mutations and single-cell genomics. Cell 176, 1325–1339 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lareau, C. A. et al. Massively parallel single-cell mitochondrial DNA genotyping and chromatin profiling. Nat. Biotechnol. 39, 451–461 (2020).
Article PubMed PubMed Central Google Scholar
Miller, T. E. et al. Mitochondrial variant enrichment from high-throughput single-cell RNA-seq resolves clonal populations. Nat. Biotechnol. 40, 1030–1034 (2021).
Article Google Scholar
Xu, J. et al. Single-cell lineage tracing by endogenous mutations enriched in transposase accessible mitochondrial DNA. eLife 8, e45105 (2019).
Article CAS PubMed PubMed Central Google Scholar
Jiang, M. et al. Increased total mtDNA copy number cures male infertility despite unaltered mtDNA mutation load. Cell Metab. 26, 429–436 (2017).
Article CAS PubMed Google Scholar
Grady, J. P. et al. mtDNA heteroplasmy level and copy number indicate disease burden in m.3243A>G mitochondrial disease. EMBO Mol. Med. 10, e8262 (2018).
Article PubMed PubMed Central Google Scholar
Salehi, S. et al. Clonal fitness inferred from time-series modelling of single-cell cancer genomes. Nature 595, 585–590 (2021).
Article CAS PubMed PubMed Central Google Scholar
Funnell, T. et al. Single-cell genomic variation induced by mutational processes in cancer. Nature 612, 106–115 (2022).
Article CAS PubMed PubMed Central Google Scholar
Laks, E. et al. Clonal decomposition and DNA replication states defined by scaled single-cell genome sequencing. Cell 179, 1207–1221 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ju, Y. S. et al. Origins and functional consequences of somatic mitochondrial DNA mutations in human cancer. eLife 3, e02935 (2014).
Article PubMed PubMed Central Google Scholar
Burr, S. P. & Chinnery, P. F. Measuring single-cell mitochondrial DNA copy number and heteroplasmy using digital droplet polymerase chain reaction. J. Vis. Exp., https://doi.org/10.3791/63870 (2022).
Article PubMed Google Scholar
Müller-Höcker, J. et al. Oxyphil cell metaplasia in the parathyroids is characterized by somatic mitochondrial DNA mutations in NADH dehydrogenase genes and cytochrome c oxidase activity-impairing genes. Am. J. Pathol. 184, 2922–2935 (2014).
Article PubMed Google Scholar
Cree, L. M. et al. A reduction of mitochondrial DNA molecules during embryogenesis explains the rapid segregation of genotypes. Nat. Genet. 40, 249–254 (2008).
Article CAS PubMed Google Scholar
Reber, S. & Goehring, N. W. Intracellular scaling mechanisms. Cold Spring Harb. Perspect. Biol. 7, a019067 (2015).
Article PubMed PubMed Central Google Scholar
Rafelski, S. M. et al. Mitochondrial network size scaling in budding yeast. Science 338, 822–824 (2012).
Article CAS PubMed PubMed Central Google Scholar
Miettinen, T. P. & Björklund, M. Cellular allometry of mitochondrial functionality establishes the optimal cell size. Dev. Cell 39, 370–382 (2016).
Article CAS PubMed PubMed Central Google Scholar
Miettinen, T. P. & Björklund, M. Mitochondrial function and cell size: an allometric relationship. Trends Cell Biol. 27, 393–402 (2017).
Article CAS PubMed Google Scholar
D’Erchia, A. M. et al. Tissue-specific mtDNA abundance from exome data and its correlation with mitochondrial transcription, mass and respiratory activity. Mitochondrion 20, 13–21 (2015).
Article PubMed Google Scholar
Basu, A., Lenka, N., Mullick, J. & Avadhani, N. G. Regulation of murine cytochrome oxidase Vb gene expression in different tissues and during myogenesis. Role of a YY-1 factor-binding negative enhancer. J. Biol. Chem. 272, 5899–5908 (1997).
Article CAS PubMed Google Scholar
Seel, A. et al. Regulation with cell size ensures mitochondrial DNA homeostasis during cell growth. Nat. Struct. Mol. Biol. 30, 1549–1560 (2023).
Article CAS PubMed PubMed Central Google Scholar
Osman, C., Noriega, T. R., Okreglak, V., Fung, J. C. & Walter, P. Integrity of the yeast mitochondrial genome, but not its distribution and inheritance, relies on mitochondrial fission and fusion. Proc. Natl Acad. Sci. USA 112, E947–E956 (2015).
Article CAS PubMed PubMed Central Google Scholar
Galitski, T., Saldanha, A. J., Styles, C. A., Lander, E. S. & Fink, G. R. Ploidy regulation of gene expression. Science 285, 251–254 (1999).
Article CAS PubMed Google Scholar
Comai, L. The advantages and disadvantages of being polyploid. Nat. Rev. Genet. 6, 836–846 (2005).
Article CAS PubMed Google Scholar
Soto, I. et al. Balanced mitochondrial and cytosolic translatomes underlie the biogenesis of human respiratory complexes. Genome Biol. 23, 170 (2022).
Article CAS PubMed PubMed Central Google Scholar
Couvillion, M. T., Soto, I. C., Shipkovenska, G. & Churchman, L. S. Synchronized mitochondrial and cytosolic translation programs. Nature 533, 499–503 (2016).
Article CAS PubMed PubMed Central Google Scholar
Lazarou, M., McKenzie, M., Ohtake, A., Thorburn, D. R. & Ryan, M. T. Analysis of the assembly profiles for mitochondrial- and nuclear-DNA-encoded subunits into complex I. Mol. Cell. Biol. 27, 4228–4237 (2007).
Article CAS PubMed PubMed Central Google Scholar
Gyorfy, M. F. et al. Nuclear–cytoplasmic balance: whole genome duplications induce elevated organellar genome copy number. Plant J. 108, 219–230 (2021).
Article Google Scholar
Pica-Mattoccia, L. & Attardi, G. Expression of the mitochondrial genome in HeLa cells. IX. Replication of mitochondrial DNA in relationship to cell cycle in HeLa cells. J. Mol. Biol. 64, 465–484 (1972).
Article CAS PubMed Google Scholar
Antes, A. et al. Differential regulation of full-length genome and a single-stranded 7S DNA along the cell cycle in human mitochondria. Nucleic Acids Res. 38, 6466–6476 (2010).
Article CAS PubMed PubMed Central Google Scholar
Chatre, L. & Ricchetti, M. Prevalent coordination of mitochondrial DNA transcription and initiation of replication with the cell cycle. Nucleic Acids Res. 41, 3068–3078 (2013).
Article CAS PubMed PubMed Central Google Scholar
Sasaki, T., Sato, Y., Higashiyama, T. & Sasaki, N. Live imaging reveals the dynamics and regulation of mitochondrial nucleoids during the cell cycle in Fucci2-HeLa cells. Sci. Rep. 7, 11257 (2017).
Article PubMed PubMed Central Google Scholar
McInnes, L., Healy, J. & Astels, S. Hdbscan: hierarchical density based clustering. J. Open Source Softw. 2, 205 (2017).
Article Google Scholar
Campbell, K. R. et al. clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers. Genome Biol. 20, 54 (2019).
Article PubMed PubMed Central Google Scholar
Yang, S. Y. et al. Blood-derived mitochondrial DNA copy number is associated with gene expression across multiple tissues and is predictive for incident neurodegenerative disease. Genome Res. 31, 349–358 (2021).
Article PubMed PubMed Central Google Scholar
Gupta, R. et al. Nuclear genetic control of mtDNA copy number and heteroplasmy in humans. Nature 620, 839–848 (2023).
Article CAS PubMed PubMed Central Google Scholar
Nekhaeva, E. et al. Clonally expanded mtDNA point mutations are abundant in individual cells of human tissues. Proc. Natl Acad. Sci. USA 99, 5521–5526 (2002).
Article CAS PubMed PubMed Central Google Scholar
Herbst, A. et al. Accumulation of mitochondrial DNA deletion mutations in aged muscle fibers: evidence for a causal role in muscle fiber loss. J. Gerontol. A Biol. Sci. Med. Sci. 62, 235–245 (2007).
Article PubMed Google Scholar
Mahmood, M. et al. Mitochondrial DNA mutations drive aerobic glycolysis to enhance checkpoint blockade response in melanoma. Nat. Cancer. https://doi.org/10.1038/s43018-023-00721-w (2024).
De Grey, A. D. A proposed refinement of the mitochondrial free radical theory of aging. Bioessays 19, 161–166 (1997).
Article PubMed Google Scholar
Shigenaga, M. K., Hagen, T. M. & Ames, B. N. Oxidative damage and mitochondrial decay in aging. Proc. Natl Acad. Sci. USA 91, 10771–10778 (1994).
Article CAS PubMed PubMed Central Google Scholar
DeHaan, C. et al. Mutation in mitochondrial complex I ND6 subunit is associated with defective response to hypoxia in human glioma cells. Mol. Cancer 3, 19 (2004).
Article PubMed PubMed Central Google Scholar
Kollberg, G., Moslemi, A.-R., Lindberg, C., Holme, E. & Oldfors, A. Mitochondrial myopathy and rhabdomyolysis associated with a novel nonsense mutation in the gene encoding cytochrome c oxidase subunit I. J. Neuropathol. Exp. Neurol. 64, 123–128 (2005).
Article CAS PubMed Google Scholar
Sazanov, L. A Structural Perspective on Respiratory Complex I: Structure and Function of NADH:Ubiquinone Oxidoreductase (Springer Science & Business Media, 2012).
Chae, J. H. et al. A novel ND3 mitochondrial DNA mutation in three Korean children with basal ganglia lesions and complex I deficiency. Pediatr. Res. 61, 622–624 (2007).
Article CAS PubMed Google Scholar
O’Hara, R. et al. Quantitative mitochondrial DNA copy number determination using droplet digital PCR with single-cell resolution. Genome Res. 29, 1878–1888 (2019).
Article PubMed PubMed Central Google Scholar
Schmoller, K. M. & Skotheim, J. M. The biosynthetic basis of cell size control. Trends Cell Biol. 25, 793–802 (2015).
Article PubMed PubMed Central Google Scholar
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Article CAS PubMed PubMed Central Google Scholar
Vázquez-García, I. et al. Ovarian cancer mutational processes drive site-specific immune evasion. Nature 612, 778–786 (2022).
Article PubMed PubMed Central Google Scholar
Shi, H. et al. Allele-specific transcriptional effects of subclonal copy number alterations enable genotype-phenotype mapping in cancer cells. Nat. Commun. 15, 2482 (2024).
Article CAS PubMed PubMed Central Google Scholar
Weiner, A. C. et al. Single-cell DNA replication dynamics in genomically unstable cancers. Preprint at bioRxiv https://doi.org/10.1101/2023.04.10.536250 (2023).
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Article CAS PubMed PubMed Central Google Scholar
Lai, D., Ha, G., & Shah, S. HMMcopy: Copy number prediction with correction for GC and mappability bias for HTS data. HMMcopy, R package version 1.44.0 https://doi.org/doi:10.18129/B9.bioc.HMMcopy (2023).
Kwok, A. W. C. et al. MQuad enables clonal substructure discovery using single cell mitochondrial variants. Nat. Commun. 13, 1205 (2022).
Article CAS PubMed PubMed Central Google Scholar
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
Article CAS PubMed PubMed Central Google Scholar
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhang, A. W. et al. Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling. Nat. Methods 16, 1007–1015 (2019).
Article CAS PubMed PubMed Central Google Scholar
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pencina, M. J. & D'Agostino, R. B. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat. Med. 23, 2109–2123 (2004).
Article PubMed Google Scholar
Benedetti, E. et al. A multimodal atlas of tumour metabolism reveals the architecture of gene-metabolite covariation. Nat. Metab. 5, 1029–1044 (2023).
Article CAS PubMed PubMed Central Google Scholar
Therneau, T. M., Lumley, T., Elizabeth, A. & Cynthia, C. survival: survival analysis. R version 3.2-3. CRAN.R-project.org/package=survival (2022).
Kim, M. Single cell mtDNA dynamics in tumors is driven by co-regulation of nuclear and mitochondrial genomes. Zenodo 10.5281/zenodo.10498239 (2024).
Kim, M. et al. mtdna dlp. GitHub github.com/reznik-lab/mtdna-dlp (2024).

Download references

Acknowledgements

We acknowledge the constructive feedback of the Shah and Reznik Labs. This project was generously supported by the Cycle for Survival, the Marie-Josée and Henry R. Kravis Center for Molecular Oncology and the National Cancer Institute Cancer Center Core (grant P30-CA008748) supporting Memorial Sloan Kettering Cancer Center. S.P.S. holds the Nicholls Biondi Chair in Computational Oncology and is a Susan G. Komen Scholar (GC233085). This work was also funded in part by awards to S.P.S.: Susan G. Komen Breast Cancer Foundation (SAC220206), the Cancer Research UK Grand Challenge Program (GC-243330) and an NIH RM1 award (RM1-HG011014). E.R. was supported by the Department of Defense Kidney Cancer Research Program (W81XWH-18-1-0318 and HT9425-23-1-0995), Cycle For Survival Equinox Innovation Award, Kidney Cancer Association Young Investigator Award, Brown Performance Group Innovation in Cancer Informatics Fund and NIH (R37 CA276200). E.R. was also supported by a grant from the Alan and Sandra Gerry Metastasis and Tumor Ecosystems Center.

Author information

Authors and Affiliations

Tri-Institutional PhD Program in Computational Biology & Medicine, Weill Cornell Medicine, New York City, NY, USA
Minsoo Kim
Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
Minsoo Kim, Alexander N. Gorelick, Ignacio Vàzquez-García, Marc J. Williams, Sohrab Salehi, Hongyu Shi, Adam C. Weiner, Nick Ceglia, Tyler Funnell, Tricia Park, Sonia Boscenco, Diljot Grewal, Cerise Tang, Nicole Rusk, Andrew McPherson, Sohrab P. Shah & Ed Reznik
Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
Alexander N. Gorelick & Hui Jiang
Department of Molecular Oncology, British Columbia Cancer Research Centre, Vancouver, British Columbia, Canada
Ciara H. O’Flanagan & Sam Aparicio
Institute of Cancer Sciences, University of Glasgow, Glasgow, UK
Payam A. Gammage
CRUK Beatson Institute, Glasgow, UK
Payam A. Gammage

Authors

Minsoo Kim
View author publications
You can also search for this author in PubMed Google Scholar
Alexander N. Gorelick
View author publications
You can also search for this author in PubMed Google Scholar
Ignacio Vàzquez-García
View author publications
You can also search for this author in PubMed Google Scholar
Marc J. Williams
View author publications
You can also search for this author in PubMed Google Scholar
Sohrab Salehi
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Shi
View author publications
You can also search for this author in PubMed Google Scholar
Adam C. Weiner
View author publications
You can also search for this author in PubMed Google Scholar
Nick Ceglia
View author publications
You can also search for this author in PubMed Google Scholar
Tyler Funnell
View author publications
You can also search for this author in PubMed Google Scholar
Tricia Park
View author publications
You can also search for this author in PubMed Google Scholar
Sonia Boscenco
View author publications
You can also search for this author in PubMed Google Scholar
Ciara H. O’Flanagan
View author publications
You can also search for this author in PubMed Google Scholar
Hui Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Diljot Grewal
View author publications
You can also search for this author in PubMed Google Scholar
Cerise Tang
View author publications
You can also search for this author in PubMed Google Scholar
Nicole Rusk
View author publications
You can also search for this author in PubMed Google Scholar
Payam A. Gammage
View author publications
You can also search for this author in PubMed Google Scholar
Andrew McPherson
View author publications
You can also search for this author in PubMed Google Scholar
Sam Aparicio
View author publications
You can also search for this author in PubMed Google Scholar
Sohrab P. Shah
View author publications
You can also search for this author in PubMed Google Scholar
Ed Reznik
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.P.S., E.R. and S.A. conceived and supervised the study. M.K. led all data analysis. S.P.S., S.A. and C.O. designed and performed the experiments. S.P.S., E.R., S.A. and M.K. designed the statistical model. Additional data analysis was performed by A.G., N.C., T.F., I.V., D.G., S.S., A.C.W., H.S., A.M., T.P., S.B. and H.J. with genomic data collection and analytical methodology development. S.P.S., E.R. and M.K. wrote the manuscript with help from M.W., C.T., N.R. and P.A.G. All authors provided feedback on and approved the paper.

Corresponding authors

Correspondence to Sohrab P. Shah or Ed Reznik.

Ethics declarations

Competing interests

S.P.S. has an advisory role to AstraZeneca. S.A. is a founder and shareholder of GenomeTherapeutics (Inflex) and scientific advisor to Sangamo Therapeutics, Chordia Biosciences and the Institute of Cancer Research, London. All roles are outside the scope of this manuscript. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Konstantin Khrapko, Caleb Lareau, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Robust single-cell quantification of mtDNA copy number based on DLP+.

a, Scatter plot of mtDNA copy number against total mapped read counts for each cell. Two-sided Pearson correlation between mtDNA copy number and total mapped reads results in a correlation coefficient of 0.03, P < 5.6 × 10⁻¹⁶. Gray-shaded areas represent error bands indicating the 95% confidence interval, and the blue line indicates the regression line. b, Per-cell mtDNA copy number estimation of GM18507 lymphoblastoid cells across 33 libraries with at least 15 cells (n = 2,281). All boxplots represent the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. c, Downsampling experiment of a SA1090 (OV2295 cell line) library (n = 573 cells) showing a gradual decrease in mtDNA read depth from right to left. All boxplots in the downsampling experiment represent the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. d, MNR across all levels of downsampling shows a very consistent MNR around 647 (two-sided, two-sample Wilcoxon test against the original library; all P > 0.78). e, Boxplot of the number of variants detected across all levels of downsampling. Total of 37 variants were detected in the original 100% sequencing library. On the other hand, the downsampling resulted in a very stable number of variants at 36 across all levels. The number of variants drastically varied only when the library was down-sampled to 10%. f, Distribution of heteroplasmy level of one heteroplasmic variant, m.15500G>A, that was at low heteroplasmy level in the original 100% sequencing depth across all levels of downsampling. The median heteroplasmy was consistent with the original until 30%, below which we saw the median heteroplasmy increase with many cells dropping out. Also, more cells exhibited discrete levels of heteroplasmy due to lower sequencing depth. All boxplots represent the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. g, For the same variant, m.15500G>A, the breakdown of the mutant status assignment of the cells with the original 100% sequencing depth is as the ground truth. At 30% of the original sequencing depth, we start to classify true mutant cells as wild-type cells with sensitivity of 0.85 and specificity of 1.

Extended Data Fig. 2 mtDNA copy number represents biophysical and genomic cellular energy demand.

a, Boxplots showing the distribution of mtDNA copy number across technical replicates over four different time points. P-values from the two-sided, two-sample Wilcoxon test are indicated above. The boxplot represents the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. ** denotes P < 0.01, *** denotes P ≤ 0.001 and no annotation denotes P > 0.1. b, Coefficient of variation in mtDNA copy number across cells in TNBC (n = 10) and HGSC (n = 5) samples. Red lines indicate a coefficient of variation across breast and ovary bulk tissue tumor samples in PCAWG. c, Violin plot of the per-cell mtDNA copy number across malignant and nonmalignant cells across four primary tumors with sufficiently high nonmalignant cells (OV-022: n = 1,625 cells, OV-081: n = 1,352 cells, SA1047: n = 626 cells, SA1135: n = 276 cells, all P < 10⁻¹⁶). Two-sided, two-sample Wilcoxon test indicates that malignant cells have a significantly higher mtDNA copy number compared to nonmalignant cells across all four tumor samples. The boxplot represents the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. d, Distributions of cell diameter in microns for GM18507 (n = 18 cells) and 184-hTERT diploid cells (n = 1,152 cells). e, Coefficient for the diameter term in the linear regression model of mtDNA copy number against cell diameter for diploid 184-hTERT and GM18507 cells only (n = 4 libraries each). Error bars represent SD of the coefficient values. f, A scatter plot showing a positive two-sided Pearson correlation between MNR and cell diameter for a sequencing library of a TP53^−/− 184-hTERT SA906b cell line, A96155B. Gray-shaded areas represent error bands indicating the 95% confidence interval, and the red dotted line indicates the regression line. g, Same as f but for a sequencing library of a TNBC SA1035 PDX, A95623A. h, Microscopic image of a true tetraploid cell from a TP53^−/− 184-hTERT breast epithelial cell library, SA906-A96228B, taken during cell dispensing as part of the DLP+. The cell size is slightly larger than diploid cells in the same library. i, Microscopic image of a doublet from the same library. Although the ploidy is estimated as tetraploid, there are actually two diploid cells sequenced together.

Extended Data Fig. 3 Within-clone analysis of mtDNA-nuDNA ratio in response to whole-genome doubling.

a, Total number and proportion of diploid and tetraploid cells plotted for each sample. b, Comparison of mtDNA copy number between diploid and tetraploid cells across all 9 184h-TERT cell lines and 7 tumor samples as well as GM18507 lymphoblastoid cells (two-sided, two-sample Wilcoxon test, all P < 7.7 × 10⁻⁹). All boxplots represent the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. c, Violin plot of MNR in diploid (n = 3,475) and tetraploid (n = 350) cells across all four 184-hTERT breast epithelial cell lines. There is no significant difference between the two groups (two-sided, two-sample Wilcoxon test, P = 0.067). All boxplots represent the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. d, Boxplot of the mtDNA copy number distribution of cell cycle-sorted cells in different phases, G1, S and G2, across two sequencing libraries of T-47D breast cancer cell line, SA1044-A96139A (n = 735 cells) and SA1044-A96147A (n = 823 cells) and of lymphoblastoid cell line, SA928-73044A (n = 481 cells) and SA928-A90553C (n = 1,016 cells). All boxplots represent the median, 25th percentile and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. Pairwise significance is indicated by two-sided Wilcoxon tests. e, Same as d, but for MNR. f, Bar plot of the median diameter, measured in microns, for both diploid and tetraploid cells for each library across the 12 sequencing libraries of tumor samples. g, Boxplot of the median, 25th percentile and 75th percentile of fold change of median diameter for tetraploid cells over diploid cells across the same 12 sequencing libraries in f. The whiskers correspond to 1.5 times the interquartile range. h, Graphical model of MityBayes. MityBayes takes raw counts of the alternate allele and total depth per cell across mtDNA variants. It infers the clonal assignment of the cells based on clone-specific heteroplasmy level and weighing of informative variants. i, Heatmap indicating the presence of mtDNA variant, m.1429C>T. Each cell indicates a fraction of mutant cells out of the total number of cells corresponding to diploid and tetraploid across clones in TP53^−/− 184-hTERT sample, SA906a. This variant is present in both diploid and tetraploid cells of clone A. j, Same as i but for m.6869C>T in TP53^−/− 184-hTERT sample, SA906a. This variant is present in both diploid and tetraploid cells of clone G. k, Same as i but for m.6708G>A in SPECTRUM-OV-081 sample. This variant is present in both diploid and tetraploid cells of clone C. l, Ranking of the weight variable that indicates the probability of the variant contributing to the clonal assignment is plotted in a descending order. The real variants were colored in red. m, Heatmap of heteroplasmy across clones determined from mtDNA variants. The mtDNA variants-based clonal labels are on the top.

Extended Data Fig. 4 Transcriptional phenotype of high MNR cells compared against low MNR cells in tumors.

a, Differential expression of mtDNA-encoded genes between clones with the highest and the lowest MNR across the tumor samples based on Wilcoxon rank sum test. Colors indicate the average log₂ fold change, while the dot size indicates the log₁₀ of adjusted p-values. b, Same as a but for engineered 184-hTERT cell lines.

Supplementary information

Reporting Summary

Supplementary Tables

Supplementary Table 1: An overview of the mtDNA DLP+ data. Supplementary Table 2: Per-cell level DLP+ sequencing statistics for 184-hTERT cell lines, HGSC and TNBC tumors. Supplementary Table 3: Summary of doublets and tetraploid cell images across sequencing libraries. Supplementary Table 4: Median number of variants per diploid and tetraploid cells across samples. Supplementary Table 5: Descriptions and prior distributions of random variables and data in the MityBayes model. Supplementary Table 6: Cell cycle dataset with ploidy estimates from the PERT model.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kim, M., Gorelick, A.N., Vàzquez-García, I. et al. Single-cell mtDNA dynamics in tumors is driven by coregulation of nuclear and mitochondrial genomes. Nat Genet 56, 889–899 (2024). https://doi.org/10.1038/s41588-024-01724-8

Download citation

Received: 21 June 2022
Accepted: 20 March 2024
Published: 13 May 2024
Issue Date: May 2024
DOI: https://doi.org/10.1038/s41588-024-01724-8

This article is cited by

Single-cell mtDNA dynamics in tumors is driven by coregulation of nuclear and mitochondrial genomes
- Minsoo Kim
- Alexander N. Gorelick
- Ed Reznik
Nature Genetics (2024)