Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers


Extrachromosomal DNA (ecDNA) amplification promotes intratumoral genetic heterogeneity and accelerated tumor evolution1,2,3; however, its frequency and clinical impact are unclear. Using computational analysis of whole-genome sequencing data from 3,212 cancer patients, we show that ecDNA amplification frequently occurs in most cancer types but not in blood or normal tissue. Oncogenes were highly enriched on amplified ecDNA, and the most common recurrent oncogene amplifications arose on ecDNA. EcDNA amplifications resulted in higher levels of oncogene transcription compared to copy number-matched linear DNA, coupled with enhanced chromatin accessibility, and more frequently resulted in transcript fusions. Patients whose cancers carried ecDNA had significantly shorter survival, even when controlled for tissue type, than patients whose cancers were not driven by ecDNA-based oncogene amplification. The results presented here demonstrate that ecDNA-based oncogene amplification is common in cancer, is different from chromosomal amplification and drives poor outcome for patients across many cancer types.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Frequency of circular amplification across tumor and nontumor tissues.
Fig. 2: Oncogene content and structural component of circular amplification.
Fig. 3: Gene expression and chromatin accessibility of amplicon classes.
Fig. 4: Presence of circular amplification is associated with poor outcomes.

Data availability

Information on accessing the data from the ICGC, including raw read files, can be found at https://docs.icgc.org/pcawg/data/. All open access TCGA data are publicly available through the National Cancer Institute Genomic Data Commons (https://gdc.cancer.gov/). The datasets marked ‘Controlled’ contain potentially identifiable information and require authorization from the ICGC and TCGA Data Access Committees. In accordance with the data access policies of the ICGC and TCGA projects, most molecular, clinical and specimen data are in an open tier that does not require access approval. To access sequencing data, researchers need to apply to the TCGA Data Access Committee via the database of Genotypes and Phenotypes (https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login) for access to the TCGA portion of the dataset and to the ICGC Data Access Compliance Office (http://icgc.org/daco) for the ICGC portion. All images analyzed are available from figshare at https://figshare.com/s/6c3e2edc1ab299bb2fa0 and https://figshare.com/s/ab6a214738aa43833391.

Code availability

AmpliconArchitect is available at https://github.com/virajbdeshpande/AmpliconArchitect. EcSeg is available at https://github.com/UCRajkumar/ecSeg.


  1. 1.

    deCarvalho, A. C. et al. Discordant inheritance of chromosomal and extrachromosomal DNA elements contributes to dynamic disease evolution in glioblastoma. Nat. Genet. 50, 708–717 (2018).

    CAS  Article  Google Scholar 

  2. 2.

    Turner, K. M. et al. Extrachromosomal oncogene amplification drives tumour evolution and genetic heterogeneity. Nature 543, 122–125 (2017).

    CAS  Article  Google Scholar 

  3. 3.

    Verhaak, R. G. W., Bafna, V. & Mischel, P. S. Extrachromosomal oncogene amplification in tumour pathogenesis and evolution. Nat. Rev. Cancer 19, 283–288 (2019).

    CAS  Article  Google Scholar 

  4. 4.

    Weischenfeldt, J. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74 (2017).

    CAS  Article  Google Scholar 

  5. 5.

    Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).

    CAS  Article  Google Scholar 

  6. 6.

    Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

    CAS  Article  Google Scholar 

  7. 7.

    Alt, F. W., Kellems, R. E., Bertino, J. R. & Schimke, R. T. Selective multiplication of dihydrofolate reductase genes in methotrexate-resistant variants of cultured murine cells. J. Biol. Chem. 253, 1357–1370 (1978).

    CAS  PubMed  Google Scholar 

  8. 8.

    Kohl, N. E. et al. Transposition and amplification of oncogene-related sequences in human neuroblastomas. Cell 35, 359–367 (1983).

    CAS  Article  Google Scholar 

  9. 9.

    Nathanson, D. A. et al. Targeted therapy resistance mediated by dynamic regulation of extrachromosomal mutant EGFR DNA. Science 343, 72–76 (2014).

    CAS  Article  Google Scholar 

  10. 10.

    Zheng, S. et al. A survey of intragenic breakpoints in glioblastoma identifies a distinct subset associated with poor survival. Genes Dev. 27, 1462–1472 (2013).

    CAS  Article  Google Scholar 

  11. 11.

    Trask, B. J. Fluorescence in situ hybridization: applications in cytogenetics and gene mapping. Trends Genet. 7, 149–154 (1991).

    CAS  Article  Google Scholar 

  12. 12.

    Deshpande, V. et al. Exploring the landscape of focal amplifications in cancer using AmpliconArchitect. Nat. Commun. 10, 392 (2019).

    CAS  Article  Google Scholar 

  13. 13.

    Xu, K. et al. Structure and evolution of double minutes in diagnosis and relapse brain tumors. Acta Neuropathol. 137, 123–137 (2019).

    Article  Google Scholar 

  14. 14.

    Koche, R. P. et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma. Nat. Genet. 52, 29–34 (2020).

    CAS  Article  Google Scholar 

  15. 15.

    Campbell, P. J. et al. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).

    Article  Google Scholar 

  16. 16.

    Zakov, S., Kinsella, M. & Bafna, V. An algorithmic approach for breakage-fusion-bridge detection in tumor genomes. Proc. Natl Acad. Sci. USA 110, 5546–5551 (2013).

    CAS  Article  Google Scholar 

  17. 17.

    Rajkumar, U. et al. EcSeg: semantic segmentation of metaphase images containing extrachromosomal DNA. iScience 21, 428–435 (2019).

    CAS  Article  Google Scholar 

  18. 18.

    Storlazzi, C. T. et al. Gene amplification as double minutes or homogeneously staining regions in solid tumors: origin and structure. Genome Res. 20, 1198–1206 (2010).

    CAS  Article  Google Scholar 

  19. 19.

    Møller, H. D., Parsons, L., Jørgensen, T. S., Botstein, D. & Regenberg, B. Extrachromosomal circular DNA is common in yeast. Proc. Natl Acad. Sci. USA 112, E3114–E3122 (2015).

    Article  Google Scholar 

  20. 20.

    Møller, H. D. et al. Circular DNA elements of chromosomal origin are common in healthy human somatic tissue. Nat. Commun. 9, 1069 (2018).

    Article  Google Scholar 

  21. 21.

    Kumar, P. et al. Normal and cancerous tissues release extrachromosomal circular DNA (eccDNA) into the circulation. Mol. Cancer Res. 15, 1197–1205 (2017).

    CAS  Article  Google Scholar 

  22. 22.

    Shibata, Y. et al. Extrachromosomal microDNAs and chromosomal microdeletions in normal tissues. Science 336, 82–86 (2012).

    CAS  Article  Google Scholar 

  23. 23.

    Davoli, T. & de Lange, T. The causes and consequences of polyploidy in normal development and cancer. Annu. Rev. Cell Dev. Biol. 27, 585–610 (2011).

    CAS  Article  Google Scholar 

  24. 24.

    Bielski, C. M. et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat. Genet. 50, 1189–1195 (2018).

    CAS  Article  Google Scholar 

  25. 25.

    Cortés-Ciriano, I. et al. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing. Nat. Genet. 52, 331–341 (2020).

    Article  Google Scholar 

  26. 26.

    Ly, P. et al. Chromosome segregation errors generate a diverse spectrum of simple and complex genomic rearrangements. Nat. Genet. 51, 705–715 (2019).

    CAS  Article  Google Scholar 

  27. 27.

    Zhang, C.-Z. et al. Chromothripsis from DNA damage in micronuclei. Nature 522, 179–184 (2015).

    CAS  Article  Google Scholar 

  28. 28.

    Umbreit, N. T. et al. Mechanisms generating cancer genome complexity from a single cell division error. Science 368, eaba0712 (2020).

    CAS  Article  Google Scholar 

  29. 29.

    Menghi, F. et al. The tandem duplicator phenotype is a prevalent genome-wide cancer configuration driven by distinct gene mutations. Cancer Cell 34, 197–210.e5 (2018).

    CAS  Article  Google Scholar 

  30. 30.

    Morton, A. R. et al. Functional enhancers shape extrachromosomal oncogene amplifications. Cell 179, 1330–1341.e13 (2019).

    CAS  Article  Google Scholar 

  31. 31.

    Wu, S. et al. Circular ecDNA promotes accessible chromatin and high oncogene expression. Nature 575, 699–703 (2019).

    CAS  Article  Google Scholar 

  32. 32.

    Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018).

    Article  Google Scholar 

  33. 33.

    Helmsauer, K. et al. Enhancer hijacking determines intra- and extrachromosomal circular MYCN amplicon architecture in neuroblastoma. Preprint at bioRxiv https://doi.org/10.1101/2019.12.20.875807 (2019).

  34. 34.

    Davoli, T., Uno, H., Wooten, E. C. & Elledge, S. J. Tumor aneuploidy correlates with markers of immune evasion and with reduced response to immunotherapy. Science 355, eaaf8399 (2017).

    Article  Google Scholar 

  35. 35.

    Hadi, K. et al. Novel patterns of complex structural variation revealed across thousands of cancer genome graphs. Preprint at bioRxiv https://doi.org/10.1101/836296 (2019).

  36. 36.

    Priestley, P. et al. Pan-cancer whole-genome analyses of metastatic solid tumours. Nature 575, 210–216 (2019).

    CAS  Article  Google Scholar 

  37. 37.

    Taylor, A. M. et al. Genomic and functional approaches to understanding cancer aneuploidy. Cancer Cell 33, 676–689.e3 (2018).

    CAS  Article  Google Scholar 

  38. 38.

    Hu, X. et al. TumorFusions: an integrative resource for cancer-associated transcript fusions. Nucleic Acids Res. 46, D1144–D1149 (2018).

    CAS  Article  Google Scholar 

  39. 39.

    Yoshihara, K. et al. The landscape and therapeutic relevance of cancer-associated transcript fusions. Oncogene 34, 4845–4854 (2015).

    CAS  Article  Google Scholar 

  40. 40.

    Torres-García, W. et al. PRADA: pipeline for RNA sequencing data analysis. Bioinformatics 30, 2224–2226 (2014).

    Article  Google Scholar 

  41. 41.

    Wala, J. A. et al. SvABA: genome-wide detection of structural variants and indels by local assembly. Genome Res. 28, 581–591 (2018).

    CAS  Article  Google Scholar 

  42. 42.

    Quinlan, A. R. BEDTools: the Swiss-army tool for genome feature analysis. Curr. Protoc. Bioinformatics 47, 11.12.1–11.12.34 (2014).

    Article  Google Scholar 

Download references


This work was supported by the Ludwig Institute for Cancer Research (P.S.M.), Defeat GBM Program of the National Brain Tumor Society (P.S.M.), NVIDIA Foundation, Compute for the Cure (P.S.M.), Ben and Catherine Ivy Foundation (P.S.M.), generous donations from the Ziering Family Foundation in memory of Sigi Ziering (P.S.M.) and Ruth L. Kirschstein National Research Service Award. This work was also supported by the following National Institutes of Health grants: NS73831 (to P.S.M.), GM114362 (to V.B.), R01 CA190121, R01 CA237208 and R21 NS114873. This work was supported by Cancer Center Support Grant P30 CA034196 (R.G.W.V), grant nos. R35CA209919 (to H.Y.C.) and RM1-HG007735 (to H.Y.C.), R35GM133600 (to C.R.B.), National Science Foundation grant nos. NSF-IIS-1318386 and NSF-DBI-1458557 (to V.B.), and grants from the Musella Foundation, B*CURED Foundation, Brain Tumour Charity and Department of Defense grant no. W81XWH1910246 (to R.G.W.V). H.Y.C. is an Investigator of the Howard Hughes Medical Institute. The results published in this paper are in whole or part based on data generated by the TCGA Research Network (https://www.cancer.gov/tcga) and the International Cancer Genome Consortium (https://icgc.org/). Analysis of the TCGA and International Cancer Genome Consortium datasets was made possible through the Cancer Genomics Cloud of the Institute for Systems Biology (ISB-CGC) and the Amazon Web Services Cloud, respectively.

Author information




H.K., N.P.N., P.S.M., V.B. and R.G.W.V. conceived the study and designed the experiments. Data analysis was led by H.K. and N.P.N. in collaboration with S.W., J. Luebeck, V.D., S.N., S.B.A., F.M., U.R., H.Y.C., E.Y. and C.R.B. Cloud data access was performed by H.K. and S.N. The FISH experiments were performed by K.T., S.W., E.Y. and A.D.G. EcSeg was performed by U.R. and J. Liu. The CIRCLE-seq data were provided by J.H.S. and A.G.H. H.K., N.P.N., P.S.M., V.B. and R.G.W.V. wrote the manuscript. E.Y. reviewed the manuscript. All coauthors discussed the results and commented on the manuscript and the supplementary information.

Corresponding authors

Correspondence to Paul S. Mischel or Vineet Bafna or Roel G. W. Verhaak.

Ethics declarations

Competing interests

H.Y.C., P.S.M., V.B. and R.G.W.V. are scientific cofounders of Boundless Bio and serve as consultants. V.B. is a cofounder and has equity interest in Digital Proteomics, and receives income from Digital Proteomics. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. N.P.N. and K.T. are employees of Boundless Bio.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Amplicon classification.

a. Validation on cell line data. Validation of the classification scheme on cell line data with FISH experiments for detecting ecDNA from the Turner et al. and deCarvalho et al. studies, in addition to newly generated data. FISH probes were designed for selected oncogenes and DAPI staining was performed to determine whether the FISH probe landed on chromosomal DNA or ecDNA. For each cell (represented as an image of the cell in metaphase), the number of positive ecDNA probes were counted, and for each cell line, the average positive ecDNA per cell was reported. For each probe, we report whether it landed in an amplicon (inferred from AmpliconArchitect), and if so, what was the amplicon’s classification. The distribution for the average ecDNA per cell between the Circular and non-circular classes was statistically significantly different (p-value < 1e-9; Wilcoxon rank sum test). bd. Whole-genome sequencing derived based Circular amplicon regions (blue) were validated with Circle-seq (red) for three neuroblastoma samples (CB2001, CB2022, and CB2050, respectively) used in the Koche et al. study.

Extended Data Fig. 2 Circular vs amplified non-circular amplification comparisons.

a. 24 recurrently amplified oncogenes significantly overlap circular regions (z-score 37.8), especially compared to amplified non-circular regions (z-scores of 30.4, 29.5, 28.0 for Linear, Heavily-rearranged, and BFB). b. For all oncogenes on amplicons with copy number >= 4 and present in at least 5 samples across the cohort, we show the class distribution of that oncogene. The oncogenes are ordered by proportion on circular amplification. c. For the 24 recurrent oncogenes known to be activated via amplification (Zack et al. Nat Gen. 2013), we report the average copy number for the oncogenes for circular amplification versus amplified-noncircular amplification. d. Breakpoint location across all samples for each recurrently amplified oncogene. We identified all breakpoints from each sample containing the recurrent oncogene on ecDNA and report the total number of breakpoints across this region in 1kb binned windows. e. Distribution of breakpoint locations across all circular samples for each recurrently amplified oncogene. We identified all breakpoints from each sample containing the recurrent oncogene on ecDNA. Shown is the distribution of the number of breakpoints in each bin, which closely follows a Poisson distribution, suggesting that the breakpoints are mostly randomly distributed across the region.

Extended Data Fig. 3 Genome instability vs amplicon classes.

a. Chromosome arm aneuploidy scores showing no or marginal difference in chromosomal arm level events between circular and non-circular amplification classes. b. Genome doubling events by amplification class. c. Distribution for total DNA loss segments by amplification class. WGS-inferred CNV data was used to count the total number of DNA losses within a sample. A DNA loss was defined as a segment with CN < 2. d. Distribution for total DNA gain segments by amplification class. WGS-inferred CNV data was used to count the total number of DNA gains within a sample. A DNA gain was defined as a segment with CN > 2. Circular samples contain statistically significantly more DNA gains than BFB, Heavily-rearranged, Linear, and No-fSCNA (p-value <0.03, <0.03, <1e-20, and <1e-111, respectively; Wilcox Rank Sum Test). e. Breakpoint homology by amplification class. f. Comparison of amplicon versus locus-level chromothripsis (Pearson’s Chi-squared test data: X-squared = 4674.7, df = 3, p-value < 2.2e-16). g. Comparison of sample category versus sample-level chromothripsis (Pearson’s Chi-squared test data: X-squared = 21.58, df = 3, p-value 8e-05 (excludes ‘No fSCNA detected’ category)). h. Comparison of sample category versus sample-level tandem duplication (Pearson’s Chi-squared test data: X-squared = 7.39, df = 3, p-value 0.06 (excludes ‘No fSCNA detected’ category)).

Extended Data Fig. 4 Gene expression of amplicon classes.

Copy number of the oncogene versus its fold-change in FPKM for all oncogenes with a copy count greater than 4, for each oncogene on each amplicon. The fold-change in FPKM is computed as the oncogene’s (FPKM-UQ+1) divided by the average of (FPKM-UQ+1) for the same oncogene in all other tumor samples from the same cohort for which the oncogene is not on any amplicon (that is, not amplified). Linear regression lines, using fold change = m*CNV+b where m and b are selected to minimize error of the fit, are shown for each class. Tukey’s range test shows oncogenes on circular structures are significantly different to oncogenes on non-circular structures (p-value < 1e-7).

Extended Data Fig. 5 Lymph node stage vs amplicon classes.

Lymph node stage for primary tumors showing samples with amplification are more likely to have spread to the lymph node at time of diagnosis (Chi-square test; df=4; p-value<1e−05).

Extended Data Fig. 6 Cell cycle and immune infiltrate gene expression signatures vs amplicon classes.

a. Cell Cycle gene expression signature single sample GSEA (ssGSEA) scores by amplification category. b. Immune infiltrate gene expression signature single sample GSEA (ssGSEA) scores by amplification category.

Supplementary information

Reporting Summary

Supplementary Tables

Supplementary Tables 1–3

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, H., Nguyen, N., Turner, K. et al. Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers. Nat Genet 52, 891–897 (2020). https://doi.org/10.1038/s41588-020-0678-2

Download citation

Further reading


Sign up for the Nature Briefing newsletter for a daily update on COVID-19 science.
Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing