Main

Since its first description in wheat embryos and boar sperm in 1964 (ref. 3), extrachromosomal circular DNA (eccDNA) has been reported in almost all cell lines and tissues4 across different species, although its abundance is highly variable1,2. Unlike the independently existing circular DNA in organelles, such as mitochondrial DNA (mtDNA), eccDNAs are derived from genomic DNA and range in size from a few hundred bases to megabases1. Although some studies have suggested that eccDNA generation might be linked to DNA damage repair5, hypertranscription5,6, homologous recombination7 and replication stress5, how exactly eccDNAs are generated is largely unknown. Similarly, it is also unclear whether eccDNA has any function, although some studies have suggested that eccDNAs might contribute to gene amplification in cancer1 or might be linked to ageing6,7,8.

To understand eccDNA biogenesis, efficient and robust methods that allow purification and sequencing of eccDNAs are needed. Most existing eccDNA purification procedures involve two sequential steps, isolation of crude extrachromosomal DNA followed by removal of contaminating linear DNA through exonuclease digestion5,6,9 and rolling circle amplification (RCA) for profiling5,9,10. However, most eccDNA samples prepared in this way contain a high level of linear DNA before RCA, as revealed by electron microscopy5,9, indicating that exonuclease digestion alone is not sufficient to eliminate all contaminating linear DNA.

An efficient eccDNA purification method

We have developed a new three-step eccDNA enrichment method that allows efficient eccDNA purification (Fig. 1a). In the first step, to minimize eccDNA loss, we replaced the conventional unbuffered sodium hydroxide lysis, which may cause irreversible denaturation or breakage of DNA circles11, with a modified alkaline buffer at pH 11.8 to lyse the whole cells. In the second step, we used the rare cutter PacI restriction enzyme to linearize mtDNA before addition of an exonuclease (Plasmid-Safe ATP-dependent DNase) to digest linear DNA. In the third step, solution A, which could selectively recover circular DNA, but not linear DNA, on silica beads (Extended Data Fig. 1a), was used to exclude any linear DNA that escaped exonuclease digestion (Fig. 1a). Additionally, vertical agarose gel electrophoresis was used to increase the sensitivity of eccDNA detection (Extended Data Fig. 1b). Using this three-step purification procedure, we purified eccDNA from 10 million HeLa cells growing at confluence, a stress condition known to increase eccDNA abundance12. The purified eccDNAs exhibited a discrete banded pattern (Fig. 1b). Furthermore, mtDNA could be removed by PacI treatment (Fig. 1b, compare lanes 1 and 2). We further confirmed the circularity of the purified eccDNAs using scanning atomic-force microscopy (SAFM) (Fig. 1c). To determine whether eccDNAs occur in non-cancer cells, mouse embryonic stem cells (mESCs) were used, and mESC-derived eccDNAs exhibited a similar banded pattern (Fig. 1d), and their purity and circularity were also verified by SAFM (Fig. 1e).

Fig. 1: Development of a three-step eccDNA purification procedure.
figure 1

a, Schematic of the three-step eccDNA purification and sequencing procedure. Step 1, extract crude DNA circles from whole cells in a buffered alkaline lysis solution and bind them to a silica column; step 2, linearize mtDNA with PacI and reduce overall linear DNA levels with Plasmid-Safe (PS) DNase; step 3, selectively recover eccDNAs by excluding residual linear DNA in solution A. eccDNAs are then sequenced by Oxford Nanopore sequencing after RCA (left) or by Illumina sequencing after Tn5 tagmentation on eccDNAs (right). b, Agarose gel showing eccDNAs purified from over-confluent HeLa cells without (–) or with (+) PacI treatment. M, linear DNA maker; Mt, mtDNA. c, eccDNAs in b (lane 2: top image; lane 3: bottom image) scanned with SAFM. d, Agarose gel showing eccDNAs purified from normal cultured mESCs. Red arrowheads indicate distinct DNA bands. e, eccDNAs in d (lane 2: top image; lane 3: bottom image) scanned with SAFM. In b and d, representative gels are shown from three independent experiments. In c and e, two representative fields are shown.

eccDNAs map to the entire genome

To gain insights into the potential mechanism of eccDNA biogenesis, we determined the genomic source of eccDNAs. HeLa cells are notorious for their aberrant genome, including aneuploidy and numerous structural variations, such as deletions, duplications, inversions, translocations and rearrangements, etc.13, making interpretation of sequencing data and dissection of the eccDNA biogenesis mechanism difficult. Thus, we performed eccDNA sequencing and mapping using mESCs, whose genetic integrity is maintained during culture14. To obtain full-length eccDNA sequences, we performed RCA and subsequent long-read Nanopore sequencing of the multiple tandem copies of individual eccDNA molecules (Fig. 1a and Extended Data Fig. 2a). Repeated sequencing of the same eccDNA with long reads allows generation of a consensus sequence matching the full-length sequence of the original eccDNA by a computational threading method (Fig. 2a and Extended Data Fig. 2c). We obtained 4 million long reads with a mean size of 3.7 kb (Extended Data Fig. 2b). To reduce false positives due to RCA artefacts and sequencing errors from the Nanopore technique15, we only used the 1.9 million long reads that each contained at least two full passes covering their original eccDNA molecule to identify high-confidence eccDNAs, resulting in the identification of 1.6 million unique eccDNAs with a median size of 1 kb (Extended Data Fig. 2b). Interestingly, the eccDNAs exhibited a regular average size interval of 188 bp (Fig. 2b). The great majority (89%) of unique eccDNAs were sequenced from a single long read (single-event eccDNA), with less than 1.5% of unique eccDNAs sequenced from more than three unique long molecules (Fig. 2c); no dominant eccDNA was identified. Such large numbers of single-event eccDNAs coupled with the lack of dominant eccDNA molecules suggest that eccDNAs are unlikely to be derived from specific genome regions.

Fig. 2: eccDNAs are circularized genomic DNA fragments that map across the genome.
figure 2

a, Integrative Genomics Viewer (IGV) alignments showing eccDNA examples from two genomic loci on chromosomes 17 and 18. Individual horizontal bars in the same colour represent subreads from a unique Nanopore long read that was repeatedly aligned to the same genomic locus or loci. ecc-1, ecc-2 and ecc-4 are single-fragment circles. ecc-1 partially overlaps ecc-2; ecc-4 partially overlaps one fragment of ecc-3, which is a two-fragment circle (2f eccDNA) aligned to two loci on chromosomes 17 and 18. b, Histogram showing eccDNA size distribution and relative abundance. c, Pie chart showing the percentages of eccDNAs with the indicated event numbers among the total unique eccDNAs identified. d, Bar graph showing eccDNA counts with the indicated number of fragments (1–7) in each circle. e, Circle plot showing the chromosomal origin of all two-fragments eccDNAs. Sub-reads from the same chromosome are in the same colour. f, Overall chromosomal distribution of eccDNAs across the genome.

Genome mapping of full-length eccDNAs revealed their various genomic alignment patterns, including at adjacent, overlapped or nested positions on the same chromosome or even across different chromosomes (Fig. 2a). We found that a great majority of eccDNAs originated from single continuous genomic loci (continuous eccDNAs, self-circularization of a single genomic fragment), and only a relatively small number of eccDNAs were formed from multiple genomic fragments (non-continuous eccDNAs, circularization of multiple genomic fragments) (Fig. 2d and Extended Data Fig. 2c), including three eccDNAs each with seven genomic fragments joined together to form a circle (7f eccDNA) (Fig. 2d). To determine whether the physical distance between genomic fragments affects the frequency of eccDNA formation, we analysed the genomic origin of two-fragment eccDNAs (2f eccDNAs). A circle plot clearly showed that paired fragments of 2f eccDNAs are not restricted to the same chromosome (Fig. 2e), but rather are randomly bridged between chromosomes, indicating that eccDNAs can be formed by joining genomic fragments from different chromosomes. Consistent with this, genome mapping of all eccDNAs revealed that eccDNAs are widespread across the entire genome (Fig. 2f).

To rule out potential biases caused by uneven amplification by RCA16, we purified another batch of eccDNAs and directly tagged them with Tn5 transposase without RCA for Illumina sequencing (Fig. 1a and Extended Data Fig. 3a). eccDNA sequences obtained in this way should faithfully reveal their genomic location and relative abundance. Consistent with the Nanopore sequencing results, Illumina sequencing showed widespread alignment of eccDNAs across the entire genome (Extended Data Fig. 3b). We noticed that the eccDNA density on the X chromosome was about half that of the autosomes (Fig. 2f and Extended Data Fig. 3b), consistent with the fact that the diploid male genome of mESC/E14 cells carries one copy of the X chromosome but two copies of each autosome (Fig. 2f and Extended Data Fig. 3b). The lack of eccDNAs mapped to the Y chromosome is largely due to the many undetermined sequences and repeat sequences on the Y chromosome17. Collectively, these data suggest that eccDNAs are widespread across the entire genome and their abundance is correlated with genomic copy numbers.

DNase γ is required for eccDNA generation

The great diversity, randomness and nucleosome ‘ladder’ size (Fig. 2) suggest that eccDNAs might be generated by random ligation (including self-ligation) of oligonucleosomal DNA fragments, which can be visualized as ‘ladders’ in agarose gel and are a known feature of apoptosis18. To determine whether apoptotic cells are the source of eccDNAs, mESCs were treated with the apoptosis inducers staurosporine, etoposide or UV light. Successful induction of apoptosis was confirmed by the typical nucleosomal ladder pattern of genomic DNA (Fig. 3a). When equal amounts of control (DMSO-treated) cells and apoptosis inducer-treated cells were subjected to the three-step eccDNA purification procedure and visualized on an agarose gel, all three treatments induced eccDNA production, although UV treatment resulted in the strongest induction (Fig. 3b and Extended Data Fig. 4a).

Fig. 3: Apoptotic DNA fragmentation and subsequent ligation by Lig3 are required for eccDNA production in mESCs.
figure 3

a, b, eccDNA production is induced by apoptosis. mESCs were treated with the indicated apoptosis inducers, and total DNA (a) and eccDNAs (b) were purified. ETO, etoposide; STS, staurosporine. c, d, Deficiency of oligonucleosomal DNA fragmentation abolishesapoptosis-induced eccDNA production. Knockout of Dnase1l3, but not Endog (encoding endonuclease G), abolishes UV-induced oligonucleosomal DNA fragmentation (c) and eccDNA production (d) in mESCs. mtDNA was kept as an internal control. e, Confirmation of DNA ligase-deficient CH12F3 cell lines. Top, immunoblotting confirming knockout of DNA ligases in CH12F3 cell lines. Bottom, genomic structure of Lig3 with CRISPR–Cas9 specifically targeting (Δ) NucLig3 but retaining MtLig3 for cell viability. f, g, Lig3 is the major DNA ligase for eccDNA generation. Shown are staurosporine-induced oligonucleosomal DNA fragmentation (f) and eccDNA (g) from the indicated CH12F3 cell lines. In ad, f and g, the amount of input cells and elution and loading volumes were equal among the samples on each agarose gel. Shown are representatives of three independent experiments.

We next determined whether eccDNA generation requires apoptotic DNA fragmentation (ADF), which is mediated by caspase-activated DNase (CAD)19, endonuclease G (EndoG)20 or DNase γ21 in a cell-type-specific manner. Genetic manipulation (Extended Data Fig. 4b, c) indicated that DNase γ, but not EndoG, mediates ADF in mESCs, as indicated by the lack of a ladder pattern with DNase γ (encoded by Dnase1l3) knockout21 (Fig. 3c). Dnase1l3 knockout did not affect cell viability under either normal culture conditions or UV treatment (Extended Data Fig. 4d, e). Purification of the eccDNAs from UV-treated cells demonstrated that abrogation of ADF prevented eccDNA generation (Fig. 3d and Extended Data Fig. 4f). We intentionally skipped PacI digestion to retain mtDNA as an internal control for equal cell input and circular DNA recovery (Fig. 3d). These results demonstrate that ADF is a prerequisite for eccDNA generation.

Lig3 is required for eccDNA generation

Next, we attempted to identify the DNA ligase responsible for circularizing the fragmented DNA. Mammals have three DNA ligase genes (Lig1, Lig3 and Lig4), each of which has a specific function, although they also function redundantly in DNA metabolism22. The functions of these ligases have been well studied in the CH12F3 mouse B-lymphocyte cell line23. To determine which of the three DNA ligases is responsible for ADF circularization, individual DNA ligases and their combinations were knocked out in CH12F3 cells by CRISPR–Cas9 with knockout confirmed by western blotting (Fig. 3e). Lig3 has both nuclear and mitochondrial isoforms, the later of which is essential for mitochondria maintenance and, consequently, cell viability24. Thus, the Lig3-knockout cell line was generated by specifically targeting the nuclear isoform (NucLig3–/–) without interfering with the mitochondrial isoform (MtLig3; Fig. 3e, lower diagram). Equal numbers of wild-type (WT) and mutant cells were treated with staurosporine to induce ADF (Fig. 3f), and eccDNAs were purified and visualized in agarose gel (Fig. 3g). The results indicated that knockout of Lig1 or Lig4 alone or in combination did not significantly affect eccDNA generation. In contrast, knockout of Lig3 greatly reduced eccDNA generation (Fig. 3g and Extended Data Fig. 4g). Because double knockout of Lig1 and Lig3 is lethal to cells23,24, it is unknown whether double knockout could completely abrogate eccDNA generation. Nevertheless, these data support Lig3 as the main ligase for eccDNA generation in CH12F3 cells.

Fig. 4: eccDNAs are potent immunostimulants.
figure 4

a, eccDNAs induce Ifna (encoding IFNα) and Ifnb1 (encoding IFNβ) expression in BMDCs. Equal amounts of the indicated DNA were transfected into BMDCs at increasing concentrations for RT–qPCR analysis. Data are presented as relative mRNA fold change (y axis) with respect to that in mock transfected cells (without DNA). Li-DNA, sonicated genomic linear DNA with sizes similar to those of eccDNAs. b, ELISA analysis of IFNα and IFNβ production in medium from a. c, Confirmation of eccDNA linearization. A representative gel shows equal amounts of the indicated DNA digested with Plasmid-Safe DNase or left undigested. Li-ecc, linearized eccDNA. d, Linearized eccDNAs lose their immunostimulatory activities. The DNAs from lanes 1–3 in c were transfected into BMDCs at 30 ng ml–1 and mRNA levels were evaluated as in a. e, f, Synthetic small DNA circles are potent immunostimulants. The same experiments were performed as in a and b, except that the linear DNA and eccDNA were replaced by synthetic linear (Syn-linear) and circular (Syn-circular) DNAs with the same sequence. g, eccDNAs are present in apoptotic medium from WT cells but not Dnase1l3–/– (–/–) cells. A representative gel of eccDNA (left) and quantification (right; n = 3) are shown; ND, not detected. h, Exonuclease-resistant DNA (not mtDNA) in apoptotic medium activates Ifna and Ifnb1 expression. Supernatants from apoptotic WT or Dnase1l3–/– (–/–) cells were treated as indicated, then incubated with BMDCs for RT–qPCR analysis as in a. Benz, benzonase. In a, b, df and h, data are shown as the mean ± s.e.m. of replicates (n = 4 per group) of a representative from three independent experiments. Statistics were calculated on biological replicates with ordinary one-way ANOVA with Tukey’s multiple-comparison test (a, b, df, h) or two-tailed unpaired t-tests (g): *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; NS, not significant.

eccDNAs are potent innate immunostimulants

The above results demonstrate that eccDNAs are ligation products of fragmented genomic DNA from apoptotic cells. DNA released from dying cells has previously been reported to promote immune responses18. Two important mediators of immune response, Toll-like receptor 9 (TLR9)25 and high-mobility-group box 1 (HMGB1)26, have been reported to preferentially bind to DNA curvatures. These observations suggest that eccDNAs may serve as immunostimulants. To test this idea, we generated bone marrow-derived dendritic cells (BMDCs) (Extended Data Fig. 5a) and compared the immunostimulant activity of sheared linear genomic DNA, eccDNAs and the widely used potent DNA ligand of cytosolic DNA sensors poly(dG:dC)27 (Extended Data Fig. 5b). We transfected BMDCs with different amounts of the three forms of DNA and then collected cells for RT–qPCR assays. Compared with transfection with linear DNA, type I interferons (IFNα, IFNβ), interleukin-6 (IL-6) and tumour necrosis factor (TNF) were all significantly induced by eccDNAs at a wide range of concentrations (10–240 ng ml–1) (Fig. 4a and Extended Data Fig. 5c). Surprisingly, the widely regarded ‘potent’ cytokine inducer poly(dG:dC) was not nearly as potent as eccDNAs at lower concentrations, and the linear DNA triggered only a mild response, even at the highest concentration, when compared with mock transfection, indicating that dendritic cells are much more sensitive to eccDNA treatment than that with linear DNA or poly(dG:dC). Consistent with this, enzyme-linked immunosorbent assays (ELISAs) confirmed the strong potency of eccDNAs in cytokine induction (Fig. 4b and Extended Data Fig. 5d).

In addition to dendritic cells, macrophages are also known to respond to immunostimulants28. To determine whether macrophages behave like dendritic cells upon eccDNA transfection, we generated bone marrow-derived macrophages (BMDMs) (Extended Data Fig. 6a). Similarly to the observations in BMDCs, eccDNAs also displayed much higher immunostimulant activities in BMDMs than linear DNA or poly(dG:dC), particularly at lower concentrations (10 and 30 ng ml–1) (Extended Data Fig. 6b, c). These data indicate that eccDNAs are very potent immunostimulants in activating both BMDCs and BMDMs. Furthermore, pretreatment of eccDNA with DNase I before transfection completely abrogated the capacity of eccDNA to induce cytokine production (Extended Data Fig. 5e, f), demonstrating that eccDNA, rather than potential concomitants of eccDNA, is responsible for the immune activation.

Circularization confers eccDNA potency

To determine whether the circular nature of eccDNAs is critical for their strong immunostimulatory activity, purified eccDNAs were first treated with FnoCas12a (Cpf1), which introduces one nick per circular DNA in the presence of Mn2+ and the absence of guide RNA29. The nicked eccDNAs were then treated with the single-strand-specific endonuclease nuclease S1, which cleaved the intact circular strand at the site opposite to the nick to generate linearized eccDNA. Linearization of eccDNAs was confirmed by their sensitivity to exonuclease digestion while intact eccDNAs were resistant (Fig. 4c, compare lanes 5 and 6). When equal amounts of linear DNAs, eccDNAs and linear eccDNAs were transfected into BMDCs, linear eccDNAs behaved like linear DNAs and failed to activate IFNα, IFNβ, IL-6 or TNF (Fig. 4d and Extended Data Fig. 7a). These results demonstrated that the circular nature of eccDNAs is critical for their strong immunostimulant activity. Because eccDNAs are derived from randomly ligated genomic fragments, their sequences are unlikely to significantly contribute to their potency. This notion was confirmed by the ability of a synthetic 200-bp circular DNA, but not its linear counterpart, to greatly induce cytokine genes transcription in BMDCs (Fig. 4e and Extended Data Fig. 7b). Similarly to native eccDNAs, synthetic circular DNA also showed higher potency in cytokine gene activation than poly(dG:dC) (Fig. 4e and Extended Data Fig. 7b). Consistent with these findings, ELISAs confirmed the strong cytokine induction capacity of synthetic circular DNA (Fig. 4f and Extended Data Fig. 7c).

To rule out the possibility that the increased immunostimulatory potency of circular DNA is due to increased stability and transfection efficiency, and to minimize the effects of exonuclease activity on linear DNA, we added phosphorothioate30 bonds on both ends of the synthetic 200-bp linear DNA. To exclude potential effects from the phosphorothioate bonds on transfection and immune stimulation, equal numbers of phosphorothioate bonds were also added in the circular counterpart. Linear and circular DNAs were separately transfected into BMDCs, and cell lysates and culture media were collected for qPCR and ELISA to compare transfection efficiency (1 h after transfection), stability and cytokine induction (12 h after transfection) (Extended Data Fig. 7d). We found no significant difference in transfection efficiency or stability between the linear and circular 200-bp DNAs (Extended Data Fig. 7e). Yet, the circular DNA induced a high level of cytokine production while its linear counterpart did not (Extended Data Fig. 7f). Collectively, these data support the idea that the circularity but not the sequence of eccDNAs confers the high potency of their immunostimulant activity.

BMDCs sense eccDNAs in medium from apoptotic cells

Because eccDNAs are generated in apoptotic cells, they could be released into the culture medium. Indeed, a substantial amount of eccDNAs could be detected in the cell-free supernatant of UV-treated mESCs undergoing apoptosis (Fig. 4g). To determine whether eccDNAs from the supernatant of apoptotic cells can be actively sensed by BMDCs without transfection, BMDCs were co-incubated with cell-free apoptotic supernatant from WT or Dnase1l3–/– mESCs. RT–qPCR analysis indicated that the supernatant from WT, but not Dnase1l3–/–, apoptotic cells stimulated IFNα and IFNβ expression (Fig. 4h). Importantly, this stimulation was not sensitive to pretreatment of the supernatant with Plasmid-Safe DNase, PacI restriction enzyme or RNases, which digest linear DNA, mtDNA and RNA, respectively (Fig. 4h), but was sensitive to pretreatment with benzonase, a nuclease that destroys all forms of DNA and RNA without proteolytic activity (Fig. 4h). These data indicate that eccDNAs, but not linear DNAs, mtDNAs or RNAs, in the supernatant of apoptotic cells are responsible for the induced immune response. Furthermore, these results also indicate that eccDNAs can be actively sensed by BMDCs without transfection. Collectively, our results indicate that eccDNAs are potent damage-associated molecular patterns of the innate immune system28.

eccDNA-triggered immune response requires Sting

To assess the global transcriptional effect of eccDNA, we performed RNA-seq analysis of BMDCs transfected with purified eccDNAs or sonicated with genomic DNA of similar size (Extended Data Fig. 8a–c). Comparative analysis indicated that eccDNAs, but not the linear DNA control, significantly increased the expression of 290 genes (fold change ≥ 5, P < 0.001), including 34 cytokines and chemokines (Fig. 5a and Supplementary Table 4), under our experimental conditions (30 ng ml–1 DNA transfected). Importantly, 9 of the top 20 most upregulated genes belong to the family of type I interferons (Fig. 5b). Gene ontology (GO) enrichment analysis revealed that the upregulated genes were enriched in terms relevant to immune response and related signalling pathways (Fig. 5c), supporting our conclusion that eccDNA is a potent innate immunostimulant that can generally increase the innate immune response. Parallel experiments further demonstrated a similar effect of eccDNAs in BMDMs (Extended Data Fig. 9a–f and Supplementary Table 5). Collectively, these data support a higher capacity and potency of eccDNA in triggering a general immune response than linear genomic DNA fragments (Fig. 5b). Importantly, this eccDNA property depends on eccDNA circularization, but not sequence, as transfection of the 200-bp synthetic circular DNA into BMDCs triggered a similar transcriptional response as purified eccDNAs (Fig. 5d and Extended Data Fig. 10a).

Fig. 5: Sting is required for eccDNA-induced gene expression.
figure 5

a, Scatterplot showing 290 genes (left, red dots) that are significantly induced by eccDNA, but not linear DNA, in BMDCs. Thirty-four significantly induced cytokine genes are indicated (right, red dots). FC, fold change. b, Heatmap representation of the top 20 most strongly induced genes. c, GO terms enriched in the genes activated by eccDNA treatment in BMDCs. The number of genes in each term and the P values of enrichment are indicated. d, Scatterplot indicating that the transcriptomes of BMDCs treated with eccDNAs and BMDCs treated with synthetic circular DNA are highly similar. e, Heatmap representation of the eccDNA-responsive genes in control and eccDNA-treated BMDCs of the indicated genotype. f, Scatterplots comparing the transcriptome affected by eccDNA in BMDCs from WT, Sting1–/– and Myd88–/– mice. eccDNA-responsive genes in WT BMDCs are indicated by red dots (upregulated, n = 365) and blue dots (downregulated, n = 79). In a, d and f, the x and y axes are log2-transformed normalized read counts; P values were generated using DESeq2 (ref. 42) and adjusted by IHW43. FC ≥ 5, adjusted P < 0.001 in a and f; FC ≥ 2, adjusted P < 0.01 in d.

To determine how eccDNA is sensed, two well-known mouse lines deficient in DNA sensing, with knockout of Sting1 (stimulator of interferon genes)31 or Myd88 (myeloid differentiation primary response 88)32, were used to generate BMDCs (Extended Data Fig. 10b), which were then subjected to eccDNA transfection. Comparative RNA-seq analysis demonstrated that, while loss of function of Myd88 did not affect BMDC responses to eccDNAs, loss of Sting function completely abrogated the capacity of BMDCs to respond to eccDNAs, as almost all the genes normally induced by eccDNA were not induced in the absence of Sting (Fig. 5e, f, Extended Data Fig. 10c and Supplementary Table 6). These data strongly suggest that the Sting pathway is responsible for sensing eccDNA to mediate its immune response.

Our data indicate that apoptotic oligonucleosomal DNA fragmentation (ODF) is directly linked to eccDNA generation, as blocking ADF abolishes eccDNA production (Fig. 3c, d). This notion challenges the assumption that eccDNAs isolated from a cell population or tissue are equally contributed by each cell9. On the contrary, our data suggest that eccDNAs are mostly derived from cells that undergo genomic DNA fragmentation. During apoptosis, genomic DNA is first broken into high-molecular-weight (HMW) fragments (>50 kb) and subsequently undergoes ODF to generate oligonucleosomal fragments19,33. Similarly to ODF in myoblasts21 and neuroblastomas under differentiation conditions34, we showed that DNase γ is required for ODF and subsequent eccDNA generation in mESCs (Fig. 3c, d). This suggests that the eccDNAs that we purified are generated in the late stage of apoptosis when ODF occurs33, which is consistent with our observation that the great majority (99.5%) of eccDNAs are within 3 kb in size (Fig. 2b), although we identified eccDNAs as long as 10 kb. Although random ligation by Lig3 of nucleosome-sized DNA fragments in the late stage of apoptosis explains the predominantly oligonucleosomal sizes and the absence of abundant individual eccDNAs in our study and previous studies5,9, it is possible that rare ligation of HMW fragments in the early stage of apoptosis might also occur. Our demonstration that Lig3 is responsible for nucleosome-sized eccDNA generation is consistent with the ability of Lig3 to circularize DNA fragments in vitro35. Whether Lig3 is also involved in the biogenesis of larger eccDNAs, double minutes36 and extrachromosomal DNAs37 remains to be determined. Although this study mainly focused on eccDNA generation under apoptotic conditions, we do not rule out the possibility that eccDNAs can also be generated under other conditions that cause genomic DNA fragmentation (for example, replication stress, double-strand DNA breaks, V(D)J recombination, etc.) with subsequent circularization.

We demonstrated that purified eccDNAs or synthetic circular DNA, but not their linear counterparts, have strong immunostimulatory activity (Figs. 4 and 5a–c). Importantly, Sting, but not Myd88, is required to mediate this process (Fig. 5e, f). cGAS–Sting is a well-known intracellular DNA sensing pathway38, and DNA sensing by cGAS has been reported to be enhanced by the host factors HMGB1 and TFAM, which facilitate DNA bending or form U-shaped structures39. Whether these factors are involved in eccDNA-mediated immune stimulation remains to be determined. In addition to the immunostimulatory activity of eccDNA we showed in this study, it is not clear whether eccDNAs from apoptotic cells are linked to the oncogene amplification and tumour progression shown for large extrachromosomal DNAs37.

Our demonstration that eccDNAs can dramatically induce type I interferon expression (Figs. 4 and 5), combined with previous observations that type I interferons possess adjuvant activity40 and that DNA released from dying cells mediates aluminum adjuvant activity18, prompt us to propose that eccDNAs possess high adjuvant activity. In addition, the existence of eccDNAs in plasma10 suggests that eccDNA is highly mobile. Given that increased levels of cell-free DNA and serum IL-6 and TNF are good predictors of disease severity leading to a cytokine storm41, we suspect that apoptosis-driven eccDNA generation and subsequent induction of cytokines might underlie the cytokine storm observed in diseases such as severe sepsis and coronavirus disease 2019 (COVID-19), as these diseases can involve massive cell death. If future studies confirm this notion, the eccDNA biogenesis and sensing pathway revealed in this study could serve as the basis for therapeutic interventions.

In summary, by providing answers to three key questions regarding the origin, biogenesis and biological function of eccDNAs, our study substantially advances understanding of eccDNAs. Further characterization of the molecular basis of eccDNA-mediated immune responses can provide new insight into innate immunity as well as vaccine design and immunotherapy.

Methods

Cell culture and apoptosis induction

mESC/E14 cells were cultured on dishes coated with 0.1% gelatin in standard LIF/serum medium containing mouse LIF (1,000 U ml–1), 15% foetal bovine serum (FBS), 0.1 mM non-essential amino acids, 0.055 mM β-mercaptoethanol (BME), 2 mM GlutaMAX, 1 mM sodium pyruvate and penicillin-streptomycin. HeLa S3 cells were grown in DMEM supplemented with 10% FBS and 100 U ml–1 penicillin-streptomycin. CH12F3 cells were cultured in RPMI 1640 supplemented with 10% heat-inactivated FBS,100 U ml–1 penicillin-streptomycin, 2 mM BME and 2 mM GlutaMAX. L929 cells were cultured in DMEM supplemented with 10% heat-inactivated FBS and 100 U ml–1 penicillin-streptomycin.

Apoptotic cell death of mESCs was induced with 0.5 μM etoposide (Selleck) or 2 μM staurosporine (Selleck) for 24 h or by irradiation at 3 mJ using UV-C light in a Stratagene Stratalinker 2400 without medium and continued culture for 16 h. Apoptotic cell death was induced in CH12F3 cells by treatment with 2 μM staurosporine for 16 h. Cell viability was analysed with a BD FACSCanto II instrument after staining with the Live/Dead Fixable Far Red Dead Cell Stain Kit. FSC-A and SSC-A gates were used to exclude debris, and FSC-A and FSC-H were used to gate on singlets and then gate on APC+ cells as dead cells. Data were analysed with FlowJo v.10.8.0.

Knockout cell line generation

Dnase1l3- and Endog-knockout mESC cell lines were generated by CRISPR–Cas9 through transient transfection with px330-mCherry (Addgene, 98750). mCherry+ cells were sorted by flow cytometry. Guide RNAs targeting sequences and PCR genotyping primers for Dnase1l3 and Endog are listed in Supplementary Table 1. The Lig1–/– CH12F3 cell line was generated using CRISPR–Cas9 guide RNAs targeting introns 17 and 19 to delete exons 18 and 19, which encode the conserved ligase catalytic site. Deletion resulted in a premature stop codon. The NucLig3–/– CH12F3 cell line was generated by specifically deleting the sequences encoding the nucLig3 start codon and the two subsequent methionine residues (Met89-Met144) with CRISPR–Cas9 guide RNAs while keeping mtLig3 in frame and functional. Lig4 is a single-exon gene, and two pairs of guide RNAs were used for two rounds of targeting to obtain homozygous deletion of the entire 2.7-kb exon. Guide RNA sequences are listed in Supplementary Table 1. Knockout cell lines were confirmed by immunoblotting.

eccDNA purification and visualization on agarose gels

To purify eccDNAs, cells were first dehydrated in >90% methanol before crude extrachromosomal DNA was extracted in an alkaline lysis buffer at pH 11.8. After neutralization and precipitation, crude extrachromosomal DNA was bound to a silica column (QIAGEN Plasmid Plus Midi Kit) in binding buffer (buffer BB from the QIAGEN Plasmid Plus Midi Kit). Bound DNA was eluted, digested with PacI (NEB) and Plasmid-Safe ATP-dependent DNase (Lucigen) for 4–16 h, then extracted with phenol:chloroform:isoamyl alcohol (PCI) solution (25:24:1) in a Phase Lock Gel tube (QuantaBio) to minimize DNA loss. After precipitation with carrier glycogen (Roche) and 1/10 volume of 3 M sodium acetate (pH 5.5), the precipitated crude eccDNAs were resuspended in solution A (One-Step Max Plasmid DNAout, TIANDZ). eccDNAs were selectively bound to magnetic silica beads in this solution and were then eluted with 0.1× elution buffer (1 mM Tris-HCl, pH 8.0), and the concentration was measured by Qubit dsDNA HS Assay kit (Thermo Fisher). A detailed eccDNA purification protocol is in preparation for publication.

For comparisons of eccDNA production among treatments or genotypes, both total DNA (Quick-DNA microPrep Plus Kit, Zymo) and eccDNA were purified from equal numbers of cells, eluted and loaded onto an agarose gel with equal volume. All DNA, except for in PCR genotyping, was resolved with vertical agarose gel electrophoresis and visualized by staining with SYBR Gold (Fisher Scientific; 1:10,000).

Synthetic small DNA circle preparation

Synthetic small DNA circles were prepared by the procedure of ligase-assisted minicircle accumulation (LAMA)44. Random DNA sequences were generated (https://faculty.ucr.edu/~mmaduro/random.htm) with 50% G+C content. The isomers of single-strand templates as well as their amplification primer sets were synthesized by IDT, and their sequences are listed in Supplementary Table 3. Products with a 5′-end phosphate were prepared with 2× Q5 DNA polymerase (NEB). Equal amounts of isomers were added to generate 100-μl HiFi Taq DNA ligase reaction mixtures and placed in thermocyclers using the following cycles: 95 °C for 3 min, 60 °C for 10 min and 37 °C for 5 min for at least 10 cycles. Circularized products were recovered with a PCR Purification Kit (Qiagen) and digested with Plasmid-Safe ATP-dependent DNase (Lucigen) before being recovered with a PCR Purification Kit.

SAFM imaging

SAFM imaging of DNA was performed in dry mode45. Briefly, 1/10 volume of 10× imaging buffer (100 mM NiCl2 and 100 mM Tris-HCl, pH 8.0) was added to the sample to reach a final DNA concentration of 0.6–1.0 ng μl–1, and 2–5 μl of the mixture was then spread on a freshly cleaved mica (Ted Pella) surface. After 2 min of incubation, the specimen was rinsed twice with 30 μl of 2 mM magnesium acetate, drying before and after the rinses with compressed air. Images were acquired by using tip C of an SNL-10 probe on a Veeco MultiMode atomic-force microscope with a Nanoscope V Controller in ‘ScanAsyst in Air mode’ and processed with Gwyddion 2.50.

Library preparation and eccDNA sequencing

The Nanopore sequencing library for eccDNA was prepared with the Ligation Sequencing Kit (Oxford Nanopore) according to the manufacturer’s instructions after RCA and debranching. RCA was performed with phi29 DNA polymerase (NEB) with some modifications to ensure efficient amplification from 100 pg of template per reaction. Briefly, each 20-μl reaction mixture contained 2 μl of 10× phi29 DNA polymerase buffer (NEB), 2 μl of 25 mM dNTPs, 1 μl Exo-Resistant Random Primer (Thermo Fisher) and ≥100 pg of eccDNA, with ultra-pure water added to a maximum volume of 17.6 μl. Reactions were mixed and incubated at 95 °C for 5 min before ramping the temperature down to 30 °C at a 1% ramp rate. Then, 1 μl of phi29 DNA polymerase, 0.6 μl of inorganic pyrophosphatase (yeast, NEB), 0.4 μl of 0.1 M DTT (NEB) and 0.4 μl of 20 mg ml–1 BSA (NEB) were added. The reaction mixture was incubated at 30 °C for 10–16 h. Because high branch structure in RCA products can block nanopores and abolish sequencing, RCA products for eccDNAs were further debranched with T7 endonuclease I (NEB) before being used for sequencing library construction with the Ligation Sequencing Kit (Oxford Nanopore, SQK-LSK109). The library was sequenced in a flow cell (R9.4.1, FL-MIN106D) on a MinION instrument according to the manufacturer’s instructions.

Illumina sequencing libraries for eccDNA were prepared by Tn5-transposon-based tagmentation with the Nextera XT DNA Sample Preparation Kit according to the manufacturer’s instructions. Briefly, after validating the purity of the eccDNAs with SAFM imaging, 0.5 ng of pure eccDNAs was directly tagmented with Tn5 transposase, followed by 12–14 cycles of PCR amplification with Illumina sequencing adaptors. Barcoded libraries were pooled and sequenced with the Illumina 2500 platform in 150-bp paired-end mode.

eccDNA sequencing data analyses

Nanopore base calling and reads mapping

The fast5 files generated by Nanopore MinION were fed to Guppy (version 3.5.2) for base calling. The parameters used for Guppy were as follows: -flowcell FLO-MIN106 --kit SQK-LSK109 --qscore_filtering --calib_detect --trim_barcodes --trim_strategy dna --disable_pings --device auto --num_callers 16. The generated reads in fastq format were further processed by porechop (version 0.2.4) to remove adaptor sequences for each read with the following parameters: --extra_end_trim 0 --discard_middle. To reduce artefacts due to misalignment during read mapping, we compiled a customized reference mouse genome (mm10combine) based on mm10 reference sequences. Briefly, we downloaded all the nucleotide sequences from the NCBI NT database (30 October 2018), and R/Bioconductor genbankr (version 1.10.0) was then used to distinguish mouse contigs from the contigs of other species. On the basis of each contig’s description and manual inspection, we removed all gene-related contigs, retaining only 15,984. The selected fasta sequences were extracted using the command ‘blastdbcmd -db nt_db -entry_batch selected_ids.txt -out selected_ids.fa -outfmt %f’. The fasta sequences were mapped to the mm10 genome. Finally, we selected contigs that could not be mapped, contigs for which less than 50% of the sequence mapped uniquely and contigs that were uniquely mapped but with a sequencing quality score of <10. In total, 103 contig sequences were added to mm10 to build the mm10combine reference genome. Cleaned reads were then aligned to mm10combine using minimap2 (version 2.17)46 with the following parameters: -x map-ont -c --secondary=no -t 16. The alignments for each read were stored in PAF format.

Consensus eccDNA generation

To obtain the consensus boundary and sequence of each eccDNA from the mapped RCA long reads, we developed a tool (https://github.com/YiZhang-lab/eccDNA_RCA_nanopore) that uses the alignments in PAF files as input and outputs eccDNA fragment composition (chromosome and genomic start and end positions of each fragment), successive fragment coverage (number of passes) and the consensus sequence derived from each RCA long read. The subreads of each RCA long read could be mapped to one genomic location or multiple locations. Subreads with mapping quality lower than 30 were discarded. The tool performed bootstrapping of successive subreads in each RCA long read to check whether the order of the mapped genomic locations for each subread was concordant with the order in the RCA long read. Because of the inaccuracy and gap-prone nature of Nanopore reads, we allowed a maximum of 20 bp offset of the mapped genomic positions (start and end positions) for two subreads to be considered as mapping to the same location. Reads with discordant subread order, location or strand were discarded. The exact boundaries of eccDNA fragments were determined by voting from the subreads’ start and end positions. Boundary positions were further refined by threading the subreads to ensure no gaps or overlaps between any successive subreads. The number of passes for each eccDNA fragment was calculated as the number of concordant subreads mapped to that location. Only eccDNAs with at least two passes were kept for downstream analysis. Each eccDNA sequence was derived from the reference genome sequence to which it mapped, with sequence variants incorporated. Sequence variants were called from subreads mapped to the corresponding location, requiring a minimum depth of 4 and a minimum allele frequency of 0.75.

Genomic distribution of eccDNAs

The eccDNA fragments were piled up across the genome. To remove PCR duplicates, eccDNA fragments on the same chromosome with the same start and end positions were treated as duplicates and only one was retained. The coverage of unique eccDNA fragments at each base of the genome was obtained using bedtools (version 2.29.2)47 and stored in bigwig files. The distribution of eccDNA fragments across each chromosome was plotted using karyoploteR (version 1.14.1)48 with the bigwig file as input.

Mapping of Illumina sequencing reads

Raw Illumina sequence reads were first processed by Trimmomatic (version 0.39)49 to remove sequencing adaptors and low-quality reads, using the following parameters: ILLUMINACLIP:adapters/NexteraPE-PE.fa:2:30:10:1:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:75 TOPHRED33. BWA50 MEM (version 0.7.17) was used with default parameters. Then, the reads were mapped to our customized mm10combine reference genome. Duplicated reads were removed by Picard (version 2.23.4). Reads with a mapping quality score of at least 60 were considered as uniquely mapped and used for downstream analysis. The genomic coverage was calculated using bamCoverage from deeptools (version 3.5.0)51 with binSize 1.

Western blot analysis

Equal numbers of cells were lysed in NuPAGE LDS sample buffer (Thermo Fisher), and protein extracts were resolved by SDS–PAGE and transferred to PVDF membrane. Antibodies against Lig1 (Proteintech; 1:1,000), Lig3 (BD Biosciences; 1:1,000), Lig4 (a gift from D. Schatz (Yale University); 1:1,000), Myd88 (ProSci; 1:1,000), Sting (Proteintech; 1:1,000) and Gapdh (Thermo Fisher; 1:20,000) were used. Uncropped and unprocessed scans of blots are provided in Supplementary Fig. 1.

BMDC and BMDM preparation and stimulation

Male mice, including WT C57BL/6, Sting1–/– (Tmem173gt, stock no. 017537) and Myd88–/– (Myd88tm1.1Defr, stock no. 009088) mice, were purchased from Jackson Labs and were housed on a 12-h light/dark cycle at 23 °C with 45–55% humidity. After at least 7 days of habituation, mice between 8 and 12 weeks old were used to collect bone marrow cells for BMDC and BMDM differentiation. BMDCs were differentiated in RPMI 1640 supplemented with 10% heat-inactivated FBS (Sigma), 10 mM HEPES, 1 mM sodium pyruvate, 100 U ml–1 penicillin-streptomycin, 2 mM GlutaMAX and 20 ng ml–1 mouse granulocyte and macrophage colony-stimulating factor (GM-CSF; Peprotech). BMDMs were differentiated in DMEM supplemented with 10% heat-inactivated FBS, 100 U ml–1 penicillin-streptomycin and 20% L929 conditioned medium. Half of the medium was replaced at day 3 and day 6. BMDCs and BMDMs were confirmed to be CD11c+MHC II+ and F4/80+CD11b+, respectively, after excluding debris with FSC-A and SSC-A gates and subsequent FSC-A and FSC-H gates on singlets. Data were analysed with FlowJo v.10.8.0. For cell stimulation, cells at days 7–9 were seeded in 96-well plates at 3.5 × 104 cells per well. DNA was transfected into cells with FuGENE HD (Promega) in Opti-MEM (Gibco) according to the manufacturer’s instructions after measuring its concentration with a Qubit dsDNA HS Assay Kit (Thermo Fisher). All transfections were performed for 12 h except as indicated, media were collected for ELISA and cells were lysed with TRIzol (Thermo Fisher) for RNA isolation.

Transfection efficiency assays

To determine the transfection efficiency for linear and circular DNA, a set of primers (for sequences, see Supplementary Table 2) with five phosphorothioate bonds at their 5′ end were used to prepare end-protected linear DNA by PCR. To balance the effects of the phosphorothioate bonds, the circular form was prepared with the same number of phosphorothioate bonds as the linear one. Then, DNA concentration was determined by Qubit dsDNA HS Assay Kit (Thermo Fisher), and DNA was transfected into BMDCs as described above with FuGENE HD (Promega). After transfection, cells were rinsed three times with PBS and lysed in 100 μl of lysis buffer (50 mM Tris-HCl pH 8.0, 1 mM EDTA, 0.5% Tween-20, 3 U ml–1 thermolabile proteinase K (NEB, P8111S)) and incubated at 37 °C for 2 h followed by incubation for 15 min at 55 °C to inactivate proteinase K. Four microlitres of cell lysate was used for qPCR with a set of primers targeting both linear and circular DNA to determine the amount of transfected DNA.

Incubation of BMDCs with supernatant from apoptotic cells

WT and Dnase1l3–/– cells in 10-cm dishes that were 80–100% confluent were washed three times with PBS and irradiated with 3 mJ of UV-C light in a Stratagene Stratalinker 2400, 10 ml Opti-MEM (Gibco) was added and cells were cultured for another 48 h. Medium was centrifuged at 650g for 5 min, and the supernatant was filtered through a 0.45-μm filter. Four hundred microlitres of supernatant was left untreated or treated with enzyme (PacI, Plasmid-Safe ATP-dependent DNase, RNase A/T1 or benzonase, as indicated) in a 500-μl reaction volume at 37 °C for 2 h and then dialyzed (molecular weight cut-off (MWCO), 10 kDa) with fresh Opti-MEM at 4 °C overnight to deplete ATP, which is required for the activity of Plasmid-Safe ATP-dependent DNase. Then, 100 μl per well of dialyzed supernatant was added to BMDCs in 96-well plates, and an equal volume of fresh Opti-MEM was added in parallel to separate wells as a mock control; 12 h later, cells were collected for RT–qPCR analysis, and data are presented as relative mRNA levels with respect to those of mock controls after normalizing to Gapdh.

RNA isolation, RT–qPCR, RNA-seq and ELISA analyses

Cellular RNA was isolated with a Zymo Direct-zol RNA Miniprep kit. cDNA was synthesized with SuperScript III, and qPCR was performed with Fast SYBR Green Master Mix (Thermo Fisher). The primer sequences for qPCR of each gene are listed in Supplementary Table 2. Gene induction levels are presented as the relative fold change with respect to mock treatment after normalizing to Gapdh. Bulk RNA-seq libraries were prepared by following the manufacturer’s instructions for the NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, E7420S). For ELISA analysis, ELISA kits for IFNβ, IL-6 and TNF were obtained from BioLegend, and IFNα ELISA kits were obtained from PBL Assay Science. Assays were performed according to the manufacturers’ instructions. Appropriate volumes of culture medium were used to ensure that the readouts were within the range of the standard curve.

RNA-seq data analysis

For RNA-seq data, adaptors and low-quality reads were trimmed using Trimmomatic (version 0.39)49 with the following parameters: ILLUMINACLIP:adapters/TruSeq3-PE.fa:2:30:10:1:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50 TOPHRED33. Cleaned paired-end reads were aligned to the mm10 reference genome with GENCODE52 mouse gene set M24, using STAR (version 2.7.6a)53 with the following parameters: --outSAMunmapped Within --outFilterType BySJout --outSAMattributes NH HI AS NM MD --outFilterMultimapNmax 20 --outFilterMismatchNmax 999 --outFilterMismatchNoverReadLmax 0.04 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --sjdbScore 1 --outSAMtype BAM SortedByCoordinate --quantMode TranscriptomeSAM. RSEM (version 1.3.3)54 was used to quantify gene expression levels using the reads aligned to the transcriptome in the bam file as input, with the following parameters: --alignments --estimate-rspd --calc-ci --no-bam-output --seed 12345 --ci-memory 30000 --paired-end --strandedness reverse. Differentially expressed genes were identified using the DESeq2 package42.

eccDNA linearization

eccDNA linearization was performed by sequential treatment of eccDNAs with the nickase fnCpf1 (Applied Biological Materials)37 and single-strand DNA-specific nuclease. Fifty nanograms of eccDNAs was nicked in a 20-μl reaction that contained 1/8 volume of 8× fnCpf1 linearization buffer (160 mM HEPES pH 7.5, 1.2 M KCl, 4 mM DTT, 0.8 M EDTA, 80 mM MnCl2) and 1 μl fnCpf1. After incubating at 37 °C for 1 h, the treated eccDNAs were extracted with PCI solution (25:24:1) in a Phase Lock Gel tube (QuantaBio) and precipitated at −80 °C with carrier glycogen (Roche) and 1/10 volume of 3 M sodium acetate (pH 5.5). Nicked eccDNAs were linearized in 10-μl reactions that contained 2 μl of 5× buffer (0.25 M sodium acetate pH 5.2, 1.4 M NaCl, 25 mM ZnSO4) and 1 μl S1 nuclease (Thermo Fisher) and were incubated at 37 °C for 5 min. Reactions were stopped by adding 40 μl of 10 mM Tris-HCl (pH 8.0), and linear eccDNAs were immediately recovered with 75 μl of SPRIselect beads (Beckman Coulter). Successful linearization of eccDNAs was confirmed by efficient digestion with Plasmid-Safe ATP-dependent DNase (Lucigen).

Statistics

Ordinary one-way ANOVA and two-tailed unpaired t-tests were performed with GraphPad Prism 9.

Ethics statement

All procedures on animals involved in this study were conducted according to protocols approved by the Harvard Medical School IACUC.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.