Article | Published:

The transcriptional regulator Aire binds to and activates super-enhancers

Nature Immunology volume 18, pages 263273 (2017) | Download Citation


Aire is a transcription factor that controls T cell tolerance by inducing the expression of a large repertoire of genes specifically in thymic stromal cells. It interacts with scores of protein partners of diverse functional classes. We found that Aire and some of its partners, notably those implicated in the DNA-damage response, preferentially localized to and activated long chromatin stretches that were overloaded with transcriptional regulators, known as super-enhancers. We also identified topoisomerase 1 as a cardinal Aire partner that colocalized on super-enhancers and was required for the interaction of Aire with all of its other associates. We propose a model that entails looping of super-enhancers to efficiently deliver Aire-containing complexes to local and distal transcriptional start sites.


Medullary thymic epithelial cells (mTECs) are involved in both negative selection of effector T cells and positive selection of regulatory T cells1. A unique feature of mTECs, which is critical for their roles in tolerance induction, is expression of a large fraction of the genome, particularly scores of loci encoding antigens characteristic of fully differentiated parenchymal cells (peripheral-tissue antigens, PTAs)2,3,4. Much of this transcription is driven by Aire5. mTECs from Aire-deficient mice and humans show severely compromised PTA expression, causing these individuals to develop autoimmune infiltrates and autoantibodies targeting multiple peripheral tissues6,7.

Several observations argue that Aire is a transcriptional regulator that operates differently from conventional transcription factors6,7. First, unlike traditional factors, the transcriptional effect of Aire on mTECs involves a large, although still select, portion of the genome2,3,4. Experimental introduction of Aire into extra-thymic cells induces expression of large sets of transcripts, which differ from cell type to cell type and also diverge from those induced in mTECs8. Second, Aire-induced gene expression has a strong element of stochasticity, with individual mTECs transcribing only a small subset of the total repertoire of induced PTA transcripts3,4,9,10. The subsets of transcripts induced in individual cells exhibit both intra- and inter-chromosomal clustering3,4. Third, Aire appears to not bind to a particular promoter or enhancer motif, exhibiting only a low, non-discriminatory affinity for DNA11. Instead, it seems to recognize generic features of transcriptionally quiescent sites, such as chromatin marks typical of silenced loci, for example, unmethylated lysine 4 of histone 3 (H3K4me0)11,12, and promoters with stalled RNA polymerase II (RNA-PolII)13,14.

Screening approaches have uncovered a large cast of structural and functional Aire partners, which fall into multiple functional classes, notably nuclear transport, chromatin structure and/or binding, transcription (including the DNA-damage response), and pre-mRNA processing15,16. However, we remain ignorant of the genomic location, architecture, biogenesis and function of the resulting Aire-containing complexes. We addressed these issues by exploiting recent advances in genome-wide chromatin mapping techniques, which now permit analysis of the small mTEC numbers available ex vivo17, and by applying diverse biochemical approaches. We found that Aire was located on and activated super-enhancers, defined as chromatin regions that host exceptionally high concentrations of transcriptional regulators. The topoisomerase TOP1 emerged as a cardinal Aire partner that colocalized on super-enhancers and was required for Aire interaction with all of its other partners.


Aire is located on mTEC super-enhancers

Our first goal was to map the genome-wide distribution of Aire ex vivo in mTECs, particularly its relationship to diagnostic histone marks, using chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq). Aire bound to 42,124 sites scattered throughout the genome of Ly51loMHCIIhi mTECs (called mTEChi hereafter) from 4–6-week-old female C57BL/6(B6) Aire+/+ mice, in comparison with the <12,000 sites detected in Aire-transfected cell lines14,18. The Aire signal was robust and reproducible, with >75% of the binding sites being common to the two biological replicates routinely examined (Supplementary Fig. 1 and Supplementary Table 1). Genome-wide, Aire bound primarily to intergenic, transcriptional start site (TSS) and intronic regions (Fig. 1a). Aire was situated along both Aire-induced and Aire-neutral genes but there was a substantially higher ratio of intergenic/TSS localization in the former compared with the latter set of genes (7.3/1 versus 1.2/1). The density of Aire around TSSs was demonstrably lower for Aire-induced than Aire-neutral genes, whereas its representation on intergenic and intronic regions did not vary much with the gene-induction status (Fig. 1b).

Figure 1: Concurrence of Aire and super-enhancers in mTECs.
Figure 1

(a) Distribution of 42,124 Aire-binding sites, derived from merged Aire ChIP-seq data from two independent experiments, along the entire genome in B6.Aire+/+ mTEChi (All), 3,588 sites annotated to Aire-induced genes and 6,238 sites annotated to expression-matched Aire-neutral genes. Aire-induction status was defined as Aire+/+/Aire−/− > 2 for Aire-induced and as Aire+/+/Aire−/− > 0.9 and < 1.1 for Aire-neutral genes. TTS, transcription termination site; UTR, untranslated region. Genomic architecture came from RefSeq. (b) Aire tag density 3 kb up- or downstream of Aire binding sites annotated to the TSS, intronic or intergenic regions of Aire-induced or Aire-neutral genes in B6.Aire+/+ mTEChi (P = 10−32 for TSS, Wilcoxon rank-sum test). (c) Delineation of super-enhancers, based on H3K27ac overloading, in B6.Aire+/+ mTEChi using the ROSE algorithm21. SE, super-enhancer; CE, conventional enhancer. (d) Heat maps of tag density for the indicated proteins 500 kb up- or downstream of H3K27ac-delimited super-enhancers in B6.Aire+/+ mTEChi. (e) Normalized ChIP-seq profiles for the indicated proteins at exemplar super-enhancers in B6.Aire+/+ mTEChi. Numbers to the right indicate the ranges of normalized tag densities. The genes indicated are Aire neutral. (f) Tag density for H3K27ac (top) and Aire (bottom) at individual H3K27ac peaks extracted from super-enhancers versus conventional enhancers, as defined in c. (g) Tag density for H3K27ac in B6.Aire+/+ versus Aire−/− mTEChi (top) and Aire-positive mTEChi versus Aire-negative mTEChi from Adig mice (bottom) (P = 4.1 × 10−37, top; P = 2.7 × 10−10, bottom; Wilcoxon rank-sum test). (h) Aire-induced transcripts (Aire+/+ versus Aire−/− mTEChi) 500 kb up- or downstream of H3K27ac-delimited super-enhancers (top), compared with random, size-matched chromatin stretches (bottom). Data are representative of two independent experiments. Data for one (cf) or both (g, bottom) H3K27ac ChIP-seq experiments came from ref. 22, whereas RNA-seq data in h came from ref. 3.

Super-enhancers are chromatin elements that serve as extended and overloaded depots for a multiplicity of general and cell-type-specific transcription factors19,20. Because they are often associated with, and regulate, genes diagnostic of fully differentiated cell types (that is, PTA genes), we investigated whether such elements might preferentially harbor Aire in mTECs. Following standard practice21, we defined mTEC super-enhancers as stretches >3.9 kb of histone 3 acetylated at lysine 27 (H3K27ac), a chromatin mark indicative of active enhancers. By this definition, mTEChi had 1,170 super-enhancers, of 30 kb average size, scattered over the chromosomes (Fig. 1c and Supplementary Fig. 2a). H3K4me1, another active chromatin mark, was selectively included in these super-enhancers, whereas H3K27me3, indicative of inactive chromatin, was excluded (Fig. 1d,e and Supplementary Fig. 2b), supporting the validity of our mTEChi super-enhancer designations. Aire was also preferentially bound to these super-enhancers, along with its structural and functional associate RNA-PolII (Fig. 1d,e and Supplementary Fig. 2b). To determine whether the super-enhancers that we had delineated were merely a conglomerate of typical enhancers, we bioinformatically sorted all of the super-enhancers and conventional enhancers from the 'hockey plot' derived from the ROSE algorithm21 (Fig. 1c), focused on individual H3K27ac peaks in the two sets of enhancers, and compared their Aire densities according to the ChIP-seq data. Super-enhancers hosted higher densities of Aire than did conventional enhancers (Fig. 1f).

To address the functional importance of Aire localization to super-enhancers, we compared ChIP-seq data obtained in parallel on mTEChi isolated from B6.Aire+/+ and B6.Aire−/− littermates. The density of H3K27ac marks on super-enhancers was significantly lower in Aire−/− mTEChi (Fig. 1g). In addition, a reanalysis of published ChIP-seq data sets22 revealed a lower density of H3K27ac marks on super-enhancers from immature Aire mTECs compared with mature Aire+ mTECs from wild-type mice (Fig. 1g).

To localize the transcriptional effect of Aire in relation to super-enhancers, we compared gene expression in regions stretching 500 kb 5′ or 3′ of super-enhancers in Aire+ and Aire mTEChi. Aire induced transcription in genomic regions extending in both directions from, but not overlapping, super-enhancers (Fig. 1h), whereas similar Aire-dependent transcriptional changes did not occur up- or downstream of random, size-matched genome stretches (Fig. 1h).

In addition, we performed ATAC-seq (assay of transposase-accessible chromatin followed by high-throughput sequencing23) to provide a genome-wide view of chromatin accessibility in ex vivo mTEChi. The chromatin stretches delineated above to be mTEChi super-enhancers had elevated ATAC signal densities (compared with other chromatin regions) in Aire+ mTEChi, but not in control earskin fibroblasts (Fig. 2a), indicating that they were preferentially open in the former case. As we had seen for H3K27ac signals, ATAC signals were substantially higher for individual peaks in super-enhancers than in conventional enhancers (Fig. 2b). In addition, comparison of ATAC signals in mTEChi of Aire+/+ and Aire−/− mice revealed that Aire induced the accessibility of super-enhancers (Fig. 2c) and, to a lesser extent, that of conventional enhancers (data not shown). This effect is perhaps most evident from a plot of Aire ChIP-seq signal against the fold change in ATAC signals in Aire+ versus Aire mTEChi. This ATAC signal differential was higher in super-enhancers than in random size-matched chromatin stretches (Fig. 2d), as indicated by the preferential concentration of ATAC signal on mTEChi super-enhancers in Aire+ mTEChi in comparison with Aire mTEChi and control earskin fibroblasts (Fig. 2e). These observations indicate that Aire was preferentially associated with and activated super-enhancers, resulting in transcriptional induction upstream and downstream.

Figure 2: Aire controls chromatin accessibility of super-enhancers.
Figure 2

(a) Heat maps displaying tag density for H3K27ac ChIP-seq in B6.Aire+/+ mTEChi and ATAC-seq signal in Aire+/+ mTEChi versus earskin fibroblasts at mTEChi super-enhancers defined in Figure 1c. (b) Histogram for ATAC-seq tag density at individual H3K27ac peaks in super-enhancers versus conventional enhancers in Aire+/+ mTEChi. (c) Line plot displaying ATAC-seq tag density in Aire+/+ versus Aire−/− mTEChi at mTEChi super-enhancers (P = 6.6 × 10−52, Wilcoxon rank-sum test). (d) Scatter plot for Aire-induced (Aire+/+ versus Aire−/−) chromatin accessibility versus Aire ChIP-seq signal at super-enhancers and random, size-matched genomic stretches as in Figure 1h (bottom). (e) Normalized ATAC-seq and Aire ChIP-seq profile at exemplar super-enhancers in various cell populations. Numbers to the right indicate the ranges of normalized tag densities. The indicated genes are Aire neutral. Data are representative of two independent experiments. Data for one of the H3K27ac ChIP-seq experiments in a came from ref. 22.

Aire participates in multiple multi-protein complexes

Given that the architecture and dynamics of Aire-containing multi-protein complexes are ill defined, we next investigated how Aire-containing complexes assemble to promote transcription. Because the low numbers of mTEChi that can be isolated from mouse thymi precludes this type of study, we used HEK293T cells transfected with a construct encoding Aire with a FLAG tag at the amino terminus (FLAG-Aire). A plot of genome-wide H3K27ac densities from published ChIP-seq data24 revealed that Aire and RNA-PolII were preferentially located on super-enhancers in HEK293T cells (Supplementary Fig. 3).

Gel filtration chromatography provides information on protein complex number and composition. Because Aire is known to form molecular conglomerates of >670 kDa25, we applied nuclear extracts from FLAG-Aire-HEK293T cells to a Superose-6 column and immunoblotted eluted fractions with a FLAG antibody (Ab). Aire was broadly distributed, spanning fractions 9–14 (669–2,000 kDa; Fig. 3a). We then pooled fractions 9–11 and fractions 12–14, immunoprecipitated Aire-containing complexes with a FLAG Ab, and immunoblotted the precipitated material with antibodies recognizing a panel of Aire partners. Based on their co-elution profiles, Aire partners divided into three groups: those maximally eluted in fractions 9–11 (SFRS3, DDX5), in fractions 12–14 (DNA-PKcs, Ku80, PARP-1, DSIF, CDK9, BRD4) and in all of the Aire-containing fractions (TOP2A, RNA-PolII) (Fig. 3b). These findings indicate that Aire participates in at least two multi-protein complexes.

Figure 3: Association of Aire with multiple multi-protein complexes.
Figure 3

(a) Coomasie-stained SDS-PAGE gel and anti-FLAG (Aire) immunoblot for gel filtration fractions (8–19) of HEK293T cells transfected with FLAG-Aire-containing expression plasmid. (b) Immunoblotting for interaction of Aire with its various partners (left margin) in pooled gel filtration fractions 9–11 and 12–14 of HEK293T cells transfected with empty plasmid (pCMV) or FLAG-Aire-containing expression plasmid (pCMV-Aire). IP, immunoprecipitation. (c) Detection of interaction of various Aire partners (left margin) with DNA-PKcs (columns C and D) or with FLAG-Aire after DNA-PKcs depletion with anti-DNA-PKcs antibody (columns G and H) in FLAG-Aire-transfected HEK293T cells. (d) Scatter plot displaying the relative interaction of various Aire partners with DNA-PKcs (as percent input; that is, column D versus B of c) versus their percent interaction with Aire post-DNA-PKcs depletion compared with post-IgG (control) treatment (that is, column H versus G of c). (e) Heat map displaying the percent interaction of the various partners with Aire post shRNA-mediated partner knockdown (normalized on LacZ-shRNA (control) transduction). HEK293T cells were transduced with shRNA against indicated Aire partners (except BRD4, which was inhibited by I-BET151), followed by transfection with a FLAG-Aire-containing expression plasmid, immunoprecipitation with an antibody to FLAG (Aire) and immunoblotting for the indicated proteins. Representative primary immunoblot data for e can be found in Supplementary Figure 4. Data are representative of two (a,b,e) or three (c,d) independent experiments with similar results.

Next, we performed antibody pre-clearing experiments to reveal the extent to which designated Aire partners co-reside in complexes. DNA-PKcs is one of the Aire-interacting proteins that is most consistently detected15. Co-immunoprecipitation experiments revealed that DNA-PKcs associated with all of the Aire partners implicated in transcription (Ku80, PARP-1, BRD4, CDK9, TOP2A, DSIF, RNA-PolII), but none of those involved in pre-mRNA processing (DDX5, SFR53) (Fig. 3c). Pre-clearing of nuclear extracts with a DNA-PKcs Ab removed the Aire-containing complexes that also hosted DNA-PKcs and, to a differential degree, Aire-containing complxes that hosted its other partners (Fig. 3c). Plotting for each Aire partner its propensity to associate with DNA-PKcs versus its degree of interaction with Aire after DNA-PKcs depletion (Fig. 3d) revealed three classes of Aire-partner-interacting proteins: those not co-residing with DNA-PKcs in Aire-containing complexes (DDX5, SFRS3), those largely (>80%) co-residing (Ku80, BRD4, DSIF, RNA-PolII), and those partially (40–60%) co-residing (PARP-1, TOP2A, CDK9). These results also suggest the existence of two to three distinct Aire-containing complexes.

To address how removal of particular protein partners influence the formation of Aire-containing complexes, we co-transfected FLAG-Aire-HEK293T cells with one of four pre-validated cognate shRNAs for individual Aire partners (expressed in the pLKO.1 vector) and assessed the ability of Aire to interact with its remaining partners in immunoprecipitation experiments. Because knockdown of BRD4 was not efficient using this approach, we used the small molecule inhibitor I-BET151 (ref. 26). We focused on the Aire partners implicated in transcriptional regulation, as the partners involved in transcription and pre-mRNA processing are known to behave independently in this assay15. Knockdown of DNA-PKcs or PARP-1 expression in HEK293T cells abolished the interaction of Aire with all of the Aire partners examined (Fig. 3e and Supplementary Fig. 4), indicating that these factors are essential for the assembly of the Aire-containing complexes. Inhibition of BRD4 and CDK9 inhibited the interactions between Aire and its associates CDK9, TOP2A and DSIF (Fig. 3e and Supplementary Fig. 4), which are known to be involved in transcriptional elongation27,28, but not those involved in the DNA-damage response29. Knockdown of TOP2A and DSIF compromised the ability of Aire to interact with only DSIF and RNA-PolII (Fig. 3e and Supplementary Fig. 4). Together, these observations indicate that Aire participated in at least two, and potentially three, multi-protein complexes.

TOP1 is a primary Aire partner

We previously proposed that TOP2A is an early Aire partner, suggesting a scenario in which Aire 'freezes' the enzymatic activity of TOP2A, thereby stabilizing double-stranded breaks (DSBs) and inciting the DNA-damage response via DNA-PK activation15. Because shRNA-mediated knockdown of TOP2A had a limited effect and inhibited the association of Aire only with DSIF and RNA-PolII (Fig. 3e and Supplementary Fig. 4), we investigated whether the ability of Aire to promote DSBs15 reflects an early interaction with other topoisomerases. First, we revisited mass-spectrometry (MS) data from several published or unpublished experiments aimed at identifying proteins that co-immunoprecipitate with Aire in HEK293T cells (ref. 15 and data not shown). TOP2A peptides were detected in most of these experiments, but peptides from TOP2B and TOP1 were also found (Table 1). All three enzymes were detected on immunoblots of proteins co-immunoprecipitated with Aire in FLAG-Aire-HEK293T cells (Fig. 4a). shRNA-mediated knockdown of TOP2B in these same cells significantly decreased Aire interactions with only DSIF and RNA-PolII (Fig. 4b and Supplementary Fig. 5a), indicating that TOP2B had a restricted effect on the association of Aire with its partners, similar to what was observed above after dampening TOP2A expression. In contrast, knockdown of TOP1 strongly inhibited the interaction of Aire with all partners tested, except DDX5 and SFRS3, which are involved in pre-mRNA processing (Fig. 4c and Supplementary Fig. 5b), a pattern reminiscent of that seen following dampening of DNA-PKcs or PARP-1 expression. Notably, TOP1 knockdown reduced the association of Aire with TOP2B and TOP2A (Fig. 4d), whereas TOP2B knockdown inhibited Aire interaction with only TOP2A, and TOP2A knockdown failed to affect the association of Aire with either TOP2B or TOP1 (Fig. 4d). The antibody pre-clearing assay showed that removal of TOP1-containing complexes strongly compromised the interaction of Aire with all of its partners that were implicated in transcription (Fig. 4e and Supplementary Fig. 5c), but did not affect associations with partners involved in pre-mRNA processing. Addition of the DNA intercalator ethidium bromide during the pulldown revealed that the Aire-containing complexes were not preformed, but rather required DNA binding30 (Supplementary Fig. 5d).

Table 1: Co-immunoprecipitation of topoisomerases with Aire
Figure 4: Primacy of TOP1 as an Aire partner.
Figure 4

(a) Immunoblots displaying interaction of Aire (FLAG) with TOP2A, TOP2B and TOP1 in FLAG-Aire-transfected HEK293T cells. IP, immunoprecipitation. (bd) Immunoblots (d) and quantitative graphs (bd) displaying the percentage interaction of Aire with various partners in FLAG-Aire-expressing HEK293T cells transduced with shRNA against TOP2B (b,d), TOP1 (c,d) or TOP2A (d) relative to HEK293T cells transduced with LacZ-shRNA (control) (***P < 0.001 versus shLacZ, unpaired Student's t test). Representative primary immunoblot data for b and c can be found in Supplementary Figure 5a,b. (e) Scatter plot displaying the relative interaction of indicated Aire partners with TOP1 (as percent input; Supplementary Fig. 5c, column D versus B) versus their percentage interaction with Aire after anti-TOP1-antibody-mediated TOP1 depletion (Supplementary Fig. 5c, column H versus G) from nuclear extracts of FLAG-Aire-expressing HEK293T cells relative to control rabbit-IgG-mediated depletion. Primary immunoblot data supporting for e can be found in Supplementary Figure 5c. Data are representative of two independent experiments with similar results (standard box-and-whisker graphs in bd are from n = 8 measurements pooled from two experiments). Shown are standard box-and-whisker graphs representing lowest and highest value with box representing first and third quartile.

To evaluate the scenario in which TOP1 induces the DSBs that initiates the formation of Aire-containing multi-protein complexes, whereas TOP2A and TOP2B are involved in downstream events, we performed ChIP-seq analysis on ex vivo mTEChi, comparing the distribution of TOP1 and TOP2A (for which reliable ChIP-seq antibodies were available) with those of H3K27ac (which defines super-enhancers), Aire and γH2AX (which delineates regions adjacent to DSBs29). TOP1 and γH2AX colocalized highly preferentially with Aire at super-enhancer regions (Fig. 5a,b and Supplementary Fig. 6), whereas the distribution of TOP2A was more dispersed, spreading beyond super-enhancers locally and far-distally (Fig. 5a,b and Supplementary Fig. 6). In addition, a comparison of the Aire-induced changes in super-enhancer-localized γH2AX ChIP-seq signals with Aire-induced changes in topoisomerase ChIP-seq signals revealed a strong correlation for TOP1, but not TOP2A (Fig. 5c). Thus, Aire coordinately affected the localizations of TOP1 and DSBs.

Figure 5: Colocalization of TOP1 and Aire at super-enhancers.
Figure 5

(a) Heat maps of binding of the indicated proteins up- and downstream of mTEChi super-enhancers delineated in Figure 1c. (b) Normalized ChIP-seq profiles for indicated proteins at exemplar super-enhancers in B6.Aire+/+ mTEChi. Numbers to the right indicate the ranges of normalized tag densities. The indicated genes are Aire neutral. (c) Scatter plot for correlation between Aire-induced (Aire+/+ versus Aire−/−) topoisomerase binding (left, TOP1; right, TOP2A) and Aire-induced DSBs (γH2AX binding) at mTEChi super-enhancers. Red line, lowess curve; R2, Pearson's correlation coefficient. TOP1, P < 2.2 × 10−16; TOP2A, P < 2.2 × 10−16 (for correlation coefficient from Student's t test). (d) Histograms of ChIP-seq tag density for TOP1 (left) or TOP2A (right) 3 kb up- or downstream of their binding sites annotated to the TSS, intronic or intergenic regions of Aire-induced genes in B6.Aire+/+ mTEChi. (e,f) Histograms of tag densities of indicated proteins (top left) 3 kb up- or downstream of TOP1 (e) and TOP2A (f) binding sites annotated to the TSS, intronic or intergenic regions of Aire-induced genes in B6.Aire+/+ mTEChi. Aire, P = 0.01; RNA-PolII, P = 0.003; γH2AX, P = 0.001; TOP1, P = 7.4 × 10−5 (for TSS versus intergenic TOP2A peaks from Wilcoxon rank-sum test). Aire, P = 0.003; RNA-PolII, P = 0.002; γH2AX, P = 0.002; TOP1, P = 8.9 × 10−5 (for TSS versus intronic TOP2A peaks from Wilcoxon rank-sum test). Data are representative of two independent experiments. Data for one of the H3K27ac ChIP-seq experiments in a and b came from ref. 22.

TOP2A was less concentrated than TOP1 at mTEChi super-enhancers, whereas the overall genomic distribution, specifically the partitioning between intergenic, TSS and intronic regions, of the two topoisomerases was similar, with relatively little TOP2A and TOP1 localizing to exonic regions (Fig. 5d). Focusing on the statistically significant TOP1 peaks annotated to Aire-induced genes, we found co-association of Aire, RNA-PolII and γH2AX at intergenic, TSS and intronic stretches; as expected, the γH2AX peaks were broad, especially at the TSSs (Fig. 5e). We detected little TOP2A in the TOP1 peaks, regardless of the genome element examined (Fig. 5e). In contrast, the statistically significant TOP2A peaks showed co-association with Aire, RNA-PolII, γH2AX and TOP1, but almost exclusively at the TSSs (Fig. 5f). These findings also suggest that TOP1 and TOP2A have divergent roles in Aire-induced transcription in mTECs, with TOP1 primarily being involved during initial complex assembly at super-enhancers and TOP2A mainly being involved during subsequent events.

TOP1 and TOP2 are required for Aire induction of gene expression

We used the topoisomerase inhibitors topotecan and etoposide, which block TOP1 and TOP2, respectively, to assess the importance of these topisomerases in vivo. These small-molecule inhibitors stabilize the enzymes when covalently bound to the DNA they just clipped, thereby inhibiting religation28. We injected 4-week-old B6.Aire+/+ female mice intraperitoneally with vehicle (DMSO), topotecan, etoposide or both drugs every day for 3 d, and then sorted the mTEChi fraction. None of the drugs altered the proportions of the major thymocyte or stromal cell compartments, as determined by flow cytometry (Supplementary Fig. 7). Nor did topotecan or etoposide treatment detectably influence mTEChi expression of Aire or any of the topoisomerases examined, as determined by flow cytometry (Supplementary Fig. 8a,b). Microarray-based gene-expression profiling, performed in biological triplicate on RNA from sorted mTEChi from drug-treated versus vehicle-treated mice, revealed that topotecan and etoposide, and especially the two together, preferentially repressed expression of the set of genes normally upregulated by Aire in mTEChi (Fig. 6a). Although the numbers of genes inhibited by each of the three treatments were similar (625–700), the reduction was less strong for etoposide than topotecan, whereas using both drugs resulted in the strongest reduction (Fig. 6a).

Figure 6: Requirement of TOP1 and TOP2 for Aire-induced mTEC gene transcription.
Figure 6

(a) Volcano plots (fold change versus P value) displaying microarray based analysis of transcriptome changes in mTEChi of B6.Aire+/+ mice treated with the indicated inhibitor(s) or just vehicle (DMSO) every day for 3 d. Orange, transcripts increased >2-fold in vehicle-treated Aire+/+ versus Aire−/− mice; cyan, transcripts decreased >2-fold. Numbers refer to transcripts up- (right) or downregulated (left) by the indicated inhibitor(s) (etoposide, P = 4 × 10−58; topotecan, P = 3 × 10−83; etoposide + topotecan, P = 2 × 10−107; P values for Aire-induced genes from χ2 test). (b) Feature-level analysis (as per ref. 14) of the transcriptome data depicted in a. Expression value fold change for each feature (exon) of Aire-induced transcripts that exhibited an exon1 imbalance in inhibitor versus vehicle-only treatment (leftmost three panels) in Aire+/+ mice or in Aire−/− versus Aire+/+ mice (rightmost panel) was plotted relative to the feature's distance from the TSS. Red horizontal lines, median of fold change for proximal features (<200 bp from TSS); black horizontal lines, median fold change for distal features (>200 bp from TSS) (etoposide, P = 4 × 10−4; topotecan, P = 2.4 × 10−9; etoposide + topotecan, P = 2.2 × 10−10; Aire−/−, P = 2.8 × 10−16; Wilcoxon rank-sum test). (c) Immunoblots for interaction of FLAG-Aire with TOP2A, TOP2B and TOP1 in FLAG-Aire-transfected HEK293T cells treated with indicated inhibitor for 24 h. Left, representative immunoblots; right, summary quantitative data. *P < 0.05 and **P < 0.01 versus DMSO (unpaired Student's t test). Data are representative of three (a,b) or two (c) independent experiments (error bars (c) represent mean ± s.d. for n = 2 measurements pooled from two experiments).

Feature-level analysis of gene-expression profiling data has been used to show that the effect of Aire on transcription is minimal just after the TSS, but increases after about 200 nucleotides, reflecting an effect on RNA-PolII pausing14,26. Plotting the ratio of Aire to Aire+ mTEChi expression values versus distance from the TSS revealed the effect of Aire on RNA-PolII pausing (Fig. 6b), as previously described14,26. Feature-level analysis of the mTEChi gene-expression data from mice treated with etoposide and topotecan versus vehicle alone suggests that these drugs had a preferential effect on distal features (>200 nucleotides from TSS), and a weaker preference with the single-drug treatments (Fig. 6b). Thus, topoisomerase inhibitors, especially in combination, operate at least in part by potentiating RNA-PolII pausing.

Co-immunoprecipitation studies on FLAG-Aire-HEK293T cells revealed that topotecan treatment blocked the ability of Aire to interact with all three topoisomerases, whereas etoposide compromised the association of Aire with TOP2, but not TOP1 (Fig. 6c). Together, these data indicate that both TOP1 and TOP2 had a substantial role in Aire-induced gene expression in mTECs. However, their roles were different, with TOP1 appearing to exert a stronger, earlier effect.

TOP1 and TOP2 are necessary for imprinting tolerance

Finally, we examined the immunologic consequences of disrupting the interaction between Aire and TOP1 or TOP2 by evaluating the clonal deletion of self-reactive thymocytes. We quantified the fraction of CD4+ T cells that recognize peptide P2 (294–306) of the retinal protein IRBP. Clonal deletion of these cells is known to be Aire dependent31. We injected 4-week-old B6.Aire+/+ mice intraperitoneally with topotecan, etoposide or just vehicle every third day for 3 weeks, and followed this with intraperitoneal inoculation with the P2 peptide in complete Freund's adjuvant. Compared with vehicle, treatment with etoposide or topotecan reduced the expression of Irbp mRNA in whole-thymus homogenates 10 d after P2 injection (Supplementary Fig. 8c). In addition, significantly more P2-specific CD4+ T cells were found in the peripheral lymphoid organs of drug- versus vehicle-treated mice (Fig. 7a), as determined by staining of pooled lymph node and spleen leukocytes with the Ab:P2 tetramer. Together, these observations indicate that topotecan and etoposide reduced negative selection of IRBP P2-specific T cells. Using the same assay, treatment with etoposide or topotecan did not reduce the negative selection of CD4+ T cells that bound the Ab:P7 tetramer, containing peptide 786–797 of IRBP (Fig. 7b), which is selected in the thymus independently of Aire31.

Figure 7: Requirement of TOP1 and TOP2 for imprinting of immunological tolerance.
Figure 7

(a,b) Cytofluormetric dot plots (left) and summary quantitative data (right) for Ab:P2 (a) or Ab:P7 (b) tetramer-stained CD4+ T cells from pooled spleen and lymph nodes of 4-week-old B6.Aire+/+ mice treated with topotecan, etoposide or just vehicle (DMSO) every third day for 3 weeks. CD3+CD8CD11bCD11cF4/80B220CD19 gated cells are displayed. Values on cytofluormetric dot plots refer to the number of cells in the indicated gate. ***P < 0.001 (unpaired Student's t test). (c) Organ histology scores at 15 weeks of age for NOD/LtJ pups (Aire+/+) treated with topotecan, etoposide or just vehicle (DMSO) at 2, 4 and 8 d after birth. Scores reflect the scale described in the Online Methods. Data are representative of three experiments with similar results (error bars (ac) represent mean ± s.e.m. from n ≥ 9 (a), n = 8 (b), n = 10 (c) measurements).

To further elucidate the effect of topoisomerase inhibition on immunologic tolerance, we quantified the leuckocyte cell infiltration in various organs targeted by inflammatory infiltrates in Aire−/− mice on a NOD genetic background. These mice were selected for analysis because the autoimmunity that typically occurs in the absence of Aire is prominent in NOD mice32. We intraperitoneally injected NOD.Aire+/+ pups with topotecan or etoposide at days 2, 4 and 8 after birth, and histologically assessed various organs for leukocytic infiltration at 15 weeks of age. Inhibition of TOP1 and TOP2 with topotecan and etoposide significantly augmented leukocyte attack on the retina and lung at 15 weeks of age compared with vehicle treatment, but we did not see an increase in infiltration in stomach, lacrimal gland or salivary gland (Fig. 7c). The inflammatory attack on the retina and lung was not seen when the same experiment was performed on Aire−/− mice (Supplementary Fig. 8d,e), suggesting that it was not a result of nonspecific drug toxicity. In Aire−/− mice, treatment with TOP inhibitors protected the mice from tissue pathology, as expected from prior data showing that etoposide and Aire induce the same set of transcripts in Aire-deficient cells15. These results indicate that inhibition of TOP1 or TOP2 results in a break in immunological tolerance akin to, but not a precise mimic of, that characteristic of mice lacking Aire. Considering these insights on Aire-containing multi-protein complexes and chromosomal localization, we propose a model of the molecular mechanism of Aire that involves preferential localization of Aire on super-enhancers for efficient delivery of Aire-containing complexes to the TSSs of its target genes (Supplementary Fig. 9).


We used advanced genomic and biochemical approaches to investigate the molecular mechanism of Aire action. Aire was found on both Aire-induced and Aire-neutral genes, particularly along chromatin stretches overloaded with H3K27ac and H3K4me1 and underloaded with H3K27me3, a profile of histone marks that is routinely used to delineate super-enhancers19,21,33. Furthermore, the super-enhancers of mature Aire+/+ mTEChi were enriched in H3K27ac marks vis à vis both mature Aire−/− mTEChi and immature Aire mTEChi from Aire+/+ mice, indicating that Aire activates super-enhancers. In addition, multiple topoisomerases were important for Aire induction of mTEC gene expression. TOP1 was a primary Aire partner, co-concentrated on super-enhancers and critical for Aire association with all of its other partners, whereas TOP2 was more involved at later stages of transcription.

Super-enhancers are defined as exceptionally long chromatin stretches hosting exceptionally high densities of general and cell-type-specific transcription factors19,20,21. They are thought to serve as depots for effective collection of relevant transcriptional regulators to enable their efficient and coordinate delivery to TSSs via intra-chromosomal looping or inter-chromosomal interactions. Super-enhancers are preferentially associated with genes that set the identity of and control the activities of fully differentiated cell types, or that are rapidly induced following environmental or physiologic stimulation. Localization of Aire in super-enhancers could explain several of its unusual features. First, Aire has a huge effect on mTEC transcription, regulating around 20% of the genes expressed in this cell type3,4. High-concentration depots of Aire-containing multi-protein complexes could drive this prodigious activity, perhaps in the nuclear speckles reported to host Aire and certain of its critical partners (for example, CBP)34,35. Second, although Aire exerts a strong influence on the mTEC transcriptome at the population level, its effect on an individual cell is much more restrained3,4,9,10. The repertoire of Aire-induced transcripts in single mTECs exhibits both intra- and inter-chromasomal clustering3,4. Dynamic looping of super-enhancers along a chromosome or engagement of a super-enhancer and TSS on different chromosomes could provide a potential framework for understanding such cell-by-cell variation. Third, the fact that super-enhancers are characteristically associated with genes mobilized during terminal differentiation of parenchymal cells might explain the preferential influence of Aire on loci encoding PTAs. Lastly, there is a strong correspondence between those genes induced by Aire and those repressed by a small-molecule inhibitor of BET proteins, among which is BRD4, a critical Aire partner26. Super-enhancers, which are overloaded with BRD4, are also known to be particularly sensitive to BET protein inhibitors36,37. Indeed, the preferential localization of BRD4 on upstream intergenic regions and its relative depletion from TSSs26 anticipated the partitioning of Aire that we observed.

There is a growing body of evidence that topoisomerases are important for gene transcription38,39. These enzymes cleave and rapidly reseal one (TOP1) or both (TOP2A/B) DNA strands, thereby generating a transient break through which topological changes are effected. Failure to complete the enzymatic reaction leads to trapping of a covalent DNA-topoisomerase intermediate, resulting in a single-stranded nick in the case of TOP1 and a double-stranded break in the case of TOP2. Confrontation of TOP1-induced nicks by the replication or transcription machinery often culminates in DSBs40,41. Several processes involved in the transcription of protein-coding genes generate topological strain that is relieved by these topoisomerases: chromatin remodeling42,43; synthesis of enhancer RNAs (eRNAs) and nucleosome depletion in enhancers44; DNA contortions at TSSs as a result of nucleosome depletion, transcription factor binding or RNA-PolII pausing45,46; and transcriptional elongation, which induces positive supercoils upstream of the polymerase and negative supercoils downstream of it28,47,48.

We previously suggested that Aire stabilizes TOP2A-induced DSBs at and downstream of TSSs15, thereby initiating recruitment of a histone-eviction complex composed of DNA-PKcs, Ku80, TOP2, PARP-1 and FACT that facilitates transcriptional elongation49. This notion was based on several lines of evidence: co-immunoprecipitation of TOP2A, DNA-PKcs and the other members of the eviction complex with Aire; the ability of Aire to promote DSBs in vitro and in vivo; and the strong correspondence between mTEC genes induced by Aire and by treatment of Aire-deficient cells with the TOP2 poison etoposide. However, our current results argue that this scenario is incorrect, or at least incomplete. Instead, we found that TOP1 was a primary Aire partner, seeding the formation of Aire-containing multi-protein complexes, notably at super-enhancers, through recruitment of elements of the DNA-damage response such as γH2AX, DNA-PKcs, Ku80 and PARP-1. Both TOP2A and TOP2B were also required for the induction of gene expression by Aire, but rather in subsequent events, a function that might still be performed by the histone-eviction complex. Findings in other systems support several elements of this revised scenario: eRNAs and TOP1-mediated DNA breaks are linked44; the DNA breaks induced by TOP1 can mobilize the DNA-damage response44; and TOP1 and TOP2 cooperate to optimize transcription28,43,47.

On the basis of these and previous11,14,15,26 observations, we propose a simplified model of Aire-induced gene expression. Attracted by hypomethylated H3K4me0 and/or repelled by hypermethylated H3K4me3 (ref. 11), Aire localizes to mTEC super-enhancers and interacts with TOP1, stabilizing DNA DSBs promoted by eRNA transcription and nucleosome depletion15 and thereby promoting recruitment of γH2AX, DNA-PKcs and other elements of the DNA-damage response, including Ku80 and PARP-1 (ref. 15). General transcription factors, such as RNA-PolII, CBP and BRD4 (ref. 26), are also drawn in, resulting in super-enhancers that host high concentrations of Aire-containing multi-protein complexes. Via chromatin looping, the super-enhancers serve as transcription factor depots for regional TSSs, particularly those contorted by paused RNA-PolII14, which itself can promote TOP2-induced DSBs, and thus independently seed the formation of some Aire-containing complexes. BRD4 in the complexes recruits pTEFb (composed of CycT1+CDK9 subunits)26, which phosphorylates DSIF, thereby lifting transcriptional pausing and promoting elongation. TOP1, and especially TOP2, ride along with RNA-PolII to relieve the torsional stresses introduced behind and in front of it. Additional factors (for example, DDX5, SFRS3) are independently incorporated into Aire-containing complexes, serving to link the transcription and splicing machineries.

Further validation of this model will probably require substantial evolution of existing genome-scale methods. The cell-to-cell variability of the effect of Aire, best appreciated from single-cell RNA-seq, may ultimately demand single-cell chromatin-capture approaches.


Maintenance, generation and treatment of mice.

Mice were housed and bred under specific-pathogen-free conditions at the Harvard Medical School Center for Animal Resources and Comparative Medicine (Institutional Animal Care and Use Committee protocol #02954). C57BL/6 (B6) Aire+/− mice5 were bred to generate Aire+/+ and Aire−/− littermates for experiments. Igrp-Gfp (Adig) reporter mice were provided by Dr. Mark Anderson, and were appropriately bred to yield Aire+/+ and Aire−/− littermates. Unless specified otherwise, females were used.

For the microarray experiments, 4-week-old B6.Aire+/+ and B6.Aire−/− mice were injected intraperitoneally with 5 mg/kg topotecan (Sigma-Aldrich), etoposide (Sigma-Aldrich) or both drugs, dissolved in dimethylsulfoxide (DMSO), once a day for three consecutive days. For the tetramer-staining experiments, mice of the same types were administered 1.25 mg/kg topotecan or etoposide once a day every third day for 3 weeks. To analyze effects on autoimmunity, we intraperitoneally injected Aire+/+ and Aire−/− NOD/LtJ pups with 0.675 mg/kg topotecan or 1.25 mg/kg etoposide once a day on the second, fourth and eighth days after birth.

Isolation, sorting and analysis of thymic and dermal cells.

Thymus tissue from individual 4–6-week-old Aire+/+ or Aire−/− mice was minced with scissors to release thymocytes and the fragments were digested with collegenase (Roche) and DNase (Sigma-Aldrich) for 15 min, then with collagenase/dispase (Roche) for 30 min, as previously described14. The released cells were stained with primary antibodies (Abs) (MHCII-APC; Ly51-PE; CD45-PE/Cy5), and CD45+ cells were depleted by MACS separation with anti-PE beads (Miltenyi). DAPICD45Ly51loMHCIIhi mTECs were sorted on a MoFlo instrument (Cytomation) into Trizol for RNA preparation (for microarray) or into Fetal Bovine Serum (FBS) (Gibco) for ChIP-seq library preparation, while GFP+(Aire+) mTECs were sorted in FACS buffer [phosphate-buffered saline (PBS), 0.5% bovine serum albumin, 2mM EDTA] for ATAC-seq library preparation.

For isolation of dermal fibroblasts, earskin tissue was minced with scissors and digested with collagenase type IV (Gibco) and DNase (Sigma-Aldrich) for 60 min. The single-cell suspension was stained with primary Abs (CD45-APC; EpCAM-APC; CD31-FITC; Ter-119-FITC; Sca-1-PE), and DAPICD45EpCAMCD31Ter-119Sca-1+ dermal fibroblasts were sorted in FACS buffer for preparation of ATAC-seq libraries.

For flow cytometric sorting or analysis, the following Abs were used: Ly51-PE (108308, BioLegend); CD45-PE/Cy5 (103110, BioLegend); MHCII-APC (107614, BioLegend); Aire (14-5934-80, eBioscience); TOP1 (ab85038, Abcam); TOP2A (ab12318, Abcam); TOP2B (ab72334, Abcam); CD3ɛ-PE (100308, BioLegend), CD4-PerCP/Cy5.5 (100434, BioLegend), CD8a-APC/Cy7 (100714, BioLegend), CD19-PE/Cy7 (115520, BioLegend), B220-PE/Cy7 (103222, BioLegend), CD11b-PE/Cy7 (101216, BioLegend), CD11c-PB (117322, BioLegend), F4/80-PE/Cy7 (123114, BioLegend). CD45-APC (103112, BioLegend), EpCAM-APC (118214, BioLegend), CD31-FITC (1625-02, SouthernBiotech), Ter-119-FITC (116206, BioLegend), Sca-1-PE (12-5981-83, eBioscience). Anti-rat IgG secondary Abs conjugated with FITC were from SouthernBiotech, while anti-rabbit IgG-Alexa Fluor 647 Abs were purchased from Jackson Immunoresearch.

ChIP-seq analysis.

1.5 × 105 mTEChi from 4–6-week-old female B6 mice were used for each ChIP-seq sample, adapting published protocols17,50. Briefly, mTECs were cross-linked with 1% formaldehyde for 8 min, sorted and lysed for 10 min on ice in RadioImmunoPrecipitation Assay (RIPA) buffer [10mM Tris-HCl (pH 8.0), 1mM EDTA (pH 8.0), 140mM NaCl, 1% Triton X-100, 0.1% sodium dodecyl sulfate (SDS) and 0.1% sodium deoxycholate] supplemented with complete protease inhibitor cocktail (Roche). Chromatin was sheared using an AFA Focused-ultrasonicator (Covaris) for 15 min (duty cycle 2%, intensity 3, cycle/burst 200) and the sheared material was cleared by a 10-min centrifugation at 13,000 rpm at 4 °C. The cleared material was immunoprecipitated overnight at 4 °C with Abs conjugated to magnetic Protein-G beads (Life Technologies, Dynabeads), followed by extensive washing of the beads with ice-cold RIPA, high-salt RIPA [10 mM Tris-HCl (pH 8.0), 1 mM EDTA (pH 8.0), 500 mM NaCl, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate], LiCl [10 mM Tris-HCl (pH 8.0), 1 mM EDTA (pH 8.0), 250 mM LiCl, 0.5% NP-40 and 0.5% sodium deoxycholate] and TE [10 mM Tris-HCl (pH 8.0) and 1 mM EDTA (pH 8.0)]. Chromatin derived from 1.5 ×105 cells immunoprecipitated with specific Abs was eluted from the beads, treated with 1 μg DNase-free RNase (Roche) for 30 min at 37 °C and with Proteinase K (Roche) for 2 h at 37 °C followed by reverse cross-linking by leaving the plate at 65 °C overnight. DNA from reverse cross-linked material was purified with SPRI beads (Agencourt AMPure XP beads, Beckman Coulter); and sequential steps of end-repair, A-base addition, adaptor-ligation and PCR amplification (15 cycles) were performed to prepare the ChIP-seq library for each sample, as described previously17. ChIP-seq for H3K27me3 was performed as previously reported51.

Individual ChIP-seq libraries were size-selected for 200–500-bp fragments with SPRI beads. Equivalent amounts of barcoded libraries were pooled and sequenced using HiSeq 2500 or NextSeq 500 (Illumina) instruments. To control for background noise, we immunoprecipitated sheared chromatin with purified rabbit IgG Abs, and a ChIP-seq library was prepared and sequenced as described above.

For ChIP-seq analysis, the following Abs were used: Aire (14-5934-80, eBioscience); TOP1 (ab3825 and ab85038, Abcam); TOP2A (WH0007153M1, Sigma-Aldrich); RNA-PolII (MMS-128P, Covance); γH2AX (05-636, Millipore); H3K4me1 (ab8895, Abcam and 07-436, Millipore); H3K27ac (ab4729, Abcam) and H3K27me3 (ab6002, Abcam).

Short reads (50 bp, single end) were aligned to the mouse reference genome (mm10) using bowtie aligner version 2.2.4 (ref. 52). Reads with multiple alignments were removed with samtools (v1.1) and de-duplicated with picard (v1.130). To identify peaks from ChIP-seq reads, we used the HOMER package makeTagDirectory followed by the findPeaks command with the 'histone' parameter53. Peaks displaying fourfold enrichment and poison P-value of 1x10−4 against background IgG ChIP were considered significant and were used for further analysis. To visualize individual ChIP-seq data on Integrative Genomics Viewer (IGV)54, we converted bam output files from picard into normalized bigwig format using the bamCoverage function in deepTools (v1.6) with options – fragmentLength 200 –normalizeUsingRPKM55. HOMER-generated peak files for H3K27ac were used for the identification of super-enhancers, using the ROSE algorithm described previously21, wherein enhancer peaks are stitched together if they are located within 12.5 kb of each other and if they do not have multiple active promoters in between; enhancers were then ranked according to increasing H3K27ac signal intensity. Heatmaps as in Figure 1d and line plots as in Figure 1b were generated using program ngs.plot56.

ATAC-seq analysis.

1 × 104 mTECs or fibroblasts from 4–6-week-old female Adig mice were used for preparation of ATAC-seq libraries, adapting published protocols23,57. Briefly, cells were suspended in 100 μl of cold hypotonic lysis buffer [10 mM Tris-HCl (pH 7.5), 10 mM NaCl, 3 mM MgCl2 and 0.1% NP40], followed by immediate centrifugation at 550 g for 30 min. The pellet was re-suspended in 5 μl of transposition reaction mix [1 μl of Tagment DNA Enzyme and 2.5 μl of Tagment DNA Buffer from Nextera DNA Sample Prep Kit (Illumina), 1.5 μl H2O], and was incubated for 60 min at 37 °C for DNA to be fragmented and tagged. For library preparation, two sequential seven-cycles of PCR were performed to enrich small tagmented DNA fragments. After the first PCR, the libraries were selected for small fragments (less than 600 bp) using SPRI beads followed by a second round of PCR with the same conditions in order to obtain the final library. Libraries were sequenced on the NextSeq 500 instrument to generate paired-end short reads (50 bp, forward; 34 bp, reverse). Data were processed essentially as per ChIP-seq analysis, except reads mapping to mitochondrial DNA (17%) were removed before analysis and peaks were identified using the 'factor' parameter in the findPeaks command of the Homer package.

Cell culture and transfection.

HEK293T cells were cultured in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% FBS, L-glutamate and penicillin/streptomycin antibiotics, and were maintained in a humidified atmosphere at 37 °C with 5% CO2. For transfection, the cells were seeded in 10cm tissue-culture plates, and were transfected with the specified plasmids using TransIT reagent (Mirus) according to the manufacturer's instructions. A plasmid driving expression of wild-type mouse Aire FLAG-tagged at the amino terminus (FLAG-Aire) was constructed by in-frame insertion between the BglII and SalI sites of the pCMV-tag1 vector (Clontech), as described previously58.

Gel filtration chromatography, immunoprecipitations and mass spectrometry.

HEK293T cells were transfected with empty or FLAG-Aire-containing pCMV-tag1 vector. 48 h later, the cells were harvested and lysed in a hypotonic lysis buffer [0.05% NP-40, 10 mM HEPES, 1.5 mM MgCl2, 10 mM KCl, 5 mM EDTA (3 ×107 cells/ml)] plus complete protease inhibitor cocktail (Roche), pH 7.4, followed by incubation on ice for 15 min. Nuclei were separated from the cytosolic fraction by centrifugation at 800 g for 10 min, and were incubated at 4 °C for 1 h in a native nuclear extraction buffer [50 mM Bis-Tris, 750 mM 6-aminocaproic acid, 3 mM CaCl2, 10% Glycerol, EDTA-free complete protease inhibitor cocktail (Roche) and micrococcal nuclease (Nuclease S7; Roche), pH 7.4, (6 ×107 cells/ml)].

For gel filtration chromatography, 1.5 mg of nuclear extract was injected into a Superose-6 10/300 GL column (GE Healthcare Life Sciences), and was separated by fast-protein liquid chromatography (FPLC) using the elution buffer 20 mM HEPES, 150 mM NaCl, 20 mM KCl, 0.5 mM MgCl2, 3 mM CaCl2, EDTA-free complete protease inhibitor cocktail, pH 7.4. 27 fractions of 1ml each were collected. The indicated fractions, used directly or pooled, were concentrated via filter centrifugation (Amicon Ultra, 10 kDa cutoff, Millipore) to 500 μl.

For immunoprecipitations, concentrated, pooled chromatographic fractions (from the above-mentioned chromatography experiment) or nuclear extracts (for standard immunoprecipitation experiments) were incubated with the indicated Abs conjugated to Protein-G Sepharose beads (Life Technologies) overnight with rotation at 4 °C. To identify DNA-dependent protein interactions, nuclear extracts were treated with ethidium bromide (100 μg/ml) for 15 min at 4 °C followed by 10-min centrifugation at 13,000 rpm at 4 °C and supernatants were used for immunoprecipitation as above. Ethidium bromide was also included in all the later washing steps for this experiment. Beads were washed thrice with ice-cold PBS containing 0.05% NP-40, and once with ice-cold PBS. Bound proteins were eluted by boiling the beads in sample buffer for 15 min, separated by SDS-PAGE, electro-transferred to polyvinylidene difluoride (PVDF) membranes (Bio-Rad), blocked for 60 min with 5% non-fat dried milk solution in PBST [PBS (pH 7.4), 0.05% Tween 20], and were probed with primary Abs overnight at 4 °C. After a wash with PBST, membranes were incubated with secondary Abs linked to horseradish peroxidase. The blots were then developed with an enhanced chemiluminescence detection system (Thermo Scientific) as per the manufacturer's instructions. For quantification, the chemiluminescent images were processed with Multi Gauge v2.3 (Fujifilm).

For immunoprecipitation studies, Abs recognizing the following proteins were used: FLAG-tag (M2 mouse mAb, Sigma-Aldrich); DNA-PKcs (MS-423-P1, Thermo Scientific); Ku80 (ab55408, Abcam); PARP-1 (9542, Cell Signaling); TOP1 (ab85038, Abcam); TOP2A (ab12318, Abcam); TOP2B (ab72334, Abcam); RNA-PolII (sc-899, Santa Cruz); SPT5 (sc-28678, Santa Cruz); CDK9 (sc-484, Santa Cruz); BRD4 (ab84776, Abcam); SFRS3 (H00006428-M08, Abnova) and DDX5 (sc-166167, Santa Cruz). Anti-mouse and anti-rabbit IgG secondary Abs conjugated with horseradish peroxidase were purchased from Jackson Immunoresearch.

Mass-spectrometry technique for analysis of FLAG-Aire immunoprecipitates has been detailed earlier15. Briefly, nuclear extracts from FLAG-Aire-HEK293T cells were incubated with 20 μl Protein-G Sepharose beads conjugated to anti-FLAG Abs overnight with rotation at 4 °C. Beads were washed three times with ice-cold PBS containing 0.05% NP-40, and once with ice-cold PBS. Immunoprecipitated proteins were eluted by boiling in sample buffer for 15 min, and were separated by 10% SDS-PAGE. Gels were stained with Coomassie G-250, and tryptic digests of individual lanes were analyzed by LC-MS/MS using an LTQ mass spectrometer. Analysis of the MS/MS data was performed using the SEQUEST algorithm as described previously15.

shRNA-mediated knockdown of Aire partners.

Knockdown of Aire-partners in HEK293T cells was accomplished by expression of cognate shRNAs (four per partner) in the lentiviral vector pLKO.1, procured from the RNAi Consortium of the Broad Institute. shRNAs targeting LacZ served as controls. We transduced HEK293T cells with individual shRNA-containing lentivirus particles, selected them using Puromycin (Gibco), then transfected them with the FLAG-Aire plasmid. 48 h later, we performed immunoprecipitations with anti-FLAG Abs, as described above. The densities of immunoprecipitated protein bands were quantified with Multi Gauge v2.3 (Fujifilm). The densities for all immunoprecipitated protein bands, after transduction of LacZ (two hairpins) or cognate shRNAs (four hairpins), were averaged for two independent experiments and scaled considering immunoprecipitation after LacZ transduction as 100%.

Microarray and quantitative PCR analyses.

RNA was prepared from mTEChi of individual mice treated with vehicle (DMSO) alone, topotecan, etoposide or both drugs followed by amplification and cDNA preparation as previously described58. cDNA was either hybridized to Affymetrix ST1.0 microarrays or used for quantitative PCR analysis of Irbp expression. Quantitative PCR was performed using Power SYBR Green master mix (Thermo Scientific) and the StepOnePlus real-time PCR system (Applied Biosystems). Primer sequences were Irbp-forward, CTACAACCGGCCCAATGACT; Irbp-reverse, AAGTAAATTCCTCGGCGGCA; Hprt-forward, TGCCGAGGATTTGGAAAAAGTG; Hprt-reverse, TGGCCTCCCATCTCCTTCAT.

Microarray data were processed using the robust multiarray average (RMA) algorithm for probe-level normalization and analyzed by the multiplot module of GenePattern (Broad Institute).The feature-level analysis of microarray data was performed as described previously, with slight modifications14. Briefly, we processed the raw probe-level data files (.CEL) from Affymetrix ST1.0 microarrays with the RMA algorithm to generate normalized exon-level and gene-level data files for each sample. The genome-wide locations of microarray probes on “mm10 (mouse) build” were extracted from the Affymetrix website. Exon-level expression values for Aire-induced genes (Aire+/+/Aire−/− >2) were taken for further analysis if the gene displayed exon1 imbalance (i.e. the ratio between exon Aire+/+/Aire−/− fold change to transcript Aire+/+/Aire−/− fold change was >2 or <0.5). For Aire-induced genes flagged for exon1 imbalance, expression levels of the exons were plotted against their distance from the TSSs.

Tetramer analyses.

Inhibitor- or vehicle-treated mice were immunized with 100 μg P2 peptide (IRBP271–290) or P7 peptide (IRBP771-790) emulsified in complete Freund's adjuvant, as described previously31. APC-conjugated Ab:P2 and Ab:P7 tetramers were generated by the National Institutes of Health Tetramer Core Facility. For tetramer staining, 10 d after immunization, peripheral lymph node and spleen cells were pooled and stained for 1 h at 25 °C, followed by magnetic-bead purification using anti-APC beads to enrich for tetramer-positive cells. The selected cells were stained with antibodies to CD3ɛ, CD4, CD8, B220, CD19, F4/80, CD11b and CD11c. Stained cells were analyzed on an LSRII (BD Biosciences), and tetramer-reactive cells were gated as CD3+CD4+CD8CD11bCD11cF4/80B220CD19 using FlowJo software (TreeStar). Tetramer-positive cells were enumerated by counting the total number of cells by MACSQuant (Miltenyi Biotech), and determining the fraction of tetramer-reactive cells on FlowJo.

Autoimmune disease monitoring.

Inhibitor- or DMSO-treated Aire+/+ mice were sacrificed at 15 weeks of age, while similarly treated Aire−/− were sacrificed at 12 weeks of age or when they had lost 15–20% body weight relative to that of littermates. The designated tissues were removed, fixed in 10% formalin and embedded in paraffin. Tissue sections were stained with hematoxylin and eosin (H+E), and infiltration of various organs was scored. In general, scores of 0, 0.5, 1, 2, 3 and 4 indicate no, trace, mild, moderate, or severe lymphocytic infiltration, and complete destruction, respectively. For retinal degeneration, 0 = lesion present without any photoreceptor layer lost; 1 = lesion present, but less than half of the photoreceptor layer lost; 2 = more than half of the photoreceptor layer lost; 3 = entire photoreceptor layer lost without or with mild outer nuclear layer attack; and 4 = the entire photoreceptor layer and most of the outer nuclear layer destroyed. All samples were scored blindly and independently by two investigators.

Statistical analysis.

Data were routinely presented as mean ± s.d. or s.e.m. Statistical significance was assessed by Student's t test, χ2 test or the Wilcoxon rank-sum test, as specified in individual figure legends.

Data availability.

The ATAC-seq, ChIP-seq and microarray datasets reported in this manuscript can be accessed in GEO with accession codes GSE92594, GSE92597 and GSE92509. Other referenced publically available data sets: B6.Aire+/+ and B6.Aire−/− RNA-seq, SRR2038194, SRR2038195, SRR2038196 and SRR2038197; H3K27ac ChIP-seq for Aire+ and Aire mTECs, GSE74257; H3K27ac and H3K4me1 ChIP-seq for HEK293T cells, GSE51633. Aire, RNA-PolII and IgG ChIP-seq data in Aire-transfected HEK293T cells came from ref. 14.


Primary accessions


  1. 1.

    , , & Positive and negative selection of the T cell repertoire: what thymocytes see (and don't see). Nat. Rev. Immunol. 14, 377–391 (2014).

  2. 2.

    et al. Population and single-cell genomics reveal the Aire dependency, relief from Polycomb silencing, and distribution of self-antigen expression in thymic epithelia. Genome Res. 24, 1918–1931 (2014).

  3. 3.

    , , & Aire controls gene expression in the thymic epithelium with ordered stochasticity. Nat. Immunol. 16, 942–949 (2015).

  4. 4.

    et al. Single-cell transcriptome analysis reveals coordinated ectopic gene-expression patterns in medullary thymic epithelial cells. Nat. Immunol. 16, 933–941 (2015).

  5. 5.

    et al. Projection of an immunological self shadow within the thymus by the aire protein. Science 298, 1395–1401 (2002).

  6. 6.

    , & Transcriptional regulation by AIRE: molecular mechanisms of central tolerance. Nat. Rev. Immunol. 8, 948–957 (2008).

  7. 7.

    & Aire. Annu. Rev. Immunol. 27, 287–312 (2009).

  8. 8.

    , & Transcriptional impact of Aire varies with cell type. Proc. Natl. Acad. Sci. USA 105, 14011–14016 (2008).

  9. 9.

    , , , & Promiscuous gene expression patterns in single medullary thymic epithelial cells argue for a stochastic mechanism. Proc. Natl. Acad. Sci. USA 105, 657–662 (2008).

  10. 10.

    , , & Ectopic expression of peripheral-tissue antigens in the thymic epithelium: probabilistic, monoallelic, misinitiated. Proc. Natl. Acad. Sci. USA 105, 15854–15859 (2008).

  11. 11.

    et al. Aire employs a histone-binding module to mediate immunological tolerance, linking chromatin regulation with organ-specific autoimmunity. Proc. Natl. Acad. Sci. USA 105, 15878–15883 (2008).

  12. 12.

    et al. The autoimmune regulator PHD finger binds to non-methylated histone H3K4 to activate gene expression. EMBO Rep. 9, 370–376 (2008).

  13. 13.

    et al. AIRE recruits P-TEFb for transcriptional elongation of target genes in medullary thymic epithelial cells. Mol. Cell. Biol. 27, 8815–8823 (2007).

  14. 14.

    et al. Aire unleashes stalled RNA polymerase to induce ectopic gene expression in thymic epithelial cells. Proc. Natl. Acad. Sci. USA 109, 535–540 (2012).

  15. 15.

    , , & Aire's partners in the molecular control of immunological tolerance. Cell 140, 123–135 (2010).

  16. 16.

    et al. An RNAi screen for Aire cofactors reveals a role for Hnrnpl in polymerase release and Aire-activated ectopic transcription. Proc. Natl. Acad. Sci. USA 111, 1491–1496 (2014).

  17. 17.

    et al. High-throughput chromatin immunoprecipitation for genome-wide mapping of in vivo protein-DNA interactions and epigenomic states. Nat. Protoc. 8, 539–554 (2013).

  18. 18.

    et al. AIRE activated tissue specific genes have histone modifications associated with inactive chromatin. Hum. Mol. Genet. 18, 4699–4710 (2009).

  19. 19.

    et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

  20. 20.

    , & Super-enhancers: asset management in immune cell genomes. Trends Immunol. 36, 519–526 (2015).

  21. 21.

    et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

  22. 22.

    et al. Identification of a novel cis-regulatory element essential for immune tolerance. J. Exp. Med. 212, 1993–2002 (2015).

  23. 23.

    , , , & Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

  24. 24.

    et al. Brd4 and JMJD6-associated anti-pause enhancers in regulation of transcriptional pause release. Cell 155, 1581–1595 (2013).

  25. 25.

    et al. APECED-causing mutations in AIRE reveal the functional domains of the protein. Hum. Mutat. 23, 245–257 (2004).

  26. 26.

    et al. Brd4 bridges the transcriptional regulators, Aire and P-TEFb, to promote elongation of peripheral-tissue antigen transcripts in thymic stromal cells. Proc. Natl. Acad. Sci. USA 112, E4448–E4457 (2015).

  27. 27.

    , & Breaking barriers to transcription elongation. Nat. Rev. Mol. Cell Biol. 7, 557–567 (2006).

  28. 28.

    et al. Topoisomerases facilitate transcription of long genes linked to autism. Nature 501, 58–62 (2013).

  29. 29.

    & Dynamics of DNA damage response proteins at DNA breaks: a focus on protein modifications. Genes Dev. 25, 409–433 (2011).

  30. 30.

    & Ethidium bromide provides a simple tool for identifying genuine DNA-independent protein associations. Proc. Natl. Acad. Sci. USA 89, 6958–6962 (1992).

  31. 31.

    et al. Detection of an autoreactive T-cell population within the polyclonal repertoire that undergoes distinct autoimmune regulator (Aire)-mediated selection. Proc. Natl. Acad. Sci. USA 109, 7847–7852 (2012).

  32. 32.

    , , , & Modifier loci condition autoimmunity provoked by Aire deficiency. J. Exp. Med. 202, 805–815 (2005).

  33. 33.

    et al. Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell 159, 374–387 (2014).

  34. 34.

    et al. Localization of the APECED protein in distinct nuclear structures. Hum. Mol. Genet. 8, 259–266 (1999).

  35. 35.

    , , , & Autoimmune regulator is acetylated by transcription coactivator CBP/p300. Exp. Cell Res. 318, 1767–1778 (2012).

  36. 36.

    et al. Discovery and characterization of super-enhancer-associated dependencies in diffuse large B cell lymphoma. Cancer Cell 24, 777–790 (2013).

  37. 37.

    et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell 153, 320–334 (2013).

  38. 38.

    , & Topoisomerase-mediated chromosomal break repair: an emerging player in many games. Nat. Rev. Cancer 15, 137–151 (2015).

  39. 39.

    , & DNA topoisomerases beyond the standard role. Transcription 4, 232–237 (2013).

  40. 40.

    et al. Conversion of topoisomerase I cleavage complexes on the leading strand of ribosomal DNA into 5′-phosphorylated DNA double-strand breaks by replication runoff. Mol. Cell. Biol. 20, 3977–3987 (2000).

  41. 41.

    et al. Ataxia telangiectasia mutated activation by transcription- and topoisomerase I-induced DNA double-strand breaks. EMBO Rep. 10, 887–893 (2009).

  42. 42.

    , & Glucocorticoid receptor transcriptional activation via the BRG1-dependent recruitment of TOP2beta and Ku70/86. Mol. Cell. Biol. 35, 2799–2817 (2015).

  43. 43.

    et al. Chromatin remodeller SMARCA4 recruits topoisomerase 1 and suppresses transcription-associated genomic instability. Nat. Commun. 7, 10549 (2016).

  44. 44.

    et al. Ligand-dependent enhancer activation regulated by topoisomerase-I activity. Cell 160, 367–380 (2015).

  45. 45.

    et al. RNA polymerase II regulates topoisomerase 1 activity to favor efficient transcription. Cell 165, 357–371 (2016).

  46. 46.

    et al. Activity-induced DNA breaks govern the expression of neuronal early-response genes. Cell 161, 1592–1605 (2015).

  47. 47.

    & Transcription-generated torsional stress destabilizes nucleosomes. Nat. Struct. Mol. Biol. 21, 88–94 (2014).

  48. 48.

    et al. Transcriptional elongation requires DNA break-induced signalling. Nat. Commun. 6, 10191 (2015).

  49. 49.

    et al. FACT-mediated exchange of histone variant H2AX regulated by phosphorylation of H2AX and ADP-ribosylation of Spt16. Mol. Cell 30, 86–97 (2008).

  50. 50.

    et al. A high-throughput chromatin immunoprecipitation approach reveals principles of dynamic gene regulation in mammals. Mol. Cell 47, 810–822 (2012).

  51. 51.

    et al. An ultra-low-input native ChIP-seq protocol for genome-wide profiling of rare cell populations. Nat. Commun. 6, 6033 (2015).

  52. 52.

    & Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  53. 53.

    et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

  54. 54.

    et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

  55. 55.

    , , , & deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–91 (2014).

  56. 56.

    , , & ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics 15, 284 (2014).

  57. 57.

    et al. Immunogenetics. Chromatin state dynamics during blood formation. Science 345, 943–949 (2014).

  58. 58.

    , , , & Aire's plant homeodomain(PHD)-2 is critical for induction of immunological tolerance. Proc. Natl. Acad. Sci. USA 110, 1833–1838 (2013).

Download references


We thank G. Buruzula, K. Rothamel, A. Rhoads, K. Hattori, A. Lopez, G. Gopalan, K. Waraska, M. Thorsen for experimental assistance and C. Laplace for help with manuscript preparation. The NIH Tetramer Core Facility (contract HHSN272201300006C) kindly provided tetramers. This work was supported by NIH grants R01 DK060027 and R01 AI088204. K.B. was supported by American Diabetes Association Mentor-Based Postdoctoral Fellowship #7-12-MN-51 to D.M.

Author information


  1. Division of Immunology, Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts, USA.

    • Kushagra Bansal
    • , Hideyuki Yoshida
    • , Christophe Benoist
    •  & Diane Mathis


  1. Search for Kushagra Bansal in:

  2. Search for Hideyuki Yoshida in:

  3. Search for Christophe Benoist in:

  4. Search for Diane Mathis in:


K.B. and H.Y. performed the experiments. K.B., C.B. and D.M. designed the study, analyzed and interpreted the data, and wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Christophe Benoist or Diane Mathis.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–9

  2. 2.

    Supplementary Table 1

    Efficiency and robustness of various ChIP-seq experiments

About this article

Publication history





Further reading