Intronic regulation of Aire expression by Jmjd6 for self-tolerance induction in the thymus

The thymus has spatially distinct microenvironments, the cortex and the medulla, where the developing T-cells are selected to mature or die through the interaction with thymic stromal cells. To establish the immunological self in the thymus, medullary thymic epithelial cells (mTECs) express diverse sets of tissue-specific self-antigens (TSAs). This ectopic expression of TSAs largely depends on the transcriptional regulator Aire, yet the mechanism controlling Aire expression itself remains unknown. Here, we show that Jmjd6, a dioxygenase that catalyses lysyl hydroxylation of splicing regulatory proteins, is critical for Aire expression. Although Jmjd6 deficiency does not affect abundance of Aire transcript, the intron 2 of Aire gene is not effectively spliced out in the absence of Jmjd6, resulting in marked reduction of mature Aire protein in mTECs and spontaneous development of multi-organ autoimmunity in mice. These results highlight the importance of intronic regulation in controlling Aire protein expression.

T he thymus has spatially distinct microenvironments, the cortex and the medulla, where developing T-cells are selected to mature or die through the interaction with thymic stromal cells 1,2 . Cortical thymic epithelial cells (cTECs), a major stromal cell-type in the cortex, direct differentiation of CD4 þ CD8 þ immature thymocytes that are capable of recognizing self-major histocompatibility complex (MHC) molecules. On the other hand, medullary thymic epithelial cells (mTECs) play an important role in self-tolerance induction by eliminating self-reactive T-cells. A unique property of mTECs is their expression of diverse sets of peripheral tissue-specific self-antigens (TSAs) 3,4 . This ectopic expression of TSAs largely depends on the transcriptional regulator Aire [5][6][7][8] , which is expressed in mature mTECs [9][10][11] . The homozygous mutations of human AIRE cause an autoimmune disease known as autoimmune-polyendocrinopathy-candidiasis ectodermal dystrophy 12,13 . Similarly, Aire-deficient mice develop multiorgan autoimmunity with the failure to delete self-reactive T-cells 5,14,15 . Despite an important role in self-tolerance induction, the mechanism controlling Aire expression itself is poorly understood.
Alternative splicing is a major cellular mechanism in metazoans for generating proteomic diversity 16,17 . This is a posttranscriptional process in which premature transcripts are selectively cut and joined in more than one way to generate multiple mRNAs from a single gene. There are three forms of alternative splicing: exon skipping, alternative splice site usage and intron retention 17 . Of these, intron retention is the least frequent alternative splicing form 17 , which occurs when an intron, having been transcribed as a part of a pre-mRNA, is not spliced out. The sequence structure of most introns consists of a short 5 0 splice site boundary, a minimal AG dinucleotide 3 0 splice site boundary, a catalytic adenosine and a polypyrimidine tract (PPT) 16 . Mechanistically, intron retention is considered to be the result of weak splice site sequences that are not properly recognized by spliceosome 18 . Intron retention often inserts a premature termination codon in the mature transcript that would then be degraded by non-sense-mediated decay 19 . Therefore, its physiological significance has been so far overlooked.
Jmjd6 is a member of the JmjC-domain containing proteins that are involved in a wide range of oxidation reactions 20 . Jmjd6 was initially identified as a phosphatidylserine receptor that mediates recognition and engulfment of apoptotic cells 21 . However, recent evidence indicates that Jmjd6 is a nuclear protein and catalyses lysyl hydroxylation of multiple substrates, including splicing regulatory proteins, transcription factors and histones, in a manner dependent on the Fe(II) and 2-oxoglutarate [22][23][24][25] . Jmjd6 deficiency in mice causes abnormal development of multiple organs during embryogenesis and led to perinatal lethality [26][27][28] ; yet, its role in the immune system and immune responses remain unclear. Here we show that Jmjd6 plays a key role in induction of central tolerance by controlling Aire expression in mTECs. Although Jmjd6 deficiency did not affect abundance of Aire transcript, the intron 2 of Aire gene was not effectively spliced out in the absence of Jmjd6, owing to the unique 3 0 splice site sequence. As a result, the expression of Aire protein was markedly reduced in Jmjd6-deficient (Jmjd6 À / À ) mTECs, and T-cells generated in such thymic microenvironments caused multi-organ autoimmunity in mice. Our findings indicate that Aire protein expression is tightly controlled through two discrete steps, intron retention and its relief, the latter of which involves the enzymatic activity of Jmjd6.

Ratio of FPKM values
Fabp3 S100a8 Npy The Aire expression is induced in mTECs through interaction with cells such as lymphoid tissue inducers, positively selected TCRab þ thymocytes, and TCR Vg5 þ dendritic epidermal T-cell progenitors [30][31][32][33] , which is mainly mediated by the signals through tumour necrosis factor receptor family members, such as receptor activator of NF-kB (RANK), CD40 and lymphotoxin b receptor (LtbR) [34][35][36] . When 2-DG-treated WT fetal thymic stroma was stimulated in vitro with RANK-ligand (RANKL), the Aire expression was markedly induced in a subset of UEA-1 þ mTECs ( Fig. 2e and Supplementary Fig. 3c). Although CD40ligand (CD40L) or anti-LtbR antibody alone was not effective, they synergistically act with RANKL to augment Aire expression in WT mTECs ( Fig. 2e and Supplementary Fig. 3c). However, the Aire expression was hardly detected in Jmjd6 À / À mTECs even after stimulation with RANKL in combination with CD40L or anti-LtbR antibody ( Fig. 2e and Supplementary Fig. 3c). This was again confirmed by flow cytometric analyses (Fig. 2f). These results indicate that Jmjd6 is critical for Aire expression in mTECs.
Intronic regulation of Aire gene by Jmjd6. As Jmjd6 is a nuclear protein that catalyses lysyl hydroxylation of multiple substrates [22][23][24][25] , it seemed likely that Jmjd6 deficiency affects transcription of both upstream and downstream genes of Aire. To comprehensively identify genes controlled by Jmjd6, we prepared 2-DG-treated fetal thymic stroma with or without RANKL stimulation from WT and Jmjd6 À / À mice and analysed their transcriptomes by RNA sequencing (RNA-seq). Although the gene expression of Jmjd6 was unchanged between before and after stimulation ( Supplementary Fig. 4), 1,850 genes were induced in response to RANKL stimulation, among which 1,020 genes were expressed in WT fetal thymic stroma at significantly higher levels than those in Jmjd6 À / À samples (see top 200 genes in Supplementary Table 1). These included 23 genes encoding Aire-dependent TSAs such as insulin 2 and salivary protein 1 (refs 5,6,8) (Fig. 3a). This reduction of Aire-dependent TSAs was further confirmed by quantitative real-time PCR using samples from Jmjd6 À / À fetal thymic stroma and Jmjd6 À / À E18.5 thymi ( Fig. 3b and Supplementary Fig. 5). On the other hand, Jmjd6 deficiency did not affect gene expression of Aire-independent TSA, glutamate decarboxylase 67 (GAD67) 5,6,8 ( Fig. 3b and Supplementary Fig. 5). Similarly, gene expression of CD80 and CD40 were unaffected in the absence of Jmjd6 (Fig. 3c). Consistent with this finding, immunohistochemical analysis revealed that RANKL-induced CD80 induction normally occurred even in Jmjd6 À / À mTECs (Fig. 3d). Unexpectedly, however, the abundance of Aire transcript was also comparable between RANKL-stimulated WT and Jmjd6 À / À samples (Fig. 3b,c), suggesting that Jmjd6 controls Aire expression through a posttranscriptional mechanism.
To explore the underlying mechanism, we next analysed pre-mRNA splicing. Among 84,708 introns of detected genes, 1,051 introns were selected, because they were expressed at a relatively high frequency (intronic FPKM410). Bioinformatics analysis identified 57 introns preferentially expressed in Jmjd6 À / À thymic stroma under RANKL-stimulated condition (Supplementary Table 2). These included the intron 2 of Aire, which was found in RANKL-treated Jmjd6 À / À samples at 1.9 times higher frequency than that in the WT sample (Fig. 3e). This was further confirmed by RT-PCR followed by Southern blotting (Fig. 3f). Amplification of Aire cDNA with primers specific for the sequence of the exons 1 and 10 yielded four bands corresponding to the Aire transcript with or without retention of intron 2, intron 9 and intron 2 plus intron 9 (Fig. 3g). By measuring the intensity of each band, 41.5% of the Aire transcripts expressed in RANKLstimulated WT fetal thymic stroma were found to be mature form, whereas this value decreased to 18.6% in Jmjd6 À / À samples (Fig. 3g), because of increase in the frequency of retention of intron 2 and/or intron 9. Similar results were obtained when E18.5 thymi of WT and Jmjd6 À / À embryos were analysed with this method (Fig. 3h). It is known that Aire is also expressed in the reproductive organs and embryonic stem (ES) cells 8,37,38 . To examine whether a similar mechanism operates in ES cells, we developed Jmjd6 À / À ES cells and compared their Aire transcripts with those of WT ES cells. Jmjd6 deficiency in ES cells markedly increased the Aire transcript containing intron 2 (Fig. 4a), indicating that Aire expression in ES cells is also controlled by Jmjd6 through intron retention. Recent structural analysis of the catalytic domain of Jmjd6 indicated amino acid residues critical for binding to Fe(II), 2-oxoglutarate and substrate lysine 39 . When five of these amino acid residues were mutated to alanine (designated 5 A mutant), the catalytic activity of Jmjd6 to hydroxylate lysine residues was completely lost (Fig. 4b).
Although the transient expression of WT Jmjd6 in Jmjd6 À / À ES cells significantly, albeit incompletely, improved the ratio of mature Aire transcript (without intron 2) to immature transcript (with intron 2), such improvement was not achieved by the 5 A mutant (Fig. 4c). These results suggest that Jmjd6 controls splicing events of Aire gene depending on its enzymatic activity.
The nature of immature Aire protein. The retention of intron 2 results in an appearance of a premature termination codon at the N-terminal portion of Aire ( Fig. 5a and Supplementary Fig. 6), leaving only 103 amino acid residues presumably intact. To know the nature of this immature Aire protein generated by intron 2 retention, we first analysed its subcellular localization. As expected, the GFP-tagged mature Aire protein was localized to the nucleus when expressed alone in MEFs (Fig. 5a). On the other hand, the mCherry-tagged immature Aire protein preferentially accumulated in the cytoplasm, owing to the lack of the nuclear localization signal (Fig. 5a). Interestingly, localization of mature Aire protein was changed from the nucleus to the cytoplasm when immature Aire protein was co-expressed (Fig. 5a). Exactly the same results were obtained when GFP-and mCherry-tags were exchanged (Supplementary Fig. 7). This finding led us to For this purpose, we developed HEK293 cells that constitutively express mature Aire protein, but inducibly express immature Aire protein when exposed to doxycycline (Fig. 5b). Doxycycline itself did not affect stability of mature Aire protein (Fig. 5c). However, the amount of mature Aire protein decreased, as the expression of immature Aire protein increased in response to doxycycline treatment (Fig. 5c).
Since this reduction of mature Aire protein level was partially inhibited by treating cells with the proteasome inhibitor MG132 (Fig. 5d), it was suggested that mature Aire protein sequestrated into the cytoplasm becomes susceptible to proteasome-dependent protein degradation.
A cis-regulatory element for intron retention of Aire gene. To understand why the intron 2 of Aire gene is susceptible to intron retention, we compared the sequence structure between Aire intron 2 and other introns. Although the most introns have the PPT site immediately upstream of the 3 0 terminal AG dinucleotide 16 , Aire intron 2 encodes GAG instead of canonical pyrimidine-rich sequence at this position (Fig. 6a), resulting in 'low' 3 0 splice site score, which is calculated based on the similarity to the consensus sequence (Fig. 6a). Comparison of 57 retained introns with 188,151 unretained ones suggested that the degree of intron retention is associated with 3 0 splice site score (Fig. 6b). Indeed, PCR with reverse transcription (RT-PCR) analyses revealed that intron 3 of S100pbp gene and intron 1 of   Cbr1 gene with low 3 0 splice site score are preferentially retained in the absence of Jmjd6 (Supplementary Fig. 8).
To directly examine the effect of GAG sequence on the intron 2 retention, we created Aire minigene containing exons 1-5 surrounded by their intronic regulatory sequences with GAG or TTT at PPT site of the intron 2 (designated GAG-type or TTT-type) (Fig. 6c). When the GAG-type minigene was expressed in WT or Jmjd6 À / À MEFs, Jmjd6 deficiency markedly increased the transcript containing intron 2 (Fig. 6c), which was consistent with the results on endogenous Aire gene expression in mTECs and ES cells (Figs 3g and 4a). However, TTT-type minigene yielded only mature transcript without intron 2 retention, irrespective of Jmjd6 expression (Fig. 6c). Thus, GAG sequence acts as a cis-regulatory element that causes intron 2 retention and inhibits Aire protein expression. Interestingly, this GAG sequence at PPT site of Aire intron 2 is highly conserved in mammals in the Euarchontoglires clade (Fig. 6d). Therefore, intron retention may have been evolved as a mechanism to prevent overexpression of Aire protein in these species.

Discussion
Intron retention is widely accepted as a consequence of missplicing, and its significance has been overlooked. However, recent evidence indicates that intron retention has a physiological role in some biological settings such as granulopoiesis 40 . Here we have demonstrated that the expression of Aire is controlled by Jmjd6 through intron retention. Although Jmjd6 deficiency did not affect abundance of Aire transcript, the intron 2 of Aire gene was not effectively spliced out in the absence of Jmjd6, resulting in a marked reduction of mature Aire protein in mTECs. In both Jmjd6 À / À E18.5 thymi and RANKL-stimulated Jmjd6 À / À fetal thymic stroma, the reduction of Aire protein was more prominent than that of mature Aire transcript. The exact reason for this discrepancy remains unclear. However, reconstitution experiments revealed that immature Aire protein generated by intron 2 retention affects subcellular localization and stability of mature Aire protein. Therefore, the ratio of mature Aire protein to the immature form might be a critical factor that determines expression and function of Aire protein in the nucleus.
Ectopic expression of peripheral TSAs by mTECs has been viewed as an essential mechanism for induction of central tolerance [3][4][5][6][7][8] . Consistent with a reduction of mature Air protein in mTECs, RNA-seq analyses revealed that the expressions of 23 Aire-dependent TSAs were markedly reduced in Jmjd6 À / À thymic stroma. This was further confirmed by quantitative realtime PCR analyses. In addition, by grafting Jmjd6 À / À thymic stroma into athymic C57BL/6 nude mice, we have shown that T-cells selected to mature in Jmjd6 À / À thymic microenvironments caused multi-organ autoimmunity. As the number of thymocytes was significantly reduced in the grafted Jmjd6 À / À thymus, disease manifestation in nu/nu Jmjd6 À / À mice might be exaggerated by homeostatic T-cell proliferation 41 . However, it has been reported that, while lymphocytes from Aire-deficient mice cause autoimmune disease when transferred into recombinase-activating gene (Rag)-deficient recipients, adoptive transfer of WT lymphocytes fail to induce disease under the same condition 5,15 . Thus, it is likely that reduction of Aire-dependent TSAs underlies disease development in nu/nu Jmjd6 À / À mice.
Mechanistically, our results suggest that Aire expression is tightly controlled via two discrete steps. First, owing to the GAG sequence at PPT site, the intron 2 of Aire gene is highly susceptible to intron retention in Euarchontoglires, and as a result, Aire protein expression is expected to be kept at a low level. As Aire has been reported to act as a proapoptotic factor 42 , this static regulation may be important to avoid a deleterious effect of Aire overexpression on the immune system or reproductive organs. The second important regulation is relief of intron retention. This is a dynamic process involving the enzymatic activity of Jmjd6, thereby raising the possibility that metabolic status and oxygen tension may influence intron retention. Although the direct substrate of Jmjd6 in this context is currently unknown, Jmjd6 interacts with multiple splicing regulatory proteins including U2 small nuclear ribonucleoprotein auxiliary factor 65 kDa (U2AF65) 22,23 . Therefore, it seems likely that Jmjd6 could alter affinity of a given splicing factor to 3 0 splice site of Aire gene through lysyl hydroxylation. Our findings thus define a previously unknown mechanism controlling expression of Aire protein critical for establishment of immunological self in the thymus.

Methods
Mice. Jmjd6 À / À mice have been described previously 27 . Mice heterozygous for the mutant allele (Jmjd6 þ /-) were backcrossed onto a C57BL/6 background for more than 10 generations, and Jmjd6 þ /mice were crossed to obtain Jmjd6 À / À embryos. The morning of finding the vaginal plug was designated as E 0.5. B6.Cg-Foxn1onu4/Nrs (nude) female mice were purchased from Taconic or provided by RIKEN BRC through National Bio-Resource Project of the MEXT, Japan and were used as recipients of thymic grafts at the age of 6-8 weeks. Mice were kept under specific pathogen-free conditions in the animal facility of Kyushu University. The protocol of animal experiments was approved by the committee of Ethics of Animal Experiments, Kyushu University.
Cell preparation and culture. To enrich TECs, thymic lobes were prepared from E18.5 or 6-8-week-old C57BL/6 mice, cut into small pieces, and dispersed further with pipetting to remove the majority of thymocytes. The resulting thymic fragments were digested with 0.125% (w/v) collagenase D/dispase (Roche) and 0.1% (w/v) DNase I (Roche) in RPMI1640 medium for 1 h at 37°C. The supernatants containing dissociated TECs were centrifuged and were washed with PBS. Before cell sorting, TECs were further enriched by depleting CD45 þ haematopoietic cells using CD45 MicroBeads (Miltenyi Biotec), and stained with the relevant antibodies and reagents. Then, mTECs and cTECs were sorted as CD45 -MHC class II þ UEA-1 þ cells and CD45 -MHC class II þ UEA-1cells, respectively. ES cells were developed from E3.5 blastocysts by using standard procedures. Integrity of ES cells was confirmed by staining them for alkaline phosphatase with a kit (Wako Pure Chemical Industries). ES cells were cultured in knockout D-MEM medium (Life Technologies) supplemented with 15% FCS (Life Technologies), 50 mM 2-mercaptoethanol (Nacalai tesque), 2 mM L-glutamine (Life Technologies), 100 U ml -1 penicillin (Life Technologies), 100 mg ml -1 streptomycin (Life Technologies) and ESGRO leukemia inhibitory factor (LIF) (Millipore) at final concentration of 1,500 units per ml. On the other hand, primary MEFs were generated from E13.5 WT and Jmjd6 À / À embryos. Primary MEFs were immortalized by transfection with a plasmid pCX4bsr-SV40ER (provided by T. Akagi, KAN Research Institute, Kobe, Japan). Immortalized MEFs were cultured in D-MEM medium (Wako Pure Chemical Industries) supplemented with 10% FCS (Nichirei Bioscience), 100 U ml -1 penicillin (Life Technologies) and 100 mg ml -1 streptomycin (Life Technologies).
RNA-seq analysis and 3 0 splice site scoring. Total RNA was extracted from WT and Jmjd6 À / À FTOC samples with (two samples for each category) or without (one sample for each category) RANKL stimulation. One mcirogram of total RNA was used for library construction with TruSeq RNA Sample Prep kit v2 (Illumina) according to the manufacture's protocol. Briefly, poly-A-containing mRNAs were purified using poly-T oligo-attached magnetic beads. The purified mRNAs were fragmented using divalent cations under elevated temperatures and then converted to dsDNA by two rounds of cDNA synthesis using reverse transcriptase and DNA polymerase I. After an end repair process, DNA fragments were ligated with adaptor oligos. The ligated products were amplified by eight cycles of PCR to generate RNA-seq library. Library integrity was verified by Bioanalyzer DNA1000 assay (Agilent Technologies). Sequencing was performed in 101-bp paired-end mode using an Illumina HiSeq (Illumina). A total of 177,060,020 reads were obtained for six samples. Filtered reads were mapped to the UCSC mm10 using the TopHat program (v2.0.10) 43 with the default parameters. The Cufflinks program (v2.1.1) 44 was then used to assemble 22,448 transcripts and to calculate the fragments per kilobase of exon per million mapped fragments (FPKM) values, which are normalized measurement of gene expression levels, with the non-default parameters: -u-library-type fr-secondstrand. To identify differentially expressed genes, the ratio of the maximum FPKM to the minimum FPKM was compared among six samples. When the ratio was more than 3, the gene was regarded as being significantly altered in expression level. We added 0.1 to the FPKM value to avoid division by 0. This led us to identify 3,212 genes with differential expression. Among these, the expression levels of 2,536 genes were significantly associated with either RANKL treatment or Jmjd6 expression (P-valueo0.05), and these genes were used for further analyses. Analysis of intron retention was performed as follows. According to the current gene annotation ('known genes' in UCSC mm10), there are 188,208 introns in total. As intron retention events should be observed in the genes with relatively high expression, we only focused on the genes with the maximum FPKM value more than 10 at least in one of the six samples. As a result, we obtained 84,708 introns. The reads mapped to these intronic regions were counted by the intersectBed program in the BEDTools utilities (v2.17.0) 45 with -c option, and the counts are converted into the FPKM values for each intron (intronic FPKM). There are 1,051 introns with intronic FPKM more than 10 for at least one of the six samples, and the degree of intron retention (IR value) was calculated by dividing intronic FPKM value by conventional FPKM value for each gene. By filtering IR value of Jmjd6 À / À sample to that of WT sample more than 1.5, we finally selected 57 introns that are preferentially expressed in Jmjd6 À / À samples under RANKL stimulation. The 3 0 splice site score was calculated by 'Splice-Site analyser tool' (http://ibis.tau.ac.il/ssat/SpliceSiteFrame.htm) 46 . Shortly, the score expresses to what extent the splice-site sequences match the following consensus sequence: TTTTTTTTTTTCAG/G ('/' indicate the intron/exon junction).
Real-time PCR was performed on ABI PRISM 7,000 Sequence Detection System using the SYBR Green PCR Master Mix (both from Applied Biosystems). The following PCR primers were used: Gapdh; Plasmids and transfection. For expression of N-terminally-tagged GFP-or mCherry-fusion proteins in mammalian cells, expression vector was created by subcloning a cDNA encoding EGFP or mCherry into pCI (Promega). The genes encoding the full-length Aire protein (mature Aire) and the truncated Aire protein (immature Aire) containing only exon 1 and exon 2 owing to intron 2 retention were subcloned into these vectors to analyse their subcellular localization in MEFs. Transfection was performed with Lipofectamine 2,000 reagent (Life Technologies). To create inducible expression system, the gene encoding HA-tagged immature Aire protein was subcloned in the pTRE2hyg vector (Clonetech) (designated pTRE2hyg-HA-immature Aire). After electroporation of pCI-GFP-mature Aire and pTRE2hyg-HA-immature Aire or pTRE2hyg into HEK293 Tet-On Advanced cells (Clonetech), cells were selected with 150 mg ml -1 hygromycin B (Wako Pure Chemical Industries) to develop stable transfectants. Aire minigene construct (GAG-type; 2.4 kb) was amplified using mouse genome DNA and subcloned into TAget Clone TM Plus (TOYOBO Co. Ltd.). After digestion with Sal I and Not I, the minigene construct was inserted into pSI vector (Promega) for transfection. The TTT-type minigene was generated by site-directed mutagenesis. The gene encoding the C-terminally GFP-tagged WT Jmjd6 was subcloned into pCI vector (designated pCI-Jmjd6-GFP). The 5 A mutant encoding alanines instead of H187, D189, K204, T285 and N287 was generated by site-directed mutagenesis. For splicing assay, Aire minigene plasmids (1 mg) was transfected into immortalized WT or Jmjd6 À / À MEFs (2 Â 10 5 cells) in 6-well plates with Lipofectamine 2000 reagent. To express Jmjd6 and its mutant in Jmjd6 À / À ES cells, these ES cells were electroporated with pCI-Jmjd6-GFP vector (WT or 5 A mutant) by using the Mouse ES Cell Nucleofector Kit (Lonza/Amaxa Biosystems). The GFP-positive ES cells were then sorted by FACSAria for RNA extraction 30 h after transfection.
Statistical analysis. For statistical analysis, P values were calculated with a two-tailed unpaired Student's t-test. P values o0.05 were considered significant. Error bars denote ± s.d.