A clonotypic Vγ4Jγ1/Vδ5Dδ2Jδ1 innate γδ T-cell population restricted to the CCR6+CD27− subset

Here we investigate the TCR repertoire of mouse Vγ4+ γδ T cells in correlation with their developmental origin and homeostasis. By deep sequencing we identify a high frequency of straight Vδ5Dδ2Jδ1 germline rearrangements without P- and N-nucleotides within the otherwise highly diverse Trd repertoire of Vγ4+ cells. This sequence is infrequent in CCR6−CD27+ cells, but abundant among CCR6+CD27− γδ T cells. Using an inducible Rag1 knock-in mouse model, we show that γδ T cells generated in the adult thymus rarely contain this germline-rearranged Vδ5Dδ2Jδ1 sequence, confirming its fetal origin. Single-cell analysis and deep sequencing of the Trg locus reveal a dominant CDR3 junctional motif that completes the TCR repertoire of invariant Vγ4+Vδ5+ cells. In conclusion, this study identifies an innate subset of fetal thymus-derived γδ T cells with an invariant Vγ4+Vδ5+ TCR that is restricted to the CCR6+CD27− subset of γδ T cells. Functional diversity of T cells expressing antigen receptors composed of γ and δ chains is just beginning to be appreciated. Here the authors show that within γδ T cells of highly diverse Vγ4+repertoire there is a population with germline rearranged, invariant TCR and a distinct phenotype.

d T cells use Rag-mediated V(D)J recombination to rearrange T-cell antigen receptors (TCRs) consisting of g and d chains. In theory, the potential junctional diversity of gd TCRs is in the range of 10 18 , and thus several orders of magnitudes higher than that of ab TCR or Ig rearrangements 1 . However, gd T cells are often regarded as innate T cells that use their TCR recombination machinery to generate identical gd TCRs of limited diversity. This perception is based on the occurrence of a few prominent gd T-cell subsets with no or little TCR junctional diversity, in which anatomical localization and function correlate with invariant gd TCRs 2,3 . Specifically, the mouse skin epidermis contains a specialized gd T-cell population of dendritic epidermal T cells (DETCs) with a fixed TCR composed of invariant Vg5Jg1Cg1 and germline-rearranged Vd1Dd2Jd2 without P-or N-nucleotides (Tonegawa nomenclature) 4,5 . The same canonical Vd1Dd2Jd2 chain is employed in combination with an invariant Vg6Jg1Cg1TCR chain in interleukin (IL)-17-producing Vg6/Vd1 T cells. These Vg6/Vd1 cells were initially thought to be restricted to the uterus and the tongue 6 , but subsequently were also found in most other tissues including the lung 7 , liver 8 , dermis 9,10 , secondary lymphoid organs 11 and intestinal lamina propria 12 . Finally, IL-4-producing Vg1 þ gd NKT cells with restricted Vd6Dd2Jd1 junctions and semi-invariant Vg1Jg4Cg4 junctions are preferentially localized in the liver and spleen 13,14 . In contrast, gd T cells circulating in the blood and secondary lymphoid organs mostly contain either Vg1 or Vg4 rearrangements and are thought to have highly diverse TCR repertoires 3,[15][16][17][18] .
Recently, gd T-cell populations grouped on the basis of the TCR g-chain usage were included into the ImmGen transcriptome database 19 . The data suggested that gd effector T-cell function correlated with gd TCR usage also in Vg4 þ T cells. Fetal thymic, and to a lesser extent splenic, Vg4 þ T cells were recognized as gd effector T cells associated with IL-17 production 19 . However, the pool of Vg4 þ T cells is heterogeneous and contains both innate cells with IL-17producing capacity as well as cells that are biased to interferon (IFN)-g production. These two populations can be segregated according to a CCR6 þ CD27 À or CCR6 À CD27 þ surface marker phenotype, respectively [20][21][22] . In addition, Vg4 þ T cells also comprise CD27 þ CD45RB high cells, a subset that readily produces IFN-g upon stimulation with IL-18 and IL-12 (ref. 23), similar to NK1.1 þ gd T cells 21 . Moreover, the requirements for final differentiation into effector cells may vary between gd T-cell types depending on their ontogeny 24 . For example, it was recently proposed that Vg4 þ T cells but not Vg6 þ T cells require extrathymic maturation for imprinting of skin-homing properties and acquisition of an IL-17-producing phenotype 25 .
To address the correlation of Vg4 þ TCR and Vg4 þ T-cell function in more detail, we performed a high-resolution analysis of the mouse gd TCR repertoire. Focusing on Vg4 þ T cells, rearranged Trd and Trg loci of the respective subsets were sequenced using 454 high-throughput sequencing technology. We found striking differences in functionally different subsets of Vg4 þ T cells. Importantly, this study identifies invariant Vg4 þ Vd5 þ T cells as a novel innate T-cell population that is abundant only among CCR6 þ CD27 À gd T cells, which are known for their IL-17-producing phenotype 10,18,[20][21][22]24,26,27 .

Results
Highly diverse Trd repertoire in Vc1 þ and Vc4 þ cells. Most gd T cells in the blood and secondary lymphoid organs display a TCR that comprises either Vg1 or Vg4 rearrangements. In contrast to fetal Vg5 þ DETCs or invariant Vg6 þ gd T cells, which have a restricted TCR repertoire of very limited diversity, circulating Vg1 þ or Vg4 þ T cells are assumed to be far more polyclonal 18 . To investigate their TCR diversity in depth, we sorted these two major subsets of peripheral gd T cells on the basis of a gd T-cell-specific reporter fluorescence 28 and co-staining with monoclonal antibodies (mAbs) directed against their Vg1 þ or Vg4 þ TCR. Next, mRNA of these samples was amplified by rapid amplification of cDNA ends (RACE) using a gene-specific primer within the first exon of the constant gene segment of the Trd locus to amplify all rearranged VDJ combinations (Fig. 1a). We observed differentially biased recombination of Trd chains in Vg1 þ and Vg4 þ cells. Approximately 60% of the Trd chains from Vg1 þ samples contained a Vd6 segment, followed by Vd2, Vd1, Vd7 and Vd12 (Fig. 1b). Trd chains from Vg4 þ samples contained mainly rearrangements of Vd7 segments (B55%), g ARTICLE followed by Vd2, Vd5 and Vd6 (Fig. 1c). This extends but is largely consistent with prior studies based on Vd-specific mAbs 29 . Further analysis of individual CDR3 sequences revealed a highly diverse repertoire in both Vg1 þ and Vg4 þ cells ( Supplementary  Fig. 1). A notable exception to broad Trd polyclonality was observed only in Vd5 rearrangements within the pool of Vg4 þ cells, where one Vd5 þ Dd2 þ Jd1 þ sequence was remarkably abundant ( Supplementary Fig. 1).
Abundant Vd5Dd2Jd1 rearrangements in Vc4 þ cells. In order to further investigate the occurrence of abundant Vd5Dd2Jd1 sequence in gd T cells, we applied primers that specifically amplified CDR3 regions located between Vd5 and Jd1 segments (Fig. 2a). With this independent approach, we confirmed the data obtained by RACE analysis. On the amino-acid level, B30% of Vd5 þ Trd sequences in Vg4 þ cells isolated from secondary lymphoid organs showed a unique arrangement of Vd5, Dd2 and Jd1 segments, whereas this combination was absent in Vg1 þ T cells, and represented less than 0.3% of Vd5 þ Trd sequences in intestinal Vg7 þ cells (Fig. 2b). Importantly, the Vd5, Dd2 and Jd1 segments were directly rearranged without P-and N-nucleotides (hereafter called germline rearrangements) in more than 95% of these canonical Trd sequences, while other nucleotide sequences coding for the same CDR3 amino-acid sequence were uncommon (Table 1). Next, we hypothesized that these canonical Vd5Dd2Jd1 sequences originated from Vg4 þ gd T cells that had developed in the fetal thymus, similar to the invariant Trd chains found in Vg5 þ DETCs and in Vg6 þ gd T cells. Their TCRs share canonical N-nucleotide-lacking Vd1Dd2Jd2 rearrangements [4][5][6] . To this end, we sequenced the Vd5 repertoire of Vg4 þ gd T cells sorted from either wild-type TcrdH2BeGFP control mice (wt) or from Indu-Rag1 mice 30 crossed with TcrdH2BeGFP mice. In the latter, deficient V(D)J recombination had been restored in the adult stage by tamoxifeninduced cre recombinase expression. Only B1% of the canonical Vd5Dd2Jd1 rearrangements were found when only adult gd T-cell development had been possible in Indu-Rag1 mice as compared with control Vg4 þ gd T cells with more than 30% of such sequences (Fig. 2c). These results suggest that the large majority of canonical Vg4 þ /Vd5Dd2Jd1 þ gd T cells are generated early in ontogeny, presumably in the embryonic thymus.
Canonical rearrangements in CCR6 þ CD27 À cd T cells. Next, we tested whether Vg4 þ cells with canonical Vd5Dd2Jd1 rearrangements were contributing to the pool of IL-17-producing Vg4 þ gd T cells, as these were recently discovered to develop exclusively before birth and subsequently persist in adult mice as self-renewing, long-lived cells 11 . Such IL-17-producing effector gd T cells are contained within a population of gd T cells identified by a CCR6 þ CD27 À surface marker phenotype [20][21][22] . Therefore, we sorted Vg4 þ T cells from peripheral lymph nodes (pLNs) and spleen of adult TcrdH2BeGFP mice into CCR6 À CD27 þ and CCR6 þ CD27 À subsets, sequenced their Trd loci and compared them with each other. Clearly, canonical Vd5Dd2Jd1 rearrangements were highly enriched in the CCR6 þ CD27 À subset (Fig. 2d). Thereby, the corresponding CDR3 amino-acid motif ASGYIGGIRATDKLV from CCR6 þ CD27 À Vg4 þ T cells was principally but not exclusively encoded by Vd5Dd2Jd1 rearrangements unmodified by P-or N-nucleotides (Fig. 3). Together, these results suggest that CCR6 þ CD27 À Vg4 þ T cells have a distinct TCR repertoire that comprises canonical Vd5Dd2Jd1 þ gd T cells. The data further support the hypothesis that IL-17-producing CCR6 þ CD27 À Vg4 þ T cells are principally generated within a functional wave of embryonic gd T-cell development 11,18 .
Ontogeny and organ distribution of Vd5Dd2Jd1 þ T cells. To further explore the potential fetal thymic origin of canonical Vd5Dd2Jd1 þ Vg4 þ T cells, we sorted Vg4 þ T cells derived from the thymus at different stages during ontogeny and sequenced the Trd repertoire of Vd5-Jd1 amplicons (Fig. 4a). The invariant Vd5Dd2Jd1 sequence was already abundant in the fetal thymic Trd repertoire of Vg4 þ T cells on embryonic day 18 (E18, 15% of all Vd5 sequences), but also persisted in juvenile (11-17%) or adult thymi (7% at 4 months; Fig. 4a and Supplementary  Fig. 2). These data further support the hypothesis that IL-17producing gd T cells, in particular with the invariant Vd5Dd2Jd1 clonotype, are already generated in the fetal thymus, but later persist as resident cells in the juvenile and adult thymus 11,24 . Moreover, these findings are consistent with a previous ontogeny study as described in (ref. 31). However, invariant Vd5Dd2Jd1 þ T cells were even more abundant in peripheral lymphoid organs than in the thymus, suggesting peripheral homeostatic expansion of the subset. Notably, invariant Vd5Dd2Jd1 þ Trd sequences made up 43% of all Vd5 sequences of Vg4 þ T cells in pLNs and 27% in the spleen (Fig. 4b).
Because TCR repertoire as well as the developmental requirements of Vg4 þ T cells vary along with location 25 , we investigated different peripheral anatomical sites separately. By deep sequencing of Vd5-Jd1 amplicons we examined the TCR repertoire of Vg4 þ T cells derived from other organs including the lung, skin and liver ( Fig. 4b and Supplementary Fig. 3). It turned out that the invariant Vd5Dd2Jd1 sequence also dominated the Trd repertoire of Vg4 þ T cells in peripheral tissues such as the skin (27%), lung (15%) and liver (25%), albeit to a lesser extent than in pLNs. Across all samples derived from the pLN and spleen, the Vd5Dd2Jd1 clonotype made up B35% of the Vd5 repertoire in Vg4 þ T cells. Since B10% of Vg4 þ T cells used the Vd5 segment (as shown by RACE analyses in Fig. 1), these constitute 3.5% clonotypic cells of all Vg4 þ T cells. Thus, with Vg4 þ T cells making up to 50% of all gd T cells derived from the pLN 32 , the actual rate of clonotypic invariant Vd5Dd2Jd1 þ gd T cells is 1.5-2% among all gd T cells. This remarkably high frequency is in the range of invariant IL-17A-producing Vg6Vd1 þ T cells that constitute 2-4% among all gd T cells in secondary lymphoid organs 33,34 . Nevertheless, only moderate expansion of this subset in peripheral tissues as compared with the thymus underlines their genuine innate nature.
Next, we asked whether TCR diversity in noninvariant Vg4 þ Vd5 þ T cells among the CCR6 þ CD27 À subset was also different to CCR6 À CD27 þ Vg4 þ T cells (Fig. 5). Within an equally sized pool of in-frame amino-acid sequences obtained from both populations, CCR6 À CD27 þ Vg4 þ T cells had a higher proportion of unique Vd5 rearrangements that were present only once in the tested sample (singletons), while CCR6 þ CD27 À Vg4 þ T cells showed fewer singletons (Fig. 5a). Consequently, samples from CCR6 þ CD27 À Vg4 þ T cells contained a higher frequency of sequences that were detected several times. In Fig. 5b, in which all sequences are represented according to their actual frequency among all sequences, such clones were considered as expanded (1-1.99%) or highly expanded (Z 2%).
To quantify the TCR diversity of the respective two gd T-cell subsets, we calculated the effective number of sequences using the Shannon index, which considers the number of observed individual sequences as well their abundance in the repertoire 35,36 . It turned out that the CCR6 þ CD27 À subset was three times less diverse than CCR6 À CD27 þ cells, and still two times less diverse when the dominant canonical Vd5Dd2Jd1 sequence was excluded from the equation (Fig. 5c). Qualitatively, we observed essentially no overlapping Vd5 sequences between two independently analysed samples of CCR6 À CD27 þ Vg4 þ cells. The only exception was the canonical sequence that was found repeatedly, albeit at low frequency ( Fig. 5d, left panel). In contrast, two independent samples of CCR6 þ CD27 À Vg4 þ T cells showed an overlap of 29 Trd sequences, beside the canonical Vd5Dd2Jd1 sequence (Fig. 5d, right panel). Accordingly, the similarity of two  7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23  Vδ5 chain from Vγ4 + CCR6 -CD27 + wt pool Vδ5 chain from Vγ4 + CCR6 + CD27wt pool  independent samples, calculated as the Morisita-Horn index, was low (0.003) for CCR6 À CD27 þ and high (0.889) for CCR6 þ CD27 À Vg4 þ T cells. In conclusion, CCR6 þ CD27 À Vg4 þ T cells, which are known for their IL-17-production capacity, display TCR repertoires of limited diversity and a considerable TCR sequence overlap between independent samples derived from individual mice. This is consistent with the view that these cells develop as a population of prewired innate effector cells with limited TCR diversity in the fetal thymus to persist and expand after birth.
The corresponding Vc4 repertoire. To complement the Trd repertoires, we next determined the respective Vg4 repertoires of sorted Vg4 þ T-cell populations. Sequencing was performed using a Vg4-specific primer in combination with a consensus primer amplifying Jg1, Jg2 and Jg3 (Fig. 6). Consistent with previous findings 2, 37 , overall diversity of the rearranged Trg loci was lower than at Trd loci, and more than 99% of Vg4 rearrangements involved Jg1. Analysis of CDR3 amino-acid sequence length distribution and composition of Vg4 chain repertoires demonstrated an expanded Vg4 motif, SYGXYSSGFHKV. Notably, this dominant motif was abundant in Vg4 þ T cells derived from wt and Indu-Rag1 mice (Fig. 6a) as well as in CCR6 À CD27 þ and CCR6 þ CD27 À Vg4 þ T cells (Fig. 6b). This is in contrast to our findings for the canonical Vd5Dd2Jd1 Trd locus rearrangement, which was exclusively abundant in fetal thymus-derived CCR6 þ CD27 À gd T cells. Nevertheless, the Vg4 repertoire of the IL-17-producing CCR6 þ CD27 À subset was even more focused towards the SYGXYSSGFHKV consensus CDR3 sequence than CCR6 À CD27 þ Vg4 þ T cells (Fig. 6b). Among all Vg4 chain sequences analysed, a leucine residue was most frequently found   ARTICLE at the variable position X of this CDR3 motif, followed by serine, arginine, proline and subsequently all other amino acids (Fig. 6c). While a stop codon within the germline Vg4 segment precludes true germline rearrangements, the two most frequent nucleotide sequences coding for SYGLYSSGFHKV were still rather germline-like as they both lacked N-nucleotides (Table 2). Together, these two sequences accounted for one-third of all in-frame sequences obtained from the CCR6 þ CD27 À subset of Vg4 þ T cells (Fig. 6d). In conclusion, sequence analyses of the corresponding Vg4 repertoire is consistent with the view that rearrangement at the g and d TCR loci is differentially controlled. Furthermore, the presence of the semi-invariant Vg4 sequences also in CCR6 À CD27 þ Vg4 þ T cells and in T cells from Indu-Rag1 mice would alone not be sufficient for any potential positive selection of IL-17-producing CCR6 þ CD27 À subset of Vg4 þ T cells.
Invariant Vd5Dd2Jd1 chains pair with a canonical Vc4 chain. Finally, we sought to identify the corresponding TCR g and TCR d chain pairs that constituted the TCR heterodimer of IL-17-producing CCR6 þ CD27 À Vg4 þ T cells. To this end, we performed single-cell PCR from cDNA of sorted Vg4 þ CCR6 þ CD27 À cells. Of 20 individual T cells with Vd5Dd2Jd1 Trd germline rearrangement, 7 had Trg rearrangements coding for the most frequent Vg4 CDR3 sequence SYGLYSSGFHKV, 12 displayed other variations of the SYGXYSSGFHKV motif and one clone showed a shorter version of this CDR3, namely SYG_YSSGFHKV (Table 3). Together, these results suggest that the pool of fetal thymusderived IL-17-producing gd T cells of the CCR6 þ lineage contains a population of invariant Vg4 þ cells with a TCR composed of a germline-rearranged Vd5Dd2Jd1 chain and a canonical Vg4Jg1 chain motif.

Discussion
This study focused on the correlation between TCR sequence and effector phenotype in Vg4 þ T cells, which constitute a heterogenic population of mouse gd T cells. It comprised high-throughput generation of hundreds of thousands of sequences of rearranged Trg and Trd genes and constitutes, to our knowledge, the first deep-sequencing report on Trd genes in       ARTICLE any species as well as on mouse Trg genes. In general, our results clearly support the view that diversity of rearranged Trd loci is much higher than for their Trg counterparts. In particular, the mouse Vg4 repertoire contained predominant expanded sequences that were shared between individual mice. These findings are consistent with prior deep-sequencing studies of the human TRG repertoire 38 In addition, the novel invariant Vg4 þ Vd5 þ T-cell population shares several decisive features with three other prominent invariant gd T-cell subsets in mice, which are Vg5 DETCs, Vg6 þ Vd1 þ cells and semi-invariant Vg1 þ Vd6 þ NKT cells and likewise fetal liver-derived human Vg9 þ Vd2 þ cells 42 . First, all of these populations contain straight germline rearrangements of invariant canonical TCRs without nontemplated N-nucleotides. Second, they are of fetal or at least perinatal origin. These two features, canonical germline rearrangements and exclusive development in the fetal thymus, define such gd T-cell populations as genuine innate T cells. Invariant Vg4 þ Vd5 þ T cells were remarkably abundant in every sample of Vg4 þ T cells sorted by a CCR6 þ CD27 À surface phenotype associated with IL-17-producing capacity and basically absent in CCR6 À CD27 þ Vg4 þ T cells. Importantly, these results were very reproducible and consistent regardless of whether independent samples were derived from genomic DNA or cDNA and whether these were analysed by either RACE or Vd5-specific primers. Thus, the canonical Vg4 þ Vd5 þ TCR classifies a hitherto unrecognized conserved subset of innate and presumably IL-17-producing Vg4 þ T cells. Notably, invariant Vg4 þ Vd5 þ T cells are actually abundant in peripheral lymphoid organs in the same magnitude as IL-17-producing Vg6 þ Vd1 þ cells 33,34 . In future studies, it will be interesting to compare the common and discrete functions of these two invariant gd T-cell subsets that are likely innate 'natural' IL-17A-producers.
Interestingly, identical Vd5Dd2Jd1 germline rearrangements had 24 years ago been designated as BID, for BALB/c invariant delta 43 . These Trd germline rearrangements were described as frequent in lungs and lymph nodes of mice with a BALB/c genetic background, but are absent in mice with a C57BL/6 genetic The germline sequences of related gene segments are demonstrated in red. N and P refer to N-and P-nucleotides, respectively. The germline sequences of related gene segments and the variable AA between G3 and Y5 are demonstrated in red. N and P refer to N-and P-nucleotides, respectively. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7477 background 43 . However, BID chains were later found to be generated also in C57BL/6 fetal thymocytes at levels similar to those detected in BALB/c mice 44 . Hence, it was concluded that the presence of BID among resident pulmonary lymphocytes of BALB/c and of BALB/c Â C57BL/6 F1 mice but not in C57BL/6 mice was because of positive selection and peripheral expansion 44 . Yet, in our study, gd T cells bearing BID, that is, Vd5Dd2Jd1 germline rearrangements, turned out to be an abundant innate Vg4 þ T-cell population in C57BL/6 mice. It is currently unclear, however highly relevant for future work, why the C57BL/6 mice of our study and the C57BL/6 mice of the Basel Institute used as negative controls for the original BID description in 1990 would differ in this respect. It is tempting to speculate that specific genetic variations are responsible for differential selection and peripheral expansion processes. The null hypothesis, however, is that no specific antigen is required for selection and development of canonical innate Vg4 þ Vd5 þ T cells. This would be consistent with the hypothesis that the capacity to produce IL-17 cytokines is prewired in a subset of fetal T-cell precursors before and is independent of TCR rearrangement 11 . In addition, there is considerable evidence that positively selecting TCR-triggering in immature fetal gd T-cell precursors induces differentiation towards the potential to produce IFN-g while suppressing the IL-17-associated factors Sox13, Sox4 and Rorc 32,53-55 . Thus, if TCR-specific selection of invariant Vg4 þ Vd5 þ T cells occurred during thymic development, they would, in contrast to our observations, be rather expected to adopt a CCR6 À CD27 þ phenotype with the potential to produce IFN-g.
In conclusion, we establish a novel truly innate gd T-cell subset of invariant Vg4 þ Vd5 þ T cells, which is confined to presumably IL-17-producing CCR6 À CD27 þ T cells. Future studies addressing the specific physiological functions of these cells within the pLNs and within tissues such as the lung, skin and liver will advance the understanding of innate lymphocyte and gd T-cell biology.

Methods
Mice. All mice used throughout the study were on a C57BL/6 genetic background. C57BL/6-TcrdH2BeGFP mice were generated at the Centre d'Immunologie de Marseille-Luminy 28 and C57BL/6-Indu-Rag1 mice were a kind gift from Siggi Wei 30 . Indu-Rag1 mice were crossed to TcrdH2BeGFP to obtain tamoxifeninducible Indu-Rag1 Â TcrdH2BeGFP mice 11  Cell isolation and cell sorting. pLNs and spleens were mashed, filtered through 50-mM nylon meshes and washed with PBS/3% fetal calf serum. Spleen cells were treated with erythrocyte-lysis buffer before mixing with lymph node cells as described previously 56 . The liver and lung were cut into small pieces and digested with 0.5 mg ml À 1 Collagenase D and 0.025 mg ml À 1 DNAse-1. The digestion was stopped by adding EDTA to a final concentration of 20 mM. For the isolation of skin lymphocytes, an area of the back was shaved, the skin was removed and cut into pieces and digested with 0.5-mg ml À 1 Liberase and 0.025 mg ml À 1 DNAse-1. Digestion was carried out for 45 min at 37°C and was stopped by adding EDTA to a final concentration of 40 mM. Digested organs were meshed through a 40-mm Cellstrainer. Lung lymphocytes were separated using Lympholyte M. Liver and skin lymphocytes were separated with density gradient centrifugation using Percoll gradients. MAbs against TCR Vg4 (clone UC3-10A6, PE-conjugated, 1:200) were purchased from Biolegend or produced in rat hybridoma cell lines (clone 49.2-9, Cy5-conjugated, 1:100). Antibody against TCR Vg1 (clone 2.11, PE-conjugated, 1:100) was purchased from Biolegend and antibody against TCR Vg7 (clone F2.67, Cy5-conjugated, 1:50) produced in rat hybridoma cell lines. Antibodies against TCR b (clone H57-597, PE-Cy7-conjugated, 1:200) and CD27 (clone LG.3A10, PerCP/Cy5.5-conjugated, 1:200) were obtained from Biolegend. Antibodies against CD196 (CCR6; clone 140706, Alexa Fluor 647-conjugated, 1:100) were purchased from BD Biosciences. Cell suspensions were treated with FcR block (clone 2.4G2) before 20-min staining with mAbs. Antibody-labelled cell populations were sorted for high-throughput sequencing through the FACSAria IIu flow cytometer (Becton Dickinson). In experiments designed to compare two populations, we always sorted the same amount of cells as a starting population to be able to quantify TCR diversities. Likewise, single cells were sorted into 96-well plates for single-cell PCR by MoFlo or XDP flow cytometer (Beckman-Coulter). The first rows of the plates were left empty as negative controls.
Nucleic acid isolation. For isolation of genomic DNA, sort-purified cell fractions were resuspended in PCR lysis buffer (10 mM Tris (pH 8.4), 50 mM KCl, 2 mM MgCl 2 , 0.5% Nonidet P-40, 0.5% Tween-20, 400 mg ml À 1 proteinase K) at 500 cells ml À 1 and incubated overnight at 50°C. The proteinase K was inactivated at 95°C for 10 min. This protocol was adapted from refs 56-58. Up to 20 ml of the DNA samples were used directly for PCR. Total RNA was isolated from sorted cell populations using the RNeasy Mini Kit (QIAGEN) and was reverse-transcribed with Superscript III (Invitrogen) using Random Primers (Invitrogen).

RACE.
To generate unbiased template libraries of rearranged CDR3 regions of the Trd locus, anchor sequence-containing cDNA template was synthesized using the SMARTer RACE cDNA Amplification Kit (Clontech) according to the manufacturer. RACE PCR was performed with a gene-specific primer located in the Cd gene segment (5 0 -CGAATTCCACAATCTTCTTG-3 0 ) and an anchor sequencespecific primer recommended in the kit.
High-throughput sequencing. Forward and reverse PCR primers for deepsequencing contained at their 5 0 ends the respective 454 universal adaptor sequences and multiplex identifier (MID) nucleotides. PCR products were purified using gel extraction with the QIAquick Gel Extraction kit (QIAGEN) and was quantified with Qubit fluorometer (Invitrogen) using the Quant-iT dsDNA HS Assay kit (Invitrogen). Amplicons were processed with the emPCR-Lib-A SV kit (GS FLX Titanium series; Roche) according to the manufacturer to sequence on 454 Genome Sequencer FLX system (Roche). Productive rearrangements and CDR3a regions were defined by comparing nucleotide sequences to the reference sequences from IMGT, the international ImMunoGeneTics information system (http://www.imgt.org) 59 . Rearrangements were analysed and CDR3a regions were defined using IMGT/HighV-QUEST 60 .
Single-cell PCR. For single-cell PCR, CCR6 þ CD27 À Vg4 þ single cells were sorted directly into 96 wells. We reverse-transcribed RNA from each sort-purified cell to cDNA, and used it as a template to amplify corresponding g and d TCR chains. Single cells sorted in 6-ml PBS were immediately frozen on dry ice and transferred to À 80°C. Frozen cells were lysed by heating to 65°C for 2 min and were cooled to 4°C. RNA transcription and a first round of PCR were sequentially performed in one reaction using the OneStep RT-PCR kit (QIAGEN) according to the manufacturer with some modifications. A combination of specific5 0 -primer pair for Vg4 chain (Vg4 outer: 5 0 -ASCAAGAGATGAGACTGCACAAAT-3 0 in NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7477 ARTICLE combination with Jg1, 2 and 3: 5 0 -GTTCCTTCTGCAAATACCTTGTGA-3 0 ) and Vd5 chain (Vd5 outer: 5 0 -TGCGGATTCTCCAAACCCAGATTTA-3 0 in combination with Jd1: 5 0 -TTGGTTCCACAGTCACTTGGGTTCC-3 0 ) was used for this multiplex reaction. For a 25-ml reaction, the components were as following: 1 Â buffer, 400 mM of each dNTP, 0.4 mM of each primer and 1-ml OneStep RT-PCR Enzyme mix. Reverse transcription was performed for 30 min at 50°C and was directly followed by amplification. PCR activation was initiated at 95°C for 15 min, before 30 cycles, consisting of 30 s at 94°C, 30 s at 65°C and 60 s at 72°C, and finally a single incubation at 72°C for 10 min. Next, second rounds of PCR with seminested primers (Vg4: 5 0 -TGCAACCCCTACCCATATTTTCT-3 0 and Vd5 inner: 5 0 -TAGGGACGACACTAGTTCCCATGAT-3 0 ) were separately performed for Vg4 and Vd5 chains. For this, 0.2 ml from the first PCR products were used as template in a 20-ml reaction. The PCR fragments were amplified using 0.5 unit AmpliTaq Gold DNA polymerase (Applied Biosystems) in combination with GeneAmp 1 Â PCR puffer II, 2 mM MgCl 2 , 0.25 mM of each dNTP and 0.3 mM of each primer. After 10 cycles, samples positive for Vg4 and Vd5 were detected using agarose gel electrophoresis. After additional 20 cycles of PCR, gelextracted (QIAGEN) PCR products of Vg4 and Vd5 chains from each cell were separately cloned in pCR4-TOPO vector through the TOPO TA Cloning Kit for sequencing (Invitrogen). These plasmid vectors were isolated via the QIAprep kit (QIAGEN) and sequenced by GATC Biotech (Germany).
Sequence analysis. First, fna files generated by 454 high-throughput sequencing were converted and partitioned into separate FASTA files using MIDs and genespecific primer sequence identifiers. Next, these files were uploaded to HighV-QUEST, an online tool available on the IMGT website 59 , and compared with the IMGT data base. Analysed txt files returned from IMGT were further processed with Excel to segregate productive and unproductive TCR rearrangements to quantify and merge similar sequences and for further statistical analysis. Via specifically selecting only in-frame productive TCR rearrangements, sequences that contained insertions and deletions from the 454 platform were routinely excluded. Shannon indices were calculated using Vegan R package (2.14.0). Venn diagrams were produced with VennMaster (0.37.5; ref. 61). All raw sequence data were uploaded to the NCBI Sequence Read Archive under the SRP accession number SRP050364.