A genome-wide innateness gradient defines the functional state of human innate T cells

Innate T cells (ITCs), including invariant natural killer T (iNKT) cells, mucosal-associated invariant T (MAIT) cells, and γδ T cell populations, use conserved antigen receptors generated by somatic recombination to respond to non-peptide antigens in an innate-like manner. Understanding where these cells fit in the scheme of immunity has been a puzzle since their discovery. Here, immunophenotyping of 101 individuals revealed that these populations account for as much as 25% of peripheral human T cells. To better understand these cells, we generated detailed gene expression profiles using low-input RNA-seq and confirmed key findings through protein-level and functional validation. Unbiased transcriptomic analyses revealed a continuous ‘innateness gradient’ with adaptive T cells at one end followed by MAIT, iNKT, Vδ1+ T, Vδ2+ T, and natural killer cells at the other end. Innateness was characterized by decreased expression of translational machinery genes and reduced proliferative potential, which allowed for prioritization of effector functions, including rapid cytokine and chemokine production, and cytotoxicity. Thus, global transcriptional programs uncovered rapid proliferation and rapid effector functions as competing goals that respectively define adaptive and innate states. One Sentence Summary Adaptive and innate T cells align along a continuous innateness gradient, reflecting a trade-off between effector function and proliferative capacity.


Introduction
Within the spectrum of immune defense, "innate" and "adaptive" refer to pre-existing and learned responses, respectively. Mechanistically, innate immunity is largely ascribed to 'hardwired,' germline-encoded immune responses, while adaptive immunity derives from recombination and mutation of germline DNA to generate specific receptors that recognize pathogen-derived molecules, such as occurs in T and B cell receptors. However, the paradigm that somatic recombination leads only to adaptive immunity is incorrect. (1-3). These "donor unrestricted" T cell populations have been estimated to account for as much as 10-20% of human T cells (4), and have critical roles in host defense and other immune processes. The existence of innate-like T cells suggests that somatic recombination, the machinery of adaptive immunity, is working to generate TCRs that function as innate antigen receptors. We and others now refer to these cells as innate T cells (ITCs).
Most ITC populations share several important features. First, they do not recognize peptides presented by MHC class I and class II. iNKT cells recognize lipids presented by a non-MHC encoded molecule named CD1d (5)(6)(7). MAIT cells recognize small molecules including bacterial vitamin B-like metabolites presented by another non-MHC encoded molecule, MR1 (8,9). It is not known whether specific antigen presenting elements drive the development of γδ T cells. One major γδ T cell population bearing Vγ2-Vδ9 TCRs (Vδ2) is activated by self and foreign phospho-antigens in conjunction with a transmembrane butyrophyllin-family receptor, BTN3A1 (10)(11)(12)(13). The antigens recognized by other human γδ T cell populations are not clear, although a subset of these cells recognizes lipids presented by CD1 family proteins (14), and recent data suggests that some may recognize butyrophyllin-family receptors (15). A second shared feature of ITCs is that their responses during inflammation and infection exhibit innate characteristics, such as rapid activation kinetics without prior pathogen exposure, and the capacity for antigen receptor-independent activation. Data in mice demonstrate that during diverse immune responses, including in host defense, cancer, autoimmunity, and allergic disease, a large portion of the iNKT cell pool is rapidly activated and orchestrates ensuing immune responses (1, 16). On the other hand, only a low frequency of the adaptive T cell pool responds during any given infection. Inflammatory cytokines such as IL-12, IL-18, and type I interferons can activate ITCs even in the absence of concordant signaling through their TCRs, and such TCR-independent responses have been reported in iNKT cells (17,18), MAIT cells (19,20), and γδ T cells (21)(22)(23)(24).
These 'cytokine-only' responses may explain how these cell populations contribute to immunity in diverse inflammatory contexts.
Given the similar functions reported among different ITC populations, we hypothesized that their effector capabilities might be driven by shared transcriptional programs. Here, we set out to transcriptionally define the basis of innateness in human ITCs by studying them as a group, focusing on their common features rather than what defined each population individually. We performed RNA-seq on highly-purified lymphocyte populations from the peripheral blood of six healthy donors in duplicate to generate robust transcriptional data, and confirmed key findings with protein-level and functional validation. Using unbiased methods to determine global interpopulation relationships, we defined an 'innateness gradient' with adaptive cells on one end and natural killer (NK) cells on the other, in which ITC populations clustered between the prototypical adaptive and innate cells. Within the ITC cluster, populations segregated from adaptive to innate as MAIT, iNKT, Vδ1, and Vδ2. These data suggest that ITCs, with innate-like functionality and antigen receptors produced by the machinery of adaptive immunity, are a distinct family defined both transcriptionally and functionally. Interestingly, we observed decreased transcription of cellular translational machinery and a decreased capacity for proliferation as hallmarks of innate cells. Innate cells rather prioritized effector functions, including cytokine production, chemokine production, cytotoxicity, and reactive oxygen metabolism. Thus, growth potential and rapid effector function are hallmarks of adaptive and innate cells, respectively.

Human immunophenotyping reveals high aggregate ITC frequencies
To characterize the abundance and variability of ITCs in humans, we quantified 4 major populations of innate T cells from 101 healthy individuals aged 20 to 58 years by flow cytometry, directly from peripheral blood mononuclear cells (PBMCs) in the resting state. We assessed the frequencies of iNKT cells, MAIT cells, and the two most abundant peripheral γδ T cell groups, those expressing a Vδ2 TCR chain (Vδ2) and those expressing a Vδ1 TCR chain (Vδ1). MAIT cells contributed from 0.1 to 15% of T cells (mean 2.4%), iNKT cells from undetectable to 1.1% (mean 0.09%), Vδ1 cells 0.25 -6.2% (mean 1.25%), and Vδ2 from 0.08 -22% (mean 4.7%). The sum of these 4 cell types accounted for 0.9 -25.7% of an individual subject's T cells (mean 8.4%) (Fig. 1A, Supplementary Table 1). Vδ2 cells were more abundant than Vδ1 in 82% of subjects, with the ratio of these two cell types ranging from 0.2 to 67.8 (mean 8.5). Age negatively associated with the total percentage of ITCs (P = 1.4e-05). MAIT (r = -0.42, P = 9.9e-06) and Vδ2 (r = -0.43, P = 4.7e-06) populations drove this association (Fig.   S1A,B), even after accounting for the abundances of other cell types (P = 5.9e-04, P = 1.2e-04, respectively), which is consistent with previous findings (25,26). We observed covariance between the frequencies of MAIT/iNKT cells (P = 0.02), corrected for the other cell types and age (Fig. S1C,D). We observed no significant associations between ITC percentage and gender, body mass index, or smoking status after accounting for age. Together, these results show human ITCs contribute a substantial portion of the peripheral T cell repertoire, are variable between individuals, and decrease with age.

ITC populations rapidly release cytokines
We next tested innate T cell populations for two functional hallmarks of innate effectors, rapid cytokine production and TCR-independent activation. To assess rapid cytokine production potential, we activated healthy donor PBMCs with phorbol 12-myristate 13-acetate (PMA) and ionomycin for 4 hours, followed by intracellular staining for interferon-γ (IFN-γ) production.
Between 35 and 85% of MAIT, iNKT, Vδ1, and Vδ2 T cells produced IFN-γ under these conditions, while a smaller percentage of adaptive CD4 + T and CD8 + T cells produced this cytokine (Fig. S2A,B). To test the relative capacity of these cell types to respond to inflammatory cytokines alone, we activated PBMCs with IL-12 + IL-18 or IL-12 + IL-18 + IFNα for 16 hours, and assessed IFN-γ production during the final 4 hours of stimulation. 20-80% of iNKT, MAIT, Vδ2, and NK cells produced cytokines under these conditions, while only a tiny portion of adaptive cells responded (Fig. 1B, Fig. S2C,D). Taken together, these studies show that ITC populations rapidly produce cytokines, and can do so in response to inflammatory cytokines even in the absence of TCR signals. Notably, we observed the latter activation mechanism almost exclusively in ITC populations.

RNA-seq profiling of ITCs reveals a continuous innateness gradient
To better understand the biological properties of human ITCs on a genome-wide scale, we profiled their transcriptomes with RNA-seq. Ultra-low input RNA-seq profiling using 1,000 cells per sample enabled high-depth sequencing of even relatively rare human lymphocyte populations. From 6 healthy individuals, we sorted in duplicate four subsets of ITCs: iNKT, MAIT (defined as MR1-5-OP-RU tetramer + ), Vδ1 and Vδ2 cells (Supplementary Table 2). From the same individuals, we also sorted CD4 + and CD8 + T cells as comparator adaptive T cells and NK cells as comparator innate cells (Fig. S3). Using SmartSeq2 to create poly(A)-based libraries, we generated 25 base pair, paired-end libraries sequenced at a depth of 4-12 million read pairs ( Fig. S4). After sequence mapping, we calculated tpm (transcripts per million) values for each gene. We considered 19,931 genes as expressed (tpm>3 in ≥10 samples), including 12,730 protein-coding, 183 T cell receptor genes, 3,261 long noncoding RNA (lncRNA), and other lowly-expressed genes (e.g. pseudogenes, Fig. S4C).
Principal component analysis identified the major axes of variation in gene expression ( Fig. 2A).
The first principal component separated the subsets by a continuous 'innateness gradient' with CD4 + and CD8 + T cells on one end, and NK cells on the other end (Fig. 2B). Ordered from adaptive to innate along the first principal component, MAIT, NKT, Vδ1 and Vδ2 clustered in between the adaptive cells and NK cells. We then identified genes associated with the rank order of each lymphocyte population in the innateness gradient (CD4 + T = 1, CD8 + T= 2, MAIT = 3, iNKT = 4, Vδ1 = 5, Vδ2 = 6, NK = 7), using linear mixed models. This analysis revealed 1,884 genes significantly associated with the innateness gradient (P < 2.5e-06=0.05/19,931, correcting for 19,931 tests), including protein coding and lncRNA genes (Fig. 2C, Supplementary Table 3).
Hereafter we refer to positive and negative associations with the ranked gradient as associations with 'innateness' and 'adaptiveness,' respectively. We defined an 'innateness score' as the magnitude of the change in expression level by an increase of one in the gradient (the β of the gradient variable within our linear mixed model).

Associations with innateness: migration, cytotoxicity, cytokine production, and ROS metabolism
Cytotoxicity and chemokines. The Gene Ontology (GO) terms most associated with innateness included NK cell and lymphocyte chemotaxis, NK cell mediated immunity, cellular defense response, and several additional terms related to leukocyte migration and activation ( Fig. 3A-D, specific GO terms indicated in figure legend and Fig. S5A). Using flow cytometry, we validated the expression of key genes, including killer cell lectin-like receptor (KLR) family genes and killer cell immunoglobulin-like receptor (KIR) genes (Fig. S5B). Cytotoxicity proteins such as perforin, granzyme B and granulysin also associated with innateness ( Fig. 3E,F, Fig. S5B). Eight chemokines strongly associated with innateness, including CCL3, CCL4, CCL5, XCL1 and XCL2 (P < 9e-12), consistent with a role for innate lymphocytes in recruiting other inflammatory cell types to initiate inflammation. IFNG (the gene coding for IFN-γ) showed a significant association with innateness (P = 1.7e-06, Fig. 3G), and the baseline IFNG levels in each cell population predicted their production of IFN-γ upon stimulation (Fig. S2A,B). Since ITCs produce diverse cytokines and chemokines (1, 2), we quantified the total cytokine and chemokine transcriptome 'mass' in each cell type at baseline. We observed that the aggregate sum of the expression levels of the 37 cytokines and chemokine genes expressed in our dataset followed the innateness gradient (Fig. 3H).
Reactive oxygen species (ROS) metabolism. Metabolic pathways are well-known to vary among immune cell subsets and influence their functions (27). Among metabolic programs, the pentose phosphate pathway was nominally positively associated with innateness (P = 0.036, Fig. 4A).
G6PD, the gene that codes for the rate-limiting enzyme in the pentose phosphate pathway, showed the strongest positive association with innateness in this pathway (β = 0.29, P = 3e-14, Fig. 4B,C). This enzyme produces NADPH which in turn can be used for glutathione biosynthesis, protecting against damage caused by ROS. Two critical enzymes for buffering the damaging effect of ROS, GCLM and GCLC, also nominally associated with innateness (P = 2e-04 and 1e-03, respectively, Fig. 4D,E). We quantified ROS by flow cytometry using CellROX green, and found that total cellular ROS levels were higher in adaptive T cells than in ITCs, suggesting that elevated G6PD might provide a baseline buffer counteracting ROS (Fig. 4F,G).
Overall, these results suggest that ITCs are prepared to buffer ROS at baseline, a useful adaptation for effector cells expressing chemokine receptors such as CXCR1, CXCR2, and CCR5 ( Fig. S7C) that direct them to the same sites of infection or inflammation as monocytes and neutrophils.

Associations with adaptiveness: regulation of translational machinery
When we applied gene set enrichment to adaptiveness, "cytosolic ribosome" (GO:0022626) emerged as the most-associated term (P = 4.7e-28, Fig. 5A,B, Fig. S6A,B). This enrichment was not driven by a small percentage of genes very strongly overexpressed among ITCs (Fig. S6C).
Translation initiation factors were also consistently associated with adaptiveness ( Fig. 5C, Fig.   S6D) suggesting that the translational machinery, and not just the ribosome complex, was associated with adaptiveness. MYC, which coordinately regulates ribosomal RNA genes (28), was the transcription factor with the highest fold change associated with adaptiveness (P = 3.8e-22, Fig. 5D). As an independent assessment of ribosome synthesis, we used quantitative polymerase chain reaction (qPCR) to assess expression of the earliest uncleaved ribosomal RNA (rRNA) precursor. The expression of precursor 47S rRNA associated with adaptiveness (Spearman rho = -0.57, P = 9e-05, Fig. 5E), suggesting that ITCs have a relative decrease in ribosome biogenesis.
Since new ribosome production is necessary for proliferation, and MYC expression is generally associated with proliferative capacity, we hypothesized that proliferation potential might associate with adaptiveness. We found that proliferation in response to anti-CD3/CD28-coated beads, like MYC and ribosome biogenesis, associated with adaptiveness (Spearman rho = -0.73, P = 5.8e-04, Fig. 5F,G). We then assayed ribopuromycylation to quantify total active translation (29). Strikingly, we observed a positive association with innateness, with the innate cell types being engaged in more active translation than the adaptive T cells (Fig. 5H,I). This suggested that despite having lower expression of many major ribosomal genes, innate T cells have a higher number of ribosomes actively involved in translation. These results recall the welldescribed regulation of ribosomes in prokaryotes, where ribosome biogenesis is major energetic control point, is suppressed in conditions under which growth and division are deprioritized (30), and can be fine-tuned to ensure maximal occupancy of active ribosomes (31). Taken together, these results suggest that adaptive cells prioritize the production of factors required for cell growth and division, while innate cells may suppress ribosome biogenesis to prioritize the translation of other mRNAs, such as those encoding effector functions including the rapid production of cytokines ( Fig. 1, Fig. 3G,H).

Transcriptional regulation of innateness
We identified 142 transcription factors that varied significantly between cell types (F statistic, P < 5.8e-05, Bonferroni threshold). The expression of these transcription factors across cell types clustered into 4 major groups (Fig. 6A). Cluster 1 showed a gradual increase that closely matched the pattern of the innateness gradient. Cluster 2 showed a pattern opposite to that of cluster 1, with an increase in expression toward adaptive cellular populations. Cluster 3 showed high levels of expression in iNKT, MAIT, Vδ2, and NK cells, with relatively lower levels in adaptive T and Vδ1 T cells, and cluster 4 captured transcription factors with the opposite pattern to cluster 3 (Fig. 6A). In PCA of these transcription factors, the second principal component separated iNKT cells, MAIT, and Vδ2 T cells, from adaptive and Vδ1 T cells (Fig. S7A), similar to K-means clusters 3 and 4. These same cell groupings were also captured by PC2 generated using the overall most variable genes ( Fig. 2A).
PLZF is a zinc finger transcription factor known to be important for the development and Human γδ T cells have previously been reported to express PLZF (52), but we did not detect elevated PLZF expression in Vδ1 cells (Fig. 6G,H). Differential expression analysis between PLZF + ITCs and adaptive T cells revealed "cytokine receptor activity" as the most enriched term for upregulation in PLZF + ITCs (P = 7.9e-05). PLZF expression in T cells was also associated with the aggregate expression of all cytokine and chemokine receptor activity genes (Fig. S7B), and we validated the expression of several of these receptors by flow cytometry (Fig. S7C). For genes differentially-expressed between PLZF + ITCs and adaptive T cells, we found significant enrichment of PLZF target genes identified in mouse thymocytes with CHIP-seq (53) (P = 6.2e-07, Fig. S7D). In addition, PLZF + ITCs upregulated genes that were associated with the term "circadian regulation of gene expression" (P = 4.2e-04), with major clock transcription factor genes like ARNTL (that codes for BMAL1), RORA, PER1 and CRY1 significantly upregulated in PLZF + ITCs compared to adaptive T cells (P < 5e-08) (Fig. 6I,J, Fig. S7E). Both BHLHE40 and ID2 also have the capacity to regulate the circadian clock (54-56). Notably, although human NK cells express PLZF (mature mouse NK cells do not express PLZF), many genes upregulated in PLZF + ITCs and identified as PLZF targets in mouse (53) showed low expression in human NK cells, including CCR2, CCR7, CXCR6, RORC, CCR5, CCR6 and LTK (Fig. S7C,D). These results suggest that PLZF may regulate different sets of genes depending on the cell type, likely working as part of a larger gene network in determining ITC fate.

Innateness in other populations of ITCs and adaptive T cells
We next investigated the innateness gradient in other candidate innate-like human T cell subsets.
We chose two additional T cell populations for analysis, Vδ3-expressing γδ T cells and δ/αβ T cells, each of which can constitute up to 1% of human peripheral T cells (57, 58). We sorted Vδ3 T cells and δ/αβT cells in duplicate from one individual and profiled their transcriptomes with ultra-low input RNA-seq. δ/αβ and Vδ3 clones have been identified that, like iNKT cells, recognize α-galactosylceramide presented by CD1d (57, 58), suggesting that these cells might potentially play a similar role in immunity to iNKT cells. However, principal component analysis revealed that δ/αβ T cells were closer to adaptive T cells, and closest to CD8 + T cells, rather than segregating with iNKT cells and other innate T cells (Fig. S8A,B). This suggests that δ/αβ T cells may have an adaptive-like phenotype. Vδ3 T cells, on the other hand, segregated closer to innate T cells by PCA, among the other γδ T cells (Fig. S8A,B). Neither δ/αβ T cells or Vδ3 T cells expressed PLZF.
Cytotoxicity genes and NK markers are expressed by a subset of adaptive T cells. We found that this class of genes was expressed by CD8 + T cells, and in some cases at higher levels than in ITCs. Interestingly, the development of innate-like Th1 effectors from adaptive cells has also recently been demonstrated in mice (59). To assess expression of innateness gradient genes in human adaptive effector T cells, we re-analyzed a human expression dataset generated using MHC class I tetramer-sorted, HCMV-specific CD8 + T cells (60) (polyclonal human CD8 + T cell datasets would likely be substantially 'contaminated' with ITCs). HCMV-specific effector memory CD8 + T cells expressed innateness gradient genes more highly than HCMV-specific memory CD8 + T cells (P = 1.4e-61, Wilcoxon paired test), which in turn had higher expression of these genes than naive CD8 + T cells (P = 7.9e-99, Fig. 7A). Conversely, genes associated with adaptiveness in our gradient were upregulated in naive CD8 + T cells compared to HCMVspecific memory CD8 + T cells (P = 9.2e-68), and also in memory CD8 + T cells compared to effector memory CD8 + T cells (P = 2.7e-11, Fig. 7A). We also re-analyzed published RNA-seq data for CD4 + T cell subsets (61). CD4 + effector memory T cells had higher expression of innateness-associated genes than CD4 + naive and CD4 + central memory T cells (P < 6.4e-119, Fig. 7B), whereas naive CD4 + T cells had higher expression of adaptiveness-associated genes than CD4 + central memory and effector memory T cells (P < 1.4e-54, Fig. 7B). Overall, these results suggest that the innateness gradient can stratify adaptive effector CD8 + and CD4 + T cell populations following infection, and is thus not limited to ITCs.

Discussion
MAIT, iNKT, γδ, and other innate-like T cells do not fit neatly into traditional paradigms of adaptive or innate immunity. Their nature has been an interesting puzzle for more than 30 years.
Each population has been studied in depth individually, but rarely have they been considered in aggregate. Here, we set out to study human ITCs as a group, addressing two important questions, 1) is there a shared transcriptional basis for their functions in immunity, and 2) how do ITCs maintain their baseline effector state? In quantitative, unbiased analyses, we discovered that ITCs segregate along an innateness gradient between prototypical adaptive and innate populations. We propose that the large transcriptional programs positively-and negatively-associated with this gradient represent the transcriptional basis of lymphocyte innateness. Our data support that ITCs are indeed a 'family' with a common transcriptional basis for their similar functions in immunity, including rapid cytokine and chemokine production, chemotaxis to areas of inflammation, cytotoxicity, and TCR-independent responses. The functional and transcriptional conservation of innate-like functions in ITCs suggests that they enhance evolutionary fitness.
That humans dedicate such a large part of their T cell repertoire to the generation of innate-like receptors is a testament to the teleological importance of innate immune surveillance even after the evolution of adaptive immunity.
Strikingly, we observed that this innateness program can not only classify ITCs according to their innateness, but can also differentiate adaptive effector populations. For example, naive, memory, and effector adaptive populations can be separated by their innateness. Interestingly, both Th1 and Th2 adaptive T cells have been demonstrated to acquire innate-like characteristics in some settings (62-65). Thus, the study of ITCs highlights important pathways used across innate and adaptive lymphocyte populations.
The shared gene programs associated with innateness included cytokine/chemokine production, cytotoxicity, and cytokine/chemokine receptor expression. For the genes positively associated with the innateness gradient, this is essentially an 'effector gradient,' which strongly supports a role for ITCs in host defense. We found that human ITCs rapidly produced IFN-γ after activation through their TCRs (Fig. S2), as do a smaller fraction of adaptive T cells. However, IFN-γ production in response to IL-12, IL-18, and IFN-α, cytokines generated by myeloid or stromal cells in response to danger signals, were almost exclusively limited to ITCs (Fig. 1B). This is consistent with the role of ITCs as innate responders where prior pathogen experience is not required. Thus, T cell innateness can regulate the response to pathogen-associated molecular patterns. Of note, human Vδ1 cells have been demonstrated to be variable in both TCR repertoire and numbers, and likely respond to specific infections (66). Although they express much of the innateness program, Vδ1 cells may not fit the ITC paradigm as neatly as the more-conserved MAIT, iNKT, and Vδ2 populations. Indeed, Vδ1 cells have greater TCR diversity, exhibit less 'cytokine-only' activation, and do not express PLZF.
Our identification of ribosome subunits and other factors involved in translational activity associated with adaptiveness (Fig. 5) also sheds light on the biology of ITCs. For an adaptive T cell, population expansion is of central importance in both primary and recall immune responses.
ITCs, on the other hand, are likely to function as sentinels early during infection, acting as 'cellular adjuvants' to enhance the larger immune response in response to microbial molecules.
For such a role, rapid effector responses are key, and proliferation may serve only to replenish numbers at a later stage. Taken together, the effector-focused transcriptional programs of ITCs and proliferation-focused programs of adaptive cells are ideally suited to support their respective roles in immunity.
Finally, the innateness gradient reported here could be applied in different scenarios in order to better understand human immunology. A transcriptomic innateness score could be employed as a unified T cell metric to classify individual single cells assayed with single-cell RNA-seq, and could provide a better understanding of patient heterogeneity. We can use our immunoprofiling data and create an 'innateness metric' for each individual based on the abundance of each T cell type weighted by the innateness level of that cell type. This score is remarkably variable between individuals, even after correcting for age (Fig. S9). This single innateness metric in an individual might be associated with genetic differences, human diseases including cancer, infection, and allergy, or therapeutic responses to immunomodulating medications.

Study design
To study the transcriptome of innate T cell populations (MAIT, iNKT, Vδ1, Vδ2), we compared them with adaptive cells (CD4 + T, CD8 + T) as well as NK cells as prototypical innate lymphocytes. Samples used for immunophenotyping and RNA-seq analyses were from healthy individuals. All human sample use was approved by the Brigham and Women's Hospital Institutional Review Board, including direct consent for public deposition of RNA sequencing. A matched set of populations were sorted from each individual to avoid batch effects. All blood draws were performed in the morning, and cells were immediately stained and double-sorted directly into lysis buffer. Based on previous RNA-seq analyses on number of replicates and read depth for optimal differential expression analysis (67), we decided to sort cells from 6 individuals in duplicate (total of 12 samples per cell-type) at a read depth of 4-12 million read pairs (8-24 million reads). The goal of this study was to define the shared transcriptional programs between cell populations rather than variability between individuals. To avoid systematic technical error or batch effects, samples were randomized within the plate for library preparation, and all samples sequenced together. Five samples were removed for low read depth (described below).

Gene expression quantification
We used Kallisto version 0.43.1 (69) to quantify gene expression using the Ensembl 83 annotation. We included protein-coding genes, pseudogenes, and lncRNA genes. As expected, protein coding genes were the most highly expressed, followed by lncRNAs and then pseudogenes (Fig. S4C). We removed 5 outlier samples that had low proportion of common genes detected (1 MAIT, 1 CD8 + T, 1 NK, and two Vδ1 samples; Fig. S4D). We used logtransformed tpm (transcripts per million) as our main expression measure, which accounts for library size and gene size (specifically log2(tpm+1)). We considered as expressed genes those with a log2(tpm+1) > 2 in at least 10 samples. We further performed quantile normalization on the log2(tpm+1) values for our differential expression analyses. Boxplots were created in R.
Boxes show the 1 st to 3 rd quartile with median, whiskers encompass 1.5X the interquartile range, and data beyond that threshold indicated as outliers.

Differential expression analyses
We used linear mixed models for our differential expression and expression association analyses.
The dependent variable was quantile normalized log2(tpm+1) expression values. Within our predictor variables, we used in all cases donor ID as a random effect. For associations with the innateness gradient, we used one fixed effect composed of integers from 1-7 (for CD4 + T, CD8 + T, MAIT, NKT, Vδ1, Vδ3 and NK, respectively). In the differential expression between adaptive cells and PLZF + ITCs we used one fixed effect taking values of 0 or 1, respectively.

Gene ontology term enrichment analyses
We downloaded Ensembl gene IDs linked to Gene Ontology (GO) terms on April 2016 (70,71).
This included 9,797 GO terms and 15,693 genes. We tested for GO enrichment sorting genes by the β (effect size) of our differential expression analysis. We used the minimal hypergeometric test (72) to test for significance. We confirmed significance of enrichment for the top GO terms using an alternative method: the function gsea of the liger package (https://github.com/JEFworks/liger).

Pathway enrichment analysis
We downloaded genes pertaining to 12 KEGG pathways (73) from the Consensus Pathway Database-human http://cpdb.molgen.mpg.de/ (74) in March 2017. First, we calculated the F statistic per expressed gene in our dataset as a metric of variability between cell types. Then we tested whether the F statistics in genes of a certain pathway were higher than the other expressed genes using a Wilcoxon test. Three pathways had a P-value < 0.05. Since higher expressed genes tend to have higher F statistics, we further tested whether these 3 pathways had significantly higher F statistics than expected by controlling for gene expression. Specifically, we chose a null set of genes with similar expression levels by taking for each gene in a pathway, 30 random genes with mean level of expression (across all cell types) within 10% of the standard deviation.
After this, only the pentose phosphate pathway had genes with F statistics higher than expected (P = 0.018). We further tested enrichment of this pathway in genes associated with innateness gradient using the gsea function of the liger package (Fig. 4A).

Immunophenotyping associations
Associations among cell types and clinical traits, when accounting for different covariates, were tested with linear regression using cell type percentages in log scale. For iNKT cell abundance, there were 2 individuals with zero values, and these were converted to the next minimal value of 0.01 before log transformation.

PLZF target analysis
We downloaded PLZF ChIP-seq peaks from the Gene Expression Omnibus (GEO) database from Mao et al (53) (accession number GSE81772). We used genes from the mouse Gencode vM14 annotation. We defined gene targets as mouse genes with a PLZF peak in the gene body or within 2kb from the transcription start site (TSS). We downloaded mouse-human gene homologues from BioMart (75). We selected only genes with 1 to 1 orthologues. We then checked from the mouse PLZF gene targets to which human orthologue they correspond. Finally, we performed logistic regression to determine whether gene targets are enriched in differentially expressed genes between PLZF + ITCs and adaptive T cells. Specifically, the response variable is 0 or 1 for non-target or target gene, respectively. The predictor variable was the β of the differential expression analysis of PLZF + ITCs versus adaptive T cells. We also tested enrichment defining gene targets if a peak was found only at the promoter region of a gene (-2kb to +1kb from TSS), and found similar results.

Data accessibility
Our RNA-seq data is available at GEO with accession number TBD. Processed expression data and innateness gradient associations can be viewed using an interactive browser at https://immunogenomics.io/itc.

Flow cytometry and cell sorting
For immunophenotyping, Ficoll-isolated (GE Healthcare) PBMCs were prepared within 2 hours of overnight fasting with blood draw between 8 and 10 AM, stained, and data was acquired the same day. For sorting, freshly-isolated PBMCs from donors that had at least 0.1% for each cell type were processed in accordance with the ImmGen standard operating procedure (76,77).

qPCR analysis
For 47S rRNA quantification, cells were sorted directly into RLT buffer (Qiagen) before RNA extraction (Qiagen, RNeasy). Primers were designed to span the first rRNA processing site using the following sequences: forward: GTCAGGCGTTCTCGTCTC, reverse: GCACGACGTCACCACAT. HPRT was used as a housekeeping control (forward: CGAGATGTGATGAAGGAGATGG, reverse: TTGATGTAATCCAGCAGGTCAG). qPCR was performed using the Brilliant III Ultra-Fast SYBR QPCR Master Mix (Agilent), read on a Stratagene MX3000P system.

Ribopuromycylation studies
To assess ribosomal activity, we adapted a microscopic technique, ribopuromycylation (29) for use by flow cytometry. Puromycin was added for 5 min in the presence of emetine (100 µg/ml), followed by fixation with 4% paraformaldehyde, permeabilization with BD Perm/Wash, and staining with an antibody that recognizes puromycin (EMD Millipore).

Supplementary Materials
Fig. S1. Innate T cell population frequency associations with age and covariance.             Individual innateness metric.
Innateness metric calculated per individual by integrating the immunoprofiling data with the innateness gradient rank per cell type. Specifically, we summed the abundance per cell type (proportion of T cells) multiplied by the rank of that cell type in the innateness gradient. We then regressed out the age effects. Plotted are the residuals of this regression.