Main

The immune system possesses immense individual-to-individual diversity. Immunity is intrinsically variable, because it is controlled by the most polymorphic genes and is shaped by highly sensitive environmental sensors that are capable of pushing immunity into myriad functional configurations. These functional configurations are the intrinsic biases toward particular immune response types. Thus, while most healthy humans have the capacity to turn on type 1 helper T cells (TH1 cells), TH2 cells, TH17 cells, type I interferons, inflammasome activation, and a multitude of other states, individuals differ in the degree to which they are primed for each functional configuration1,2,3. These interindividual differences are relatively low at birth4 but continually expand as we age and encounter new environments1,5, and are both stable and robust to perturbation2,3. The origin of this diversity is rooted in our evolutionary pasts, with genes that control immune traits being among the most divergent in archaic genomes6. In modern humans, a diverse range of immune-associated disorders reflects the clinical consequences of this diversity in immunological states. This diversity also provides challenges for more successful application of immunotherapeutic strategies, initially developed against cancer or autoimmunity, that hold wider potential for other clinical conditions. For example, despite their widespread success in successfully treating 21 types of cancer, checkpoint blockade inhibitors induce adverse immune effects in up to 50% of people who receive combination therapy7. A better understanding of the reasons driving such variability could help define more precision-medicine strategies.

The immune system is possibly unique in the advantages that variation can confer. The Red Queen hypothesis, an evolutionary arms race between competing species (Fig. 1a), runs into a generational time asymmetry when considering the evolution of pathogens and hosts (Fig. 1b). Rather than unsustainable convergence toward a homogenous state of infection–resistance, evolution has selected for maintenance of immune diversity as a protective mechanism (Fig. 1c). When potential pathogens can rapidly specialize to take advantage of a fixed niche, an evolutionary advantage can be gained from possessing an immune system wired into a functional configuration that is different from that of the prior host (Fig. 1c,d). Beyond the immune system, a single holotype can be considered optimal per environmental condition, with diversity representing a divergence from the optimum. In the immune system, by contrast, diversity is generally a beneficial feature (although some loci, such as the Toll-like receptor (TLR) loci, are subject to intense purifying selection8), with increased immune divergence from the prior host predicted to benefit the newly infected individual.

Fig. 1: Immunological diversity as an evolutionary strategy.
figure 1

a, In a simplified single-trait Red Queen scenario, a predator–prey relationship based on speed drives ever-increasing speed in both the predator (red density curve) and prey (blue density curve), with the fast end of the predator bell curve requiring an overlap with the slow end of the prey bell curve. b, In a host–pathogen relationship, the asymmetry in population growth potential and generation time would allow the pathogen to rapidly specialize to counter any single-trait immune response. c, Representation of a human population with immune diversity, including individuals biased toward different immune responses. Individual X is immunologically more similar to individual Y than they are to individual Z. These distances also relate to the magnitude of the reoptimization cost that a pathogen will experience on moving from host X to either host Y or host Z. d, Evolutionary trajectories of pathogens in their optimization toward two example immune biases. The green trajectory represents optimization in a host population in which all individuals are biased toward TH17. The purple trajectory represents optimization in a host population in which all individuals are biased toward type I interferons (IFN). The blue trajectory represents optimization in an immunologically diverse host population.

In this Review, we outline the current state of knowledge on the primary drivers of immunological variation. We concentrate on serological, cellular, and molecular aspects of immunity, as direct measures of immune variation, instead of clinical outcomes and other complex phenotypes, of which immune status is only one aspect. Likewise, this Review focuses on the drivers of immune variation that have a substantial impact at the population level — common genetic variants, intrinsic factors such as age and sex, and common environmental exposures. Of necessity, we omit rare drivers, such as rare genetic variants, except for the lessons they provide for a broader population-level understanding.

Genetic drivers of immune variation

Genetic variation is an important driver of immune variation. Studies of multigenerational families9,10 and twin pairs11,12,13 nevertheless highlight high variability in how much genetic variation contributes to different immune traits. The largest multigenerational study10 demonstrates a median heritability of 37% for immune variation, with a range between 0% and 79%, in line with most previous studies. Cytokine pathways, key drivers of immunity, are particularly heritable12,14,15. Genome-wide association studies (GWASs) allow the dissection of this aggregate genetic contribution to individual causal drivers.

Extremely large GWASs have been performed to investigate basic hematological traits, and have identified >7,000 loci associated with such traits16,17. Colocalization of immune-modifying variants with risk variants for autoimmune diseases has been observed, suggesting clinical consequences16. These very large studies permit trans-ethnic analysis. For example, a missense variant in IL7 is associated with increased lymphocyte counts in South Asians, an effect that has been obscured in other populations17. The other end of the GWAS spectrum trades power for depth: six fine-detail immune-phenotyping GWASs have been performed on relatively small numbers of individuals9,10,11,18,19,20. Together, these GWASs have identified 95 loci associated with immune phenotypes, of which 28% have been reported by more than one study. Factors to consider when comparing studies are the design of the immune-phenotyping platform, that power is still limited, and differences in ancestry. For example, 13% of associations in a Sardinian population10 have a twofold-higher frequency in this population than in the European cohort of the 1000 Genomes Project, where they are mostly at low frequency. Hence, more associations are expected to emerge and, as for hematological traits, analyses of different ancestries will be helpful.

The wealth of genetic data allows broad comparisons to be made. Associations have been predominantly with protein quantitative trait loci (pQTLs), rather than cellular traits. These pQTLs act mostly in trans, via regulatory genes19,21,22,23. This is in line with distal (trans) effects being an important contributor to variation of gene expression among immune cell types and being enriched for disease associations, even though most expression QTL (eQTL) studies so far have focused on cis effects24. Increasingly, physiological perturbations in specific cell types are being mapped and are assisting in the evaluation of the colocalization of immune trait GWASs and eQTL signals25. More recently, splice variation has been identified as an additional source of variation that is increased following immune activation26. Genetic determinants of alternative splicing are largely independent of those of gene regulation and are often context dependent or alter splicing only in the context of immune stimulation26.

The human leukocyte antigen (HLA) locus, the most genetically diverse in the human genome, maintained by balancing selection27, is worth additional attention. This locus has been reproducibly associated with multiple immunological disorders, both infectious and autoimmune, often with the strongest effect size28. These associations may be driven by variation in permission of clonal HLA–peptide–TCR/KIR interactions; however, a role for variation in HLA expression level and stability is being increasingly inferred as a peptide-independent basis for association28. GWASs for healthy, baseline immune phenotypes have revealed several independent associations within the HLA locus, including a known type 1 diabetes HLA-DRB1 risk allele that impacts HLA-DRB1 surface expression in innate immune cells20, and variants correlated with surface expression of costimulatory molecules in adaptive immune cells10. HLA has an even more prominent role in the genetic control of immunoglobulin levels, accounting for 19% of known associations and including autoimmune disease–associated alleles29. On the basis of the biology of HLA variation, it is expected that future GWASs of immune responses at the clonal level will identify stronger effects, accounting for the powerful associations exhibited with disease.

Age and immune variation

Age is one of the most potent drivers of immune variation, driving a shift toward systemic inflammation and away from naive lymphocyte phenotypes1,3,5. Age correlates with multiple immunological parameters and also amplifies the degree of variation present, a result of direct effects of age as well as increased environmental exposures over time1,12. Age-related immune variation can be due to cumulative effects of immunosenescence, such as lower activity of hematopoietic stem cells30,31, altered lineage differentiation32, thymic involution33, attenuated antiviral responses34, leukocyte attrition, or mutation accumulation. Key genetic mutations with increased age include both somatic mutations and mosaic chromosomal alterations (mCAs), such as deletions, duplications, and loss of heterozygosity. A recent large study described how expanded mCA clones increased with age, were associated with altered leukocyte numbers, and showed significant associations with infections35. Intriguingly, this process may act synergistically with genetic variation, as GWASs have identified multiple variants in DNA-damage-repair pathways associated with mCAs36. Likewise, postzygotic somatic mutations increase with age and can impact immune responses, with somatic variants frequently observed in the CD8+ T cells of people with multiple sclerosis or rheumatoid arthritis, and noncoding somatic variants that act as eQTLs have been described in cancer37. The relative contributions of these factors largely remain to be dissected, and it is likely that many age-associated immune changes are driven by combinations of factors or through secondary changes that occur during aging, such as increased inflammation or pathogen exposure — for example, cytomegalovirus infection, which impacts multiple immune phenotypes12. It is also important to consider cumulative environmental exposures as potential confounders. Few detailed immune-phenotyping studies have been run outside of countries with historic economic privilege, and even within such countries, large-scale changes in environmental exposure separate older participants from younger participants. Well-established age-associated changes may therefore reflect infectious and environmental history, rather than purely reporting on the direct effect of age itself.

Sex and immune variation

The effect of sex on immune variation is most apparent at the clinical level. There are differences between women and men in risks for immune-related diseases, including those linked to autoimmunity and viral infection. At the level of immune parameters, the most consistently observed associations with sex are altered baseline cellular traits and differential responses to vaccinations38. A recent study also found that 47% of HLA class I and II genes showed differences between sexes in expression following stimulation with lipolysaccharide (LPS), a far greater rate than the genome average39. While much of the effect of sex is unexplained, the best-studied drivers for this variation are sex chromosomes and sex hormones.

With an abundance of immune-related genes present on the X chromosome, the differential allosome allocation can explain many sex-associated immune differences. Several studies have shown biallelic expression of the X-encoded TLR7 in females, which results in higher levels of type I interferon induction by plasmacytoid dendritic cells and a greater propensity for immunoglobulin G (IgG) class switching in B cells40,41. By contrast, analyses of whole-blood transcriptional responses to LPS found that the vast majority of X-linked genes are commonly induced in both males and females, including the majority of genes known to escape X inactivation39. TLR7 may thus represent a unique case, rather than a general rule. While the Y chromosome is genetically poor, there is evidence that mosaic loss of chromosome Y (LOY) contributes to immune variation. LOY is the most common postzygotic mCA in leukocytes and is associated with earlier mortality and morbidity in men. LOY is higher in innate cells than in adaptive cells and is associated with large-scale transcriptional dysregulation42. Whether these associations are direct and casual or reflect a common underlying mechanism requires further study.

Sex hormones are the other major source of potential sex-associated immune differences. Antibody responses to influenza vaccination positively correlate with plasma estradiol concentrations in females43 and negatively correlate with plasma testosterone in young males43,44. Studies of sex hormones are complicated by the intersection with age, as estradiol, progesterone, and testosterone levels fluctuate throughout life. It is therefore important to consider that immune variability may be specific to certain developmental periods in life, and that age effects may often be nonlinear, as was reported for differential immune response specifically to influenza H1N1 stimulation45 and vaccination46.

While substantial immune variation is associated with sex, almost no data are available to discriminate between the impact of sex and that of gender within these studies, and effects attributed to sex may, at least in part, be causally driven by gender. A large-scale blood transcriptomics analysis found sharply divergent effects of sex in an urban environment versus a rural environment in Morocco47. Immune differences between male and female participants were strongly amplified in the rural setting, with traditional gender roles altering environmental exposures, compared with those in the less gender-segregated urban setting47. This result implies that the primary effect observed was based on gender rather than sex, although the topic requires focused research. To definitively assign causation to sex-associated immune variation, future systems-immunology studies will need to be actively inclusive; engagement of immigrant and transgender populations might provide valuable insights into the sex versus gender effects. The strong correlation of sex with immunological clinical outcomes demonstrates the urgent clinical need of understanding the basis for this immune variation and improving the tailoring of medical strategies to these differences.

Environmental drivers of immune variation

Adequate nutrition is essential for a functioning healthy immune response. Severe nutritional deficiencies lead to immune defects, in particular in children48. However, despite many claims, it remains unclear how normal dietary variation directly affects immune responses, or whether diet mediates its effects through indirect complex effects, such as changes in the microbiome (discussed below), body weight, or associated inflammatory effects. As an example, a recent systems-immunology study identified food-derived metabolites as potential drivers for an urban–rural divide in immune configurations49, although environmental exposures might be confounders. Randomized placebo-controlled studies are therefore essential for assessing the direct effect of diet on immune function. A recent meta-analysis of 18 such studies of probiotic supplementation found only limited effects on immunity in healthy adults50, while a small-scale study identified fermented foods as reducing inflammatory markers51. Despite this, there is evidence for immunomodulatory roles of key dietary components. For example, iron deficiency is widespread in infants from lower-income countries and has been shown to be important for development of T cell and B cell responses. A retrospective study of iron deficiency in infants found that iron supplementation improved responses to measles vaccination52. With supporting evidence from in vitro studies53 and people with mutations in TFRC, which encodes transferrin receptor 1 (ref. 54), it is clear that iron bioavailability is necessary for efficient B cell responses. For other macronutrients, evidence exists for the importance of dietary salt, specifically for TH17 immunity. High-salt diets have been associated with increased risk of autoimmunity55 and reduced mitochondrial function in monocytes56. In vitro and mouse experiments have identified increased TH17 cells as a potential mechanism linking high salt to autoimmunity57, supported by TH17 deficiencies in people with salt-losing tubulopathies58 and small-scale challenge experiments59. Finally, vitamin D is important for many physiological processes, and the use of placebo-controlled, randomized interventional studies is helping to identify which immune pathways are affected60. Despite these examples, and more, it remains unclear whether the relationships between nutrition and immunity are continuous, with population-level effects driven by different diets, or whether they exist only at the extreme ends of the spectrum, such as in iron deficiency or very-high-salt diets. Similarly, for body weight, while a moderate body-mass-index range has not been observed to be associated with multiple immune traits20, individuals with obesity have been reported to be deficient in natural killer cell numbers and function, driven by the lipid-rich environment61.

Environmental exposures are potential drivers of immune variation, with particulate matter in pollution and industrial chemicals found in food and our domestic and work environments capable of driving immune deviation62. Evidence for the immune-altering capacity of environmental exposure is strongest in the downstream clinical manifestations, where, for example, air pollution and industrial-chemical accidents are linked to inflammatory diseases63,64, and farm-animal exposure protects from asthma65. Combined with animal-exposure models and in vitro studies66, variation in exposure to pollutants and chemicals is likely to drive strong immune divergence. An example of the mechanistic link being directly made is the role of aryl hydrocarbons in promoting TH17 responses. Following the identification of the molecular basis67, positive correlations have been found between levels of particulate-matter air pollution and circulating CCR6+ T cells prone to TH17 polarization68. While the latter population association was performed in people with multiple sclerosis, it provides a clear proof of principle that variable environmental exposures can contribute to the immune configurations observed within a population. Challenge studies can also be used, such as replicating urban diesel-exhaust exposure in a healthy cohort, demonstrating the resulting elevation of IgE-mediated responses69. Exposure-elimination studies70 provide the reverse design. Among the most potent environmental exposures, due to its ubiquity, is cigarette smoking, which drives an inflammatory and possibly autoimmune state, with changes in many leukocyte cell types20. Understanding the effects of smoking is complicated by the over 4,000 toxic substances present. A better understanding of these complex effects may be provided through comparative studies with ex-smokers as well as individuals exposed to passive smoking, e-cigarettes, or nicotine products. While the effects of individual environmental exposures need to be explored, the net aggregate is potentially extremely potent, and may account for such findings as a halving of immune variation in cohabitating couples3.

The human microbiome has the potential to be a major contributor to immune variation. The microbiome exhibits an extraordinary degree of interindividual compositional diversity71. The presence of the microbiome influences immune development and function, with infant microbial colonization modifying immune development4,72, and adult broad-spectrum antibiotic treatment truncating vaccine responses73. Among the strongest evidence for microbiome variation influencing immune states is the association with diseases. Shifts in microbiome composition are observed in different inflammatory and immunological diseases. While such associations may be correlative in nature, microbiome ‘transplants’ have demonstrated partial success, consistent with a causative relationship. Fecal microbiota transplantation following Clostridium difficile infection, the most promising condition for microbiome transplantation74, is accompanied by reduced inflammatory parameters75. Isolated gut-microbiome species have been shown to promote immune biases toward TH1 cell76, TH17 cell77, or regulatory T cell (Treg cell) differentiation78 when transferred into mice with limited or no resident microflora. Parallel shifts in microbiome and immune profiles have been observed in infants4,72 and the posthematopoietic stem-cell transplantation setting79, and microbiome supplementation in these settings alters the immune system72,80,81. Together, these data strongly suggest that, at least in individuals with dysbiosis or limited microbiome diversity, introduction of new species does modify the immune status.

Of the diverse mechanisms by which variation in the microbiome may influence host immune variation, several have been experimentally validated. First, commensals provide antigenic targets82. While the impact of such responses will be diluted above the clonal level, the immune response at the interface tissue is likely to be profoundly shaped by these interactions, as has recently been demonstrated in the clonal dominance in tissues observed after vaccine challenge83. Second, the microbiome produces bioactive metabolic products, which can shape the host immune system. For example, short-chain fatty acids are implicated in the ability to induce Treg cells in mice78, and tryptophan metabolites reduce interferon-γ responses by human cells in vitro84. The key unknown is the extent to which these immunomodulatory properties are amplified or negated across the microbiome variation observed in the healthy human population. If independence in microbial colonization efficiency occurs, these effects would drive a spectrum of immune-modifying microbiomes, increasing variance in immune configurations. Conversely, interdependency in microbial colonization may result in niche substitution, limiting the overall impact of microbiome diversity on immune variation (Fig. 2). Notably, the initial immune variation among infants converges as the microbiome stabilizes, independently of gestational age85, suggesting there is some degree of interdependency in the effects of colonization4. Large-scale studies with parallel systems immunology and microbiomics, such as one identifying that up to 10% of the variation in cytokine responses could be accounted for by the microbiome84, are needed to answer such questions.

Fig. 2: Simplified models for interaction of microbiome diversity and immune variation.
figure 2

a, Within the human microbiome is a large diversity of commensal species, some of which will have the potential to drive immune phenotypes in different directions (illustrated with color). b, A hypothetical principal component analysis of immunome-active commensal species, based on their impact on the human immune system. Clusters illustrate different microbiome species that have similar immune-shaping properties. c, Under a model of colonization independence, individuals will vary in the relative representation of different commensals. When only a single axis of Treg cell–biasing and TH1 cell–biasing commensals is considered, individuals with a microbiome rich in an immune-shaping species set will have more extreme immune variation. The effect is a net expansion of immune variation across the human population. d, Under a model of colonization interdependence, microbiome diversity is driven by species substitution. Loss of one immune-shaping commensal provides a niche that is filled by other commensals with similar immune-shaping potential. The net effect on immune variation across the human population is negligible under this model.

The balance of probabilities suggests that microbiome variation between individuals contributes to the observed immune diversity. The magnitude of this contribution, however, remains to be determined. The responsiveness of the microbiome to diet, environmental exposure, age, and sex86,87 creates problematic confounders. Even in the case of GWASs, in which associations found with the microbiome show some overlap with immunological loci, reproducibility has been problematic, and the direction of causality between immune traits and microbiome traits associated with the same loci has not been established88. In each case, it remains possible that microbiome changes are largely bystander correlations, responding in parallel with the immune system to the causative drivers. Equally, the microbiome may be the nexus that integrates many of the associated variables and provides the direct causative mechanisms underlying immunological variation.

Variation during immune reactions

As an emerging field of research, systems immunology has concentrated largely on understanding the nature of variation in the healthy baseline state. Beneath this baseline variation, however, lies a hidden layer of immune variation, present only during immune responses. Indeed, it is precisely the outcome of immune challenge that has shaped the evolution of our immune system. Systems-vaccinology studies are the key approach to investigating the perturbation of the immune system in a controlled manner, with influenza vaccination being among the most extensively studied. Variation in the production of protective antibodies following vaccination has been correlated with high plasmablast activity within a week after vaccination89,90. The presence of CD38+ B cell subsets at baseline seems to be a strong predictor across diverse cohorts and studies89,91, with specific gene-expression signatures89,91,92,93. Variation in the early activation of the interferon pathway has been reproducibly associated with elevated antibody production in the latter response89,90,91,94, a finding consistent across multiple different vaccines91,94,95,96. These positive innate responses can also be found at baseline and are mediated mainly by dendritic cells and plasmacytoid dendritic cells91,97. Intriguingly, a similar interferon signature correlates with prediction of clinical flares of systemic lupus erythematosus91, suggesting a common basis for an immunological variation that may have beneficial or detrimental consequences depending on the activation context.

A strong case study for the utility of systems-vaccinology approaches is the application of these lessons to understanding immune differences with age. The functional correlation between early interferon responses and antibody production in healthy young individuals may suggest the causality of poor immune response after vaccination in the aged population90,92. The aged population, in comparison with younger individuals, has an innate cell compartment skewed toward an inflammatory signature but away from type I interferons90,93,98,99. Indeed, the addition of an interferon-stimulating adjuvant to an influenza vaccine substantially improves the production of antibodies in older people100,101. Studying natural variation between good responders and poor responders therefore identifies target pathways that can be exploited for improving clinical outcomes in the poor responder population.

Although vaccines constitute the best-controlled systems-immunology challenge context, experimental and natural infections provide the most physiologically relevant. The SARS-CoV-2 pandemic has provided the most intensively studied natural infection context, with a multitude of systems-immunology studies identifying immunological variations associated with protection from severe infection. Clonal cross-reactivity explains a proportion of protection102,103; however, other factors have been linked to differential immunological response configurations. Cytokine bias, driven by an altered myeloid compartment104,105,106, appears to significantly alter the risk of severe disease, with type I interferons being protective and IL-6 or TNF being detrimental104,105,107. In the lymphoid compartment, susceptibility to severe disease is associated with elevated activation and clonal expansion of CD8+ T cells104,108,109. The contribution of this variation in immune responses to clinical outcomes is emphasized by the predictive nature of early immune configurations for later pathology109,110. While some component of the association between immune status and clinical outcome may be shared across infections, especially those sharing infectious modalities111, other associations are likely to differ. An example of the latter is the association of CD4+ T cell responses with parasitemia restraint in controlled malaria infections112. The utility of systems immunology in understanding clinical outcomes in SARS-CoV-2 infections is likely to drive the uptake of this approach to other infectious diseases. We anticipate that different immune configurations will each provide either beneficial effects or detrimental effects, depending on the infectious challenge.

Regulatory QTL mapping during dynamic processes such as the response to immune stimuli can reveal otherwise hidden regulatory variation that may be particularly relevant for disease. As reviewed elsewhere24, immune cell activation reveals a large number of trans-acting response eQTLs, which are highly cell type specific. The response eQTLs identified in the context of pathogen sensing can have large effect sizes, and show a stronger enrichment for immunological disease associations and evidence of recent selection than do standard, steady-state eQTLs24,45,113. Transcriptional variation is to a large extent buffered at the protein level, and the first GWASs for pQTLs after in vitro stimulation are emerging14,114,115. Significantly increased interindividual variability in cytokine response has been observed following stimulation115, and is under strong genetic control and typically pathogen centered rather than cytokine centered14,114,115. As in the previously mentioned baseline example, nearly all pQTLs are trans effects14,114,115, and they are enriched for infectious-disease associations14,114,115. Human genetic determinants following in vivo encounters have also been studied. Several GWASs have been performed for response to vaccination, although further replication is needed before firm conclusions can be drawn, and studies are increasingly being performed on responses to infections116. While associations with clinical outcome have been reviewed elsewhere, the response to SARS-CoV-2 is worth noting, as the COVID-19 pandemic provided a recent opportunity for infectious-outcome GWASs that are orders of magnitude larger than previous studies. These studies have identified 13 genome-wide significant loci with relatively large effect sizes associated with infection or clinical outcome117,118,119. Very few studies, however, have included immunological phenotypes during an in vivo infection-response GWAS. One well-known early example was the association of interferon-λ4 expression with clearance of hepatitis C virus. In a genome-to-genome analysis, this host variation was subsequently demonstrated to drive viral polymorphism and viral load120. Here, again, the translational potential exists to use the correlation between immune variation and infectious outcome as a rational guide to therapeutic intervention.

The continuous evolution of immune variation

The evolutionary advantages of immune variation are imprinted upon our genomes. From human origins to modernity, the signature of selection for immunological genetic variants is evident. Among the most fascinating historic selection events is the archaic introgression of variants from Neanderthal and Denisovan genomes into those of modern humans. Neanderthal ancestry accounts for 2% of the genome of Euroasians, while Denisovan ancestry represents <1% of the genome in East and South East Asians and up to 6% in some Oceanian populations. Despite the widespread signature of purifying selection against archaic alleles, selection acting on advantageous archaic introgressed segments may have increased their frequency8. Introgressed Neanderthal and Denisovian loci are enriched for innate and adaptive immunity genes121,122, and include the HLA and immunoglobulin regions123,124. RNA viruses appear to have been an important driving force of such positive selection, with up to 30% of high-frequency Neanderthal introgressions being selected in response to viruses125. Between 46% and 65% of alleles in introgressed haplotypes represent ancient genetic diversity lost in the out-of-Africa bottleneck, and for 70% of these alleles reintroduced in Eurasians, the ancestral form is still present in Africans126, demonstrating the selective pressure for maintaining genetic diversity in immunity.

Increasing evidence for functional consequences to introgressed alleles is being found. The example of FCGR2A (which encodes an immunoglobulin receptor) underscores the importance of alternative splicing also in response to infection. An introgressed variant, in linkage disequilibrium with immune-trait-associated signals20, increases protein-coding transcripts and phagocytosis after stimulation by LPS26. The role of introgression in shaping the genetic architecture of COVID-19 is notable. The major risk factor for severe COVID-19, almost doubling the risk, is a haplotype block encompassing the genes encoding the chemokine receptors CXCR6 and CCR9 (refs. 117,118,119). This haplotype has been introduced by archaic introgression124 (Fig. 3). The high frequency of this Neanderthal haplotype in Europeans (8%) and South Asians (30%) suggests that the variant previously underwent positive selection124,127, raising the possibility of oscillating selection with changing infectious exposures over time. Alternatively, the striking difference in frequency between South Asia and East Asia, where the variant is largely absent, has been suggested to result from negative selection, perhaps from coronaviruses or other pathogens127. A different Neanderthal haplotype block, at the OAS1–OAS3 cluster, is protective against severe COVID-19 in Eurasians117,118,119, producing an OAS1 splice form with higher enzymatic activity128. The ancestral variant, still present in Africans, appears to confer a similar magnitude of protection against severe COVID-19, underscoring the value of the reintroduction129.

Fig. 3: Neanderthal ancestry impacts COVID-19 severity.
figure 3

The major risk factor for severe COVID-19, which almost doubles the risk, is a Neanderthal haplotype block on chromosome 3 encompassing the gens encoding the chemokine receptors CXCR6 and CCR9 (tagged by rs35044562, red), which is especially common in South Asians. A different Neanderthal haplotype block, at the OAS1–OAS3 cluster on chromosome 12 (tagged by rs1156361, light blue), is protective against severe COVID-19 in Eurasians. This haplotype appears to have reintroduced the ancestral OAS1 splice variant rs10774671 (dark blue), still present as an isolated variant in Africans. Frequencies indicated by bars for 1000 Genomes Project populations: ACB, African Caribbean; ASW, African ancestry in the southwest United States; BEB, Bengali (Bangladesh); CDX, Dai Chinese; CEU, Northern and Western European ancestry in the United States; CHB, Han Chinese; CHS, Southern Han Chinese; CLM, Colombian; ESN, Esan (Nigeria); FIN, Finnish; GBR, British; GIH, Gujarati Indians in Houston (United States); GWD, Gambian Mandika; IBS, Iberian (Spain); ITU, Indian Telugu in the United Kingdom; JPT, Japanese; KHV, Kinh Vietnamese; LWK, Luhya (Kenia); MSL, Mende (Sierra Leone); MXL, Mexican ancestry in California (United States); PEL, Peruvian; PJL, Punjabi (Pakistan); PUR, Puerto Rican; STU, Sri Lankan Tamil (United Kingdom); TSI, Toscani (Italy); YRI, Yoruba (Nigeria). Adapted population genomics map134: orange arrows indicate the major migrations of Homo sapiens after the out-of-Africa exodus; green arrows indicate some more recent migratory events. Approximate geographic areas of modern human populations presenting Neanderthal or Denisovan ancestry are shaded in light blue and yellow, with the Neanderthal ancestry observed in American populations reflecting their varying levels of European ancestry.

The evolution of immunity has not halted with modernity. While historical immune evolutionary processes were driven primarily by the burden of infectious disease, and in particular childhood mortality from infections, the shifting context of immune challenges still provides selective pressure. The plethora of immune-driven pathologies, each at risk of amplification by particular immune configurations, creates complex environment-dependent trade-offs. Genetic determinants of immune variation overlap known immune disease loci9,10,11,18,19,29,115. As examples, association signals for immunoglobulin levels overlap genes known to be involved in autoimmune and immunodeficiencies29, monocyte-derived cytokine associations overlap infectious diseases, and T cell–derived cytokine associations overlap autoimmune diseases14,115. The nature of immune variation suggests that no variants will be unambiguously beneficial, with instead each immune configuration bias incurring both a benefit and a penalty, in a context-dependent manner. The arrival of an era of lower infectious-disease burdens, in particular the hygienic control of fecal–oral transmission, childhood vaccination, and antibodies, has not eliminated the selective pressure on the immune system, but has instead changed the relative benefits and penalty of particular variants. An example of this principle may be seen in the P1104A variant of TYK2, a uniquely broad immune-disease locus. Homozygosity for this variant confers susceptibility for severe mycobacterial disease, most frequently tuberculosis130. At the same time, P1104A homozygosity is associated with fivefold protection against multiple autoimmune diseases131 and changes in leukocyte frequency16,17 (Fig. 4a). The allele frequency declined from ~9% in prehistoric humans to ~3–4% in Europeans, among the highest purifying selection observed in the human genome132 (Fig. 4b). The recent decline in tuberculosis prevalence has eliminated this association in Europe131,133 and has prevented further negative selection of the allele. However, the recent emergence of SARS-CoV-2 may expose the adverse functional consequences of TYK2 genetic variation once more. An intergenic variant in strong linkage disequilibrium with P1104 is associated with COVID-19 severity117,119. The effect of this allele therefore balances increased risk of certain infections with protection from autoimmunity in a temporally and spatially dependent manner.

Fig. 4: Immune variation exerts context-dependent opposite effects on disease.
figure 4

a, TYK2-P1104A (rs34536443) is a unique shared protective variant for more than ten autoimmune diseases, whereas it is the main common risk factor for tuberculosis in endemic regions and a variant in high linkage disequilibrium increases the risk of severe COVID-19. The degree of protection or risk is based on the published allele-dosage model130,131, GWAS EBI catalog (June 2021) and GenOMICC Consortium (release 6, 15 June 2021), with a recessive model exerting even larger effects for those diseases for which it has been tested. AITD, autoimmune thyroid disease; AS, ankylosing spondylitis; CD, Crohn’s disease; JIA, juvenile idiopathic arthritis; MS, multiple sclerosis; PBC, primary biliary cirrhosis; PS, psoriasis; RA, rheumatoid arthritis; SC, sarcoidosis; SLE, systemic lupus erythematosus; T1D, type 1 diabetes; TB, tuberculosis; UC, ulcerative colitis. b, The TYK2-P1104A variant originated around 30,000 years ago, with present frequencies in the indicated major 1000 Genomes populations. The variant has undergone strong negative selection in Europeans over the past 2,000 years, and this has been suggested to have co-occurred with the emergence of Mycobacterium tuberculosis as one of the most deadly diseases in Europe. Figure adapted from published results132, listing populations and frequencies from the 1000 Genomes Project. AFR, sub-Saharan African; EUR, European; SAS, South Asian; EAS, East Asian; kya, thousand years ago.

Conclusion

The genetic architecture of immune diversity and capacity for malleability lies in our evolutionary history. While the rate of natural-selection-induced genetic change is reduced, the changing environment is altering the physiological consequences of this archaic genetic variation. Systems-immunology and systems-vaccinology approaches have made great advances in recent years in elucidating the basic structure of this variation. Multiple key challenges still remain in dissecting causal mechanisms (see Box 1), with engagement of neglected populations and the joint application of techniques from genomics, population genetics, microbiomics, and environmental epidemiology being critical for further progress. Beyond the mechanistic understanding of natural variation lies the promise of identifying individuals with potentially pathogenic immune configurations and the use of therapeutics to reroute immunity into a healthy state. Natural variation in the immune system highlights the pathways that are amenable to large functional shifts after modulation. Understanding variation in response to environmental factors in particular, such as diet, microbiome, and environmental exposure, holds the promise of using simple environmental manipulations in a targeted manner to reroute an individual’s immune system toward a less pathogenic configuration. Although the advantages of personalized immune modification are manifold, they first require a baseline knowledge of the source of our individual differences.