Vaccines are at the leading edge of the great success story of evidence-based medicine. Through the twentieth century, medical and basic scientists working in both academia and industry developed the spectrum of familiar products that serve to minimize the toll levied by infections such as whooping cough, poliomyelitis and measles, particularly in the pediatric population. The possibility of finding some way to define the key, early correlates of vaccination—the 'molecular' language that defines the initial parameters for later development of optimum protective immune memory—would be an enormously valuable tool whose realization has thus far proven a considerable challenge. Being able to 'read out' such information could, for example, allow the development of rapid tests for determining whether a person has responded effectively to a vaccine within a short window after the initial dosing. This would be of particular value when priming health care workers or other first responders during an emerging outbreak and would also contribute to the goal of rational vaccine design by allowing the delineation of essential host response components that boost vaccine 'take'. In this issue of Nature Immunology, Pulendran and colleagues use a systems biology approach to build such a tool, a predictive algorithm of yellow fever vaccine immunogenicity1. Developed in 1937 at the old Rockefeller Institute, Max Theiler's 17D yellow fever vaccine is perhaps the most successful immunogen ever made in a laboratory2. Now, 70 years later, Pulendran and colleagues find that 17D induces a small number of early transcriptional shifts that have powerful predictive value for the amount of ensuing immune memory in humans.

Using the best available techniques at the time, early 'vaccinologists' were able to pick what might now be considered the 'low-hanging fruit' of immune intervention. These vaccines targeted mainly acute viral infections that induced sterilizing, antibody-based immunity in survivors. Replicating that immunity by priming high-serum antibody titers was sufficient for broad protection in diverse populations. The challenges faced today are in some cases more complex: a spectrum of pathogens such as human immunodeficiency virus, hepatitis C virus, plasmodium malaria and Mycobacterium tuberculosis that have consistently defeated scientists and constitute extraordinarily difficult targets for priming protective immunity. Focusing efforts only on parameters (such as measurement of serum antibody concentrations) that have proven so informative in the past has not led to the development of effective vaccines against these more complex disease problems. How can this situation be improved?

Pulendran and colleagues study two separate 17D vaccine trials in immunologically naive humans. In the first, samples were collected at intervals immediately after vaccination (days 1, 3, 7 and 21) and at a memory time point (day 60). The authors analyze data obtained with spectrum of multiplex assays (including cytokine and transcriptome analyses) of peripheral blood mononuclear cells used immediately after isolation or after in vitro restimulation. Overall, their findings confirm work reported before on the diversity of innate responses activated by the yellow fever vaccine3. Like the vaccinia virus that protects against smallpox, the 17D yellow fever vaccine is a live attenuated virus. A greater breadth of 'downstream' responses has consistently been associated with live viruses than with inactivated viruses, as they target many innate pathways, including Toll-like receptors, Nod-like receptors and RIG-I-like receptors. Representatives of these families are found to be activated in 17D-inoculated humans, including Toll-like receptor 7, RIG-I and 'downstream' gene regulators such as interferon-regulatory factor 7.

Many experiments with mice have defined the function of such innate response pathways in determining the subsequent profile of adaptive immunity. What has not been clear is whether the amount of molecular activation shows direct, quantitative correlation with the magnitude of immune memory. Put plainly, does greater innate induction, as assessed by the higher transcription of interferon-regulatory factor 7 (or similar molecules), result in more antibody and T cell memory? To address that question, the authors use a sophisticated algorithm to determine predictive correlations between input (the early response transcriptome data) and output (either high antibody titers or CD8+ T cell numbers; both correlations are tested independently). The algorithm, discriminant analysis via mixed integer programming (DAMIP), 'trains' on a subset of the input data, then tests itself on the remaining data not in the initial subset. This iterative process produces a 'rule': a collection of gene 'signatures' that can be used to predict a particular outcome for CD8+ T cell memory. Through the generation of many random subsets of 'training subjects' and 'testing subjects', the rules are refined until a predetermined degree of success is achieved for each rule. For trial 1 in the study by Pulendran and colleagues, these rules have a 93% success rate for classifying subjects into high or low CD8+ T cell memory categories.

The use of the same set of data in the 'training' and 'testing' phases raises the possibility that the correlations observed may be trial specific and that variation between independent experiments might render such rules useless. Pulendran and colleagues thus analyze a second, entirely independent vaccination trial with new subjects to verify the criteria derived the first time round. Applying the rules from trial 1 to trial 2 results in a success rate of at least 80% for all rules, with some reaching as high as 90%. Success here is determined by 'one-fold validation' that involves a simple review of each person in the trial, followed by application of the rules to determine if the person fits the classification. This high degree of validation confirms the robustness of the DAMIP protocol (Fig. 1). The authors further verify the DAMIP algorithm by reversing the protocol, using the data set from trial 2 as the input to generate new rules for subsequent testing of the data set from trial 1. Here the resultant rules have at least 90% efficiency for trial 2 and 73% efficiency when subsequently applied to trial 1. More importantly, many of the same genes that had formed the basis of the trial 1 rules are identified again for trial 2, which substantiates the fundamental reproducibility of the method.

Figure 1: Application of DAMIP to yellow fever vaccine trials.
figure 1

Naive human subjects were vaccinated with the 17D yellow fever vaccine and peripheral blood mononuclear cells were obtained at several time points shortly after vaccination and after memory generation. For the DAMIP analysis, vaccinees are categorized as either 'high' or 'low' responders on the basis of memory CD8+ T cell or antibody responses (Step 1). Next, the DAMIP algorithm is applied to a series of randomly assigned 'training' (no shading) and 'testing' (shaded) groups of vaccinees (Step 2) with derivation of key signatures on the basis of correlations between early transcriptome responses and their eventual memory outcomes (Step 3). These signatures are then tested for the original vaccine group and an independent trial (Step 4). Similarly, the outcomes of the second trial are used to generate signatures for testing of the original trial, with many of the same genes having predictive functions. Red lettering (bottom) indicates genes mismatched for responder and rule status. Hi, high; Lo, low.

As most successful immunization is associated with protective antibody responses, the authors apply the DAMIP algorithm to two vaccine trials, 'training' on one and testing on the other to develop predictive rules for high versus low long-term serum antibody concentrations. Again, the 'rules' successfully predict the outcome of the independent trial (100% for the signatures 'trained' on trial 1 and used to predict trial 2). The most notable result in this antibody analysis is that a single gene, TNFRSF17 (which encodes a receptor for the growth factor BLyS-BAFF), is present in all 15 signatures from either trial 1 or trial 2.

The consistency of these predictive signatures across two trials for both CD8+ T cell and antibody responses to the 17D yellow fever vaccine raises the possibility that these rules or their components might have broad applicability for different types of immunogens designed to protect against diverse pathogens. Superficially, the experiments to test that hypothesis seem straightforward, and even if the same signatures are not found for other infectious diseases, the application of a similar analysis may identify the underlying mechanisms that promote varied determinants of immunogenicity. The difficulty with such studies in humans would be that there are few attenuated vaccines with which it could be guaranteed that adults have had no prior experience of the pathogen. The obvious exception is of course vaccinia virus, which has not been delivered to the population at large since the successful completion of the smallpox-eradication program in 1979.

In addition to providing a potential tool for the forward assessment of vaccine efficacy, the findings from this 'systems' approach will provide the starting point for the development of new hypotheses aimed at elucidating the parameters that control T cell population expansion and antibody production. It might not be unexpected, for example, that a receptor for BLyS-BAFF is crucial in promoting potent antibody responses. However, the key genes in the predictive signatures for CD8+ T cell responses are not the 'usual suspects' for inflammatory innate signaling pathways. Of 22 signatures, 16 include SLC2A6, which encodes GLUT1, a membrane protein that regulates glucose transport and glycolysis. Both lipid metabolism and glycolysis are known to be essential for T cell proliferation and effector function4,5. Another common component is EIF2AK4, which is associated with regulation of the translation factor eIF2α. Although innate activation is a necessary precondition for adaptive immune expansion, these reports suggest that variations in CD8+ T cell quantity are the result of metabolic modulations in responding people.

As systems biology matures, such methods can be expected to expand the 'net' of immunological inquiry to include a broader scope of potential mechanisms and predictive possibilities. The increasing prominence of technologies that generate vast amounts of data on transcriptomes, proteomes and epigenomes requires a parallel advancement of analytic techniques that can sift through these data in an unbiased way, isolating significant correlates for mechanistic evaluation. The bases for successful intervention against a host of important infectious diseases probably already exist in many of these data sets. This new approach by Pulendran and colleagues offers one apparently robust protocol for such analysis. It is hoped that this will both facilitate and inform new vaccine design by providing an expanded spectrum of possible targets.