Model structure

We consider T cells (and CAR-T products) to comprise three functionally distinct cell populations: T memory cells (T M ), capable of long-term self-renewal and immunological memory; T effectors (T E ), responsible for target-mediated cell killing; and exhausted T cells (T X ), lacking both killing potential and proliferative capacity. An antigen-sensing toggle switch coordinately regulates the decision of memory cells to self-renew versus differentiate, the rate of effector proliferation, exhaustion and the rate of memory cell regeneration from effectors (Methods). This represents a conceptually simple yet biologically sound description of T cell function and regulatory control in response to immunological need, as determined by systemic antigen burden (Fig. 1a).

Fig. 1: An antigen toggle switch model of T cell regulation quantitatively describes pharmacokinetics/pharmacodynamics behavior of CR, PR and NR patient population response to Kymriah in CLL. a, Cartoon depiction of the model structure, comprising three populations of T cells—T memory cells (T M ), T effector cells (T E1 and T E2 ) and exhausted T cells (T X )—and B cell tumors (B). Tumor cells express B cell antigen (B A ), which stimulates T cell proliferation and differentiation and inhibits the formation of T memory cells. b, We fit the model to published pharmacokinetics/pharmacodynamics profiles separated by response category (CR/PR/NR) from Fraietta et al.18 using PSO. Model fits (curves: mean of 12 parameter sets; dark shaded areas: middle 90%) agree with both CAR-T and B cell tumor dynamics over time (dots: mean data; light shaded areas: range of data) for each of the three prototypic populations. c, PCA plot of the logarithm of the best-fitting parameters colored by population. PC1 captures 35.3% of the variability, and PC2 captures 21.7% of the variability. d, Sorted PC1 coefficients suggest that TK50 (highlighted pink bar) and k kill , μ M and d M (highlighted blue bars) are the largest sources of variation between CR and NR populations. These parameters correspond to cytotoxic potency, tumor cell lysis rate, memory cell proliferation and death rates, respectively. Full size image

Model parameterization: patients with CLL treated with Kymriah and grouped by response

We first sought to determine whether the mathematical description of T cell regulatory control could quantitatively capture characteristic CAR-T pharmacokinetic and tumor dynamic profiles and whether parameter estimates reveal anything about biological underpinnings of clinical variability. Fraietta et al.18 reported mean pharmacokinetic and tumor dynamic profiles of patients with chronic lymphocytic lymphoma (CLL) treated with Kymriah (CTL019, a CD19-targeted CAR-T), grouped by complete responders (CRs), partial responders (PRs) and non-responders (NRs). We digitized the data (mean ± s.d.) and used particle swarm optimization (PSO) to estimate model parameters characterizing the three population archetypes (Fig. 1b). Parameters were estimated 12 times per patient group. Although parameters are non-identifiable (Supplementary Information), the clinical data were captured with good accuracy (Supplementary Fig. 5).

Biological mechanisms differentiating CR, PR and NR populations

To decipher the biological mechanisms underlying the differing patient response profiles, parameter estimates from the three patient populations were first decomposed into principal components (PCs) (Fig. 1c). Note that the three populations form relatively distinct clusters in parameter space, wherein the x axis depicting PC1 (accounting for 35.3% of the variance) separates virtual patients by response, and the y axis depicting PC2 (accounting for 21.7% of the variance) separates CR and NR groups from PRs. Examining the coefficients of PC1 (Fig. 1d), the lowest value (associated with NR) is TK50 (cytotoxic potency of effectors), and the largest positive contributions (associated with CR) is memory and effector cell turnover (proliferation and death rates; μ M , d M and d E2 ). That is, in responding patients, CAR-T effectors lyse target tumor cells much more efficiently, and both memory and effector cells cycle at a higher rate. These findings are consistent with local parameter sensitivity analysis (Supplementary Fig. 6).

It is established that frequency of memory cells in CAR-T infusion products, as assessed by standard T cell immunophenotyping, is predictive of clinical response19,20. This was one of the primary conclusions of Fraietta et. al.18. However, the PC1 loadings (Fig. 1d) suggest that cell-intrinsic differences in memory cell function (μ M and d M ) rather than frequency (f Tm ) are more important determinants of response. To discern the importance of memory cell frequency versus function, we preformed two experiments. First, we attempted to fit the data under the hypothesis that the only difference between CR/PR/NR populations was the composition of the product (frequency of T M , T E and T X cells), whereas the cell-intrinsic kinetic parameters are conserved (Supplementary Fig. 7). The model does capture differences in pharmacokinetics and tumor dynamics between the populations, and the inferred CAR-T product composition is consistent with that reported by Fraietta et al.18. However, the magnitude of differences between the populations cannot be fully explained by this hypothesis. That is, CAR-T cell composition as defined by memory and exhausted cell frequencies alone is insufficient to explain the variance in clinical activity.

To directly compare the inferred differences in memory cell function among CR/PR/NR groups, we simulated a dose-ranging study using purified memory cell populations from CR/PR/NR archetypes (Supplementary Fig. 8). The CR memory cells produced robust and dose-dependent CAR-T expansion, persistence and tumor reduction, whereas the NR cells showed very little expansion or anti-tumor activity, and the PR memory cells display somewhat intermediate function. In sum, these results imply that, although memory cell frequency in CAR-T infusion products contributes to exposure and response, cell-intrinsic features, such as proliferative capacity, are necessary to account for the variance clinical outcomes. We next sought to identify molecular signatures that underly these cell-intrinsic features and resultant clinical variance.

Molecular and cellular features differentiating CR, PR and NR populations

To examine the molecular and cellular features underlying these functional differences, we used bulk RNA-seq data from the same trial18 wherein pre-infusion CAR-T products were sequenced and annotated by response category. Differential expression analysis on the CR versus NR populations revealed biological features (gene signatures) consistent with inferred functional differences (Supplementary Figs. 9 and 10). We confirmed findings from the original report and additionally found that the CR population is enriched in CD4+ and CD8+ memory cell gene signatures (defined by single-cell sequencing of thymic tissue21) and display heightened expression of signatures characterizing T cell proliferation, effector cytokine (interferon) signaling and IL2RB, IL7 and JAK/STAT signaling (defined by curated pathway databases22,23,24). CAR-T cells from NR patients show heightened p53 (ref. 25) and DNA damage26 signaling, pathways that may underly the proliferative deficit.

Single-sample gene set enrichment analysis (ssGSEA) was subsequently used to examine distribution of the pathway and cell signatures in individual samples. The CR population is significantly enriched in the ‘non-exhausted T cell’ signature (Fig. 2a), consistent with simulations, wherein the fraction of non-exhausted cells at day 60 (peak of anti-tumor effects) is significantly higher in the CR group (Fig. 2b), whereas cells from the NR patients rapidly progress to exhaustion (Supplementary Fig. 11). The simulations also align with clinical reports that CAR-T products that fail to expand in vivo show heightened expression of exhaustion markers LAG3 and PD1 (ref. 27).

Fig. 2: ssGSEA estimates the activity of signaling pathways and enrichment of cell populations in CAR-Ts, separated by response. a,c–f, ssGSEA reveals differences in cell populations and signaling pathways between populations for selected cell signatures and signaling pathways (panel titles). n = 31 independent samples—five CR, five PR and 21 NR. b, Using the 12 best-fitting parameter sets for each population and model simulations, we calculated the percentage of the T cell population at day 60 that is non-exhausted. The median non-exhausted T cell population at day 60 (over the 12 parameter sets) is near 100% for both CR and PR populations, whereas the median is approximately 50% for the NR population. Differences between populations were assessed using an unequal variances two-sided t-test (P values shown). Box plots represent median ± 25th percentiles, with whiskers representing min/max values. Full size image

We found that CRs are differentially enriched in both CD8+ and CD4+ memory T cell signatures (Fig. 2c,d), consistent with the necessity of memory cells for mediating sustained responses28. Note, however, that bulk sequencing data cannot resolve cell population frequencies nor discern between transcriptionally similar versus co-varying cell types (Supplementary Fig. 12). That is, CR products may have higher frequencies of CD4+ and CD8+ memory cells or may contain cells with more ‘memory-like’ transcriptomes at similar frequencies. The CR population also shows heightened IL2RB and IL7R signaling (Fig. 2e,f), indicating that the CR cell products may show heightened sensitivity to the correspondent cytokines. Notably, IL2 and IL7 are common components of CAR-T expansion media29, and peak serum IL7 concentration is predictive of CD19 CAR-T exposure and progression-free survival30. Although the results shown in Fig. 2 are statistically significant, the ssGSEA distributions overlap between response categories. Thus, in addition to the limitations of bulk sequencing data, none of the gene signatures assessed could serve as univariable predictors of patient response.

Cell-intrinsic functional differences mediating CAR-T clinical response

To deconvolute the role of cell frequency versus function in mediating response, we leveraged two recently published clinical studies containing scRNA-seq data of pre-infusion, autologous CD19 CAR-T products matched with clinical outcomes. Bai et al.31 reported data for 12 patients with acute lymphoblastic leukemia (ALL) treated with a CD19 CAR-T product analogous to Kymriah—five CRs, two NRs and five patients who relapsed (RL). Haradhvala et al.32 reported data for 32 patients with large B cell lymphoma (LBCL) treated with either Kymriah (n = 13) or Yescarta (n = 19). For the Kymriah-treated group, there were six CRs and seven NRs; for the Yescarta-treated group, there were 11 CRs, one PR and seven NRs.

Examination of uniform manifold approximation and projection (UMAP) projections of the three datasets (Kymriah in ALL, Kymriah in LBCL and Yescarta in LBCL) reveals some separation of response categories in transcriptome space, particularly in ALL (Fig. 3a,d,g). To assess whether response separation is attributable to differences in T cell composition, we assigned cell type labels by mapping expression profiles of the individual cells to annotated tumor-infiltrating lymphocyte populations via ProjecTILs33. Most CD8+ cells in all three datasets are classified as T effector memory (Tem) or T exhausted (Tex), but there are no consistent differences in composition by response category (Supplementary Fig. 13a–c). For example, the frequency of cells annotated as exhausted is significantly higher in the NR/RL categories as compared to CR in the ALL data (P < 0.05, mean 4.4% versus 8.7%, respectively; Fig. 3b,e,h). However, this pattern does not hold for the LBCL data, and the modest effect size is insufficient to account for the vast disparity in clinical outcomes. We used the cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) antibody tag data provided by Bai et al.34 to assign early memory (Tmem: CD8+CD45RO−CD27+) and exhausted (CD8+PD1+) cell annotations by immunophenotype, reported to be predictive of response in CLL18. Although exhausted cell annotations by ProjecTILs and immunophenotype were notably concordant (6.7% versus 5.9% of total cells), cell frequencies did not differ by response category in ALL (Supplementary Fig. 13d,e).

Fig. 3: scRNA-seq of pre-infusion CAR-T products reveals cell-intrinsic defects associated with non-durable response. UMAP projections of three datasets representing Kymriah in ALL (a–c), Kymriah in LBCL (d–f) and Yescarta in LBCL (g–i). a,d,g, UMAP projections annotated by response category. b,e,h, UMAP projections annotated as exhausted using ProjectTILs33. c,f,i, UMAP projections annotated for high (above mean) or low (below mean) CAR-T cell dysfunction signature from Good et al.35. j, GSEA for select pathways, comparing both exhausted versus non-exhausted and CR versus PR/RL/NR categories within cells annotated as T effector memory (Tem via ProjecTILs) or early memory (Tmem; CD8+CD45RO−CD27+ via CITE-seq). A positive normalized enrichment score (NES, blue) indicates higher enrichment in CR/non-exhausted cells. *NR = NR/RL or NR/PR. P values were calculated by Kolmogorov–Smirnov tests implemented in GSEA. Full size image

To probe cell-intrinsic function, we annotated cells using a ‘CAR-T dysfunction’ signature, characteristic of functionally exhausted CAR-T cells with reduced proliferative and cytotoxic capacity35. Visually, the dysfunction signature is dispersed throughout response categories and not restricted to exhausted regions (Fig. 3g,h,i). Interrogating cell-intrinsic functional differences at a deeper resolution, we performed differential gene expression analysis on T sub-cell populations (annotated both by transcriptome and immunophenotype), followed by pathway enrichment for select gene signatures (Fig. 3j). As a control, we first assessed differences between cells annotated as exhausted versus non-exhausted. Exhausted cells are consistently enriched in the CAR-T dysfunction signature across datasets, whereas the ‘exhausted T cell’ and ‘P53 signaling’ signatures appear specific to the ALL-exhausted cells. Conversely, non-exhausted cells show disparate enrichment for the ‘early memory T cell’ signature as well as cytokine production and inflammatory response signatures, hallmarks of T cell functional potency.

Comparing cell populations from the CR versus NR/PR/RL categories reveals a consistent pattern across datasets. Focusing either on effector memory or early memory (CD8+CD45RA−CD27+) subsets, the NR/PR/RL groups display characteristic features of exhaustion. In particular, the CAR-T dysfunction signature is consistently heightened. The CR cell populations conversely show increased expression of early memory and/or T cell functional signatures (cytokine production and inflammatory response). That is, memory and effector cell populations from CAR-T products resulting in CR appear more functional or ‘memory-like’, whereas the same cell populations from NR/PR/RL categories appear more exhausted. The single-cell data, thus, confirm inferences from the model in separate indications (ALL and LBCL): CAR-T infusion products associated with non-durable response display deficits in proliferative and functional capacity intrinsic to memory and effector cell populations.

Cell-intrinsic attributes predictive of CAR-T response can be inferred from pre-infusion product transcriptomes

If CAR-T response is product-intrinsic rather than host-intrinsic, we reasoned that the differences in pre-infusion product transcriptomes could be predictive of response. Moreover, comparing response classifiers based on cell-intrinsic function (transcriptome) versus cell composition (T cell phenotype) could help elucidate which product-intrinsic feature is more clinically relevant. We used the bulk RNA-seq data from Fraietta et al.18 to develop a multivariate transcriptome classifier. Starting with the 28 pathways that were differentially expressed between the CR versus NR groups (false discovery rate (FDR)-adjusted P < 0.05; Supplementary Information), we trained a logistic regression-based classifier using a genetic algorithm for feature selection (Methods).

The resultant model was able to predictively distinguish CAR-T products from CR versus NR patients, with a median cross-validated accuracy of 90% based on a train:test split of 60:40 (Fig. 4a). As comparison, we trained and assessed classifiers using the early memory (CD8+CD45RO−CD27+) and exhausted (CD8+PD1+LAG3+) cell frequencies as reported18 (Supplementary Fig. 13d). The resulting accuracies (80% and 83%, respectively) are significantly better than chance but less so than that achieved using functional transcriptomes (P < 10−15 and P = 6 × 10−11, respectively). The gene signature panel thus reveals clinical functionality to an extent not apparent from immunotyping, implying that transcriptomes yield more value as CAR-T product characterization assays than current best-practice flow cytometry panels.

Fig. 4: CD19 CAR-T response can be predicted from infusion products using an ssGSEA-based transcriptome classifier with better accuracy than T cell immunophenotypes. Distribution of predictive accuracies are shown for 2,500 iterations using 60:40 train:test split cross-validation. Results from the transcriptome-based ssGSEA classifier are compared to classifiers (a) based on reported T memory (CD8+CD45RO−CD27+) and T exhausted (CD8+PD1+) cell frequencies from Fraietta et al.18. b, A bivariate classifier based on calculated T memory (CD8+CD45RO−CD27+) and T exhausted (CD8+PD1+) cell frequencies from Bai et al.34. c,d, Bivariate classifiers based on T effector memory and exhausted cell frequencies from ProjecTILs annotations of Haradhvala et al.32. Accuracy distribution resulting from null models (random classification) is shown as controls. *** indicates P < 10−15, two-sided rank-sum test. e, CAR-T response scorecard, representing the 28 gene signatures fed into the transcriptome classifier, ordered by differential GSEA in Fraietta et al.18. Bubble size indicates frequency of inclusion in the 2,500 trained models after feature selection; color indicates differential enrichment between response groups by dataset, based on pseudo-bulked GSEA (score = −1 × sign(NES) × log 10 P value). Red, CR enriched; blue, NR/PR/RL enriched. Gene signatures are annotated by source. NES, normalized enrichment score. Full size image

To assess whether these findings translated across datasets and indications, we applied the same workflow to pseudo-bulked single-cell data from Bai et al.34 (Kymriah in ALL) and Haradhvala et al.32 (Kymriah and Yescarta in LBCL). For the Bai et al.34 data (Kymriah in ALL), we compared accuracy of classifying CR versus NR/RL groups using the 28-gene signature panel to a bivariate classifier trained using the early memory (CD8+CD45RO−CD27+) and exhausted (CD8+PD1+) immunophenotype frequencies calculated from CITE-seq antibody tags (Supplementary Fig. 13d). Median accuracy of the transcriptome classifier was 80%, less (as expected) than before but better than that achieved by T cell immunophenotyping (47%, P < 10−15; Fig. 4b). We similarly assessed predictive accuracy using the LBCL data from Haradhvala et al.32 separately for Kymriah and Yescarta. As no immunophenotype data were provided, we compared the transcriptome classifier to bivariate classifiers based on estimated T effector memory (Tem) and exhausted cell (Tex) frequencies from ProjecTILs33 annotations (Supplementary Fig. 13b,c). Median predictive accuracy of the transcriptome classifier was 80% and 71% for Kymriah and Yescarta, respectively, outperforming T cell phenotype-based classification in both cases (60% and 67%, P < 10−15; Fig. 4c,d). As an additional control, we seeded the classifier with ‘random’ pathways by sampling from the compendium of gene signatures that were not differentially expressed between CR versus NR groups in the CLL data (FDR-adjusted P > 0.05; Methods and Supplementary Fig. 14). The resulting accuracies were either slightly better or indistinguishable from chance (the ‘null’ model), and all were significantly less accurate than predictions arising from the 28-gene signature panel.

Machine learning models are notoriously difficult to interpret. To condense the inner workings of the transcriptome classifier into interpretable patterns, we created a CAR-T response scorecard (Fig. 4e). This summarizes GSEA on the 28 select pathways and frequency of inclusion in the 2,500 trained models across each of the four datasets. There is variance in the directionality and statistical significance of the signatures between datasets, as would be expected. These represent different diseases, CAR-T products and platforms, and the data were generated by independent groups. However, the overlap is far greater than would be expected by chance (P < 10−5 for all; Methods). Notably, the Yescarta LBCL scorecard is visually distinct from the three Kymriah scorecards, and the resulting model predictions are correspondingly less accurate. This suggests distinct yet overlapping biology underlying response between the two products.

In summary, response to two separate CD19 CAR-T therapy products (Kymriah and Yescarta) in three indications (CLL, ALL and LBCL) is at least partially predetermined by functional attributes of the CAR-T infusion product. These functional attributes are shared across the four datasets to varying extents, revealed through gene signatures, and not fully apparent from T cell immunophenotyping.

Explaining inter-patient variability in Kymriah pharmacokinetics

The pharmacokinetics of Kymriah and other CAR-T products tested in clinical trials show high inter-patient variability, with AUCs spanning three orders of magnitude4,36,37. Although the transcriptome classifier can predictively distinguish response categories, we assessed whether our mechanism-based model is explanatory of the additional pharmacological variability—specifically, whether a mixture of the three patient archetypes (CR/PR/NR), combined with reported variation in administered dose and initial tumor burden, is sufficient to quantitatively account for the observed variance in exposure.

We first overlaid simulations of the CR/PR/NR pharmacokinetic profiles with registrational data for Kymriah5. Although these are different patient populations (CLL versus B cell ALL (B-ALL)), the pharmacokinetics are highly conserved between these two indications6. Visually, the CR/PR/NR profiles correspond roughly to the top quartile, median and bottom 5% of exposure (Fig. 5a). Thus, the CR/PR/NR population archetypes cover much of the pharmacokinetic variation but do not fully account for individual patient variability as they were fit to population means.

Fig. 5: Clinical variability in dose, tumor burden and CR/PR/NR pharmacological archetype account for population variance in Kymriah exposure and predict clinical covariates of response to Yescarta. a, Shaded areas show the clinical variability of exposure to Kymriah5 with median model simulations overlaid for the CR, PR and NR populations. b, CAR-T AUC distributions. The box plot labeled Kymriah shows the distribution in AUC obtained from 1,000 simulations of the clinical pharmacokinetics model (each dot corresponds to a percentile of the AUC distribution). The group of box plots labeled Model shows the AUC distribution obtained for the 12 best-fitting parameter sets for each population (CR, blue; PR, gray; NR, pink) with the colored background the range of AUCs obtained from the clinical pharmacokinetics data. The group of box plots labeled +Dose shows the AUC distributions for each population when doses are randomized within reported ranges in the virtual population (n = 1,000); +B0 shows the distributions when initial tumor burdens are randomized; and +Dose/B0 shows the distribution when both dose and initial tumor burdens are randomized. Box plots represent median ±25th percentiles and whiskers the min/max value or an additional 1.5-fold quartile distance. c, Cmax distributions plotted as in a. d–f, We defined response to treatment as tumor AUC less than 10,000 cells × day / µl and evaluated whether each patient in the virtual CR population with randomized doses and tumor burdens (+Dose/B0) exhibited a response (black binary data points). Logistic regression with respect to the tumor burden (d), Cmax (e) or the quotient of Cmax and tumor burden (f) reveals how each predicts response (blue curve indicates model estimate with 95% confidence intervals). As a control, uniform random sampling of parameter space (1,000 parameter sets) does not exhibit these response relationships (gray dashed line indicates model estimate with 95% confidence intervals). The clinical covariates of response calculated using the virtual population have the same trends as published covariates of response to Yescarta (red dotted curves). Note that the covariates of response for Yescarta have been linearly scaled to match the ranges in the virtual population for plotting. Full size image

We next assessed the effect of variability in dose and tumor burden using a virtual population approach9. We created virtual populations (n = 1,000) by Monte Carlo sampling across the parameter sets while randomizing dose and tumor burden within reported ranges, either alone or in combination, by log-uniform sampling.

The simulated exposures (AUC) for these virtual populations span the inter-individual variability of Kymriah (101–104 cells × day / μl; Fig. 5b). Variance in either dose or tumor burden is sufficient to cover and roughly match the reported variance of exposure within the CR/PR/NR populations. That is, although the model was fit to population mean data assuming fixed tumor burden and dose, relaxing either of these input assumptions is sufficient to account for reported variance. Similar results are produced by examining the Cmax (Fig. 5c). Grid simulations were used to assess how tumor burden and dose drive exposure and tumor response (Supplementary Fig. 15), revealing a non-linear relationship that likely contributes to the clinical variance. Given that the model recapitulates observed variance in exposure, we next assessed whether these simulations predict clinical covariates of tumor response.

Predicted covariates of response: Cmax and tumor burden

We examined whether the virtual populations could predict a priori the reported statistical relationships among cell expansion, tumor burden and clinical response. A thorough analysis of response covariates to Yescarta in large cell B cell lymphoma (LCBCL) identified the ratio of CAR-T expansion to initial tumor burden (that is, Cmax/B0) as the strongest correlate of durable response20. The same result was reported for overall survival in B-ALL38, indicating that this is a conserved feature across indications. The median pharmacokinetics and population variance of Yescarta are similar to Kyrmiah (Supplementary Fig. 16).

Focusing on the virtual CR population, we defined response by the B cell AUC, set to 104 cells × day / µl (the minimum observed for the virtual PR population). We used a logistic regression model linking response to initial tumor burden (B0), Cmax or the ratio as predictors (Fig. 5d–f). The equivalent logistic curves from Yescarta were digitized and overlaid by normalizing the x axes. The results are qualitatively consistent with the clinical data, in that these covariates are predictive of response.

To assess whether these predictions emanate directly from the model structure or necessitate model training, we created a ‘control’ virtual population by random sampling of parameter space (n = 1,000). This control population did not reproduce the same findings, emphasizing the need for appropriate training data to make accurate predictions.

Dose–response implications: patients with multiple myeloma treated with Abecma (BCMA-CAR-T)

To better understand the relationship among dose, Cmax and tumor response, we applied the modeling framework to a phase 1/2 dose-escalation study of Abecma (BB2121, idecabtagene vicleucel), a BCMA-targeted CAR-T approved for the treatment of multiple myeloma39. We again used PSO to estimate model parameters characterizing the pharmacokinetics and tumor dynamics (Fig. 6a,b). Although parameters are non-identifiable, both were captured with good accuracy (Supplementary Fig. 17), and simulations recapitulate the relationship between Cmax/Bo and tumor response identified in Fig. 5f for Kyrmiah and Yescarta (Supplementary Fig. 18).

Fig. 6: Model extension to Abecma dose response. a,b, Model training: we fit the toggle switch model to phase 1 dose–response data and observed good fits, with Pearson correlation coefficients from the goodness-of-fit plots (Supplementary Fig. 15) of 0.59 for the CAR-T cells and 0.74 for the tumor. c–e, Model analysis: we compared the fraction of the total T cell population across doses in the memory, effector and exhausted groups by plotting the mean across parameter sets. For low doses, the T cell population becomes mostly exhausted, whereas, for high doses, the population of memory and effector cells persists. f,g, Model testing: we compared predictive simulations at two doses with the data reported in the phase 2 study (150–450 million cell doses)40. The tumor dynamics out to 1 year fall within the bounds predicted for the 150–450 million cell doses. M, million. Full size image

The simulations yield insight into the effects of CAR-T dose on T cell population dynamics (Fig. 6c–e). The lowest dose (50 million cells) was incapable of tumor reduction and resulted in a predominance of exhausted T cells and gradual loss of memory cells. The highest dose, for which the greatest degree of tumor reduction was observed, produced the opposite response, with minimal exhaustion and a high fraction of memory cells. This is analogous to changes in T cell composition after acute versus chronic infection and provides mechanistic underpinning to the covariates identified above. That is, at an insufficient Cmax:tumor burden ratio, due either to low dose or expansion capacity, the infused CAR-T population will exhaust before clearing tumor.

To assess the predictivity of the model, we compared simulations against data from the phase 2 study, wherein patients were treated at doses of 150, 300 and 450 million cells and tumor dynamics (BCMA levels) were monitored out to 1 year (Fig. 6f,g). Although the pharmacokinetics are moderately under-predicted, the tumor dynamics are predicted with reasonable accuracy. That is, the phase 2 data (150–450 million cell doses) fall between the simulated 150 million and 450 million cell doses with similar dynamics. This is particularly notable, given that the model was trained on data going out to 2 months, whereas predictions are extrapolated out to 1 year.