Selective expansion of myeloid and NK cells in humanized mice yields human-like vaccine responses

Mice engrafted with components of a human immune system have become widely-used models for studying aspects of human immunity and disease. However, a defined methodology to objectively measure and compare the quality of the human immune response in different models is lacking. Here, by taking advantage of the highly immunogenic live-attenuated yellow fever virus vaccine YFV-17D, we provide an in-depth comparison of immune responses in human vaccinees, conventional humanized mice, and second generation humanized mice. We demonstrate that selective expansion of human myeloid and natural killer cells promotes transcriptomic responses akin to those of human vaccinees. These enhanced transcriptomic profiles correlate with the development of an antigen-specific cellular and humoral response to YFV-17D. Altogether, our approach provides a robust scoring of the quality of the human immune response in humanized mice and highlights a rational path towards developing better pre-clinical models for studying the human immune response and disease.

M uch has been learned about how the mammalian immune system functions at steady state and during infection using inbred mouse models. However, it has become increasingly recognized that the mouse and human immune systems differ in numerous important aspects 1 , thus limiting the predictive value of studies in rodents for human biology. Furthermore, the narrow host tropism of many important human-tropic pathogens precludes the use of conventional mouse models for analyzing the interactions of such pathogens with the mammalian immune system 2 . The direct study of human immune responses is challenging as usually only peripheral blood, but not material from lymphoid organs or the site of infection, is readily accessible. Immune responses to many pathogens have been studied in patients, but interpreting such clinical data is difficult as numerous parameters that could influence measured immune response are often unknown. To gain better control of these critical factors, immune responses to live-attenuated vaccines, including yellow fever 3 , flu 4 , and smallpox 5 , have been carefully characterized. These studies have greatly contributed to our understanding of human immunity, but intra-and inter-donor variability, previous and/or current infections, age or microbiotic status still add significant complexity to the data and make analysis challenging.
Humanized mice have emerged as powerful tools for studying a broad range of human(-tropic) pathogens. Mice engrafted with components of a human hematolymphoid system or human immune system (HIS) have been especially useful for dissecting the interactions of human viruses with human immune cells [6][7][8][9][10] . A variety of mouse strains (reviewed in ref. 11 ) well-suited for engraftment of human hematolymphoid cells have been developed. These recipient strains are usually highly immunocompromised to facilitate engraftment of xenogeneic cells. Non-obese diabetic (NOD) mice deficient for both the recombinase activating gene 1 (Rag1 −/− ) and the IL-2 receptor gamma chain (IL2Rγ null ) (NRG mice) are commonly used and do not develop functional murine B, T, or natural killer (NK) cells 12 . NRG mice are also deficient in hemolytic complement 13 and harbor a polymorphism in the gene encoding murine signal regulatory protein α (SIRPα), which reduces phagocytic activity against human cells 14 . Injection of irradiation-conditioned NRG mice with human hematopoietic stem cells (HSCs) leads to de novo hematopoiesis, resulting in stable engraftment of human hematolymphoid system components 6,12,15 .
Although there is evidence that the engrafted HIS in such mice becomes activated upon microbial challenge, the quality of the immune response in conventional models and in other refined models (such as the bone marrow-liver-thymus, or BLT model) remains weak or uncertain 7,9,[16][17][18][19][20] . One of the major reasons is the underrepresentation of critical human immune cell lineages in these models, which are crucial for activating the adaptive immune response. In particular, the scarcity of human dendritic cells (DCs) as well as other myeloid lineages and NK cells, decreases the functionality of the engrafted HIS. The small frequencies of these cell populations can be explained, in part, by the limited biological cross-reactivity of the non-redundant cytokines that promote lineage differentiation 21 . Consequently, several new humanized mice models with significant reconstitution of myeloid and/or NK cell compartments have been recently developed (hereon referred to as second-generation humanized mouse models). Indeed, exogenous administration of human interleukin (IL) 15 or an IL15/IL15 receptor (R) fusion protein significantly increases human NK cell numbers 22 . Similarly, injection of recombinant cytokines, such as granulocyte-macrophage colony stimulating factor (GMCSF), macrophage colony stimulating factor (MCSF), IL3 or FMS-like tyrosine kinase 3 ligand (Flt3LG), or their expression in engineered xenorecipient strains results in increased frequencies of erythro-myeloid lineage cells 15,[23][24][25] . However, knowledge of how the human immune response in any of these novel models compares to those observed in humans remains limited.
To address this need, we devised an experimental pipeline allowing us to quantitatively assess immunity in humanized mice and compare to host responses in humans. By probing the cellular, humoral, and transcriptomic response to a highly immunogenic common standard, the yellow fever virus vaccine YFV-17D 26 , we provide here the first comprehensive comparison of the human immune response in conventional, second-generation humanized mice and human vaccinees. Our results highlight that selective expansion of myeloid and NK cells in humanized mice induces transcriptomic responses to YFV-17D infection akin to those of human vaccinees. The more human-like transcriptomic responses, lacking in conventional models, correlated with the development of antigen-specific cellular and humoral immunity to YFV-17D in humanized mice more robustly engrafted with human NK and various myeloid cells. Altogether, our work demonstrates a robust approach for the quantitative measurement of immunity in humanized mice, for more objective model cross-comparison and consequently, for the rational development of better pre-clinical humanized models.

Results
Conventional humanized mice mount limited immunity to YFV-17D. YFV-17D is one of the most potent vaccines ever developed, and single vaccination usually results in protection for at least 10 years 26 . Existing data on the immune response to YFV-17D in human vaccinees could thus serve as a valuable comparator to systematically assess the functionality of a transplanted HIS. In contrast to the transient or even undetectable viremia observed in human vaccinees 3,5 , YFV-17D RNA rapidly reached a plateau and persisted in the blood of NRG-HIS mice for at least 22 days, suggesting the engrafted HIS cannot effectively clear infection (Fig. 1a) despite an absence of significant mortality (Supplementary Figure 1a). In YFV-17D-infected conventional NRG-HIS mice, we noticed an overall increase in peripheral CD3+ T cells upon YFV-17D infection (Fig. 1b) without any changes in the ratio of CD4+/CD8+ T cells (Supplementary Figure 1b). In contrast to reports in patients, the frequencies of human CD8+ T cells expressing HLA-DR and CD38-two markers to track virus-activated cells within the bulk CD8+ T cell population in the peripheral blood of human vaccinees 5 -did not change in the blood (Fig. 1c) or spleen (Supplementary Figure 1c) of these mice. Downregulation of CCR7 and CD45RA on a subset of CD4+ and CD8+ T cells was detectable in the blood and spleen of NRG-HIS mice upon YFV-17D infection ( Fig. 1c; Supplementary Figure 1c), indicating that the engrafted HIS responded to the infection. However, this activation did not correlate with better control of viral replication in the periphery (Fig. 1a). To enable tracking of antigen-specific T cells responses, we infected humanized NRG mice expressing transgenically HLA-A2*0201 (NRG-A2-HIS) and quantified YFV-specific CD8 + T cells in the blood and spleen of NRG-A2-HIS mice using an HLA-A2:YFV NS4B (amino acids 214-222, LLWNGPMAV) tetramer 3 . Unlike previous studies characterizing virus-specific CD8+ T cell responses to HIV, Epstein-Barr virus, dengue virus, or adenovirus 9,17,19,27 , we did not detect any A2:LLWNGPMAVspecific T cells in either the blood or spleen over the course of infection, indicating that YFV-17D-specific cells are poorly primed in NRG-A2-HIS mice (Supplementary Figure 1d). These data indicate that while NRG-HIS mice have utility as a challenge model 6 , they require further refinements to better model human immune responses.
Next, we examined the expression of four genes (MDA5/IFIH1, STAT1, IRF7, RSAD2), all reported to be upregulated in PBMCs of human vaccinees following vaccination 28 , in the human PBMCs of NRG-HIS mice following YFV-17D infection. Since we expected the anti-viral response in NRG-HIS to be low, we aimed at determining the cumulative median expression of these four anti-viral genes by RT-qPCR using human-specific primers. Our data revealed that the cumulative median expression of these four genes significantly increased at day 11 post infection relative to pre-infection levels (day 0) (Fig. 2a; Supplementary Figure 1e). Using RNA-sequencing (RNA-seq), we then performed an unbiased quantification of differentially expressed genes in human PBMCs of NRG-HIS mice on day 11 post YFV-17D infection. Notably, only four genes were significantly differentially expressed (DE) (HSP90AA1, HIST1H4C, LRRFIP1, and UGDH-AS1) (Fig. 2b- Enhanced human myeloid and NK cell reconstitution in HIS mice. Impaired immune function in conventional NRG-mice (reviewed in ref. 29 ) can likely be attributed in part to low frequencies of critical immune cell subsets, such as myeloid and NK cells, compared to humans (Fig. 3a) where the myeloid compartment represents 50-80% of peripheral leukocytes. Since DCs and NK cells are key effectors of the innate immune response and critical in the activation of an adaptive response, we hypothesized that selective expansion of these cell subsets in humanized mice could promote an enhanced human immune response.
Receptor-type tyrosine-protein kinase FLT3, or fetal liver kinase-2 (Flk2), is a cell surface receptor broadly expressed on early hematopoietic precursors in the bone marrow. Consequently, myeloid cell development is severely impaired in Flk2-deficient mice (Flk2 −/− ) 30,31 . However, in xenorecipient mouse strains commonly used for human hematopoietic engraftment 12,32-34 , murine myelopoiesis is largely unaffected. Thus, the majority of myeloid cells are still of murine origin and putatively interfere with priming of human-specific adaptive immune responses upon infection with (human-tropic) pathogens.
Despite the recent generation of humanized mouse models harboring a Flk2 deletion 25,35 , the influence of this on the human immune response in humanized mice has not yet been described nor directly compared to immunity in humans. Therefore, as a proof-of-concept, we aimed to quantitatively evaluate the impact of this enhancement on the cellular and transcriptomic response to YFV-17D infection through cross-comparison with other humanized mouse models and human clinical data.
We thus generated Flk2 −/− mice on the NRG background, yielding NRGF mice. While previous work relied on repeated injections of recombinant Flt3LG 24,25 , we chose a vectored delivery approach, constructing a replication-incompetent . ****p ≤ 0.0001, ns non-significant (Wilcoxon-Mann-Whitney test). b Percentage of peripheral human CD3+ T cells among the total human CD45+ cell population in the blood of NRG-HIS mice over the course of YFV-17D infection. Bounds of box and whiskers represent the min-to-max fraction of peripheral human CD3+ T cell among total human CD45+ at each time point. Medians are indicated in each box as center line (n = 5) *p ≤ 0.05, **p ≤ 0.01 (Student's t test). c Fraction of peripheral human CD4+ and CD8+ T cells co-expressing HLA-DR and CD38 (red) or lacking expression of both CCR7 and CD45RA (blue) in the blood of NRG-HIS mice over the course of YFV-17D infection. Bounds of box and whiskers represent the min-to-max fraction of human CD4+ or CD8+ T cell for each marker combination and time point. Medians are indicated in each box as center line (n = 5). *p ≤ 0.05, **p ≤ 0.01 (Student's t test) adenoviral system for sustained and stable production of human      Figure 5). In NRGF-HIS/Flt3LG mice, we identified upregulated GO terms related to IFN signaling, antigen presentation, and cytokine production, consistent with previous findings in humans 28,39 . Moreover, we observed an enrichment in GO terms related to the regulation of B and T cell-mediated immunity, providing additional evidence for a more comprehensive immune response to YFV17D in NRGF-HIS/Flt3LG mice. In contrast, 20% of the immune-related GO terms were significantly downregulated in NRG-HIS mice, further underscoring the limited functionality of the HIS in this model. A KEGG pathway 40,41 enrichment analysis also confirmed the ability of NRGF-HIS/Flt3LG mice to mount an enhanced transcriptional response. Among the top five upregulated KEGG pathways identified in each mouse model, immune-related pathways were only found in NRGF-HIS/Flt3LG mice (Fig. 4e). These pathways, related to antigen presentation and NK cell activity, were consistent with our findings as well as with previous ex vivo and in vivo human studies 28,39,42,43 .
Human-like transcriptomic responses in NRGF-HIS/Flt3LG mice. Two studies have previously delineated the transcriptomic response to YFV-17D in human vaccinees 28,44 . Specifically, these studies identified sets of genes differentially regulated in PBMCs upon YFV vaccination. Hence, we utilized these valuable datasets as a reference to quantitatively evaluate how similar transcriptomic responses were in our different humanized mouse models in comparison to humans. We re-analyzed three datasets (hereon referred to as the Lausanne, Montreal, and Emory cohorts) derived from two independent studies 28,44 and sorted the list of differentially regulated genes (p adj ≤ 0.1) at day 7 post vaccination for each human cohort. From these three lists of genes, we then generated a global human dataset of genes differentially expressed upon YFV-17D vaccination using two distinct methods: an unbiased method and a double selection method, i.e., composed of genes found in at least one cohort, or in at least two cohorts respectively. Using these two distinct human reference datasets, we computed a Spearman rho correlation index (see Methods for details) for each humanized mouse model (NRG-HIS, NRGF-HIS/Fluc, and NRGF-HIS/Flt3LG; day 11 post vaccination) relative to human vaccinees, at varying p adj thresholds (from 0.01 to 0.1 with 0.01 increments) for differentially expressed genes. Correlation indexes were referred to as r U and r D for the unbiased and double selection method respectively (see Methods for details), for a given p adj threshold. We chose to use the correlation index value derived from gene sets constructed with an p adj threshold of 0.05 as our definitive r U and r D  Table 1). When limiting the analysis to differentially expressed genes at p adj ≤ 0.01, the r U and r D correlation indexes of the three mouse cohorts displayed the highest differences, with notably r D for NRGF-HIS/Flt3LG reaching up to 0.188 (versus 0.0967 and 0.036 for NRG-HIS and NRGF-HIS/Fluc, respectively). Altogether, this analysis demonstrates the superiority of our NRGF-HIS/Flt3LG mice over conventional humanized mice for modeling the human transcriptomic response to YFV-17D infection. Moreover, our approach provides a valuable and relevant methodology for an accurate and objective scoring of human immunity in humanized mouse models. The pro-inflammatory cytokine CXCL10 (or IP-10) is one of the most significantly induced cytokines following YFV-17D infection in humans 28,46 . Consistent with these observations, IP-10 serum levels did increase in NRGF-HIS/Flt3LG mice following infection (Fig. 6c). Several other pro-inflammatory cytokines, including MCP-1, IL-6, and IL-18, were also elevated in NRG-HIS/Flt3LG versus NRGF-HIS/Flt3LG mice at specific time points (Supplementary Figure 6d). As these increased cytokine levels did not correlate with an enhanced human immune response in the NRG-HIS/Flt3LG mice, they likely reflected the severe GVHD conditions in NRG-HIS/Flt3LG. Other cytokines, such as IFNγ, IL-23, and GM-CSF, were detected at similar levels in both mouse models (Supplementary Figure 6e).
Although an increase in peripheral CD3+ T cells was observed in both NRG-HIS/Flt3LG and NRGF-HIS/Flt3LG mice, Fig. 4 NRGF-HIS/Flt3LG mice display an extensive transcriptomic signature. a Schematic representation of the experimental procedure to characterize the PBMC transcriptomic signature of NRGF-HIS mice following YFV-17D infection. b Number of significantly DE genes (p adj ≤ 0.05) at day 11 post YFV-17D infection (versus day 0, prior infection) in the PBMCs of NRG-HIS (red), NRGF-HIS/Fluc (green), and NRGF-HIS/Flt3LG (blue) mouse PBMCs upon YFV-17D infection. c Protein-protein network of significantly DE (p adj ≤ 0.05) in NRGF-HIS mice following YFV-17D infection. Each gene is colored based on its log 2 FC (1 < x < 2, red; 0.5 < x < 1, orange; −1 < x < −0.5, yellow; −2 < x < −1, green). Areas enriched with genes related to a specific biological process are highlighted by a dotted circle or ellipse. d Frequencies of upregulated (red area) or downregulated (blue area) immune-related GO terms among all statistically significant GO terms (p ≤ 0.05) in NRG-HIS (red), NRGF-HIS/Fluc (green), and NRGF-HIS/Flt3LG (blue) mice following replicate analysis. Total count of immune-related GO-terms out of all significant GO terms (displayed as immune-related/all) are also reported. Dotted lines between the bars symbolize the progressive enhancement of human immune functionality across our humanized mice models. e KEGG pathway enrichment analysis of the transcriptomes of NRG-HIS (red), NRGF-HIS/Fluc (green), and NRGF-HIS/Flt3LG (blue) mouse PBMCs following replicate analysis. YFV-specific immunity in HLA-expressing NRGF-HIS/Flt3LG mice. Since NRGF-HS/Flt3LG mice did control YFV-17D infection, we analyzed the virus-specific CD8+ T cell response to ascertain its similarity to that of human vaccinees 3,5 . We intercrossed NRG-A2 mice with NRGF mice, yielding NRGF mice expressing HLA-A2*0201 (NFA2 mice). NFA2-HIS mice were then injected with either Adv-Fluc or Adv-Flt3LG 5 days prior to YFV-17D infection (Fig. 7a). Consistent with our previous findings in NRG-HIS/Flt3LG and NRGF-HIS/Flt3LG mice (Fig. 6b), viral replication was better controlled in NFA2-HIS/Flt3LG mice compared to NFA2-HIS/Fluc mice over time (Fig. 7b). Serum viremia in NFA2-HIS/Flt3LG mice peaked between days 5 and 10 post infection with the infection ultimately cleared by day 20 (Fig. 7b). These kinetics mimic those observed in human Neither cohort developed GVHD. We also found no correlation between survival and differential human Flt3LG concentration in the serum of surviving versus non-surviving NFA2-HIS/Flt3LG mice (Supplementary Figure 8b). YFV NS4B-tetramer+ CD8+ T cells were readily detectable in the blood of infected NFA2-HIS/Flt3LG mice but not NFA2-HIS/ Fluc mice ( Fig. 7c; Supplementary Figure 8c). Viremia and the frequencies of YFV-specific CD8+ T cells followed similar kinetics in the peripheral blood, suggesting an important role for CD8+ T cells in YFV-17D infection control and clearance as previously demonstrated in human vaccinees 47 . Consistent with previously reported YFV-specific CD8+ T cell phenotypes from human vaccines 3 , a significant fraction of YFV NS4B-tetramer+ CD8+ T cells proliferated, as indicated by their Ki67 expression, and acquired an HLA-DR+/CD38+ effector phenotype (Supplementary Figure 8d, e). In contrast, NS4B-tetramer negative CD8+ T cells did not demonstrate upregulated Ki-67 expression upon YFV-17D infection (Supplementary Figure 8f).
Although the frequency of antigen-specific CD8+ T cells statistically decreased to background levels in the blood of NFA2-HIS/Flt3LG mice, YFV-specific CD8+ T cells were readily detectable in the spleen by day 20 post infection but were not detectable in the spleen of NFA2-HIS/Fluc mice (Fig. 7d, Supplementary Figure 8g). Absolute cell count and phenotyping of splenic YFV-specific CD8+ T cells showed that these cells mostly expressed HLA-DR and CD38 at day 20 post infection ( Supplementary Figure 8g-i). They also did not show preferential expression of Ki67, suggesting a switch toward a memory phenotype. Future studies will be aimed at accurately delineating the different phenotypes of these antigen-specific cells.
Given the important correlation between the induction of a T cell-specific response and the clearance of YFV-17D infection in the periphery, we conducted a T cell depletion experiment. Prior to YFV-17D infection and 5 days post infection, NFA2-HIS/ Flt3LG mice were treated with anti-CD4 (α-CD4) or anti-CD8 (α-CD8) antibodies (n = 4 per group), which have previously been used to deplete CD4+ and CD8+ T cells, respectively, in humanized mice 7,9 (Supplementary Figure 9a). Peripheral CD4+ and CD8+ T cells were efficiently depleted in α-CD4 and α-CD8treated NFA2-HIS/Flt3LG mice, respectively (Supplementary Figure 9b), prior to infection. In the spleen, where T cells are more abundant than in the peripheral blood, we observed a more than ten-fold reduction in the number of CD4+ T cells in α-CD4-treated mice and a more than 1000-fold reduction in the number of CD8+ T cells in α-CD8-treated mice (Supplementary Figure 9c). Importantly, the counts of myeloid cell populations were unaffected in either condition (Supplementary Figure 9d). Upon T cell depletion, only α-CD8-treated mice exhibited significant mortality upon infection (50% survival) (Supplementary Figure 9e). However, both α-CD4-treated and α-CD8-treated mice were unable to clear viral infection in the periphery (Supplementary Figure 9f). Thus, these results suggest that both CD4+ and CD8+ T cells are important for controlling infection in peripheral blood and that CD8+ T cells are likely early regulators of such control. Additionally, these results are further evidence that the clearance of YFV-17D infection in NFA2-HIS/ Flt3LG mice is human-, and not murine-, mediated.
Seroconversion in human vaccinees is the hallmark of YFV-17D potent immunogenicity 48 . Hence, determined whether the enhanced control of infection and T cell-specific response were also associated with an improved humoral immune response. We assessed YFV-specific antibody concentrations in the sera of NFA2-HIS/Fluc (n = 4) and NFA2-HIS/Flt3LG (n = 4) mice over 6-weeks following infection. We detected a significant increase in YFV-specific IgM and IgG in the serum of the four NFA2-HIS/ Flt3LG mice at day 40 post infection (Fig. 7e, f; Supplementary  Figure 10a). In contrast, YFV-specific antibodies were not detected in the serum of NFA2-HIS/Fluc mice during the first 30 days following infection, and none of these mice survived till the final experimental end-point (day 40 post infection). Notably, we found a strong negative correlation between viremia level and YFV-specific IgG concentration in the blood of NFA2-HIS/ Flt3LG mice but not in NFA2-HIS/Fluc mice (Fig. 7g, h). Consistently, the serum of three NFA2-HIS/Flt3LG mice at day 40 post infection neutralized YFV-17D in vitro (median neutralizing titer: 1:33) (Fig. 7i). NFA2-HIS/Flt3LG mice also displayed a significantly enhanced frequency of multiple B cell subpopulations at day 20 post infection in comparison to NFA2-HIS/Fluc mice (Supplementary Figure 10b, c). Specifically, frequencies of follicular B cells, transitional B cells, classswitched memory B cells, or plasmablasts were higher in NFA2-HIS/Flt3LG mice, suggesting these subpopulations proliferate and differentiate better in response to YFV-17D infection in this model.
Enhanced YFV-17D immunity associates with superior HIS complexity. Finally, we employed Seq-Well 49 , a recently developed platform for massively parallel single-cell RNA-Seq (scRNA-Seq), to delineate at the greatest possible resolution the cellular composition of the engrafted HIS that correlated with a superior human immune response to YFV-17D. We isolated splenocytes from two NRG-HIS mice and two NFA2-HIS/Flt3LG mice at 6-weeks post YFV-17D infection and sorted these cells by human CD45 (hCD45) or human CD33 (hCD33) expression (see also Supplementary Note 1).
We ran parallel Seq-Well arrays for each sorted population, enabling both unbiased characterization of the relative abundances of all lymphocytes, as well as a deeper examination of the cellular diversity within the myeloid compartment. First, we examined the cell types identified in hCD45+single cells in NFA2-HIS/Flt3LG (Fig. 8a) compared to NRG-HIS (Fig. 8b) mice. NFA2-HIS/Fl3LG splenic CD45+ cells showed a higher diversity of well-resolved subpopulations of activated and differentiated cytotoxic lymphocytes, T cells expressing known activation and memory markers (CD27, CCR7, STAT1, CD40LG), and regulatory T cells (distinguished by high expression of FOXP3 and CTLA4). The abundance of myeloid and NK cells within the hCD45+ samples was significantly higher in NFA2-HIS/Flt3LG mice, and we were unable to resolve a distinct cluster of myeloid cells from the NRG-HIS populations when clustered alone.
To compare more directly the myeloid compartments between the NRG-HIS mice and the NFA2-HIS mice, we combined both hCD33+ and hCD45+ samples from either the NFA2-HIS mice (Fig. 8c) or NRG-HIS mice (Fig. 8d) and computationally gated out all T cells and B cells by expression of TCR-or BCR-related genes (full list in Methods section). In NFA2-HIS/Flt3LG mice, we identified six distinct clusters of cell types corresponding to subpopulations of NK cells, monocytes, macrophages, cDCs, pDCs, and cross-presenting DCs (full cluster-defining genes in Supplementary Data 2). We also directly compared the expression of the top subpopulation-defining genes in NFA2-HIS/ Flt3LG non-T, non-B single cells (Fig. 8e) with the corresponding single cell gate in NRG-HIS mice (Fig. 8f). This analysis revealed significant up-regulation of a broad range of myeloid and NK cell functional and activation markers (such as HLA-DRA, CD83, and CD40 for cDCs, and CCL5, GNLY, GZMA, PRF1, and CD226 for NK cells) in NFA2-HIS/Flt3LG mice over NRG-HIS mice.
When analyzed alone, hCD45+ NRG-HIS single cells did not yield a distinct subpopulation that could be annotated as a myeloid type cluster. However, when these cells were clustered in concert with abundant myeloid cell types from the NFA2-HIS mice, we could resolve these cells distinctly. Through this analysis, we confirmed superior frequencies of DCs, NK cells and NKT cells in NFA2-HIS/Flt3LG mice in comparison to NRG-HIS mice (Fig. 9a, b).
Finally, we tested differential expression between hCD45+ single cells in both mouse models over a curated list of lineage and cell-type relevant genes (Fig. 9c, d). This analysis revealed significantly higher expression of functional markers of activated NK and T cells (such as NKG7, PRF1, and GZMA), as well as higher abundance of functional mature myeloid cells (notably via ITGAM, ITGAX, CD14, or CD83) in NFA2-HIS/Flt3LG mice.
Altogether, these data provide an in-depth view of the cellular composition of the HIS in conventional and second-generation humanized mice. They also highlight the enhanced engraftment and functionality of the myeloid and NK cell compartment in NFA2-HIS/Flt3LG mice, which is likely critical in promoting an enhanced transcriptomic, cellular, and humoral response to YFV-17D. A more complete description of our experimental design and our scRNA-seq data can be found in Supplementary Note 1, associated with Supplementary Figure 10d

Discussion
Understanding the pathogenesis and immune responses elicited by human-tropic pathogens presents considerable challenges. Over the past decades, humanized mice have proven susceptible to a large number of human-tropic pathogens 6,8,17,27,50,51 and have emerged as valuable platforms to model human-specific infectious processes in vivo. Despite the fact that multiple studies have reported that humanized mice can develop immunological features resembling those of humans, commonly used humanized mouse models still mount an imperfect human immune response 7,9,16,52 , and it remains incompletely defined how specific refinements of xenorecipient strains and/or humanization protocols impact immunity. To guide a more directed approach to improving future generations of humanized mice, objective metrics are needed to facilitate direct comparisons with clinical data. YFV-17D is a highly potent human vaccine that induces a strong, polyfunctional immune response 3,28,44 as well as longlasting protective immunity (reviewed in ref. 26 ). The human immune response to YFV-17D has been intensively studied in the peripheral blood of human vaccinees 3,5,28,44 , and transcriptomic signatures in human PBMCs following YFV-17D immunogenicity have been delineated 28,44 . These features highlight YFV-17D human immunity as a powerful comparator for evaluating immune responses in humanized mice.
Here, we provide a comprehensive and quantitative comparison of human clinical data with equivalent data sets from conventional and exemplary second-generation humanized mouse models. By probing the specific cellular, humoral, and transcriptomic response to YFV-17D in different humanized mouse models, we examined how a correlation index, built to reflect the degree of overlap between the transcriptome of a given humanized mouse model and that of human vaccinees upon YFV-17D infection, associated with the induction of YFV-specific immunity and viral clearance within a given model.
Despite high levels of human engraftment, the human immune response to YFV-17D in NRG-HIS mice was weak and associated with a low correlation index. In contrast, enhancement of the NK and myeloid cell compartments in NRGF-HIS associated with a significantly higher correlation index in comparison to  CD1C   XCR1  NCAM1  CCL5  KLRF1  KLRB1  CD226  NKG7  GNLY  GZMA  IL32   IRF8  CD80  CD40  CD33  CLEC4C  GZMB  IGJ  IL2RB  PRF1   CD83  C5AR1  CD86  CD300E  VCAN  ITGAX  TNFSF10  LYZ  FCGR3A   NCAM1  CCL5  KLRF1  KLRB1  CD226  NKG7  GNLY  GZMA  IL32   CD80  CD40  CD33  CLEC4C  GZMB  IGJ  Hence, our study shows that the transcriptomic correlation index represents a powerful and accurate proxy for assessing the quality of the human immune response in these-and importantly, other -models. Numerous alternative humanization strategies have been realized in e.g., so-called BLT or MISTRG models 17,23 . The experimental pipeline and data sets provided here will undoubtedly be a valuable resource to objectively evaluate these strategies in comparison to others, as well as for the rational development of advanced humanized mice models and improved modeling of human disease in vivo. Refinement strategies could include the expression of additional human orthologs of non-redundant cytokines, which exhibit limited biological cross-reactivity in order to foster development of other underrepresented human immune cell types. Currently, most studies employ naive, i.e., antigen-inexperienced, mice, which does not take into account how immunological history shapes immunity. Thus, exposing mice to multiple other vaccines prior to yellow fever vaccination or infection with another pathogen may be a valuable approach to further enhance immunity. Co-engraftment of NFA2 or other xenorecipient strains with additional HSC donor-matched human tissues, such as liver, thymus, and/or lymph nodes, could also significantly augment the immune response. Such coengraftments could enhance T and B cell selection, intrahepatic T cell priming 53 as well as liver-mediated secretion of key human-immune components 54 . Finally, engraftment of second-generation humanized mice with a human-like microbiome represents another valuable approach to enhance immunity as recently suggested 55 .
Taken together, our study demonstrates that correlation of transcriptomic signatures provide a relevant and valuable path toward the rational refinement of humanized mouse models. Such an approach opens avenues for more accurate characterization of human-pathogen interaction events in vivo, for the development of innovative immunotherapy and vaccine strategies, as well as for the modeling of relevant (patho-) physiological processes and auto-immune diseases.
Generation of recombinant adenoviruses. Adenoviral stocks were generated as previously described 56  Purification of human CD34+ cells were assessed by quantifying by flow cytometry using an anti-human CD34+-FITC antibody (dilution 1/100, clone 581, BD Biosciences, Franklin Lakes, NJ). Expression of human CD90, CD38, CD45RA was assessed among the CD34+ population.
All animal experiments described in this study were performed in accordance with protocols (number 1930) that were reviewed and approved by the Institutional Animal Care and Use and Committee of Princeton University.
Generation of human immune system-engrafted mice. 1-5 days old xenorecipient mice were irradiated with 300 cGy and 1.5-2 × 10 5 human CD34+ HSC were injected intrahepatically 4-6 h after irradiation. Male and female mice transplanted with CD34+ HSC derived from various human donors were used in this study.
Mouse injections and blood collections. 4-8-month-old NRG-HIS, NRGF-HIS, and NFA2-HIS were infected through intravenous injection in the tail with 10 10 recombinant AdV-Flt3LG or 10 11 recombinant AdV-Fluc particles and/or with 10 6 YFV-17D p.f.u., resuspended in 200 µl of PBS. For experiments involving preinjection of AdV-Fluc or AdV-Flt3LG prior YFV-17D infection, YFV-17D infection was always performed 5 days following AdV-injection. 200 µl of blood were collected through submandibular bleeding at the indicated time-points. Serum was separated from blood cells by centrifugation (10 min, 3500 rpm at room temperature) for further quantification of serum viremia. For assessment of immune cell expansion and time course experiments, all the infected humanized mice displayed a level of peripheral humanization ranging from 40 to 60% human CD45+ out of total CD45+ cells. For assessment of the YFV-17D transcriptomic signature, all humanized mice displayed level of peripheral humanization ranging from 50 to 80% human CD45+ out of total CD45+ cells.
Monitoring of clinical symptoms and manifestations. Clinical manifestations of disease were monitored daily and signs of clinical disease progression recorded through weight and clinical scoring. All mice succumbing to YFV-17D infection developed severe weight loss and displayed several signs of disease such as posture hunched, trembling, appearance with ruffled fur and rear leg paralysis (at later stage of disease). GVHD was determined by the presence of the following clinical symptoms: severe hair loss on a significant portion of the body (with visible naked skin), skin rash, and skin inflammation.
Quantification of human FLT3LG concentration in mouse serum. Human Flt3LG concentrations were measured using an in-house sandwich ELISA. A rabbit polyclonal capture (Abcam, Cambridge, MA, USA) antibody was coated overnight into a 96 well-plate (Thermo Scientific, Waltham, MA, USA) at a concentration of 1:500. Following incubation with a blocking-buffer (SuperBlock™ Blocking Buffer; Thermo Scientific, Waltham, MA, USA), mouse serums were incubated at multiple dilution (1:10, 1:100, 1:1000). Captured human Flt3LG was then detected using a biotin-conjugated rabbit polyclonal detection antibody (Abcam, Cambridge, MA, USA) and a streptavidin-conjugated horseradish peroxidase (HRP, Abcam, Cambridge, MA, USA) antibody. A soluble recombinant human Flt3LG (initial concentration: 100 ng/ml) was used to determine a standard concentration curve. Optical density signals at 450 nm were then assessed using a TriStar Multimode Microplate reader (Berthold Technologies GmbH & Co. KG, Bad Wildbad, Germany).
Quantification of YFV-specific antibodies in mouse serum. Relative concentration of human YFV-17D-specific IgM and IgG was measured by homemade sandwich enzyme-linked immuno-sorbent assay (ELISA). 5000 YFV-17D infectious particles were coated overnight in a 96-well plate (Thermo Scientific, Waltham, MA, USA). Following incubation with a blocking-buffer (SuperBlock™ Blocking Buffer; Thermo Scientific, Waltham, MA, USA), mouse serums were incubated at 1:20 and 1:40 dilution. A serum sample from an anonymous human vaccinee was diluted from 1:10 to 1:160 in a two-fold dilution manner, and used as a positive control. Captured IgG and IgM were then detected using anti-human IgG (Thermo Scientific, Waltham, MA, USA; Clone HP6017) or IgM (Thermo Scientific, Waltham, MA, USA; Clone HP6083) horse peroxidase antibodies.
Optical density signals at 450 nm were then assessed using a TriStar Multimode Microplate reader (Berthold Technologies GmbH & Co. KG, Bad Wildbad, Germany).
YFV-17D infectious clone and in vitro transcription. pACNR-YFV-17D lowcopy number backbone (kindly provided by Charles Rice, The Rockefeller University, NY) was transformed and amplified using low recombination NEB 5-alpha high efficiency competent Escherichia coli (New England Biolabs, Ipswich, MA, USA). Transformed bacteria were incubated in LB+ 50 μg/ml Ampicillin (Sigma-Aldrich, Darmstadt, Germany) overnight at 30°C under shaking at 205 rpm. Plasmid cDNA was purified using E.Z.N.A. Endonuclease free Maxiprep Kit (Omega, Norcross, GA, USA), ethanol precipitated and linearized using Afl-II restriction enzyme. Following concentration of linearized DNA by ethanol precipitation, viral RNA was transcribed from 1 µg of linear template using mMES-SAGE mMACHINE SP6 kit (Ambion, Foster City, CA, USA) according to manufacturer's instructions. Organ collection and isolation of immune cells. Blood (200 µl) was collected through submandibular bleeding and transferred into EDTA capillary collection tubes (Microvette 600 K3E, Sarstedt, Nümbrecht, Germany). Cells were separated from plasma through centrifugation, and red blood cells were lysed with 1× lysis buffer (BD Pharm Lyse, BD Biosciences, San Jose, CA, USA) for 15 min at room temperature in the dark. Following lysis and quenching with 10% (v/v) FBS DMEM media, blood cells were then washed twice with a 1% (v/v) FBS-PBS solution before staining. At the indicated endpoints, mice were euthanized via exsanguination under ketamine/xylazine anesthesia. To generate splenocyte single cell suspensions, spleens were collected and individually placed in 15 ml of serum-free DMEM. Spleens were then transferred in a 6 cm dish, dissociated using a razor blade and digested (0.1% collagenase, Sigma-Aldrich, Darmstadt, Germany; 40 mM HEPES; 2 mM CaCl 2 ; 2 U/ml DNase1, HBSS, Life Technologies, Invitrogen, Carlsbad, CA, USA) for 30 min at 37°C. Following quenching with 10% (v/v) FBS-DMEM media, splenocytes were strained through a 100 µm cell strainer and washed with 10% (v/v) FBS-DMEM twice. Splenocytes were then centrifuged and lysed with 1× lysis buffer (BD Pharm Lyse, BD Biosciences, San Jose, CA, USA) for 15 min at room temperature in the dark. Cells were then washed twice with a 1% (v/v) FBS-PBS solution and counted prior to staining. For bone marrow-derived cell isolation, femurs and tibiae of the mice were flushed with PBS (Life Technologies, Invitrogen, Carlsbad, CA, USA) in a 10 cm dish. Cells were then centrifuged and lysed with 1× lysis buffer (BD Pharm Lyse, BD Biosciences, San Jose, CA, USA) for 10 min at room temperature in the dark. Cells were then washed twice with a 1% (v/v) FBS-PBS solution and counted prior to staining.
YFV-17D single-step RT-quantitative real time PCR. Viral RNA was isolated from mouse serum using the ZR Viral RNA Kit (Zymo, Irvine, CA, USA) according to manufacturer's instructions. Viral RNA was quantified using singlestep RT-quantitative real-time PCR (SuperScript® III Platinum® One-Step qRT-PCR Kit, Life Technologies, Invitrogen, Carlsbad, CA, USA) with primers and TaqMan probes targeting a conserved region of the 5′UTR of the 17D genome. Single-step RT-qPCR was accomplished in a StepOnePlus Real-Time PCR System (Applied Biosystems) using the following thermal cycling: 52°C for 15 min, denaturation at 94°C for 2 min, 40 cycles of denaturation at 94°C for 15 s, annealing at 55°C for 20 s, and elongation at 68°C for 20 s. A cDNA sequence coding for the 5′UTR was in vitro transcribed and used as standard for the absolute quantification of viral RNA. The primers used are as follow: YFV-17D sense 1, GCTAATTGAGGTGCATTGGTCTGC; YFV-17D sense 2, GCTAATTGAGG TGTATTGGTCTGC; YFV-17D antisense 1, CTGCTAATCGCTCAACGAACG; YFV-17D antisense 2, CTGCTAATCGCTCAAAGAACG, YFV-17D probe, FAM-ATCGAGTTGCTAGGCAATAAACAC-BHQ.
Detection and characterization of YFV-specific CD8+ T cells. PBMCs and splenocytes were isolated and purified as described above. 2-4 × 10 6 PBMCs or splenocytes were then incubated for 30 min at room temperature with a purified recombinant Fc protein (Human BD Fc block TM ; BD Biosciences San Jose, CA, USA; 1/10 dilution) in order to prevent tetramer non-specific binding to Fc receptor. Cells were then incubated for 1 h at room temperature with a HLA-A*02:01 APC-conjugated tetramer (MBL International, Woburn, MA, USA; 1/10 dilution) specific for the NS4B 214-222 derived epitope (LLWNGPMAV) 44 . Cells were then incubated with the appropriate antibody cocktail optimized for CD8+ T-cell phenotyping and fixed as described above. Flow cytometry fluorophore compensation were performed as described above. Counting beads were added to each sample prior flow-cytometry analysis (AccuCheck Counting Beads, Life Technologies, Invitrogen, Foster City, CA, USA) and were used to determine the absolute number of YFV-specific CD8+ T cell (also referred NS4B/A2+ CD8+ T cells) per 100 µl of blood. Absolute count of splenic YFV-17D-specific CD8+ T cells were determined by reporting the number of total splenocytes and YFV-17D-specific CD8+ T cells processed by flow cytometry with the total count of splenocytes determined following tissue digestion. Number of events collected per mouse and for a given tissue to determine the absolute count of YFV-17D-specific CD8+ T cells ranged from 5 to 10 and from 100 to 700, in the blood and spleen, respectively. YFV-17D-specific CD8+ T cells were analyzed by flow-cytometry for the expression of HLA-DR, CD38, and Ki-67.
T cell depletion experiment. NFA2-HIS/Flt3LG were injected with 100 µg of anti-CD4 or anti-CD8α (clone OKT-4 and OKT-8 respectively, BioXCell, West Lebanon, NH, USA) or not. Injections of 100 µg of antibody per mouse (intraperitoneal route) were performed for three consecutive days prior YFV-17D infection and three days post Adv-Flt3LG injection. A fourth antibody injection was performed 5 days post YFV-17D infection. At the day of infection, effective T-cell depletion was verified in the blood of animals by flow cytometry using antibodies targeting the following marker: human CD45+, mouse CD45+, human CD3+, human CD4+, and human CD8+. Weight and survival of the animals were monitored over a 20-day course of infection. Animals who survived the course of infection were sacrificed at day 20 post infection, and serum was collected to determine presence or absence of viral infection clearance in periphery. At the time of sacrifice, T cell count was determined in the blood and spleen of animals to control T cell depletion in mice treated with anti-CD4 or anti-CD8α. Additionally, the absolute counts of several myeloid cell populations (human cDCs, pDCs, monocytes, macrophages) and NK cells in the spleen were determined to ensure the specificity of the T-cell depletion.
Isolation of human CD45+ and RNA extraction. To isolate human CD45+ cells, humanized mice were bled prior and following infection and total blood were collected. Cells were centrifuged and lysed with 1× lysis buffer (BD Pharm Lyse, BD Biosciences, San Jose, CA, USA) for 10 min at room temperature in the dark. Cells were then washed, counted, and resuspended at a concentration of 1 × 10 8 cells/ml in a 2% (v/v) FBS-1 mM EDTA-PBS solution. Human CD45+ cells were then isolated using a human CD45+ cells enrichment kit (Stem Cell Technologies, Vancouver, British Columbia, Canada) according to the manufacturer's protocol. Purification of human CD45+ cells was assessed by quantifying by flow cytometry human and murine CD45+ frequencies prior and after purification. Following enrichment, human CD45+ were spined and resuspended in RLT lysis buffer (Qiagen, Hilden, Germany). Cellular RNA was then extracted using the RNeasy Mini Kit (Qiagen, Hilden, Germany) following manufacturer's instructions. For the transcriptomic experiments, mice were assembled as cohorts of 4-5 animals displaying humanization levels ranging between 50 and 80%. Following lysis and prior to cell resuspension in a 2% FBS-1 mM EDTA-PBS solution, blood samples of members of each cohort were pooled together, leading to one sample per cohort and time point (day 0 and day 11). Cohorts were assembled so the average humanization across cohorts differed only by ±5% between cohorts.
YFV-17D neutralization assay using mouse serum. Sera isolated from NFA2-NRGF-HIS/Flt3LG mice and identified by ELISA as containing YFV-specific antibodies following YFV-17D infection were employed for a neutralization assay. For each mouse, sera isolated prior (day 0) and after infection (day 40) were diluted at 1:20 and 1:40 in media containing YFV-17D viral particles (produced as described above). The number of viral particles per condition was determined as to reach a final m.o.i of 0.004 (200 p.f.u.). Viral particles and serum mixtures were incubated for 1 h at room temperature, and mixtures were used to infect naive Huh7.5 cells. As controls, we infected cells with cell culture media containing YFV-17D particles but no serum (also pre-incubated for 1 h at room temperature), or with media containing no viral particles and no serum. cDNA library generation and RNA-sequencing. The poly-A-containing RNA transcripts in the humanized mice total RNA samples were converted to cDNA and amplified following the Smart-seq2 method 57 . Sequencing libraries were made from the amplified cDNA samples using the Nextera kit (Illumina, San Diego, CA, USA), assigning a unique barcode to each of the libraries to be sequenced together. The cDNA samples and libraries were examined on the Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) DNA HS chips for size distribution, and quantified by Qubit fluorometer (Life Technologies, Invitrogen, Carlsbad, CA, USA). The majority of the cDNA fragments in each sample were between 1 and 3 kb, indicating good transcript integrity. The RNA-seq libraries from each experiment (i.e., one for each mouse strain, NRG-HIS and NRGF-HIS mice) were pooled together in equal amounts and sequenced on the Illumina HiSeq 2500 Rapid Flowcell as single-end 75nt reads following the standard protocol. Sequencing depth was of 10-30 million reads per sample (7-10 samples were loaded on a single flow cell). Raw sequencing reads were filtered by Illumina HiSeq Control Software and only Pass-Filter (PF) reads were used for further analysis.
RNA-Sequencing data analysis. In the Galaxy platform 58 through Princeton University, single-end short reads were mapped to the human reference (version GRCh38) and the murine reference (version GRCm38) using TopHat and Bowtie 2 with default parameters and read groups specified. Only counts from uniquely mapped reads were used in the DESeq 2 analyses.
Counts were generated by htseq-count 59 version 0.6.1galaxy1 run on a Galaxy installation, downloaded, and read into R 60 version 3.3.1 (2016-06-21) using scripts run in RStudio version 0.99.489 61 . The htseq count data was loaded into R and the DESeq2 (Galaxy Version 2.11.38) utilized 62 , capturing factors for cohort, day, and batch, with the sampling day as the experimental design. We set up the experimental design for the contrast we were interested in (day 0 versus day 11) and normalized log 2 transformed transcript counts within DESeq2. Differential gene expression were determined for each group of mice (NRG-HIS, NRGF-HIS/ Flt3LG or NRGF-HIS/Fluc) using the standard DESeq2 filters and nbinomWaldTest. Results were extracted from the DESeq2 analysis and annotated using Bioconductor's AnnotationDbi (version 1.32.3) and org.Hs.eg.db packages (version 3.2.3). Differential gene expression was considered as significant when p adj ≤ 0.05. Results are available in Supplementary Data 1.
Gene set enrichment analysis was performed in R using the package Generally Applicable Gene-set Enrichment (GAGE) 63 . Briefly, the GAGE method was called with the regularized log 2 fold change values, and with the go.sets.hs 37,38 or kegg. sets.hs 40,41 database. Same.dir was set to true (since we were interested in wholesale up-or down-regulated shifts). The top five resulting KEGG pathways (qvalue ≤ 0.06) were then identified.
Human vaccinees datasets and measure of correlation index. For the Emory cohort (n = 25 vaccinees, two independent trials of respectively 15 and 10 human vaccinees 28 ), micro array data were downloaded from Gene Expression Omnibus under series accession no. GSE13485. For the Lausanne and Montreal cohorts (n = 11 and n = 25 vaccinees respectively 44 ), data were downloaded from Gene Expression Omnibus under series accession no. GSE13699. For the three cohorts, micro array raw data at day 0 and day 7 post vaccination were downloaded and we employed GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) to determine the set of differentially regulated genes at day 7 post vaccination. For the Emory cohort, we identified a set of 193 genes with p adj ≤ 0.1 differentially regulated at day 7 post vaccination. For the Lausanne and Montreal cohort, we identified a set of 2674 and 12930 genes with p adj ≤ 0.1 differentially regulated at day 7 post vaccination, respectively.
To establish our correlation index, we decided to focus on gene expression foldchange (FC) as the compared variable, as it is adimensional and can therefore be compared across experiments using different expression measurement methodologies We first sought to establish a global human dataset of genes that could be used to measure similarity with humanized mice in a relevant fashion. To do so, we used two methods. In the first one (unbiased), the gene set included all genes that had been statistically determined to be differentially expressed (for a defined Benjamini-Holmes-adjusted p-value threshold of q equal or lower than 0.1) in at least one of the three human cohorts. This method has the merit of being the most unbiased, as it uses no other criteria than statistical relevance. However, it is sensitive to false positives (that can occur somewhat frequently in transcriptome analyses, especially concerning low-expression genes). Moreover, when constructing the gene dataset that way, a disproportionate fraction of it was actually determined by the Montreal human cohort, as this cohort displayed the highest number of differentially regulated genes. To compensate for the over-weight of the Montreal cohort in our dataset, we therefore also used a double selection method, where we only selected genes that had been statistically determined to be differentially expressed (for a defined Benjamini-Holmes-adjusted p-value threshold of q equal or lower than 0.1) in two or more of the three human cohorts. This method ensures the biological relevance of the genes selected in the set construction as they have been detected in more than one cohort, but at the cost of omitting potentially relevant candidates that were only detected in only one cohort.
For both methods, after elimination of duplicates, we averaged the fold-changes of the genes in the set across all three human cohorts, therefore building two adjusted-p-value-dependent reference vectors: the "unbiased" vector U q and the "double selection" vector D q .
Finally, the Spearman rho correlation statistics to those reference vectors were computed for each humanized mouse cohort to build our correlation index. For a mouse cohort M, the "unbiased" value vector M U,q was first built as follows: The values in M U,q and U q were ranked to obtain the vectors rgM U,q and rgU q , and the unbiased correlation index r M,U,q (simply referred to as r U in the Results section) was then computed as follows: The double selection correlation index r M,D,q (simply referred to as r D in the Results section) was computed in a similar fashion for a mouse cohort M.
It is important to note that the correlation index is dependent on the adjusted pvalue threshold selected when building the reference vectors U q and D q . As this threshold is raised, more genes will be included in building the index, thus affecting the whole process leading to the computing of our correlation indexes. To address that question, we built several versions of our reference vectors U q and D q using different adjusted p-value thresholds (from 0.01 to 0.1 by 0.01 increments) and computed r M,U,q and r M,D,q for each of them. Based on the shapes of the r M,U,q = f (q) and r M,D,q = f(q) curves, which all show a steady, linear decline with no sharp discontinuity between q = 0.02 and q = 0.08, we chose to use the correlation index value derived from gene sets constructed with an adjusted p-value threshold of 0.05 as our definitive correlation indexes. For a mouse cohort M, the unbiased correlation index to human cohorts r M,U is therefore: scRNA-Seq. Human CD45+ and human CD33+ populations from the spleens of NFA2-HIS/Flt3LG mice and NRG-HIS mice were FACS-sorted and loaded onto prepared Seq-Well arrays, as previously described. Briefly, 10,000 single cells were loaded onto one array containing 86,000 barcoded mRNA capture beads. Cells and beads were co-confined in microwells on the array, and a polycarbonate membrane sealed individual wells to allow for isolated single-cell lysis and transcript hybridization prior to bead recovery for reverse transcription. Next, each library was treated with Exonuclease I to remove excess primers, and PCR amplification was performed using KAPA Hifi PCR Mastermix (Thermo Scientific, Waltham, MA, USA) to generate final cDNA libraries. Libraries were then constructed using the Nextera XT DNA tagmentation method (Illumina, San Diego, CA, USA). Tagmented and amplified libraries were subsequently purified and sequenced using an Illumina 75 cycle NextSeq500/550v2 kit (read 1: 20 12 bp barcode, 8 bp UMI; read 2: 50). Detailed procedures for scRNA-Seq data alignment and analysis are available in the Methods section.
ScRNA-sequencing library generation. Frozen splenocytes from NFA2-HIS/ Flt3LG and NRG-HIS mice were thawed and stained with antibodies for antihuman CD33 or CD45, and calcein green as a positive viability marker. Cell populations that were either anti-human CD33+ or anti-human CD45+ and calcein green positive were sorted into RPMI + 10% FBS. Single cells from each sort gate and each animal were loaded onto prepared Seq-Well arrays, as previously described 49 . Briefly, 10,000 single cells from each sort gate were loaded onto one array containing 86,000 barcoded mRNA capture beads. Cells and beads were coconfined in microwells on the array, and a polycarbonate membrane sealed individual wells to allow for isolated single-cell lysis and transcript hybridization prior to bead recovery for reverse transcription. Next, each library was treated with Exonuclease I to remove excess primers, and PCR amplification with KAPA Hifi PCR Mastermix (Thermo Scientific, Waltham, MA, USA) to generate final cDNA libraries. Libraries were then constructed using the Nextera XT DNA tagmentation method (Illumina, San Diego, CA, USA). Tagmented and amplified libraries were subsequently purified and sequenced using an Illumina 75 cycle NextSeq500/550v2 kit (30 bp PE reads).
ScRNA-Seq alignment and analysis. The reads were aligned as described in ref. 66 . Briefly, for each NextSeq sequencing run, raw sequencing data was converted to demultiplexed FASTQ files using bcl2fastq2 based on Nextera N700 indices corresponding to individual samples/arrays. Reads were then aligned simultaneously to both mm10 and hg19 genomes using the Galaxy portal maintained by the Broad Institute for Drop-Seq alignment using standard settings of the STAR aligner. Individual reads were tagged according to the 12-bp barcode sequenced and the 8-bp UMI contained in Read 1 of each fragment. Following alignment, reads were binned onto 12-bp cell barcodes and collapsed by their 8-bp UMI. Digital gene expression matrices for each sample were obtained from quality filtered and mapped reads. UMI-collapsed data was utilized as input into R for further analysis.
We first compared the alignment quality between NFA2-HIS/Flt3LG mice and NRG-HIS mice in each sorting condition. We confirmed equivalent sequencing depth and sample input quality between NFA2-HIS/Flt3LG and NRG-HIS mice by observing the total transcripts/single cell detected for the mouse single cells that contaminated our sorted populations. However, we observed overall lower quality among single cells that align to human genomes in the NRG-HIS mice compared to NFA2-HIS/Flt3LG mice, which cannot be attributed to differences in sequencing depth. We next merged UMI matrices across all genes detected in any hCD45+ sample (2 from NFA2-HIS/Flt3LG, 2 from NRG-HIS mice), and eliminated cells with fewer than 300 UMI detected and fewer than 300 unique genes detected (n = 1297 across 2 NFA2-HIS/Flt3LG mice, n = 457 across 2 NRG-HIS mice). We next partitioned the matrix into cells originating from NFA2-HIS/Flt3LG mice or NRG-HIS mice, and completed all clustering and cell calling analyses in parallel. To complete dimensionality reduction and data visualization methods, we first identified the top variable genes by including all genes with an average normalized and scaled expression value greater than 0.32 and dispersion greater than 0.6. Principal Components Analysis was performed over the list of variable genes and all cells (for each experimental group individually). We determined the top significant principal components using a permutation method as previously described 67 . Significant principal components were used for tSNE plotting, with perplexity of 40. We used FindClusters, a clustering algorithm in the R package Seurat 1.4.0.1 (https:// github.com/satijalab/seurat) to identify significant clusters. 10 clusters were found for NFA2-HIS/Flt3LG mice, and 6 clusters were found for NRG-HIS mice, using equivalent parameters. No cluster was attributed to any single mouse. To identify genes that defined each cluster, we performed a likelihood ratio test implemented in Seurat. Top marker genes were used to classify cell clusters into cell types based on existing biological knowledge.
To better understand diversity within the non-T cell, non-B cell compartment of each humanized mouse type, we merged both hCD45+ and hCD33+ arrays for NFA2-HIS/Flt3LG mice, and separately merged hCD45+ and hCD33+ arrays from NRG-HIS mice. We eliminated low quality cells with the same parameters as above, and eliminated any cell with expression of the following genes: CD3E, CD3G, CD3D, TRAC*, TRBC*, CD79A, CD79B, IGKC, IGLC*, MSA41, CD19 (* indicates any number), to gate out any T or B cells. We applied the same protocol as above to identify variable genes, identify significant principal components, calculate a tSNE representation, find significant cell clusters, and annotate cell clusters by identifying biologically meaningful defining genes. Genes that define each of these non-T, non-B cell clusters in the NFA2-HIS/Flt3LG mice were compared directly to the expression of these genes in the NRG-HIS non T, non B cells. Gene differential expression was calculated using a likelihood ratio test for zero inflated data, implemented in Seurat, with original description described 68 .
To directly compare the abundance of each cell type by annotation in each humanized mouse dataset, we took all hCD45+ samples and merged their UMI matrices. We analyzed this data as described above, and calculated the frequency of each cell type within either NRG-HIS mice or NFA2-HIS/Flt3LG mice.
Statistical analysis. Statistical analyses were performed using either a nonparametric Wilcoxon-Mann-Whitney test 69 or a parametric Student' t test (GraphPad Prism software V6.0) when appropriate and as indicated in each figure legend. A two-way ANOVA test was performed for multiple comparison (GraphPad Prism software V6.0). *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001, ****p ≤ 0.0001. Information related to the statistical analysis performed for RNA-seq and singlecell RNA seq experiments can be found in the related sections above.

Data availability
All the transcriptomic datasets generated for this study are available through the National Center for Biotechnology Information Gene Expression Omnibus (GEO) under the SuperSeries accession no. GSE119751. The bulk RNA-seq data and scRNA-seq data are referenced under two SubSeries that are linked to GSE119751, respectively GSE119749 and GSE119750. All other relevant data, as well as additional information regarding reagents employed in this study, are available from the authors upon request.