Single-cell RNA sequencing reveals evolution of immune landscape during glioblastoma progression

Glioblastoma (GBM) is an incurable primary malignant brain cancer hallmarked with a substantial protumorigenic immune component. Knowledge of the GBM immune microenvironment during tumor evolution and standard of care treatments is limited. Using single-cell transcriptomics and flow cytometry, we unveiled large-scale comprehensive longitudinal changes in immune cell composition throughout tumor progression in an epidermal growth factor receptor-driven genetic mouse GBM model. We identified subsets of proinflammatory microglia in developing GBMs and anti-inflammatory macrophages and protumorigenic myeloid-derived suppressors cells in end-stage tumors, an evolution that parallels breakdown of the blood–brain barrier and extensive growth of epidermal growth factor receptor+ GBM cells. A similar relationship was found between microglia and macrophages in patient biopsies of low-grade glioma and GBM. Temozolomide decreased the accumulation of myeloid-derived suppressor cells, whereas concomitant temozolomide irradiation increased intratumoral GranzymeB+ CD8+T cells but also increased CD4+ regulatory T cells. These results provide a comprehensive and unbiased immune cellular landscape and its evolutionary changes during GBM progression. Single-cell RNAseq during initiation and progression of mouse glioblastoma reveals a dynamic immune microenvironment transitioning from pro-inflammatory microglia in early tumors towards an infiltrating macrophage and suppressor cell-centric immune landscape in late-stage tumors.

G BM-an incurable primary malignant brain cancer-has heightened genomic and cellular heterogeneity, a 14-month median survival and absence of an effective treatment. IDH1 wild-type (WT) GBMs are classified into transcriptomic subtypes (classical, proneural and mesenchymal), with defining genetic mutations 1 . Classical GBMs are characterized by epidermal growth factor receptor (EGFR) amplification and mutations often associated with losses of CDKN2A and PTEN tumor suppressor genes 1 . Building on this clinical information, we genetically engineered mice to faithfully mimic these genomic events and modeled classical GBMs 2-5 , to study GBM development in an immune competent context. GBMs are heavily (>30% of cellular mass) infiltrated by immune cells 6 . Cataloging this diversity using single-cell RNA-sequencing (scRNA-seq) is an ongoing process 7 , and a full representation of the immune composition during GBM initiation, progression and standard of care (SOC) (ionizing radiation (IR) and temozolomide (TMZ)) treatments remains absent.
Interpatient and intratumoral heterogeneity poses a formidable challenge to the successful development of targeted and immunotherapeutic approaches for the treatment of GBM. An in-depth understanding of the interactions between GBM cells and their immune microenvironment is therefore critical and such knowledge is currently lacking. Here, we performed droplet-based scRNA-seq to study the immune and the nonimmune composition of mouse GBMs at single-cell resolution, and performed longitudinal analyses of populations during initiation and progression. We investigated the effects of SOC on the immune composition of GBM in mice by flow cytometry. We observed drastic changes in the innate immunity population during progression from proinflammatory microglia early during GBM development to high infiltration of immunosuppressive macrophages and neutrophils in end-stage GBMs. We validated the clinical relevance of these distinct immunological profiles by analyzing specimens collected from patients with low-grade glioma and GBM. In both patient samples and our mouse GBM model, dynamics of tumor cells growth coincided with infiltration of immunosuppressive cells. In addition, we identified populations of myeloid-and lymphoid-derived cells that are present only in GBM. Together, our results establish a road map of events that leads to the establishment of the immunosuppressive characteristics of GBMs and offer an in-depth, unbiased resource for studies designed toward therapeutic interventions for GBM.

scRNA-seq identifies tumor, stromal and immune cells in GBM.
To determine cellular interactions during GBM progression, we leveraged our de novo GBM mouse model [2][3][4] wherein tumors are initiated in situ from conditional (Cre/Lox) overexpression of human EGFR combined with loss of Cdkn2a and Pten. A conditional luciferase reporter gene 8 was used for monitoring GBM growth by bioluminescent imaging (BLI). Stereotactic intracranial injections of a Cre lentivirus initiated GBMs and progression was monitored by BLI, and corresponded to tumor volume (Fig. 1a,b). Along with tumor-free normal brains as controls, GBMs were harvested for scRNA-seq at two stages of growth (Early <10 8 BLI and Late >10 8 BLI) (Extended Data Fig. 1a and Supplementary Table 1). CD45 − tumor/ nonimmune cells and CD45 + immune cells were sorted by flow cytometry and subjected to scRNA-seq (Extended Data Fig. 1b). Uniform manifold approximation and projection (UMAP) dimension reduction was performed on 27,633 CD45cells and 36,304 CD45 + cells, resulting in 17 and 20 clusters from the CD45and CD45 + samples, respectively ( Fig. 1c and Supplementary Table 2). Expression of Ptprc (Cd45) is exclusively restricted to the CD45 + sorted samples (Fig. 1d). EGFR and iCre clusters were identified as tumor cells (Fig. 1e) and related more to human scRNA-seq categories defined by Johnson-Verhaak 9 than those defined by Neftel-Suvà 10 (Extended Data Fig. 1c). We thus adopted the Johnson-Verhaak nomenclature to label EGFR + tumor clusters (Fig. 1c,f) with cluster 22 proliferating stem-like cells displaying the most proliferating cells (Fig. 1f).

Regionalization of transcriptionally distinct EGFR + cells.
To define the underpinnings of the distribution of EGFR + tumor cells into five transcriptionally distinct clusters, we determined the number and area of EGFR + cells and showed a progression from multiple EGFR + single-cell to small independent clusters to a single observable mass over time ( Fig. 2a and Extended Data Fig. 2a). This oligoclonality in early-and mid-lesions is not unequivocal evidence that late-stage tumors are composed of independently arising clones. Lineage tracing using the multifluorophore Cre-reporter allele R26R-Confetti strain 15 performed on GBMs harvested 10 and >30 days postinjection demonstrated that all four fluorophores were present at comparable levels and distributed evenly with similarly variegated patterns of expression, suggesting multiple clones are in residence in early and end-stage tumors (Fig. 2b,c). This oligoclonal nature, however, is discordant to the scRNA-seq data that identified three transcriptionally distinct populations of tumor cells. In fact, deciphering the transcriptional trajectories of single tumor cells using Monocle3 pseudotime tracing 16 suggest that cells in cluster 10 gives rise to cells composing clusters 0, 22 and 24, whereas cluster 5 cells represent an independent origin (Fig. 2d).
To better understand the molecular basis driving the separation of stem-like (clusters 0 and 10) from differentiated-like cells (clusters 5 and 24), we identified differentially expressed genes (DEGs) (Fig. 2e). Gene ontology (GO) pathway analysis pointed to significantly upregulated pathways pertaining to responses to INFα/β/γ, cell migration and angiogenesis in cluster 5 and oligodendrocyte differentiation, myelination and cell adhesion in cluster 0 (Extended Data Fig. 2b). The high proliferative index of cells in cluster 22 (Fig. 1g) is represented by spindle organization, mitotic cytokinesis, chromosome segregation and cell division (Extended Data Fig. 2c). Surprisingly, the mode of EGFR signaling of clusters 0/10 and 5/24 appears to be distinct. Cells from clusters 0/10 have increased expression of Tgfα while those from clusters 5/24 have increased expression of Hbegf (Fig. 2e,f)-two EGFR ligands that exert differential signaling in cancer models [17][18][19] . Other EGFR ligands were detected in neuroendocrine cells (Egf and Btc), microglia (Btc) and CD4 + T cells (Areg) (Extended Data Fig. 2d).
These differences in signaling from EGFR + GBM cells led us to test whether regional attributes define cells from clusters 0, 10, 5, 24 and 22. We leveraged The IVY Genome Atlas Project (IVY GAP) 20 and superimposed the most highly expressed gene in each cluster onto the expression patterns of patient samples. Genes from clusters 5 and 24 shared expression patterns with GBM cells dissected from pseudopalisading regions and the perinecrotic zone (Extended Data Fig. 3f). Genes from cluster 22 shared patterns with cellular tumor, hyperplastic blood vessels and microvascular proliferation, whereas top genes of cluster 10 were highly upregulated in cellular tumor bulk, leading edge and infiltrating tumor regions (Extended Data Fig. 3f). Similarly, alignment of highly expressed genes from EGFR + PDGFRA + and EGFR + GAL1 + revealed that EGFR + PDGFRA + cells display profiles similar to those from cells of cellular tumor, leading edge and infiltrating tumor, whereas EGFR + GAL1 + cells express genes similar to perinecrotic zones, pseudopalisading regions and hyperplastic blood vessels (Extended Data Fig. 3g). Together, these findings demonstrate a persistent oligoclonal nature of the model and suggest that the distinct gene expression profiles observed in the EGFR + cancer cells are potentially influenced by their ligand usage and localization in the tumor, ultimately shaping considerable heterogeneity in signaling pathway activation patterns.
GBMs harbor a proliferative population of microglia. In the uninjured brain, microglia have limited self-renewal capacities 21 . The presence of three clusters of microglia in GBM may represent a functional categorization. In fact, microglia cluster 19 cells express higher levels of G2M and S phase cell cycle genes than clusters 2 and 4 ( Fig. 3a). Additionally, cluster 19 S phase cells increased considerably in early and late GBM mice compared with normal brain controls (Extended Data Fig. 4a). Validation using systemic Data are presented as mean ± s.e.m. of biologically independent tumors, *P < 0.0001, unpaired t test, two-tailed, n = 9 (three or four sections per tumor), 16 and 11 for 7, 14 and 21 days, respectively. b, Confetti sections of a > 30 days late-stage GBM. Scale bars, left panel 100 μm, right panel 500 μm. NB, normal brain. c, Fluorescent cells (A and B) or clusters (C and D) from biologically independent mice bearing early-stage (10 days) and late-stage (>30 days) post Cre virus injection. Data are presented as mean ± s.e.m. of n = 4-6 fields of view per section and n = 4-5 sections per biologically independent tumors were quantified. d, Monocle3 trajectory inference on EGFR + clusters. e, DeGs between EGFR + clusters 0 and 5. NS, not significant. f, expression of eGFR ligands Hbegf and Tgfa overlaid on UMAPs. g, expression of Lgals1 (GAL1) and Pdgfra overlaid on UMAPs. h,i, Heat map (h) and volcano plot (i) of DeGs from bulk RNAseq of hseGFR + PDGFRA + and hseGFR + GAL1 + flow-sorted cells. j, Reactome analysis of upregulated genes in hseGFR + PDGFRA + and hseGFR + GAL1 + cell populations.
in vivo DNA labeling with 5-ethynyl-2′-deoxyuridine (EdU) and flow cytometry showed a fivefold increase in EdU + microglia from GBM mice compared with control brain (10.07 ± 2.21% versus 2.22 ± 1.92%) (Fig. 3b), demonstrating that increased proliferation of microglia is occurring specifically in GBM tissues. In fact, flow cytometry of tumor bed and contralateral brain over time for total and Ki67 + microglia cells demonstrated increases only in tumoral microglia during GBM progression (Fig. 3c,d).
gender-matched mice, and performed bulk RNAseq (Fig. 3e,f and Extended Data Fig. 4e,f). As expected, EdU + GBM microglia contained upregulated genes associated with mitotic cytokinesis, cell division and cell cycle (Fig. 3g). Microglia sensome 22 gene set enrichment analysis (GSEA) validated the identity of microglia (Extended Data Fig. 4g). EdUmicroglia express genes associated with inflammatory response, innate immune response, cytokine response, cellular response to interleukin-1 (IL-1) and negative regulation of nuclear factor kappa B (NF-kB) transcription factor activity (Fig. 3g). Regressing out cell cycle gene signatures from the EdU + microglia dataset did not reveal additional GOs (Extended Data Fig. 4h). Spearman correlation to embryonic (E17), postnatal and adult Tmem119 positive and negative microglia and LPS-stimulated adult microglia show that the EdU + microglia population is less related to embryonic, postnatal (Tmem119 + and Tmem119 − ) and adult (LPS-stimulated and control) microglia than EdUmicroglia ( Fig. 3f), suggesting that cycling microglia in GBM adopt a transcriptome that is different from noncycling (EdU -) microglia, the latter seemingly adopting a proinflammatory polarization.
Comparing the transcriptomes of contralateral GBM microglia with those of normal brain microglia showed that microglia located far from the GBM tumor bed are activated significantly, upregulating genes involved in major histocompatibility complex II (MHCII) antigen processing and presentation, cellular responses to interferon (IFN)-β,γ and innate immune response, whereas the top gene signatures of normal brain microglia related to phagocytosis, and positive regulation of NF-KB transcription ( Fig. 3h and Extended Data Fig. 4i). This suggests that the presence of a GBM stimulates distant microglia that are otherwise not in direct physical contact with the tumors. Applying pseudotime tracing to microglia clusters determined that cluster 4 gives rise to cluster 2 ( Fig. 3j). Interestingly, the levels of expression of canonical microglia markers P2ry12 and Tmem119 are higher in cells of cluster 4 than cluster 2 ( Fig. 3i), reinforcing the notion that cluster 2 is derived from cluster 4. Reactome analysis of EdU + and EdUtranscriptomes confirms the mitotic nature of EdU + sorted cells and the antigen processing and presentation, and IL-1 signaling of EdUcells (Fig. 3k). Together, these findings suggest that a subpopulation of microglia in the GBM tumor microenvironment (TME) respond to the tumor by proliferating and committing less to polarization programs.

Most GBM cytokines originate from intratumoral immune cells.
Recruitment of immune cells to tumors has been investigated largely from a cancer cell centric perspective. However, a growing number of studies report on the contribution of TME cells to this process 23 . To address this, we analyzed a panel of 32 cytokines by quantitative PCR with reverse transcription (RTqPCR) from CD45 − and CD45 + flow-sorted cells from mouse GBMs (Fig. 4a). We found that expression of most (22/32) cytokines was enriched in CD45 + cells (Fig. 4a) and validated by scRNA-seq results (Fig. 4b). Microglia cluster 2 expresses highest levels of Tnf (Fig. 4b-d and Extended Data Fig. 5a), which was validated independently by flow cytometry (Fig. 4e). Similarly, most cytokines receptors are expressed in CD45 + cell populations (Fig. 4c). Notably, Cxcl12 levels are higher in CD45endothelial cells (Fig. 4a,b,e) whereas its receptor Cxcr4 is expressed on T cells, NK cells, B cells, neutrophils (including polymorphonucler myeloid-derived suppressor cells (PMN-MDSCs)), macrophages and DCs (Fig. 4b,e) perhaps reflecting an unbeknown paracrine circuitry between endothelial and several immune cells. Together, these results reinforce the developing notion that most intratumoral cytokines and their receptors are expressed more from CD45 + cell populations and much less so from cancer cells.

Expression of immune checkpoints in CD45 + cells of GBMs.
The common belief that cancer cells are the sole source of checkpoint ligands and responsible for suppression of T cell immune responses is slowly changing as recent reports from us and others demonstrated the importance of myeloid-derived checkpoint molecules [24][25][26][27] . Reinforcing this notion are our findings that most checkpoint transcripts are expressed in CD45 + cells, mostly in T and myeloid cells (Fig. 4f). Notably, transcripts for PD-1 (Pdcd1) were observed only in T cell clusters ( Fig. 4g and Extended Data Longitudinal evolution of GBM TME during tumor progression. To uncover cellular and molecular changes to the GBM TME over time, normal brain and early and late stages of GBM development (Extended Data Fig. 6a and Supplementary Table 1) were analyzed. As expected, EGFR + cells were absent in normal brain and increased dramatically in early and late GBMs (Fig. 5a,b and Supplementary Table 2). Flow cytometry validated the cell cluster compositions of early and late samples, in which early GBMs were composed mostly of microglia and late GBMs saw increased infiltration of macrophages ( Fig. 5c,d). Flow cytometry of independent cohorts of GBM mice further validated these observations (Fig. 5e,f). Notably, an initial quasi-stagnant accumulation of EGFR + cells for the first 25 days was followed by an explosive growth expansion culminating in a moribund late stage preceded by an accumulation of CD45 + cells (Fig. 5e). The observed increases in macrophages, PMN-MDSCs, lymphocytes, monocytes and NK cells over time were also not detected in contralateral brain tissues, indicating that these unique cellular dynamics are localized to the TME (Fig. 5f).
Infiltration of macrophages, and PMN-MDSCs parallels bloodbrain barrier leakage. The significant increases in macrophages (6,20), PMN-MDSCs (11) and EGFR + cells late during GBM progression may reflect a disruption of the blood-brain barrier (BBB). Intravenous injection of Evans Blue (EB) and sodium fluorescein (NaF) to assess BBB integrity 30 at 10, 20, 25 and 30+ (moribund) days after tumor initiation led to extravasation and rapid accumulation after 25 days (Fig. 5i). Orthogonal validation by contrast-enhanced molecular resonance (MR) imaging over time normalized to contralateral central nervous system, showed increased enhancement between 26 and 32 days post-tumor initiation, corresponding to extensive BBB leakage (Fig. 5j). Together, these results demonstrate that the integrity of the BBB remains intact until later stages Arg 1 S100a8 S100a9 IL1b 10   of tumor development-a time that coincides with the influx of immunosuppressive cells into GBM.

Recruitment of immunosuppressive cells during progression.
To better define the molecular mechanisms by which immune cells are recruited to the GBM TME, we determined DEGs between late and early GBMs. In macrophage (6,20), gains were observed in expression of genes involved in negative regulation of inflammatory response, neutrophil chemotaxis and aggregation (6) and angiogenesis and metabolic process (20) (Fig. 6b and Extended Data Fig. 7c). In DC (1), gains in genes modulating antigen processing and presentation via MHCII and leukocyte chemotaxis were observed (Fig. 6b and Extended Data Fig. 7d). In PMN-MDSCs (11), upregulation of genes in NFAT signaling and cellular response to hypoxia and downregulation of genes that characterize inflammatory responses and regulation of T cell proliferation were observed ( Fig. 6b and Extended Data Fig. 7e). We observed decreases in transcripts of the cytokine:receptor Ccl2 and Ccr2 in macrophage (17) and (13), respectively (Fig. 6c), suggesting a cytokine relationship between perivascular macrophages and other BAMs in normal brain. PMN-MDSCs (11) expressed more Cxcl2 and less Cxcl3 in early than late GBM, and expression of their receptor Cxcr2 is highest in neutrophil (25) (Fig. 6c), perhaps reflecting a switch in Cxcr2 ligand use during GBM progression. The high expression of Il1b in PMN-MDSCs (11) and in neutrophil (25) (Fig. 6c) and its receptor on endothelial cells (18,28) (Fig. 4c), also suggest a functional interaction between granulocytes and endothelial cells in GBM.
GSEA of microglia and macrophage clusters for markers of classical 'M1-like' proinflammatory polarization and 'M2-like' protumorigenic polarization 31 reveal that microglia (2) had similar normalized enrichment scores (NES) in normal brain and early and late GBM samples whereas microglia (4) preferentially displayed protumorigenic NES in early and late GBMs (Fig. 6d). Infiltrating macrophages (6,20) had higher NES for 'M2-like' protumorigenic polarization markers in early and late GBMs compared with clusters (13,17) (Fig. 6d). Taken together, this suggests that microglia and BAM adopt a proinflammatory immune microenvironment early in tumor growth, which is lost during GBM progression.
Bone marrow changes early during gliomagenesis. Bone marrow (BM) myeloid progenitors generate MDSCs during tumor-driven emergency myelopoiesis 24,32 . Using flow cytometry (Extended Data Fig. 8a,b), we show that GBM significantly increased the hematopoietic progenitors LSK and LK, with the granulocyte/monocyte progenitors (GMPs) in the LK population showing the biggest increase (Fig. 7a). GBM mice had splenic myeloid cells increasesevidence for GBM-induced emergency myelopoiesis. In these cells, PMN-MDSC, M-MDSC and DC were increased, whereas macrophages were diminished compared with control mice (Fig. 7b). GBM mice had a significant increase in systemic T cells, characterized by elevated expression of CTLA-4 (Fig. 7a,b). These results indicate that GBM induces robust emergency hematopoiesis with significant systemic immunosuppressive implications.

TMZ depletes BM-derived GMP.
To glean insights into the effects of SOC on the GBM immune microenvironment, we analyzed immune cells from brain, BM and spleen of TMZ-treated mice by flow cytometry. A dose of 25 mg kg −1 daily of TMZ in mice is equivalent to the IR/TMZ SOC dosage (75 mg m −2 ) given to GBM patients, whereas a dose of 66.67 mg kg −1 daily in mice is equivalent to 150-200 mg m −2 postradiation adjuvant therapy 33 . After a 2-week treatment of TMZ, tissues were harvested at 24, 72 or 168 h (Extended Data Fig. 8c) and analyzed by flow cytometry (Extended Data Fig. 8d,e). Microglia numbers were unchanged; however, BAMs showed a significant decrease (control 2.75 ± 0.38 versus treated 1.35 ± 0.25) in the 66.67 mg kg −1 and a trend in the 25 mg kg −1 treated cohorts 24 h after cessation of TMZ (Extended Data Fig. 8e), effects that were not observed at later times, suggesting a temporary sensitivity of BAMs to TMZ. Dramatic depletion of GMPs in low and high doses of TMZ were observed in BM, which was more striking at 24 h, and did not recover fully until 1-week post cessation of treatments (Extended Data Fig. 8e). In contrast to GMP, the levels of common myeloid progenitors (CMP) remained unchanged, consistent with the hypothesis that more rapidly proliferating/differentiating myeloid progenitors are more sensitive to TMZ. Splenic myeloid cells were unaffected by TMZ (Extended Data Fig. 7e). Note that splenic myelotoxicities are likely to occur at later time points based on the kinetics of nadir development in patients treated with TMZ 34 . Together, these results demonstrate that treatment of healthy, nontumor mice with TMZ has little effect on population levels of microglia, BAMs and CMPs, whereas GMPs are sensitive to prolonged TMZ treatment, requiring several days for the myelotoxicity to dissipate.

TMZ effects on PMN-MDSCs, macrophages and microglia.
Treatment of GBM mice with 66.67 mg kg −1 of TMZ (Extended Data Fig. 8f) extended survival by 14 days compared with control (median survivals (days): control 9, TMZ-25 9.5 and TMZ-66.7 23) (Fig. 7c) and decreased BLI output (Extended Data Fig. 8g), whereas low dose (25 mg kg −1 ) TMZ had no effect on survival and GBM growth ( Fig. 7c and Extended Data Fig. 8g), analogous to different glioma models 35 . Flow cytometry analysis revealed that high dose (66.7 mg kg −1 ) TMZ significantly increased microglia in GBM mice but not control animals and resulted in a decrease in macrophages, whereas decreases in PMN-MDSC populations were observed with both TMZ concentrations (Fig. 7d). No statistically significant differences were observed in the relative numbers of CD45 + cells and in splenic MDSCs upon TMZ treatments ( Fig. 7d and Extended Data Fig. 8h), suggesting that decreases in tumoral PMN-MDSCs upon TMZ treatment may stem from recruitment deficiencies to the TME. Our data also imply that high dose TMZ may promote a proinflammatory phenotype by increasing microglia and decreasing suppressive populations of PMN-MDSCs and macrophages in addition to direct anti-cancer cell effects. This TMZ-mediated switch in the immune microenvironment was not sufficient, however, to induce tumor clearance, suggesting a lack of adaptive response and development of resistance mechanisms.

Patient immune heterogeneity of low and high-grade gliomas.
GBM patients are diagnosed with symptomatic late-stage disease due to the asymptomatic nature of early-stage GBMs, which considerably hampers longitudinal studies of evolution of the immunological landscape of GBM. To overcome this limitation, and to correlate immune composition with disease aggressiveness, we analyzed freshly isolated low-grade glioma (n = 5) and GBM (n = 8) patient (seven males and six females) tumor samples by flow cytometry (Extended Data Fig. 9a and Supplementary Table 3 for patient demographics). Relative amounts of CD45 + and CD11b + cells were similar in low-grade and GBM tumors and the median ages at diagnosis were 44 ± 4.9 years old versus 54.3 ± 4.1 years old for low-grade and GBM, respectively (Extended Data Fig. 9b,c). Median survival of GBM patients was 73.8 weeks and undefined for the low-grade patients (Extended Data Fig. 9d). Flow cytometry revealed that GBM immune profiles are distinct from low-grade gliomas. Notably, higher numbers of microglia are observed in low-grade gliomas than in GBMs, the latter having higher numbers of PMN-MDSCs, intermediate monocytes and macrophages (Fig. 8a-d). Neither gender, anatomical location nor IDH1 mutant status correlated with these differences (Fig. 8c,d and Extended Data Fig. 9e-j).

Discussion
GBM is an aggressive and universally fatal primary brain cancer. Clinical progress remains restricted because we lack comprehensive knowledge of the evolution of GBM tumor and immune cells during treatments. Such information is unattainable using patient GBM samples due to the inherent difficulties in performing longitudinal surgical samplings. To overcome these roadblocks, we performed scRNA-seq on a genetically engineered EGFR-driven mouse GBM model and flow cytometric analyses of mouse and patient GBM samples to determine their immunological cellular landscape at various stages of disease evolution and during SOC therapies. The most striking observation of our studies was the changes in immune cell composition as GBM developed. We uncovered an immune microenvironment progression from 'M1'-like proinflammatory microglia early during GBM tumorigenesis and in low-grade glioma tumors from patients, towards an 'M2'-like protumorigenic macrophage-centric infiltration in the advanced stages of GBM in the mouse model and in patient samples. These transitions parallel a breakdown of the BBB and an explosive growth of EGFR + cancer cells.
MDSCs are immature myeloid cells with immunosuppressive and tumor-promoting properties that accumulate in tumors 36 . MDSCs are associated with poor outcomes and correlate negatively with response to chemotherapy, immunotherapy and cancer vaccines [37][38][39][40] . In our studies, we observed that PMN-MDSCs expressed high levels of Csfr3, Ccr1, Cxcr2 and Cxcr4, whereas Cxcl12, the ligand for Cxcr4, was expressed preferentially in CD45 − GBM endothelial cells, unraveling a previously unidentified mechanism for recruitment of immunosuppressive MDSC into the GBM microenvironment. Thus, GBM progression might not simply reflect the development of cancer cell resistance and escape from chemoradiotherapy, but might be the direct consequence of immunosuppressive mechanisms induced by an accumulation of MDSCs.
Another key finding from our work is the identification of the immune cell types expressing many cytokines and their receptors in GBM. In our model, the bulk of cytokines and their receptors are expressed in CD45 + cells with few from EGFR + cancer cells. We also uncovered a potential circuitry between Cxcl12 positive endothelial cells and Cxcr4 positive PMN-MDSCs, perhaps facilitating MDSC recruitment and infiltration. Although Cxcl12 and Cxcr4 expression has been reported previously in GBM [41][42][43][44][45] , our results of Cxcr4 positive intratumoral PMN-MDSCs reveal a potential axis for therapeutic intervention, in particular in the context of checkpoint inhibition 46-48 .
Finally, an important advancement is the discovery of a highly proliferating population of GBM-associated microglia, which may play a decisive role in activating emergency myelopoiesis in GBM patients and recruit BM-derived immunosuppressive myeloid cells to the GBM TME. More in-depth molecular analyses of this population are required. Collectively, our results offer a unique view of the dynamics of immune and nonimmune cells during GBM initiation and progression and point to new areas for therapeutics development.

online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41590-022-01215-0. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/. © The Author(s) 2022 methods Study design. These were prospective studies in genetically engineered mouse models and GBM patient samples designed to study the longitudinal changes in the cellular composition and transcriptomes of the immune microenvironment of mouse GBM. For the single cell RNA-seq study, normal brain controls (n = 2 samples, each pooled from two mice), Early GBM samples (n = 4 samples each pooled from two independent mice or tumors) and Late GBM samples (n = 3, single tumor per mouse) (see Supplementary Table 1). Patient samples were obtained postsurgery and completely deidentified. For flow cytometry analyses, cohorts of mice were initiated and GBM growth parameters monitored using BLI.
Genetically engineered mouse models. All mouse procedures were carried out in accordance with Beth Israel Deaconess Medical Center recommendations for care and use of animals and were maintained and handled under protocols approved by Institutional Animal Care and Use Committee (IACUC). EGFR conditional transgenic mice (Mus musculus musculus) were generated as described [2][3][4] and crossed with Cdkn2A [−/−49 , PTEN flox/flox 50 and conditional luciferase reporter transgenic mice 8 . Animals are kept on a mixed genetic background composed of C57Bl/6J, 129S4, FVBN/J, SJL and Balb/cJ. Brainbow 2.1 mice 15 were obtained from Jackson Laboratory and crossed to EGFR conditional transgenic mice. A 1:1 ratio of 12-to 16-week-old males and females were used in all experiments. Mice were housed using a ventilated cages system on a 12-h light/12-h dark cycle at an ambient temperature of around 25 °C with 40-60% humidity.
Virus production and protocol. The pTyf TGFa-IRES-iCre and pTyf -iCre lentivirus plasmids were produced at high titer for intracranial injection by transient transfections in HEK293T (ATCC CRL-3216) cells as follows: HEK293T cells were seeded in 4 × 15 cm 2 culture plates at a density of 40-50% confluency and maintained in DMEM supplemented with 10% FBS and antibiotics. For each 15-cm 2 culture plate, lentivirus was produced by cotransfection of 20 μg lentiviral transducing vector, 15 μg of ΔR8.9 packaging vector and 10 μg of VSV-G envelope vector using TransIT-LT1 transfection reagent (Mirus) according to the manufacturer's instructions. After 72 h post-transfection, the virus-containing medium was collected and pooled. Conditioned medium was concentrated using 40 μm Amicon filters (Millipore). The viral stock was further concentrated by ultracentrifugation at 100,000g for 1.5 h. The concentrated virus was resuspended in 0.1 ml PBS, aliquoted and stored at −80 °C. To determine the functional titer, a dilution series of pTyf TGFa-IRES-iCre high titer viral preparations were used to infect a Tdt-tomato expressing cell line using a final concentration of 8 μg ml −1 of polybrene. At 48 h postinfection, cells were harvested, washed with PBS and analyzed via flow cytometry. A dilution series of commercially available adenoviral-Cre was used as a positive control for titration.

Stereotactic injections.
We performed stereotactic injections on adult animals (6-12 weeks of age). Mice of the indicated genotype were anesthetized with an intraperitoneal injection of Ketamine/Xylazine (ketamine 100-125 mg kg -1 , xylazine 10-12.5 mg kg -1 ) and mounted on a stereotaxic frame with nonpuncturing ear bars. The incision site was shaved and sterilized with betadine and ethanol surgical scrubs and a single incision was made from the anterior pole of the skull to the posterior ridge. A 1-mm burr hole was drilled at the stereotactically defined location of the striatum (1 mm rostral to the bregma, 2 mm lateral to the midline and at 2.5 mm depth to the pia surface) and a 1 μl Hamilton syringe mounted onto a Stoelting QSI (Stoelting) was used to inject the indicated Cre virus at a rate of 0.1 μl min -1 . Following retraction of the syringe or pipette, the burr hole was filled with sterile bone wax, the skin was drawn up and sutured and the animal placed in a cage with a padded bottom atop a surgical heat pad until ambulatory. Bioluminescent imaging. Mice were imaged by BLI starting at 18 days postinjection for tumor detection. Isoflurane-anesthetized mice were injected intraperitoneally with 150 mg kg -1 d-luciferin and imaged after 10 min using a Xenogen BLI set up. A region of interest (ROI) was drawn around the head and quantitated on total luminescence (photons per second per cm 2 per seridan (p cm -2 s -1 sr -1 )). Mice were monitored biweekly until tumor development, or weekly after tumor development.
IR/TMZ treatments. Mice were imaged until luminescence reached 10 7 p cm -2 s −1 sr −1 and enrolled in treatment groups. TMZ was resuspended at 2.5 mg ml −1 or 66.67 mg ml −1 in OraPlus oral suspension medium and mice were dosed via oral gavage at 10 μl g −1 for a final dosage of 25 mg kg −1 or 66.67 mg kg −1 , respectively. For survival studies mice were dosed daily until moribund, for timepoint studies mice were dosed for 1 week and harvested at 72 h following the final treatment.
For irradiation, mice were anesthetized with ketamine/xylazine (ketamine 50-100 mg kg −1 , xylazine 5-6 mg kg −1 ) to immobilize them in the apparatus. Mice were placed in the X-RAD x-ray irradiator with only their head in the field of irradiation. Mice were protected from stray radiation to the body with a sheet of lead formed loosely around them. Following radiation treatment, mice were placed on a heated pad and observed until ambulatory. Treatment regime for radiation is 2 Gy for 5 days, 2 days off, followed by 2 Gy for a further 5 days. Mice for moribund harvest are monitored until moribund, timepoint mice are harvested 72 h following the final treatment.
Brainbow immunofluorescence. GBM tumor-bearing mice were perfused with PBS to flush out circulating blood cells, and then perfused with freshly made 4% paraformaldehyde. Brains were harvested and postfixed in 4% paraformaldehyde at 4 °C overnight. Brains were then transferred to 20% sucrose at 4 °C overnight for cryoprotection, followed by 30% sucrose. Sections (50 μm) cut using a microtome were mounted and rehydrated in PBST (PBS 1% Tween20) for 10 min before slides were mounted using ProLong Gold Anti-fade mountant with 4,6-diamidino-2-phenylindole and cured overnight. Slides were imaged on Zeiss 880 upright confocal microscope and analyzed using Fiji software. The fluorescence intensity threshold was set for each image to identify signal above background fluorescence. A pixel count of masks for each color was quantitated using Image J and normalized to total fluorescent pixels per image. A minimum of four fields of view per section was quantitated and a minimum of four sections per tumor were analyzed.
Single-cell tumor harvesting. Tumors were dissociated as described above, and a small aliquot was taken for flow cytometry analysis and stained for innate and adaptive immune cell markers. The remaining sample was stained with CD45 and Zombie Yellow fixable viability dye. Cells were washed with RNase-free PBS with 0.1% BSA and immediately sorted on a fluorescence-activated cell sorting (FACS) Aria Cell Sorter for viable CD45 + and CD45populations into ice-cold RNase-free PBS with 0.1% BSA. A minimum of 1 million CD45 + and CD45cells were harvested when possible, to account for the low capture rate of InDrop system. If fewer cells were available, then two samples were pooled; this occurred for both normal brain samples and for each of the Early samples. Cells were then filtered through a 70 μm filter and brought to the Harvard Medical School Single Cell Core (https://singlecellcore.hms.harvard.edu) where they were run on an InDrop system with the goal of encapsulating 10,000 cells per sample. Following encapsulation, we performed a reverse transcription reaction and prepared libraries 51 . As per core recommendation, four libraries per 10,000 cells were prepared. Following harvest of all samples, libraries were sequenced using the NextSeq and Novaseq systems with 16 and 32 libraries per sequencing run, respectively, so that the depth of sequencing was calculated to be around 25,000 genes per cell.
Brain tumor flow cytometry. Mice were perfused with 10 ml PBS, the brain was harvested and cerebellum removed. Brain tissue was minced and resuspended in 1.5 mg ml −1 collagenase in Hanks' Balanced Salt Solution (HBSS) with calcium and magnesium. Mixture was incubated rotating at 37 °C for 30 min, with gentle dissociation using a P1000 pipette halfway through to break apart large pieces. Dissociated cell mixture was filtered through a 100 μm filter and diluted with HBSS. All washes were pelleted at 400g for 5 min. The mixture was resuspended in 30% Percoll in PBS for myelin removal at 700g for 15 min with the brake on low. The myelin layer was removed and the mixture diluted and pelleted at 400g for 5 min. Red blood cells were lysed using Biolegend red blood cell (RBC) lysis buffer according to the manufacturer's protocol. Cells were blocked with Fc block for 5 min and stained with antibodies and Biolegend Zombie fixable viability dye. Cell surface antigens were stained 30 min at 4 °C in the dark. Cells were washed twice and fixed using eBioscience FoxP3 intracellular staining kit according to the manufacturer's protocol. Cells were then stained with intracellular antibodies for 30 min at 4 °C in the dark. Cell suspensions were washed twice and resuspended in PBS for analysis. A minimum of 20,000 events were collected on a Beckman Coulter Gallios flow cytometer with acquisition software Kaluza (v.1.1.3) or BD LSR Fortessa with acquisition software FACSDiva (v.8.1) and analyzed using FlowJo (v. 10.5.3). We performed compensation using Invitrogen Ultracomp ebeads Compensation Beads, which were stained with appropriate antibody and analyzed on the same voltage and settings. Spleen tissue was harvested by mechanical dissociation and filtered through a 100 μm filter before RBC lysis and stained using the same protocol. BM was flushed from the femur and filtered before staining. A microglia gating strategy was used based on the finding that CD45 high and low are adequate markers in mouse to differentiate between macrophages and microglia, respectively, and P2ry12 was found to be a good secondary marker to confirm microglial identity 52 . MDSCs were identified using Ly6G + Ly6C + and Ly6G -Ly6C + for PMN-MDSCs and M-MDSCs, respectively 53 . The antibodies used in flow cytometry can be found in Supplementary Tables 4 and 6. Antibody combinations against the following target antigens were used to define these cell types: macrophages, CD45 hi CD11b + Ly6C − Ly6G − P2ry12 − ; microglia, CD45 lo CD11b + Ly6C − Ly6G − P2ry12 + ; PMN-MDSCs CD45 + CD11b + Ly6c + Ly6G + ; M-MDSCs, CD45 + CD11b + Ly6c + Ly6G − ; CD8 + T cells, CD45 + CD3 + CD8 + CD4 − ; CD4 + T cells, CD45 + CD3 + CD4 + CD8 − ; Treg cells, CD45 + CD3 + CD4 + CD8 − Foxp3 + , tumor cells, CD45 − hEGFR + .

BM cells and splenocytes isolation.
Preparation and staining from BM and splenic cells were done as described 24 . Briefly, cells from BM were isolated by flushing the femur of the indicated mice with PBS. ACK lysis was applied for 1 min and cells were washed three times with PBS 1× supplemented with 10% FBS followed by staining with the indicated antibodies and flow cytometry. Single cells were isolated from spleen by mashing on 70 μm filters followed by ACK lysis for 2 min and washing with PBS 1× supplemented with 10% FBS. Single-cell suspensions were plated in 96-well plates. We performed surface staining at 4 °C for 15 min with the flow antibodies listed in Supplementary Table 4. For intracellular staining, Cytofix/ Cytoperm and Permwash staining kit (BD Pharmingen) were used according to the manufacturer's instructions. We performed intracellular staining 4 °C for 30 min with flow antibodies listed in Supplementary Table 4.

Cell cycle and 5-ethynyl-2′-deoxyuridineanalysis by flow cytometry.
To determine the rate of proliferation in tumor microglia, 5-ethynyl-2′-deoxyuridine (EdU) at 100 mg kg −1 dose was administered in tumor bearing mice (EGFR WT) intraperitoneally at 16 and 4 h before harvest. After perfusion, tumor-bearing side as well as the contralateral side of brain were harvested, and the cerebellum removed. Brain was minced into small pieces and resuspended in 1.5 mg ml −1 collagenase IV in HBSS with calcium and magnesium. The mixture was incubated rotating at 37 °C for 30 min, with gentle dissociation using a P1000 pipette halfway through to break apart large pieces and enhance cell dissociation. The dissociated cell mixture was filtered through a 100 μm cell strainer, diluted with HBSS and pelleted at 400g for 5 min. The pellet was resuspended with 30% Percoll in PBS and centrifuged at 700g for 15 min with low brake to facilitate myelin removal. After myelin removal, the mixture was washed in PBS and pelleted at 400g for 5 min. RBCs were lysed using Biolegend RBC lysis buffer according to the manufacturer protocol. N = 4 mice were pooled per sample. Age-matched nontumor mice served as controls.
Throughout the experiment, RNase-free reagents were used, and all buffers contained 0.0025% RNasin Plus (Promega). Cells were stained with Zombie Yellow fixable viability dye (Biolegend) in PBS for 20 min in the dark on ice. Cells were washed once with PBS and blocked with Fc block for 5 min. Cell surface antigens (CD45 and P2RY12) were stained with antibodies for 30 min at 4 °C in the dark. Cells were washed twice with FACS buffer (1× RNase-free PBS(Invitrogen), 2% RNase-free BSA (VWR) and 0.0025% RNasin Plus) and fixed for 10 min at 4 °C with 2% paraformaldehyde using fixative provided with the EdU labeling kit (Invitrogen). Cells were permeabilized in saponin based permeabilization buffer (1× permeabilization buffer, 2% RNase-free BSA (VWR) and 0.0025% RNasin Plus). EdU-labeled cells were stained using a Click-IT reaction with Alexa Fluor 488 nm-azide (Click-iT Plus EdU Flow Cytometry Kit, Invitrogen) for 20 min at 4 °C in the dark. Cells were washed twice with 1× permeabilizing buffer containing 0.0025% RNasin Plus and resuspended in FACS buffer for FACS.
Cells were placed on ice at all times until sorting. Cells were sorted in 500 μl FACs buffer (containing 1× RNase-free PBS, 2% RNase-free BSA and 0.0025% RNasin Plus). Beckman Coulter MoFlo AstriosEQ was using for cell sorting using a 120 μm nozzle. A minimum of 50,000 events were collected at Beckman Coulter Gallios flow cytometer and analyzed using FlowJo. Compensation was performed using Invitrogen Ultracomp ebeads Compensation Beads, which were stained with specific antibody and analyzed on the same voltage and settings. All samples were acquired under the same setting and analyzed using same gating strategy as shown in Extended Data Fig. 3c. Total RNA was isolated using miRNeasy FFPE Kit (Qiagen), using manufacturer protocol with some modifications. Briefly, cells were pelleted and resuspended in 240 μl of PKD/Proteinase K solution at RT, mixed, and incubated at 56 °C for 45 min. Samples were centrifuged at 16,000g for 20 min. Supernatant was transferred to a new 1.5 ml Eppendorf tube and 10 μl of DNase Booster Buffer and 10 μl of DNase I stock solution were added for DNase treatment for 15 min. RBC buffer (500 μl) RBC buffer was added and mixed by pipetting several times; 1,200 μl ethanol was then added and mixed. Finally, the samples were passed through MinElute spin columns, washed three times with 500 μl RPE Buffer. RNA was eluted into 15 μl of RNase-free water and processed for bulk RNA-seq analysis as described below.
For cell cycle analysis, cells were stained with extracellular markers CD45, CD11b and P2ry12 and then fixed and kept in permeabilization buffer and stained with FxCycle Far Red for 20 min to label DNA content before flow cytometry analysis. DNA content stain was visualized on a linear axis and cells were read at the lowest flow rate for clear distinction in 2N versus 4N cells.
Single-cell analysis pipeline. Bioinformatic analyses were performed by the Joslin Diabetes Bioinformatics Core. Gene counts were generated by RapMap aligner 54 of bcbio-nextgen pipeline using the mouse GRCm38 transcriptome (Ensembl v.94).
To distinguish between droplets containing cells and ambient RNA, we used Monte Carlo simulations to compute P values for the multinomial sampling transcripts from the ambient pool 55 . First, we assumed that some barcodes correspond to empty droplets if their total unique molecular identifier (UMI) counts are at or below 150. We then called cells at a false discovery rate (FDR) of 0.1%, meaning that no more than 0.1% of our called barcodes should be empty droplets on average. The number of Monte Carlo iterations determines the lower bound for the P values 56 . There are no nonsignificant barcodes that are bounded by iterations, which indicated there was no need to increase the number of iterations to obtain even lower P values. We computed the maximum contribution of the ambient solution to the expression profile for cell-containing droplets. First, we estimated the composition of the ambient pool of RNA based on the barcodes with total UMI counts less than or equal to 150 for each gene in each sample. Second, we computed the mean ambient contribution for each gene by scaling the ambient pool by some factor. Third, we computed a P value for each gene based on the probability of observing an ambient count equal to or below that in cell-containing droplets based on Poisson distribution. Fourth, we combined P values across all genes using Simes' method. We performed this for a range of scaling factors and identified the largest factor that yields a combined P value above threshold 0.1 so that the ambient proportions are the maximum estimations.
To remove ambient contamination, we transformed and normalized the data for each sample using the R package sctransform 57 and performed principal component analysis (PCA) on the integrated data. To cluster the cells, we construct a K nearest neighbor (KNN) network of the cells based on the Euclidean distance in the space defined by the top principal components (PCs) selected by Horn's parallel analysis 58 . We refined the weights of the connection between pairs of cells based on their shared overlap in their local neighborhoods, that is, Jaccard similarity. To cluster the cells, we applied modularity optimization techniques, that is, the Louvain algorithm. We removed ambient RNA contamination from the cluster-level profiles and propagated the effect of those changes back to the individual cells using the removeAmbience function of the R package DropletUtils 55 . Doublets were identified using the R package scDblFinder and then removed from the downstream analysis. Briefly, thousands of doublets were first simulated by adding together two randomly chosen single-cell profiles. For each original cell, the density of simulated doublets in the surrounding neighborhood was computed. The simulated doublet density was then combined with an iterative classification scheme. For each observed cell, an initial score was computed by combining the fraction of simulated doublets in its neighborhood with another score based on coexpression of mutually exclusive gene pairs 59 . A threshold was chosen that best distinguishes between real and simulated cells, allowing us to obtain putative doublet calls among the real cells. Cell clustering and marker gene identification: we visualized quality control (QC) metrics as violin plots 60 that included the number of genes (nFeature), number of UMI (nCount) and percentage of mitochondrial UMI (percent_mt). We used the QC metrics to filter cells that have less than 500 UMI (low-quality cells), less than 500 features (low-quality cells) and more than 20% mitochondrial UMI (dying cells).
Integrating batch 1 and 2 data: we considered batch 1 and 2 as two independent datasets. Seurat includes a set of methods to match shared cell populations across datasets. These methods first identify cross-dataset pairs of cells that are in a matched biological state ('anchors') that can be used both to correct for technical differences between datasets (that is, batch effect correction) and to perform comparative scRNA-seq analysis of across experimental conditions 61 . We split batch 1 and 2 into two datasets. We transformed and normalized the split datasets separately using the R package sctransform 57 . We identified the 'anchors' and use these anchors to integrate the two datasets together. We performed PCA on the integrated data and only the top 3,000 variable genes were used as input. We performed UMAP analysis using the same top PCs as input to the clustering analysis.
We classified the cell cycle phases using the cyclone function of the R package scran 62 . Searching for DEGs (cluster biomarkers), we found markers for every cluster compared with all remaining cells using the Wilcoxon Rank Sum test, and reported those with a log 2 FC threshold of 0.25 and expressed in more than 10% of the cells. To identify cell type, we used an automatic tool, SCSA 63 , to annotate cell types for all cell clusters. It selects cell cluster markers with a P value cutoff of 0.01. Selected markers were then compared with cell type specific markers in the CellMarker database 64 . If the z-score of the first predicted cell type is more than twice as much as the second predicted cell type, or the z-score of the second predicted cell type is negative, the first predicted cell type is considered a 'good' prediction by SCSA. If the z-score of the first predicted cell type is less than twice as much as the second predicted cell type, we manually selected the one that is likely found in brain tissue or the one that belongs to immune cells.
PDGFRA and GAL-1 positive GBM populations. Tumors were dissociated into single-cell suspensions as described above. Single-cell suspensions were stained with antibodies EGFR PerCP Cy5.5 (clone: AY13, Biolegend), with either PDGFR APC (clone: APA5, BD Biosciences) or Galectin-1 PE (R&D systems). Cell surface antigens are stained for 30 min at 4 °C in the dark. Cells were washed with RNase-free PBS with 0.1% BSA and immediately sorted on a FACS Aria Cell Sorter for EGFR + PDGFR -, EGFR + PDGFR + , EGFR + Galectin-1 + and EGFR+Galectin-1 -populations. Sorted cells were resuspended immediately in Qiagen RLT buffer. RNA was isolated using the Qiagen RNeasy Micro kit as per the manufacturer's protocol and eluted using 14 μl RNase-free water.
Bulk RNA-seq analyses. RNA isolated from flow cytometry sorted cells was processed for bulk RNAseq as follows: libraries were prepared using SMARTer Stranded Total RNAseq v.3 Pico Input Mammalian sample preparation kits from 1 ng purified total RNA according to the manufacturer's protocol. The finished dsDNA libraries were quantified by Qubit fluorometer and Agilent TapeStation 4200. Uniquely dual indexed libraries were pooled in an equimolar ratio and shallowly sequenced on an Illumina MiSeq to further evaluate library quality and pool balance. The final pool was sequenced on an Illumina NovaSeq 6000 paired-end 150 bp reads (for the EdU-labeled experiment) or a High Output Illumina NextSeq550 (for the PDGFRA and GAL-1 experiment) at the Dana-Farber Cancer Institute Molecular Biology Core Facilities. The reads were trimmed for adapters and poly(A/T) tails, and then filtered by sequencing Phred quality (≥ Q15) using fastp 65 . Mouse genome sequences and gene annotation were downloaded from the UCSC goldenPath, v.mm39. We then used the genomeGenerate module of the STAR aligner 66 to generate the genome indexes. STAR aligner option sjdbOverhang = 74 for 75 bp reads.
We aligned the adapter-trimmed reads to the genome using STAR aligner with the two-pass option. Reads are mapped across the genome to identify novel splice junctions in the first-pass. These new annotations are then incorporated into the reference indexes and reads are realigned with this new reference in the second pass. While more time-intensive, this step can aid in aligning across these junctions, especially in organisms where the transcriptome is not as well annotated. Alignments were assigned to genomic features (that is, the exons for spliced RNAs) by using featureCounts 67 . Multi-mapping reads are counted as fractions. To filter out low expressing genes, we kept genes that have counts per million (CPM) more than 1 in at least three samples; there were 18,180 genes (PDGFRA/GAL1 experiment) and 14,727 genes (for the EdU experiment) after filtering. We then normalized counts by weighted trimmed mean of M-values (TMM) 68 . If no normalization is needed, all the normalization factors will be 1. We used normalization factors between 0.84 and 1.15. To use linear models in the following analysis, we performed Voom transformation 69 to transform counts into logCPM, where logCPM=log_2 (10 6 × count/(library size × normalization factor)). Voom transformation estimates the mean-variance relationship and uses it to compute appropriate observation-level weights so that more read depth gives more weight. To get an overall view of the similarity and/or difference of the samples, we performed PCA. We then adjusted for the tumor effects and recalculated the PCA analysis using the adjusted data. To discover DEGs, we use limma, an R package that uses linear models to power differential expression analyses 70 . We performed moderated t-tests to detect genes that are deferentially expressed between groups, with the tumors as covariates. We performed GSEA 71 using the microglia sensome gene sets 22 . GSEA for Reactome. We perform GSEA using the Reactome, and the lists of all genes' Z-scores of each comparison.
Mouse microglia RNAseq comparison analysis. Data from Bennett et al 72 were acquired from SRA under accession number SRP068621. The downloaded data are paired-end fastq files and each pair should have the same number of matched reads. However, for these data, there are different numbers of reads in each file (probably because some reads fail QC). Therefore, we synchronized the paired-end fastq files and separated unmatched reads using fastq-pair. RNA-seq raw reads were 75-bp unstranded paired-end reads. The reads were trimmed for adapters and filtered by sequencing Phred quality (≥ Q15) using fastp 65 . A count table was generated by aligning reads to the mouse transcriptome (Ensembl v.104) using kallisto 73 and converting transcript counts to gene counts using tximport 74 . These samples had an average alignment rate of ~57%. To filter out low expressing genes, we kept genes that have CPM more than 0.5 in at least 1 sample; there were 19,050 genes after filtering. We then normalized counts by weighted TMM 68 . If no normalization was needed, all the normalization factors will be 1. Here, the normalization factors were between 0.53 and 1.37. To use linear models in the following analysis, we performed Voom transformation as above. We performed Spearman correlation tests between different ages of mouse microglia, central nervous system myeloid, whole-brain gene expression and our microglia gene expression.

BBB integrity evaluation.
Tumor-bearing mice at the indicated time points and nontumor-bearing mice were anaesthetized using ketamine/xylazine mixture (ketamine 100 mg kg −1 and xylazine 10 mg kg −1 ; intraperitoneally (i.p.)) and mice were administered a 2% solution of Evans Blue (EB; Sigma, catalog no. E2129-10G) and 2% solution NaF (F6377-100G) intravenously at a dose of 4 ml kg −1 of body weight 30 through the retro-orbital plexus. At 30 min postinjection, mice were perfused transcardially with 0.9% NaCl until the perfusate from right atrium ran clear. Mice were decapitated and dissected into coronal sections to visualize the extravasation of tracer markers. Each region was weighed and then homogenized in 0.75 ml PBS and 0.25 ml 100% trichloroacetic acid (Sigma Aldrich, catalog no. T6399-100G). Fluorescence of the extravasated dye was determined by spectrophotofluorometry (excitation at 620 nm and emission at 680 nm for EB and excitation at 440 nm and emission at 525 nm for NaF) and expressed as micrograms per gram of brain tissue using standard curves for EB and NaF.
Gad-enhanced magnetic resonance imaging (MRI) was performed using a small animal MRI device (Bruker Biospec 94/20, Bruker) equipped with an 84 mm quadrature transmit volume coil and a four-element mouse head array. Animals were anesthetized with isoflurane in oxygen and situated on the MRI bed with respiratory monitoring and thermal support provided by a warm air circulator. For tumor volume quantitation, images of the brain were acquired in axial and coronal planes using a RARE (rapid acquisition with refocused echoes) pulse sequence with T1 and T2 weighting. Axial T1-weighted images used TR/TE = 800/23 ms, 0.8 mm slice thickness and 125 μm in-plane resolution. Coronal T1-weighted images had TR/TE = 863/23 ms, 0.8 mm slice thickness and 156 μm in-plane resolution. Axial T2-weighted images had TR/TE = 1800/33 ms, 0.6 mm slice thickness and 125 μm in-plane resolution, while coronal T2-weighted images were acquired with TR/TE = 44/2,246 ms, 0.6 mm slice thickness and 156 μm in-plane resolution. T1-weighted images were acquired with eight signal averages, and T2-weighted images had four signal averages. All images had RARE factor 8. For assessment of tumor permeability, anesthetized animals were given 0.2 mmol kg −1 of gadoteridol (ProHance) i.p., dissolved in approximately 100 μl sterile saline. The mice were then placed promptly in the MRI device, and following acquisition of scout images for localization of the relevant anatomy, T1-weighted RARE images were acquired with RARE factor 4, TR/TE = 500/16 ms, 0.8 mm slice thickness, 125 μm in-plane resolution and four signal averages. For each animal, the first T1-weighted images were acquired at exactly 10 min following contrast administration. To assess contrast uptake and washout, imaging was repeated every 1-2 min for 30 min. Images were analyzed using Horos DICOM viewer. Three independent measurements from a minimum of two MR slices were used to collect Gd-enhanced data. Volumes of T2 weighted images were calculated by establishing ROI on serial slices. EGFR immunohistochemistry. Deeply anesthetized animals were perfused transcardially with cold PBS. Brains were excised, rinsed in PBS and serial coronal sections cut using a brain mold. Thick sections were postfixed in 4% paraformaldehyde overnight. Formalin-fixed tissues were embedded in paraffin, sectioned at 5-10 μm, and stained with hematoxylin and eosin (Sigma) for histopathological analysis. For immunohistochemistry (IHC), cut sections were deparaffinized and rehydrated through a xylene and graded alcohol series and rinsed for 5 min under tap water. Antigen target retrieval solution (Dako, catalog no. S1699) was used to unmask the antigen (microwave for 10 min at low power then cooled down for 30 min) followed by three washes with PBS for 5 min each. Quenching of endogenous peroxidase activity was performed by incubating the sections for 10 min in 0.3% H 2 O 2 in methanol followed by PBS washes. Slides were preincubated in blocking solution (5% (v/v) goat serum (Sigma) in PBS 0.3% (v/v) Triton-X100) for 1 h at room temperature, followed by mouse-on-mouse blocking reagent (Vector Labs, Inc., catalog no. MKB-2213) incubation for 1 h. Primary antibody was incubated for 24 h at 4 °C. Secondary antibodies used were biotinylated anti-rabbit or anti-mouse (1:500, Vector Labs, Inc.) for IHC and were incubated for 1 h at room temperature. All antibodies were diluted in blocking solution. All immunobinding of primary antibodies was detected by biotin-conjugated secondary antibodies and Vectastain ABC kit (Vector Labs, Inc.) using DAB (Vector Labs, Inc.) as a substrate for peroxidase and counterstained with hematoxylin. Anti-human EGFR primary antibodies were used (EGFR; catalog no. 28-0005, 1:200, Zymed). EGFR staining (cluster numbers and diameters in millimeters squared) was quantified from photomicrographs and Image J by two independent observers who were blind to the images. At least three fields of view per image, four images per tumor and a minimum of n = 3 tumors were analyzed per timepoint. GO analysis. To identify genes enriched in Biological Process (BP), Molecular Function (MF) and Cellular Component (CC), we utilized the Database for Annotation, Visualization, and Integrated Discovery (DAVID) v.7.0 (refs. 78,79 ) with GOTERMs BP, MF and CC. All terms with a P value (Benjamini or Benjamini-Hochberg adjusted) less than 0.05 were considered significant and ranked by the number of genes identified in the group.
Statistical analyses. scRNA-seq statistical analysis was completed as described above. All other statistical analyses were performed using GraphPad Prism (GraphPad Software v.9.3.0). Values are given as mean ± s.e.m. or s.d. as indicated. Numbers of experimental replicates are given in the figure legends. When two groups were compared, significance was determined using an unpaired two-tailed t-test. For comparing more than two groups, one-way analysis of variance (ANOVA) was applied. Significance for survival analyses was determined by the log-rank (Mantel-Cox) test. P values < 0.05 are considered as statistically significant. No statistical methods were used to predetermine sample sizes, and our sample sizes were similar to those reported in previous publications [2][3][4] . Mice were assigned randomly to the various experimental groups described. Data collection and analysis were not performed blind to the conditions of the experiments except for analysis shown in Fig. 2b,c. Data distribution was assumed to be normal, but this was not formally tested. No datapoints were excluded from the analyses.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All data and materials used in the analysis are available in some form to any researcher for purposes of reproducing or extending the analysis. In rare instances, a material transfer agreement (MTA) may be required. scRNA-seq and bulk RNAseq data files are publicly accessible in the Gene Expression Omnibus under accession numbers GSE195848, GSE196174, GSE196175 and GSE195813. All analyses and visualizations were performed in R (v.3.6.3). Source data are provided with this paper.