Chemogenomic profiling of breast cancer patient-derived xenografts reveals targetable vulnerabilities for difficult-to-treat tumors

Subsets of breast tumors present major clinical challenges, including triple-negative, metastatic/recurrent disease and rare histologies. Here, we developed 37 patient-derived xenografts (PDX) from these difficult-to-treat cancers to interrogate their molecular composition and functional biology. Whole-genome and transcriptome sequencing and reverse-phase protein arrays revealed that PDXs conserve the molecular landscape of their corresponding patient tumors. Metastatic potential varied between PDXs, where low-penetrance lung micrometastases were most common, though a subset of models displayed high rates of dissemination in organotropic or diffuse patterns consistent with what was observed clinically. Chemosensitivity profiling was performed in vivo with standard-of-care agents, where multi-drug chemoresistance was retained upon xenotransplantation. Consolidating chemogenomic data identified actionable features in the majority of PDXs, and marked regressions were observed in a subset that was evaluated in vivo. Together, this clinically-annotated PDX library with comprehensive molecular and phenotypic profiling serves as a resource for preclinical studies on difficult-to-treat breast tumors.

B reast cancer comprises a heterogeneous collection of malignancies exhibiting distinct disease trajectories 1 . The histological and molecular diversity between tumors have been associated with important clinical phenotypes, namely metastatic potential and therapeutic response, which dictate survival 2,3 . Certain subsets of breast cancer patients represent unique clinical challenges, including triple-negative breast cancers (TNBC), which currently lack targeted therapies, metastatic disease, which is broadly treatment-resistant, and rare histological variants, where evidence-based guidelines are deficient 4,5 .
Translational research relies on preclinical models as approximations of human tumors to address these challenges. Although cell lines and genetically engineered mouse models have contributed to seminal advances in our understanding of breast cancer biology, they have largely failed to account for inter-and intra-tumor heterogeneity, at least partially contributing to the high rate of attrition in oncologic drug development [6][7][8][9] . Due to the human-origin and limited selective pressures of immediate transplantation, patient-derived xenografts (PDX) have emerged as models for preclinical drug testing 6,10,11 . Although previous efforts to develop and characterize breast cancer PDXs have demonstrated the relative molecular fidelity of these models, their ability to recapitulate clinical phenotypes is of equal importance, yet remains unclear [12][13][14][15][16] . Anecdotal evidence supports the retention of therapeutic response and metastatic propensity upon xenotransplantation, but this has still not been systematically evaluated across PDX libraries with extensive clinical and molecular annotation 12,13,17 .
Here we develop a series of PDXs representing breast tumors with unmet clinical needs. Molecular characterization is performed by whole-genome (WGS) and transcriptome (RNA-seq) sequencing and reverse-phase protein array (RPPA), which is complemented by in vivo evaluation of metastatic dissemination and chemosensitivity. We demonstrate that these models recapitulate the biology of parental human tumors and that the PDX platform serves as a tool for discovery and testing of precision therapeutics for those tumors showing poor responses to conventional chemotherapeutics.

Results
Establishment of poor prognosis breast cancer PDX library. To develop preclinical models of breast cancers with unmet clinical needs, patient-derived xenografting of select cases was integrated into an existing biobanking protocol with extensive clinicopathological annotation to allow for molecular and functional interrogation (Fig. 1a). The selection criteria included tumors that were: (1) ER−; (2) HER2+; (3) high-grade ER+; (4) metastatic; or (5) rare histological variants. Orthotopic engraftment into female immunocompromised mice was successful for 36 tumor samples of 81 attempts, for an overall take rate of 44.4% (Fig. 1b). Engraftment success associated with the following clinicopathologic features: high-grade, low-ER expression (≤15%), HER2-negativity, germline BRCA1/2 mutation, previous systemic treatment and presence of axillary lymph node (ALN) metastases ( Supplementary Fig. 1a). Supporting the aggressive biology of the cohort, successful engraftment was significantly associated with shorter progression-free survival (PFS) among patients whose tumors were used to attempt xenotransplantation (Log-rank p = 0.027) (Fig. 1c). Clinicopathologic features of the patients and PDXs are shown in Table 1, which demonstrates the high representation of TNBCs (73.0%) and rare histological variants (18.9%) in the PDX cohort.
Altogether, the Goodman Cancer Research Centre (GCRC) PDX library represents an aggressive breast cancer cohort comprised of 37 novel PDX lines derived from 36 tumors from 34 unique patients. One patient had three PDX models derived from their tumors-two sublines from distinct histological regions from their primary tumor (GCRC1784Xd/c, discussed below) and one from mediastinal lymph node metastasis (GCRC2054X) that developed at a later time point. Another patient had two PDXs developed from their tumors-one from their primary tumor (GCRC1915X) and another from a lung metastasis (GCRC2076X), which was sampled at the time of recurrence.
Tumor growth kinetics were evaluated over serial passages, where the median time to endpoint (10 mm in largest diameter) on first transplant generation was 128 days (range 30-234 days), and significantly decreased over subsequent passages (p < 0.001) (Fig. 1d, Supplementary Fig. 1b). Unlike engraftment rates, the only clinicopathologic parameter that was significantly associated with growth kinetics was previous exposure to therapy, where pre-treated PDX lines grew faster (p = 0.004) ( Supplementary  Fig. 1c). To address the feasibility of prospective drug testing using PDXs, the time to endpoint of passage two (P2) (a timeframe conducive for in vivo drug sensitivity studies) was compared to time-to-progression. Only 12.5% (3/24) of primary tumors reached P2 endpoint prior to patient progression; however, 63.6% (7/11) of patients with recurrent or metastatic disease progressed within this timeframe (Fig. 1d). This highlights a potential barrier for prospective personalized drug sensitivity testing using PDXs as avatars in advanced breast cancer. Together, the engraftment success rates, association with clinicopathologic factors and outcomes as well as growth kinetics described herein are consistent with previously published breast PDX cohorts 12,13,15 .
PDXs retain histopathological features of primary tumor. To assess the histopathological fidelity of PDXs, we evaluated patient tumor-PDX pairs for concordance using a panel of immunohistochemical (IHC) markers. Due to the potential for lymphoproliferative outgrowths arising during xenotransplantation, all xenografts were initially screened for epithelial (pan-cytokeratin, pan-KRT) and lymphoid (CD45) markers to validate the epithelial origin of each PDX line ( Supplementary Fig. 2a) 18 . Two lines were found to contain primary xenografts that were CD45+ (GCRC1924 and GCRC2034), both of which were derived from tumors with florid lymphocytic infiltrate ( Supplementary Fig. 2b). Although one of the four GCRC1924 primary outgrowths established as a pan-KRT+/CD45− tumor and could be serially propagated as such, all P1 transplants of GCRC2034 were pan-KRT−/CD45+ and therefore this line was excluded from the final series.
Further evaluation of breast cancer-associated markers (ER, HER2, Ki67, p53, vimentin, CK5/6, CK8/18) revealed IHC diversity across the PDX library, with striking similarities between parental tumors and PDXs (Fig. 1e, Supplementary Fig. 3). Regarding receptor expression, ER and HER status were concordant in 80.6% and 100%, respectively (Fig. 1f). Of the seven cases with ER discordance, five were from patients whose tumors were low-ER-expressing (1-15%) and corresponding PDXs were ER negative (GCRC1715, 1784(GCRC1715, , 1882(GCRC1715, , 2001(GCRC1715, , 2047. Conversely, re-expression of ER in the PDX occurred in two lines. One was an ER− skin recurrence from a patient who was initially diagnosed with an ER+ primary breast cancer and was sampled for xenografting while the patient was being treated with a selective-estrogen receptor degrader (GCRC1971). The other was an ER− brain metastasis from a patient who initially had an ER+ primary breast cancer (GCRC1944) (Fig. 1e, f; primary tumors not shown).
Architectural features and lineage markers were most notable among rare histological variants. This includes the cylindromatous organization with p63+ basal cells from an adenoid cystic carcinoma (ACC, GCRC1828); synaptophysin expression in a neuroendocrine carcinoma (NE, GCRC1979); mucicarmine+ lakes from a mucinous tumor (GCRC2007); loss of E-cadherin in an invasive lobular carcinoma (ILC, GCRC1971); and vimentin+ mesenchymal-like cells dispersed in a chondroid matrix in a metaplastic breast cancer (GCRC1784) (Fig. 1f,  Supplementary Fig. 4a). In the latter example, both ductal and Fig. 1 Breast cancer PDX collection. a Schematic of live biobanking protocol of primary and metastatic breast cancer (BrCa). Fresh breast tumors were divided for traditional biobanking (FFPE and snap frozen) and generation of PDX. These samples underwent molecular (WGS/RNA-seq and RPPA) and functional (in vivo drug sensitivity, metastasis assays) analyses. FFPE formalin-fixed paraffin-embedded, PDX patient-derived xenograft, WGS wholegenome sequencing, RNA-seq RNA-sequencing, RPPA reverse-phase protein array. b Bar chart of success rate for engraftment attempts per tumor sample. Success (n = 36), failure (n = 45). c Kaplan-Meier analysis of progression-free survival (PFS) for patients whose tumors successfully generated PDXs versus those that failed. d Bar chart of growth kinetics (time to endpoint, defined as growth to 10 mm) across passage 1-3 (P1, P2, P3) of PDX models (P1 median n = 2.5, range 1-11; P2 median n = 6, range 3-29; P3 median n = 4, range 3-15). Mean ± SD. Patient follow-up (Pt F/U) time and time to progression (Pt progression) are overlaid. *Mouse sacrificed before endpoint reached. e Representative H&E, estrogen receptor (ER) and HER2 staining from patient tumor and corresponding PDXs, representing the four major clinical subtypes. Left, whole-image scale bar 200 μm; right higher-magnification scale bar 50 μm. Complete staining panel in Supplementary Fig. 3  chondroid histological components were observed in the GCRC1784 patient tumor, which was recapitulated in one of the P1 mice (P1-8), whereas the other P1 PDX displayed pure ductal morphology (P1-7) ( Supplementary Fig. 4b, c). Serial transplantation resulted in purification of each histological component, which remained stable over subsequent passages, and the resulting sublines were further characterized separately (GCRC1784Xd, ductal; and GCRC1784Xc, chondroid) (Supplementary Fig. 4c). Altogether, this data supports the preservation of histopathologic features upon xenotransplantation.
Examining a panel of genes, which have been causally implicated in breast tumorigenesis revealed a long-tail distribution, where the frequencies of alterations in the PDX cohort were comparable to human breast cancer datasets, particularly ER-tumors ( Supplementary Fig. 6). Hotspot oncogenic drivers, truncating loss-of-function mutations and CNAs were identified in members of critical breast cancer pathways, including P53, receptor tyrosine kinase (RTK), PI3K and MAPK signaling, as well as cell cycle, transcriptional and epigenetic regulators and were highly conserved in PDXs (Fig. 2c). Together, these data show that although there is variability in clonal evolution upon xenotransplantation at the genome-wide scale, PDXs faithfully maintain the diverse landscape of coding mutations and oncogenic drivers displayed in their parental breast tumors.
PDXs preserve the expression landscape. To evaluate the expression landscape, transcriptome sequencing was performed on 37 PDX samples and 29 matched patient tumors. Initial hierarchical clustering using the top 1000 most variable genes separated the PDXs from human tumors, driven by a gene set that is predominantly expressed in human samples (Supplementary Fig. 7a). Differential expression analysis between 29 human tumor-PDX pairs revealed 3037 genes displaying log2fold higher expression in human samples, which were highly enriched for immune-related pathways ( Supplementary Fig. 7b, c). After filtering these genes to perform an epithelial-centric analysis, all tumor-PDX pairs clustered adjacent to one another (Fig. 3a). Samples derived from an individual patient, including PDXs from metastatic lesions, clustered with one another (GCRC1784/2054 and GCRC1915/2076), further supporting retention of the gene expression landscape upon xenotransplantation.
Classification of samples by breast cancer intrinsic subtypes revealed 86.7% concordance for tumor-PDX pairs, where 26 (70.3%) of PDXs were basal-like, 10 (27.0%) were HER2-enriched (HER2E) and one (2.7%) was luminal B (Fig. 3a). Subtype switching from normal-like in human to basal-like in PDX was observed in tumors with high stromal content (GCRC1828 and GCRC1886), which was lost upon engraftment ( Fig. 1f, Supplementary Fig. 3). The other two subtype switches were basal-like in human to HER2E in PDX (GCRC2001 and GCRC2029).
To further interrogate the expression landscape at the protein and signaling levels, 37 PDX lines were subject to RPPA against total (n = 176), phosphorylated (n = 64), methylated (n = 2) and detyrosinated (n = 1) proteins. After removing an outlying sample known for high matrix protein production (GCRC1784 chondroid, correction factor 0.33), supervised differential expression analysis based on receptor status confirmed expression of proteins previously associated with ER (ER, AR, INPP4B), HER2 (pHER2) and TNBC (pATM, pChk2, Notch1, B-catenin, TP53, low PTEN) biology (Fig. 3e). At the pathway level, variability was seen in hormonal, cell cycle, apoptosis, DNA damage, EMT and  other specific signaling pathways (PI3K/Akt, RTK, Ras/MAPK, TSC/mTOR; Fig. 3f). The overall correlation between RNA and RPPA data was high. At the gene level, 63.3% of transcripts/probes were significantly correlated across the PDX collection (B-H p-value < 0.05) ( Supplementary Fig. 7d). This was consistent with TCGA data, where a significant correlation between RPPA/RNA-seq R values was observed in the 108 probes common to our and TCGA datasets (Pearson r = 0.61, p < 0.0001) (Supplementary Fig. 7e). Although several probes measuring post-translational modifications (PTM) were highly correlated with RNA levels (e.g. pHER2, p4E-BP1, pRB), there was an enrichment for PTM RPPA probes in those that did not correlate with RNA expression, further supporting the added-value of RPPA data (Fisher's exact test  Fig. 7f). Together, these data demonstrate that the PDX collection displays diverse expression programs that are representative of their parental tumors.
PDXs model distinct patterns of metastatic dissemination. As space-filling lesions in vital organs, metastatic disease is generally treatment-refractory and the primary cause of breast cancer mortality. To assess the metastatic potential of the breast PDXs, we screened 29 models by gross organ examination at the time of necropsy during routine xenograft passaging (Fig. 4a). Metastatic lesions were observed in 14 lines, the majority of which were micrometastases to the lung at <50% penetrance ( Fig. 4a, b,  Supplementary Fig. 8a).
In PDX models with the highest frequencies of metastasis, corresponding patients displayed significantly worse PFS (Logrank p = 0.004) (Fig. 4c). Two of the highly metastatic PDX lines (GCRC1986 and 1991) appeared to be able to disseminate to multiple sites at high frequency, both of which were derived from patients who developed diffusely metastatic disease early in the course of their disease (Fig. 4a, d, h). To better quantify metastatic capacity, microscopic examination was performed on necropsy specimens in larger cohorts of mice (n = 9) after developing symptomatic metastases following primary resection when mammary fat pad tumor diameter reached 10 mm. The GCRC1986X model was generated from a liver metastasis in a patient initially diagnosed with locally advanced, ER+ breast cancer that disseminated to the liver and lungs within 10 months of diagnosis (Fig. 4d). At experimental endpoint (median 116 days, range 101-143), PDXs from this patient demonstrated high rates of metastasis to the ALN (66.7%), lung (88.9%), liver (88.9%), brain (55.6%) and within the abdomen (77.8%) (Fig. 4e-g). Similarly, the GCRC1991X model demonstrated broad metastatic propensity, and was developed from a patient who presented with synchronous metastatic HER2+ breast cancer (Fig. 4h). In the PDX, similarly high rates of metastasis were seen in the ALN (66.7%), lung (100%), liver (100%), brain (22.2%) and abdomen (55.6%) at experimental endpoint (median 101 days, range 81-136) (Fig. 4i-k).
Contrasting these widespread patterns of dissemination, the GCRC1971X PDX displayed organotropism. This ER+ ILC model was generated from a skin recurrence seven years after diagnosis (Fig. 4l). Following the development of severe twirling behavior in 2/4 (50%) mice from the first transplant generation of our metastasis screen, cranial dissection revealed the presence of large skull-base metastases (Fig. 4o). Clinical follow-up on this patient revealed a metastatic lesion to the cavernous sinus within one year of her local recurrence (Fig. 4p). A larger PDX cohort validated the initial observation, where all mice developed skullbase metastases with low frequency spread to other sites by endpoint (median 152 days, range 124-154) (Fig. 4m, n).
Together these data show that although the majority of PDXs display low-penetrance metastases to the lung and/or axilla, a subset can mimic the widespread or organotropic patterns of dissemination observed in the patient.
PDXs retain the chemosensitivity profile of patients. To systematically evaluate chemosensitivity profiles across the GCRC library, agents from the major classes of chemotherapeutics commonly used in advanced breast cancer were screened in a PDX clinical trial (PCT) using a "one mouse per treatment per model" approach ( Supplementary Fig. 9a) 6 . The in vivo efficacy of doxorubicin, gemcitabine, cisplatin and paclitaxel were evaluated as single agents over a~28-day study across 25-31 PDX lines. Responses were quantified and stringently categorized as complete response (CR), partial response (PR), stable disease (SD) or progressive disease (PD) according to the mRECIST criteria (Supplementary Table 1) 6 . By performing replicates across 13 different model-drug-response combinations, we found the PCT approach to be highly reproducible, where a high correlation between replicates was observed (r = 0.9646, p < 0.0001) ( Supplementary Fig. 9b).
The fidelity of the PDX response was assessed as a clinical cohort and from the perspective of an individual patient. At the population level, variable response rates were observed for each drug compared to an untreated control group, as demonstrated by rightward shifts in the waterfall plots (Fig. 5a). The objective response rates (CR + PR) for each agent ranged from 8.0 to 51.5%, in line with those observed in single-agent studies in patients with metastatic breast cancers (Fig. 5b) [24][25][26] . These response rates translated to significant improvements in tumor volume doubling-free survival for each of the chemotherapytreated cohorts compared to untreated mice (Log-rank p < 0.005 for each treatment arm) (Fig. 5c).
Diverse response profiles were observed by evaluating the chemosensitivity profiles of 30 PDXs tested with multiple singleagent regimens (Fig. 5d). Overall, the average response across the four chemotherapies evaluated in the PDXs correlated with the number of drugs the patient had been exposed to prior to engraftment (Fig. 5e). This multi-agent analysis could be used to further classify PDXs as broadly chemosensitive (≥2 CR/PR) or resistant (≤1 CR/PR). This classification based on PDX response was reflected in patient clinical outcomes, where patients with chemoresistant PDXs displayed worse PFS (Fig. 5f).
To evaluate whether individual tumors retain therapeutic sensitivity profiles following xenotransplantation, PDX response data were retrospectively compared with responses observed in the patient prior to engraftment. Sufficient patient data were available for doxorubicin, cisplatin and paclitaxel (no patients were treated with gemcitabine prior to engraftment) and patient responses were classified as sensitive (CR/PR) or resistant (SD/ PD). PDXs from patients' whose tumors were clinically chemoresistant consistently exhibited decreased sensitivity compared to tumors that were sensitive and/or unexposed to the Fig. 3 Expression landscape of PDX library. a Heatmap of unbiased hierarchical clustering across PDX (n = 37) and patient tumor (n = 29) samples, which underwent RNA-sequencing and filtering of stromal genes. Clustering based on top 1000 most variable genes (interquartile range), with histopathological (histology, ER, HER2) and subtype (absolute intrinsic molecular subtype, AIMS) annotation (top). Heatmap of PAM50 genes (below). b Heatmap of unbiased hierarchical clustering across PDX samples (n = 37) using ssGSEA scores for C2 CGP gene sets from MSigDB from RNA-seq data. c Heatmap of ssGSEA scores and gene expression (RNA-seq) for breast cancer subtype-associated pathways (luminal, ERBB2, basal). Patient tumor samples are ranked by increasing score (top) with corresponding PDXs (bottom) in the same order (n = 29). Correlation plot of ssGSEA pathway scores between human and PDX samples (below heatmap). The same applies for d for proliferation, hypoxia and epithelial-to-mesenchymal transition (EMT) signatures. e Heatmap of differentially expressed genes based on ER/HER2/TNBC status of PDX samples (n = 36), which underwent reverse-phase protein array (RPPA). f Heatmap of unbiased hierarchical clustering of RPPA pathway scores across PDX samples (n = 36). DDR DNA damage response, RTK receptor tyrosine kinase.
given agent (Fig. 5g). Together, of the 34 patient-PDX-drug combinations that could be compared, 64.7% showed concordant responses (Fig. 5h). Half of the discordances were cases in which the patient showed doxorubicin sensitivity but was resistant in the PDX, which may be attributed to dose-limiting toxicity observed in the mouse when doxorubicin was administered above 3 mg/kg. Alternatively, this could be explained by engraftment of an enriched subpopulation of drug resistant subclones from residual disease following a partial response to neoadjuvant chemotherapy 27 . The mechanism by which three cases went from paclitaxel resistant in the patient to sensitive in the PDX remains unclear, though the effect of drug holiday during PDX generation resulting in re-sensitization may play a role 28,29 .
PDXs identify therapeutics for difficult-to-treat tumors. To evaluate the utility of our chemogenomically profiled PDX library in the identification of therapies for difficult-to-treat tumors, actionability was assessed across WGS, RNA-seq, RPPA and chemosensitivity studies using previously published actionable features (OncoKB, ESMO/ESCAT, DEPO, Akbani et al.) (Fig. 6a) 30   E545K in GCRC1971X) and ERBB2 hotspot mutation (L755S in GCRC1715X) (Fig. 6a, Supplementary Table 2 30,31 . Outlier expression analysis of RNA and RPPA data was also evaluated to identify further targets (Fig. 6a). Although this confirmed several of the clinically established targets (ER/PR and HER2/pHER2), it also revealed other expression outliers currently undergoing clinical (CD274/ PD-L1, androgen receptor) and preclinical (pChk1/2/ATM/ATR, FASN/ACC1, FAK) evaluation 32,33 . Proof-of-concept functional experiments were performed for several models representing unique clinical challenges. For the GCRC1971X model, an ER + ILC that displayed skull-base metastases and multi-chemoresistance in vivo, WGS revealed multiple candidates (PIK3CA hotspot, FGFR1 amplification, CDH1 truncating mutation) (Figs. 4l-p and 6a). In addition to being amplified, FGFR1 was highly expressed and was further pursued because of its known roles in ILC biology and endocrine therapy resistance, which the patient displayed (Fig. 6b) 34,35 . A small PCT was initiated to evaluate the efficacy of BGJ398, an orally available FGFR inhibitor, among a subset of PDXs displaying the highest FGFR1 copy number and/or RNA expression across the PDX library (Fig. 6c). Although GCRC1971X achieved a CR within a week of initiating treatment, responses were poor for the four other models with lower levels of FGFR1 amplification/expression (Fig. 6c, d). These findings were upheld in the metastatic setting, where BGJ398 induced regressions in spontaneous skull-base metastases for this PDX model (Fig. 6e).
To address the challenge of treating patients with rare histological variants, GCRC1979X was further investigated. This triple-negative neuroendocrine breast tumor was derived from a locally recurrent lesion one year after undergoing neoadjuvant chemotherapy and breast conserving surgery, at which point a PDX was developed. Consistent with the patient's lack of response to anthracycline/taxane, the PDX did not respond to doxorubicin or paclitaxel in the chemosensitivity screen (Fig. 5d). AKT was found to be an RPPA outlier, and given the success of everolimus in advanced gastrointestinal neuroendocrine tumors, which display frequent PI3K/mTOR activation, this pathway was  further interrogated 36 . The GCRC1979X PDX ranked highest for an RPPA-based PI3K/mTOR activation signature (Fig. 6f). The efficacy of everolimus was evaluated in the GCRC1979X model in the context of a PCT of 19 PDX models to further evaluate the specificity of response (Fig. 6g). Overall, everolimus response correlated with PI3K/mTOR activation score, where the GCRC1979X displayed a strong and durable response to this agent (Fig. 6h, i)   (c.2002C>T) reported as benign, failed to respond to neoadjuvant doxorubicin, cyclophosphamide and paclitaxel (Fig. 6j). A PDX was generated at the time of surgery and chemosensitivty was evaluated as the patient underwent adjuvant chemotherapy (cyclophosphamide, methotrexate and 5-fluorouracil). Similar to the patient, the PDX did not regress with paclitaxel or doxorubicin (Fig. 6k). Seven months after surgery, the patient recurred with multiple liver metastases. Concordant with the response observed in the PDX, the patient was empirically started on cisplatin and demonstrated a marked regression of her large liver lesions (Fig. 6j, k). Eventually, the patient developed toxicity and required switching to carboplatin, and thereafter progressed. These functional studies demonstrate the potential for biomarker discovery and identification of precision therapies for diverse clinical challenges using a library of chemogenomically profiled PDXs.

Discussion
Although tremendous progress has been made in the treatment of breast cancer leading to improved outcomes, there are subsets of patients who unfortunately remain particularly difficult-to-treat. Models that faithfully recapitulate both molecular and phenotypic features of these tumors are critical for the translation of basic discoveries to improve the management of these patients. The comprehensive molecular profiling and screening of metastatic potential and chemosensitivity herein indicate that PDXs largely maintain the biology of the tumors from which they are derived, and therefore represent a valuable preclinical resource. The PDX library described herein fills several voids in the breast cancer modeling space. We have generated novel xenografts for tumor types where no models currently exist to our knowledge, including rare histological variants of breast cancer (e.g. NE, ACC). In addition, we describe the first breast PDX exhibiting highly selective skull-base organotropism (GCRC1971X). We have also contributed to a growing list of TNBC PDXs available to the research community, which will need to continue to expand to capture the full breadth of this heterogeneous subtype [37][38][39][40] .
This collection also represents the largest series of distinct breast cancer PDX models, which have undergone WGS with their corresponding patient tumors. Although the mutational load, copy number profiles and driver alterations were robustly conserved, the strength of VAF correlations was variable across PDX lines (Fig. 2, Supplementary Fig. 5). Eirew et al. 15 have previously examined this using deep and single-cell sequencing to delineate multiple patterns of clonal dynamics upon xenotransplantation. Given that we and others have demonstrated preservation of the transcriptional and protein signaling outputs in PDXs, the functional significance of these subclonal shifts remain unclear [12][13][14]41 . Furthermore, we find PDXs retain signaling activity, which have traditionally been associated with microenvironmental stimuli, including interferon and hypoxia responses. Whether this is secondary to tumor-stromal interactions with the immunocompromised host mouse or cancer cellautonomous properties remain unclear.
PDXs have recently been shown to serve as a robust platform for population-level preclinical drug screening using a highthroughput 1×1×1 approach 6 . Our data support the feasibility and reproducibility of this strategy, whereby PDX response rates to standard-of-care chemotherapeutics were similar to what is observed clinically (Fig. 5). Although measurement accuracy is sacrificed in search for clinically meaningful tumor regressions using hundreds of unique models in the 1×1×1 approach, we have also found value in evaluating drug sensitivity across~30 PDX models with substantial molecular heterogeneity for signal detection in discovery-phase experiments. By screening a portion of this PDX cohort for gefitinib sensitivity, we have identified a subset of TNBCs, which highly express EGFR specifically within the tumorigenic subpopulation of their tumors, which renders them vulnerable to EGFR inhibition, as opposed to uniformly high EGFR expression, which did not correlate with sensitivity 42 .
Defining the chemosensitivity landscape of our PDX library provides important clinical context in the evaluation of novel agents and may guide design for subsequent clinical trials. For example, drugs that cause regression in multi-chemoresistant PDX models may have enhanced success in the late-stage/metastatic setting versus another agent, which only shows activity in untreated/chemosensitive models, where the neoadjuvant setting may be more appropriate for clinical evaluation. Chemosensitivity profiles in conjunction with deep molecular characterization provides a platform for biomarker discovery and mechanistic studies. PDXs also allow for dynamic temporal sampling that can be a challenge in patients, a strategy that has been used to identify resistance mechanisms to PI3K inhibition in TNBC 43 .
Our data are consistent with retention of chemosensitivity upon xenotransplantation, yet the prospect of using PDXs as avatars in a predictive setting to guide therapy will require coclinical studies (e.g. REFLECT study, NCT02732860). Although cases representing clinical challenges were purposefully selected for inclusion in our study, chemogenomic profiling revealed nearly all models displayed actionable features that could be interrogated, several of which demonstrated marked regressions in vivo (Fig. 6). Although the drugs used in our study have been approved in other indications and/or safely used in humans, evaluating experimental agents in PDXs must account for species differences in pharmacokinetics/dynamics so that they are tested using clinically relevant dosing. Our data highlight that the practical application of avatars would be a challenge in the metastatic setting where patients often progress prior to establishment of the PDX, however, the adjuvant setting in high-risk TNBCs could provide a window-of-opportunity where molecular characterization and functional testing could be performed prior to recurrence. Izumchenko et al. 11 have demonstrated that PDXs from multiple tumor types accurately recapitulate the patient's Fig. 6 Chemogenomic profiling of PDXs reveals actionable feature for difficult-to-treat tumors. a Heatmap of actionable features for PDX models (n = 36) derived from clinical data (receptor and germline BRCA1/2 status), whole-genome sequencing (WGS; copy number alterations and mutations), outlier expression from RNA-sequencing and reverse-phase protein array (RPPA) and chemosensitivity from in vivo profiling. b Heatmap of FGFR1 copy number and RNA expression across PDX library (n = 33). c Waterfall plot of BestAvgResponse for PDX clinical trial in mice treated with BGJ398 (n = 5). d Tumor growth curve for mammary fat pad tumors of GCRC1971 PDX either untreated or BGJ398-treated (n = 1 per arm). e Tumor growth curve for spontaneous luciferase-tagged skull-base metastases of GCRC1971 PDX untreated and BGJ398-treated mice (n = 3 per arm). Mean ± SEM. f Heatmap of PI3K/mTOR combined RPPA signature score (above) and expression of individual probes (below) across PDXs (n = 36). Samples ranked on PI3K/mTOR score. g Waterfall plot of BestAvgResponse for PDX clinical trial in mice treated with everolimus (n = 20). h Correlation plot of PI3K/mTOR RPPA score and BestAvgResponse to everolimus for cohort in g. i Tumor growth curve for GCRC1979 PDX either vehicle or everolimus-treated (n = 3 per arm). Mean ± SEM. j Schematic of clinical history for patient GCRC1738, including treatments and tumor sizes. k Tumor growth curves for GCRC1738 PDX either untreated, doxorubicin, paclitaxel, gemcitabine or cisplatin treated (n = 1 per arm).
clinical response even after late relapse and/or intervening lines of therapy, though further studies prospectively demonstrating the predictive capacity of PDXs are warranted. Although the broad applicability of this approach would currently be prohibited by cost and inefficient take rates, personalized avatars could be evaluated in the context of high-risk/difficult-to-treat patients as validation models for select drug candidates identified by higherthroughput explant cultures 16 . . Excess breast tumor tissue was transported to the laboratory in ice-cold transport medium (DMEM/F12, 50 μg/ml gentamicin, 1× penicillin-streptomycin, 2.5 μg/ml fungizone). Samples were cut into 1 mm 3 fragments and transplanted into the fourth mammary fat pad of 5-7-week-old NOD.Cg-Prkdc scid Il2rg tm1Sug /JicTac (NOG) female mice (Taconic) under sterile conditions. Fine needle aspirates were washed in PBS, resuspended in PBS:matrigel (Corning) (1:1) and 50 μl was injected orthotopically. For ER+ tumors, estrogen supplementation was administered as subcutaneous wax pellets as previously described 44 . Mice were palpated weekly and measured using calipers and harvested for in vivo passaging when tumors reached endpoint (>10 mm in the largest dimension). Tumors fragments were cryopreserved by freezing in FBS with 10% DMSO in a CoolCell container (Biocision) at −80°C.

Methods
Immunohistochemistry. Tissues were fixed in 10% neutral-buffered formalin for 24 h, paraffin-embedded and sections were cut at 4 μm. Sections were deparaffinised in xylenes, re-hydrated in ethanol followed by antigen retrieval in boiling 10 mM citrate buffer (pH 6.0). Slides were blocked with Power Block (BioGenex), incubated primary antibody for 1 h at room temperature with the following anti DNA and RNA isolation. Tissue was snap frozen in OCT (Tissue-Tek) and sectioned for histopathological review. Total DNA and RNA was isolated from adjacent sections using the AllPrep DNA/RNA Mini Kit (Qiagen). Germline DNA was derived from buffy coat and extracted using the DNA Blood Maxi Kit (Qiagen). DNA was quantified using the Qubit fluormeter. RNA was quantified by NanoDrop and integrity was evaluated with Bioanalyzer 2100 (Agilent). The following samples were not included in genomic analyses: GCRC1924T (WGS/RNAseq; tumor sample unavailable), GCRC1944T (RNA-seq; low-quality RNA), GCRC1979T (WGS/RNA-seq; tumor sample unavailable), GCRC1986T (WGS/ RNA-seq; tumor and germline samples unavailable), GCRC1986X (WGS; patient germline sample unavailable for alignment), GCRC2054T (WGS/RNA-seq; tumor sample unavailable), GCRC2076T (WGS/RNA-seq; tumor sample unavailable), GCRC2089T (RNA-seq; low-quality RNA).
Whole-genome and RNA-sequencing. WGS libraries were prepared using PCRfree construction and sequenced at 1 lane per sample on HiSeq X with v3 chemistry according to Illumina protocols, generating 150 bp paired-end reads. RNA-seq libraries were prepared using ssRNA-seq construction and sequenced on HiSeq2000 according to Illumina protocols, generating 75 bp paired-end reads.
WGS analysis. Short paired-end reads were trimmed using Trimmomatic (version 0.35) 45 and the resulting reads were aligned to the GRCh38/hg38 human reference genome using BWA-MEM (version 0.7.15) 46 . Alignments were recalibrated by using GATK (version 3.8) 47 and duplicates were marked with Picard (version 2.9; http://broadinstitute.github.io/picard/). To remove possible contaminated reads originating from mouse in xenograft samples, reads were also aligned to the GRCm38/mm10 mouse and the Disambiguate algorithm (version 1.0) 48 was used to assign the reads to individual species based on the highest quality alignment of the read pair. Somatic mutations were identified by GATK's Mutect2 algorithm 49 at default parameters in each tumor and xenograft sample using its matched normal tissue as reference. Structural variants were called using Manta at default parameters (version 1.5) 50 . Copy number variation analysis was performed using SCoNEs (version 2.1; https://bitbucket.org/mugqic/scones/) with a bin size parameter of 10 kb. Variants were annotated with hg38 database using SnpEff (version 4.3) 51 . CRAVAT 5.2.4 was used to identify potential pathogenic and cancer driver variants 52 . Genes previously associated in breast cancer were retrieved from Nik-Zainal et al. 19 . Oncoprints were generated from cBioPortal (https://www. cbioportal.org/).
RNA-seq analysis. Adaptor sequences and low-quality score bases (Phred score < 30) were first trimmed using Trimmomatic 45 . The resulting reads were aligned to the human genome reference sequence (GRCh38/hg38), using STAR 53 . Reads originating from mouse in xenograft samples were removed using the Disambiguate algorithm 48 . Read counts for each sample are obtained using HTSeq 54 . For downstream analyses, lowly-expressed genes with an average read count lower than 10 across all of the samples were excluded. Raw counts were normalized using the TMM algorithm (i.e., weighted trimmed mean of M-values), implemented in edgeR R package (version 3.22.5) 55 . Using the voom function in the limma R package (version 3.36.5) 56 , the data was converted to log-counts per million with associated precision weights. The lmfit function from limma was used to identify differences in expression levels between primary tumor and xenograft models with the following paired design: Expression~Individual + Model. Nominal p-values were corrected for multiple testing using the Benjamini-Hochberg method. Pathway analysis was performed with single-sample gene set enrichment analysis (v4) using GenePattern3.9.10 57,58 . Heatmaps were constructed using Morpheus (https://software.broadinstitute.org/morpheus).
Reverse-phase protein array. Snap frozen tissue was analyzed by RPPA with 244 antibodies as previously described 33 . Briefly, lysates were serially diluted, arrayed onto nitrocellulose-coated slides, and probed with antibodies using a tyramidebased signal amplification approach followed by a DAB colorimetric reaction. Slides were scanned, spots were quantified by Array-Pro Analyzer and relative protein levels were determined by interpolation of each dilution curve from the standard curve (SuperCurve Rx64 3.1.1) of the slide. Data was normalized for loading and transformed to linear values or median-centered log2 values. Signature scores were generated by summing positive regulatory components minus negative regulatory components on median-centered log2 values from previously published RPPA signatures 33 . GCRC1784Xc (derived from a metaplastic tumor with chondroid differentiation) was not included in the analysis because it was an outlier (Correction Factor 1: 0.33) sample for total protein content. The TCGA Breast Invasive Carcinoma, Firehose Legacy RPPA and RNA-seq datasets were downloaded from cBioPortal (https://www.cbioportal.org/) on January 27, 2020.
Statistics and reproducibility. Measurements were taken from distinct samples when applicable. No sample size calculations were performed. Prism 8 (GraphPad) was used for basic statistical analysis. Significance was determined using Student's t-test (two-tailed), Fisher's exact test, Pearson correlation and survival analysis using the Log-rank (Mantel-Cox) test. Benjamini-Hochberg correction was applied where noted in the text. p-value < 0.05 was considered significant.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
Whole-genome sequencing data has been deposited in the Sequence Read Archive under the accession PRJNA594000. RNA-sequencing data has been deposited in the Gene Expression Omnibus under the accession GSE142767. Any remaining data pertaining to this manuscript is available from the corresponding author upon reasonable request.