Abstract
Available genetically defined cancer models are limited in genotypic and phenotypic complexity and underrepresent the heterogeneity of human cancer. Here, we describe a combinatorial genetic strategy applied to an organoid transformation assay to rapidly generate diverse, clinically relevant bladder and prostate cancer models. Importantly, the clonal architecture of the resultant tumors can be resolved using single-cell or spatially resolved next-generation sequencing to uncover polygenic drivers of cancer phenotypes.
Similar content being viewed by others
Main
Most cancers are not driven by a single oncogenic driver but are instead the sum of multiple genetic perturbations that occur during tumor evolution1. However, the functional impact of most genomic abnormalities found in cancers remains largely unknown. Wrangling the catalog of recurrent genetic alterations in cancer and deriving meaningful insights into the functional and contextual contributions of these events is a major challenge in the field of cancer genomics. In vitro assays recapitulate only specific aspects of cancer behaviors such as cell proliferation, anchorage-independent colony formation or invasive migration. In vivo strategies such as the transplantation of cancer cell lines or chemical carcinogenesis are not genetically defined. Genetically engineered mouse models are a gold standard to define genetic drivers in cancer, but they are costly, slow and do not allow the facile manipulation of more than a few genes. Dissociated-cell tissue recombination and transplantation assays have also been applied to study the malignant transformation of primary epithelial cells but have been reliant on the introduction of discrete sets of candidate genes and limited by inefficient transgenesis. Collectively, existing cancer models generated through these methods grossly underrepresent the diversity of human cancer. Furthermore, the use of these technologies to systematically investigate the permutations of genetic events associated with a single cancer would be incredibly challenging, if not impossible. To address these limitations of scale, throughput and economy, we developed a methodology incorporating barcoded lentiviral libraries encoding cancer-associated genetic events introduced efficiently and at random into primary epithelial cells, which are engrafted in mice for tumorigenic selection, at a high multiplicity of infection (MOI). This system enables the generation of genotypically and phenotypically diverse tumors and the massively parallel single-cell lentiviral barcode sequencing of tumors to identify cooperative oncogenic drivers of malignant transformation and specific cancer phenotypes.
Organotypic or organoid cultures permit the expansion of primary epithelial cells while maintaining their complex organization and tissue function2. A major barrier to higher-order genetic studies in this context has been inefficient transgenesis using available lentiviral transduction protocols. We proposed that enforced cell–virus contact in a constrained volume of gel matrix could increase lentiviral transduction efficiency. Primary mouse bladder urothelial (mBU) and prostate epithelial (mPE) cells were isolated by fluorescence-activated cell sorting (FACS) on the basis of a lineage-negative (Lin−) (CD45−CD31−Ter119−), EpCAM+CD49fhigh immunophenotype (Extended Data Fig. 1a), as these populations self-renew at high frequencies3 and readily establish organoids in culture (Extended Data Fig. 1b). Cells were mixed into cold Matrigel containing concentrated lentivirus expressing GFP before the seeding and polymerization of organoid droplets4. Near complete transduction of mBU and mPE cells was achieved, delivering up to 10–20 copies per cell (Fig. 1a,b). We next developed a barcoding system to characterize the distribution of unique proviral copies per cell. Lentiviral constructs were barcoded with matching ten-nucleotide sequences distal to the 5′ long terminal repeat (LTR) and proximal to the 3′ LTR and produced as a pool. A custom single-cell amplicon panel was designed on the Mission Bio Tapestri platform to enable the sensitive enumeration of multiple uniquely barcoded lentiviruses per cell (Extended Data Fig. 1c). This approach was validated using a defined population of 3T3 cells engineered with lentiviruses to harbor up to four unique lentiviral barcodes per cell (Extended Data Fig. 1d). mPE were transduced with a diverse barcoded lentiviral pool at varying MOIs, and single-cell amplicon sequencing showed relatively normal distributions of proviral copies per cell (Fig. 1c and Extended Data Fig. 1e).
To determine the utility of this approach in understanding the initiation and progression of bladder and prostate cancer, we selected commonly mutated genes from cancer genome sequencing studies5,6,7 (Extended Data Fig. 2a) and cloned these as open reading frames (ORFs) or short hairpin RNAs (shRNAs) into barcoded lentiviral constructs to mimic gain-of-function or loss-of-function events (Extended Data Fig. 2b and Supplementary Table 1). At least three shRNAs from The RNAi Consortium (TRC) targeting each gene were tested for knockdown in 3T3 cells by quantitative PCR. The shRNA demonstrating the most potent knockdown of target gene expression was incorporated into the lentiviral libraries (Extended Data Fig. 2c). A bladder urothelial lentiviral pool (BU-LVP) of 33 genes and a prostate epithelial lentiviral pool (PE-LVP) of 24 genes were produced in arrayed format to avoid lentiviral barcode recombination and concentrated by ultracentrifugation (Extended Data Fig. 3a). Infectivity (representation) of each lentivirus was evaluated by transducing either mBU or mPE cells with the respective lentiviral pool and performing bulk amplicon sequencing of lentiviral barcodes (Extended Data Fig. 3b). Initial lentivirus pools demonstrated over tenfold overrepresentation of shRNA vectors relative to ORF vectors (Extended Data Fig. 3c), presumably owing to more efficient viral packaging because of the reduced length between LTRs of the transfer plasmid8. These data were applied to adjust the cell surface area of producer cells for subsequent arrayed lentiviral library production, leading to near normalization of the representation of shRNA and ORF vectors (Extended Data Fig. 3d).
We adopted an approach in which primary mBU and mPE cells infected with BU-LVP or PE-LVP at high MOI in organoids were recombined with inductive mouse embryonic day 16 (E16) bladder mesenchyme (EBLM)9 or urogenital sinus mesenchyme (UGSM)10 and subsequently grafted subcutaneously in NOD scid gamma (NSG) mice to enable biological selection for tumorigenic clones (Fig. 1d). No tumors were appreciable from control grafts of untransduced mBU or mPE cells recombined with EBLM or UGSM. The efficiency of tumor formation (tumors formed per graft inoculated) was 80% (16 of 20) for mBU cells infected with BU-LVP and 38% (18 of 47) for mPE cells infected with PE-LVP (Supplementary Table 2). Tumor latency was measured as time from inoculation to achieving a maximal tumor diameter of 1 cm and ranged from 2.3 to 7.4 months (mean 4.2 months) for bladder tumors and 3.2 to 16 months (mean 8.9 months) for prostate tumors (Supplementary Table 2).
A representative tumor derived from primary mBU cells transduced with BU-LVP exhibited three morphologically distinct regions consistent with papillary urothelial carcinoma with an inverted growth pattern, urothelial carcinoma with squamous differentiation and sarcomatoid urothelial carcinoma, all three of which were also supported by GATA3, TP63 and pan-cytokeratin (panCK) immunostaining (Fig. 1e). Single-cell DNA amplicon sequencing was performed to enumerate the lentiviral barcodes for the determination of clonal architecture and deconvolution of lentivirus-delivered genetic events putatively involved in tumorigenesis. Three major clones harboring distinguishable sets of lentiviral barcodes were identified (Fig. 1f), but spatial resolution was lost owing to single-cell dissociation. To associate histology with clonality, we performed laser capture microdissection (LCM) of the three regions on stained tissue sections and performed bulk DNA amplicon sequencing (Fig. 1g). The papillary urothelial carcinoma was uniquely associated with Fgfr3 S243C, shAtm and Zfp703 mutations, in addition to the common Ywhaz, Pik3ca E545K, Pparg and Pvrl4 mutations observed in all three dominant clones. Cancer genomics studies have shown that activating mutations in FGFR3 are highly enriched in papillary urothelial carcinomas5,11. We further validated these findings in the mouse urothelial transformation assay in independent experiments using a defined lentiviral pool of Fgfr3 S243C, Ywhaz, Pik3ca E545K, Pparg and Pvrl4 (Extended Data Fig. 4a), which produced tumors with papillary urothelial carcinoma with an inverted growth pattern by histopathology and based on the endophytic proliferative pattern (Extended Data Fig. 4b). The co-occurrence of these genetic alterations was also evident in the human muscle-invasive bladder cancers from The Cancer Genome Atlas bladder cancer (TCGA-BLCA) cohort5 (Extended Data Fig. 5).
Several tumors called the Fred Hutch Bladder Tumor (FHBT) series have been generated using this methodology, including those with pure urothelial carcinoma and others with mixtures of histologic subtypes (Fig. 2a and Extended Data Fig. 6). The urothelial origin of these tumors was supported by GFP staining (Fig. 2b–d), which was positive even in regions of sarcomatoid carcinoma with low or absent panCK staining (Fig. 2b). We conducted molecular profiling of these tumors and their regional tumor histologies by LCM and RNA-seq analysis. Principal component analysis (PCA) of the gene expression data showed that squamous and sarcomatoid subtypes clustered together and were separate from urothelial and papillary urothelial carcinomas (Fig. 2e). The BASE47 subtype predictor12, a gene expression classifier used to distinguish luminal and basal subtypes of urothelial carcinoma, generally classified the tumors with papillary and papillary squamous subtypes as luminal and the squamous and sarcomatoid histologies as basal, consistent with an established relationship between sarcomatoid differentiation and the basal subtype13 (Fig. 2f). The Consensus Molecular Classifier14 revealed that the non-papillary urothelial histologies showed neuroendocrine-like gene expression with low or absent luminal and basal gene signatures (Fig. 2g and Extended Data Fig. 7a). Gene set enrichment analysis (GSEA) was used to compare these tumor histologies in a pairwise manner and revealed the enrichment of genes associated with epithelial-to-mesenchymal transition in sarcomatoid carcinoma, as expected from prior molecular analyses of human tumors13 (Fig. 2h). We further confirmed the relevance of our FHBT models by projecting their RNA expression patterns onto principle component analysis plots of tumors from the TCGA-BLCA cohort (Fig. 2i) and established N-butyl-N-(4-hydroxybutyl)-nitrosamine (BBN)-induced mouse bladder cancer models15 (Extended Data Fig. 7b) to show that they occupy overlapping space on the basis of histologic classification, indicating that the transcriptional features with the greatest variance between tumor subtypes are also conserved with FHBT models.
mPE cells transduced with PE-LVP and engrafted in mice (Extended Data Fig. 8a) also gave rise to mixed cancer morphologies. One tumor showed high-grade prostate adenocarcinoma with focal pleomorphic giant cells (Extended Data Fig. 8b), a rare histologic subtype associated with poor prognosis16 that may contribute to therapeutic resistance and lethality17. Immunostaining revealed HOXB13 and AR expression in both histologies with pronounced nuclear TP53 expression in the pleomorphic giant cells (Extended Data Fig. 8b). We isolated large (pleomorphic giant cell carcinoma) and small (adenocarcinoma) cells from dissociated tumors using a flow cytometry-based strategy, propagated these cells briefly (one passage) in organoid cultures, then dissociated the cells and stained with the nuclear dye Hoechst 33342 to further isolate cells on the basis of DNA content for downstream single-cell lentiviral barcode enumeration (Extended Data Fig. 8c). This single-cell clonality analysis revealed striking enrichment of shKmt2c in the putative pleomorphic giant cell clones (Extended Data Fig. 8d). Recent studies have established that KMT2C mediates the DNA damage response in cancer18,19, and DNA damage repair alterations are common in human prostate adenocarcinoma with pleomorphic giant cell features20.
In summary, we describe a set of technologies that form a functional in vivo cancer genomics assay with efficient delivery of compound genetic perturbations from barcoded lentiviral libraries and single-cell sequencing to rapidly investigate genotype–phenotype relationships in cancer initiation and progression using primary epithelial cells. We leveraged this strategy to develop a series of mouse bladder cancers that recapitulate the phenotypic diversity of human bladder cancer and a mouse prostate cancer with pleomorphic giant cell carcinoma, representing cancer subtypes that have not previously been modeled in a genetically defined fashion. Importantly, single-cell lentiviral barcode deconvolution associated mutant active Fgfr3 with the luminal papillary differentiation of urothelial carcinoma and the loss of Kmt2c with pleomorphic giant cell carcinoma in prostate cancer. These initial studies provide proof of principle that this approach can be deployed to investigate higher-order genetic interactions to explore complex genotype-to-phenotype relationships in cancer.
Methods
Lentiviral constructs and lentiviral library production
Double-barcoded lentiviral vectors were generated from the parental vector FU-CGW21 by sequentially inserting matched ten-nucleotide barcodes into the PacI site distal to the HIV FLAP using the Quick Ligation Kit (New England Biolabs) and PCR amplification of the WPRE sequence and barcode with insertion into the ClaI sites proximal to the 3′ LTR by HiFi DNA Assembly (New England Biolabs). ORFs were cloned into the EcoRI site of the double-barcoded lentiviral vectors by HiFi DNA Assembly. To generate shRNA lentiviral vectors, the ubiquitin C promoter sequence was excised from the double-barcoded plasmid by digesting with PspXI and EcoRI. U6 promoter and shRNA cassettes were isolated by digesting pLKO.1 TRC shRNA clones with PspXI and EcoRI and were inserted into the digested double-barcoded plasmid using the Quick Ligation Kit. Individual lentiviruses were generated in arrayed format in 293T cells (CRL-11268, ATCC) by co-transfection of each double-barcoded lentiviral ORF or shRNA plasmid with the helper plasmids pVSV-G, pMDL and pRev using FuGENE HD Transfection Reagent (Promega). Lentiviral supernatants were collected 36 h after transfection, pooled and concentrated by ultracentrifugation in V-bottom polypropylene centrifuge tubes on a SW 32 Ti in an Optima XE 90 (Beckman Coulter) at 82,520g at 4 °C for 2 h. Supernatants were aspirated, and lentiviral pellets were resuspended in residual media and cryopreserved.
shRNA screening
The top three to five shRNA sequences identified from The RNAi Consortium for each target gene were identified from the Genetic Perturbation Platform Web Portal at the Broad Institute. shRNA sequences were cloned into pLKO.1. pLKO.1-TRC control and pLKO.1-shRNA lentiviruses were generated and used to transduce 3T3 cells (gift from V. Vasioukhin, Fred Hutchinson Cancer Center). Seventy-two hours after lentiviral transduction, 3T3 cells were collected, and RNA was collected using an RNeasy Mini Kit (Qiagen). Reverse transcription of RNA was performed using SuperScript IV Reverse Transcriptase (Invitrogen) as per the manufacturer’s instructions. qPCR was performed on a QuantStudio 6 using SYBR Green qPCR Master Mix (ThermoFisher Scientific), and primers specific to each target gene and Ubc as a control. All primers used for these studies are listed in Supplementary Table 3. Relative expression was calculated using ddCT analysis.
Embryonic bladder mesenchyme and urogenital sinus mesenchyme preparation
All animal care and studies were performed in accordance with an approved Fred Hutchinson Cancer Center Institutional Animal Care and Use Committee (IACUC) protocol (PROTO000051048) and Comparative Medicine regulations. All animals were housed in an Association for Assessment and Accreditation of Laboratory Animal Care (AALAC)-accredited facility and subjected to a 12-h light/dark cycle with the temperature maintained between 18 °C and 24 °C and 40–60% humidity. UGSM was isolated and propagated as previously described21. E16 fetal bladders were also collected at the same time as the urogenital sinus and subjected to similar steps for preparation of EBLM. UGSM and EBLM were passaged less than five times before use in engraftment studies.
Mouse bladder and prostate dissociation and organoid culture
Bladder and prostates from 8- to 12-week-old male C57BL/6 mice (The Jackson Laboratory) were dissected and mechanically and enzymatically dissociated as previously described21. Cells were stained with antibodies for FACS on a Sony SH800 Cell Sorter with collection of Lin−CD49fhighEpCAMhigh cells. Between 1 × 104 and 2 × 104 bladder urothelial and prostate epithelial cells were resuspended in a total of 15 µl of growth factor-reduced Matrigel (Corning) with or without concentrated lentivirus and seeded as droplets in each 48-well tissue culture plate well. Cells were cultured as previously described22. Mouse bladder organoid culture media consisted of Advanced DMEM-F12, 10 mM HEPES, 2 mM GlutaMAX, B27 supplement, 1.25 mM N-acetylcysteine, 50 ng ml−1 hEGF, 100 ng ml−1 hNoggin and 500 ng ml−1 hR-spondin, 200 nM A83-01 and 10 µM Y-27632. Mouse prostate organoid culture media consisted of mouse bladder organoid culture media with the addition of 1 nM dihydrotestosterone.
Organoid transformation assay
After 5–7 days of culture, transduced mouse bladder urothelial or prostate epithelial organoids were liberated by dissociating the Matrigel matrix with 5 U ml−1 dispase (STEMCELL Technologies). Organoids were washed with PBS and resuspended in ice-cold Matrigel with either 105 EBLM or UGSM and subcutaneously injected into the flanks of 6- to 8-week-old male NSG (NOD-SCID-IL2Ry-null) mice (The Jackson Laboratory). For prostate epithelial transformation studies, mice were supplemented with testosterone through the subcutaneous implantation of 90-day release testosterone pellets (Innovative Research of America). Tumors were collected when they reached 1 cm in maximal diameter. The maximum tumor size permitted by the Fred Hutchinson Cancer Center IACUC is 2 cm in diameter, which was not exceeded during these studies.
Copy number assay
DNA was extracted from organoids using a GeneJET Genomic DNA Purification Kit (ThermoFisher Scientific). Copy number analysis was performed by TaqMan Real-Time PCR Assay (ThermoFisher Scientific) using the TaqMan Copy Number Reference Assay, mouse, Tfrc (4458366) and EGFP TaqMan Copy Number Assay (Mr00660654_cn) on a QuantStudio 6. Genomic DNA extracted from the tails of transgenic C57BL/6 mice with one or two copies of GFP was used as a calibrator sample. GFP copy number was determined using ddCT analysis, where sample copy number = calibrator copy number × 2−ddCT.
Single-cell DNA amplicon sequencing library preparation and sequencing
A custom panel was designed for the Mission Bio Tapestri to amplify segments of ten mouse genes at two exons each, the 5′ and 3′ lentiviral barcodes and lentiviral GFP. Libraries were generated either from cryopreserved or freshly dissociated tumor cells using the Mission Bio Tapestri Single-cell DNA Custom Kit according to the manufacturer’s recommendations. Single cells (3,000 to 4,000 cells per μl) were resuspended in Tapestri cell buffer and encapsulated using a Tapestri microfluidics cartridge, lysed and barcoded. Barcoded samples were subjected to targeted PCR amplification, and PCR products were removed from individual droplets, purified with KAPA Pure Beads (Roche Molecular Systems) and used as a template for PCR to incorporate Illumina P7 indices. PCR products were purified by KAPA Pure Beads and quantified by Qubit dsDNA High Sensitivity Assay (ThermoFisher Scientific). Sample quality was assessed by Agilent TapeStation analysis. Libraries were pooled and sequenced on an Illumina MiSeq or HiSeq 2500 with 150 bp paired-end reads in the Fred Hutchinson Cancer Center Genomics Shared Resource.
Laser capture microdissection and DNA and RNA isolation for high-throughput sequencing
Sections 10 µm thick were cut from formalin-fixed paraffin-embedded (FFPE) tumor tissue blocks and mounted onto PEN Membrane Frame Slides (ThermoFisher Scientific). Sections were fixed with 95% ethanol for 1 min, stained with 3% cresyl violet and dehydrated through graded alcohols and xylene. Histology review and annotation were performed by a pathologist. Laser capture microdissection was performed on an Arcturus XT Laser Capture Microdissection System (ThermoFisher Scientific). Microdissected specimens were collected for DNA and RNA extraction. DNA was extracted using a GeneRead DNA FFPE Kit (Qiagen), and RNA was extracted using an RNeasy FFPE Kit (Qiagen) according to the manufacturer’s protocols. Two-step PCR for lentiviral barcode amplification and sequencing library adaptor ligation was performed. The first PCR reaction consisted of 2x KAPA HiFi HotStart ReadyMix, 100 nM of 1° FWD primer (5′- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCAAAATTTTCGGGTT TATTACAGG-3′), 100 nM of 1° REV primer (5′- GTCTCGTGGGCTCGGAGATGTGTATAAGAGA CAGGCCGCTCGAGGACTATTAAG-3′) and 80 ng of genomic DNA. Thermal cycling conditions were 95 °C for 3 min; (95 °C for 30 s, 64 °C for 30 s, 72 °C for 30 s) × 25 cycles; 72 °C for 5 min; and hold at 4 °C. PCR cleanup was conducted using the Wizard SV Gel and PCR Clean-Up System (Promega), with elution in 30 µl of double distilled water. The second PCR reaction consisted of 2x KAPA HiFi HotStart ReadyMix, 140 nM of 2° i7 primer, 140 nM of 2° i5 primer and 5 µl of elution from the PCR cleanup of the 1° PCR. Thermal cycling conditions were 95 °C for 3 min, (95 °C for 30 s, 61 °C for 30 s, 72 °C for 30 s) × 8 cycles; 72 °C for 5 min; and hold at 4 °C. The sequences of 2° primers used to incorporate dual-indexed Illumina sequencing adaptors are displayed in Supplementary Table 4. PCR cleanup was conducted using the Wizard SV Gel and PCR Clean-Up System, with elution in 30 µl of double distilled water. Sample quality was assessed by Agilent TapeStation analysis. Sequencing was performed on an Illumina MiSeq or HiSeq 2500 instrument using 150 bp single-end reads. PhiX sequences were excluded from the sequencing reads by Bowtie 2 v2.4.4 (ref. 23). Cutadapt v4.1 (ref. 24) was used to trim the reads to the barcode region. Then the trimmed reads were aligned to custom DNA references containing all barcodes using Bowtie 2. Samtools v1.11 (ref. 25) was used to extract read counts for each barcode. The RNA-seq libraries were prepared using a SMARTer Stranded Total RNA-Seq Kit v3 - Pico Input Mammalian (Takara Bio) and sequenced on an Illumina NovaSeq 6000 using a NovaSeq S4 flow cell with 100 bp paired-end reads by MedGenome. Sequencing reads were mapped to mouse genome reference GRCm39, and gene expression was quantified and normalized using the UC Santa Cruz Computational Genomics Lab Toil RNA-seq pipeline v4.1.2 (ref. 26).
Transcriptional subtype analysis and PCA projections
All computational analyses were carried out in RStudio v4.1.0. Mouse Ensembl genes were converted to Mouse Genome Informatics (MGI) gene symbols using the biomaRt package v2.24.1 (https://bioconductor.org/packages/release/bioc/html/biomaRt.html). MGI gene symbols were then converted to their human orthologs by referencing the mouse–human ortholog database available from The Jackson Laboratory (http://www.informatics.jax.org/downloads/reports/HOM_MouseHumanSequence.rpt). The human ortholog matrix was used for downstream analysis in transcriptional subtype analysis. FHBT samples were classified using the BASE47 subtype predictor gene list and the ConsensusMIBC package v1.1 (https://github.com/cit-bioinfo/consensusMIBC). Z-score means of genes and signature scores were calculated for each sample. Heatmaps of both the BASE47 and ConsensusMIBC results were generated using the pheatmap package v1.0.12 (https://www.rdocumentation.org/packages/pheatmap/versions/1.0.12/topics/pheatmap). For PCA analysis, the FPKM human ortholog matrix was normalized by log2 + 1 transformation before performing mean-centered PCA using the prcomp package v3.6.2 (https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/prcomp). Visualization of the PCA plot was performed using the factoextra package v1.0.7 (https://cran.r-project.org/web/packages/factoextra/index.html) and ggpubr package v0.6.0 (https://www.rdocumentation.org/packages/ggpubr/versions/0.6.0).
For PCA projections, RNA-seq count data from the FHBT, GSE220999 and TCGA-BLCA datasets were transformed to counts per million, normalized and batch corrected using ComBat-seq27 to compare across each dataset using the DGEobj.utils package v1.0.6 (https://rdrr.io/cran/DGEobj.utils). PCA projection of the FHBT data onto the TCGA-BLCA space was done by first generating a PCA of the TCGA-BLCA samples from the common genes between the FHBT and GSE220999 data. A PCA for both the FHBT and GSE220999 samples was then scaled by the eigenvalues of the TCGA-BLCA using the base package v3.6.2 (https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/scale). A plot was constructed overlaying the reference TCGA-BLCA samples with either FHBT or GSE220999 tumor projections using ggplot2 v3.4.1 (https://cran.r-project.org/web/packages/ggplot2/index.html). TCGA-BLCA samples were colored by their Consensus Molecular Classifier subtype. Differential gene expression analysis was performed pairwise between FHBT histologies using the DESeq2 package v1.38.3 (ref. 28). P values were generated via the Wald test and P-adjusted using the Benjamini–Hochberg correction. Pre-ranked GSEA (Broad Institute) was conducted by inputting a ranked list of differentially expressed genes based on log10-transformed P values from the DESeq2 analysis for each pairwise comparison. Dot plots were generated by plotting the normalized enrichment score and log-transformed false discovery rate for each pre-ranked GSEA output using ggplot2.
Single-cell lentiviral barcode enumeration and clonality analysis
Raw sequencing reads were trimmed to the amplicon regions using the awk command. Barcode sequences in the reads were filtered and extracted using UMI-tools v1.0.0 (ref. 29). Processed reads were aligned to custom references containing all amplicon sequences using bwa-mem v0.7.17-r1188 (ref. 30). Samtools was used to extract amplicon counts for each barcode. Mouse cells with no GFP amplicon counts were removed. Counts per cell were normalized to total counts for each barcode. A minimum threshold normalized count of 1% of total counts was used to define the presence of a barcode in a cell. The clonal architecture of cells was determined by enumerating all cells containing each distinct combination of barcodes.
Immunohistochemistry
Tumor samples were formalin-fixed and paraffin-embedded, sectioned to a 5-µm thickness and placed on positively charged glass slides. For each tumor, slides were stained with a standard hematoxylin and eosin protocol. Immunohistochemical staining was performed according to an established protocol31. Stained slides were digitally scanned on a VENTANA DP 200 (Roche) and analyzed using QuPath 0.2.3 (ref. 32).
Antibodies
Antibodies used for FACS: Human/mouse/bovine integrin alpha 6/CD49f PE-conjugated antibody (FAB13501P, R&D Systems, 1:40); PE/Cyanine7 anti-mouse CD326 (Ep-CAM) antibody (118216, BioLegend, 1:40); CD31 (PECAM-1) monoclonal antibody (390), FITC (11-0311-82, eBioscience, 1:100); CD45 monoclonal antibody (30-F11), FITC (11-0451-85, eBioscience, 1:100); TER-119 monoclonal antibody (TER-119), FITC (11-5921-82, eBioscience, 1:100). Antibodies used for immunohistochemistry: Anti-wide spectrum Cytokeratin antibody (ab9377, Abcam, 1:100); rabbit monoclonal GFP antibody (clone D5.1, Cell Signaling, 1:100); rabbit polyclonal p63 antibody (12143-1-AP, Proteintech, 1:200); mouse monoclonal p53 antibody (clone 1C12, Cell Signaling, 1:500); rabbit monoclonal HOXB13 antibody (clone D7N8O, Cell Signaling, 1:50); rabbit polyclonal AR antibody (06-680, Millipore, 1:2,000); rabbit monoclonal GATA3 antibody (clone D13C9, Cell Signaling, 1:200); rabbit monoclonal CD44 antibody (clone E7K2Y, Cell Signaling, 1:100).
Statistical analyses
Data analysis was performed on GraphPad Prism 9 (GraphPad Software). qPCR results were analyzed in Excel. Statistical significance was determined using the unpaired two-tailed Student’s t-test. Results are depicted as mean + s.d. unless stated otherwise. For all statistical tests, P values of <0.05 were considered significant.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Sequencing data pertaining to this study are available from the Gene Expression Omnibus (GEO) as SuperSeries GSE229783. RNA-seq data from FHBT models are available from accession number GSE229780. Bulk DNA amplicon sequencing data from lentiviral library representation studies and from FHBT models are available from accession numbers GSE231542 and GSE229781, respectively. Single-cell DNA amplicon sequencing data related to determining the unique proviral copies per cell after lentiviral transduction across a range of MOIs are available from accession number GSE231543. Single-cell DNA amplicon sequencing data from FHBT models and enriched cells from the tumor model with prostate adenocarcinoma and focal pleomorphic giant cell carcinoma are available from accession number GSE229782.
Code availability
The study did not use any custom code or software. All code or software used for all data processing and analysis has been described in the Methods section. No custom code was used to generate figures.
References
Loeb, L. A., Loeb, K. R. & Anderson, J. P. Multiple mutations and cancer. Proc. Natl Acad. Sci. USA 100, 776–781 (2003).
Kretzschmar, K. & Clevers, H. Organoids: modeling development and the stem cell niche in a dish. Dev. Cell 38, 590–600 (2016).
Santos, C. P. et al. Urothelial organoids originating from Cd49fhigh mouse stem cells display Notch-dependent differentiation capacity. Nat. Commun. 10, 4407 (2019).
Drost, J. et al. Organoid culture systems for prostate epithelial and cancer tissue. Nat. Protoc. 11, 347–358 (2016).
Robertson, A. G. et al. Comprehensive molecular characterization of muscle-invasive bladder cancer. Cell 171, 540–556.e525 (2017).
Winters, B. R. et al. Genomic distinctions between metastatic lower and upper tract urothelial carcinoma revealed through rapid autopsy. JCI Insight 5, e128728 (2019).
Robinson, D. et al. Integrative clinical genomics of advanced prostate cancer. Cell 161, 1215–1228 (2015).
Kumar, M., Keller, B., Makalou, N. & Sutton, R. E. Systematic determination of the packaging limit of lentiviral vectors. Hum. Gene Ther. 12, 1893–1905 (2001).
Oottamasathien, S. et al. Bladder tissue formation from cultured bladder urothelium. Dev. Dyn. 235, 2795–2801 (2006).
Zong, Y., Goldstein, A. S. & Witte, O. N. Preparation of urogenital sinus mesenchymal cells for prostate tissue recombination models. Cold Spring Harb. Protoc. 2015, 988–990 (2015).
Marzouka, N. A. et al. The Lund molecular taxonomy applied to non-muscle-invasive urothelial carcinoma. J. Mol. Diagn. 24, 992–1008 (2022).
Damrauer, J. S. et al. Intrinsic subtypes of high-grade bladder cancer reflect the hallmarks of breast cancer biology. Proc. Natl Acad. Sci. USA 111, 3110–3115 (2014).
Guo, C. C. et al. Dysregulation of EMT drives the progression to clinically aggressive sarcomatoid bladder cancer. Cell Rep. 27, 1781–1793.e1784 (2019).
Kamoun, A. et al. A consensus molecular classification of muscle-invasive bladder cancer. Eur. Urol. 77, 420–433 (2020).
Shah, S. D. et al. Syngeneic model of carcinogen-induced tumor mimics basal/squamous, stromal-rich, and neuroendocrine molecular and immunological features of muscle-invasive bladder cancer. Front Oncol. 13, 1120329 (2023).
Alharbi, A. M., De Marzo, A. M., Hicks, J. L., Lotan, T. L. & Epstein, J. I. Prostatic adenocarcinoma with focal pleomorphic giant cell features: a series of 30 cases. Am. J. Surg. Pathol. 42, 1286–1296 (2018).
Pienta, K. J., Hammarlund, E. U., Brown, J. S., Amend, S. R. & Axelrod, R. M. Cancer recurrence and lethality are enabled by enhanced survival and reversible cell cycle arrest of polyaneuploid cells. Proc. Natl Acad. Sci. USA 118, e2020838118 (2021).
Rampias, T. et al. The lysine-specific methyltransferase KMT2C/MLL3 regulates DNA repair components in cancer. EMBO Rep. 20, e46821 (2019).
Chang, A. et al. Recruitment of KMT2C/MLL3 to DNA damage sites mediates DNA damage responses and regulates PARP inhibitor sensitivity in cancer. Cancer Res. 81, 3358–3373 (2021).
Lotan, T. L., Kaur, H. B., Alharbi, A. M., Pritchard, C. C. & Epstein, J. I. DNA damage repair alterations are frequent in prostatic adenocarcinomas with focal pleomorphic giant-cell features. Histopathology 74, 836–843 (2019).
Xin, L., Ide, H., Kim, Y., Dubey, P. & Witte, O. N. In vivo regeneration of murine prostate from dissociated cell populations of postnatal epithelia and urogenital sinus mesenchyme. Proc. Natl Acad. Sci. USA 100, 11896–11903 (2003).
Karthaus, W. R. et al. Identification of multipotent luminal progenitor cells in human prostate organoid cultures. Cell 159, 163–175 (2014).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Vivian, J. et al. Toil enables reproducible, open source, big biomedical data analyses. Nat. Biotechnol. 35, 314–316 (2017).
Zhang, Y., Parmigiani, G. & Johnson, W. E. ComBat-seq: batch effect adjustment for RNA-seq count data. NAR Genom. Bioinform. 2, lqaa078 (2020).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
DeLucia, D. C. et al. Regulation of CEACAM5 and therapeutic efficacy of an anti-CEACAM5-SN38 antibody-drug conjugate in neuroendocrine prostate cancer. Clin. Cancer Res. 27, 759–774 (2021).
Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Acknowledgements
We thank C. Morrissey, L. Xin and P. Nelson for critical discussion and review of this work. This work was supported by National Institutes of Health (NIH) grants DP2 CA271301 (J.K.L.), R01 CA276308 (A.C.H.), R37 CA230617 (A.C.H.), a Bladder Cancer Advocacy Network Research Innovator Award (J.K.L.), a Department of Defense Peer Reviewed Cancer Research Program Career Development Award W81XWH-19-1-0569 (J.K.L.), a Department of Defense Prostate Cancer Research Program Early Investigator Research Award W81XWH-20-1-0083 (S.L.) and a Department of Defense Peer Reviewed Cancer Research Program Horizon Award W81XWH-19-1-0658 (S.J.). We acknowledge support from the Seattle Translational Tumor Research Program in Bladder Cancer and the University of Washington Medicine Urethral Cancer Research Fund provided by donor D.L. Rich. This research was also supported by the Flow Cytometry, Experimental Histopathology and Genomics Shared Resources of the Fred Hutch/University of Washington Cancer Consortium funded by NIH grant P30 CA015704.
Author information
Authors and Affiliations
Contributions
S.L., A.W., H.S. and J.K.L. designed experiments. S.L., A.W., H.S., V.B., G.J. and S.J. performed the experiments. H.S. and G.J. analyzed the RNA-seq and bulk and single-cell DNA amplicon sequencing data. Q.W. provided statistical oversight. R.B.M., J.L.W., H.-M.L., A.C.H., B.M.F., M.C.H. and J.K.L. supervised the study. S.L. and J.K.L. wrote the manuscript with contributions from all authors.
Corresponding author
Ethics declarations
Competing interests
J.K.L. served on the Speaker’s Bureau for Mission Bio. B.M.F. has consulting or advisory roles with QED Therapeutics, Boston Gene, Astrin Biosciences Merck, Immunomedics/Gilead, Guardant and Janssen and receives patent royalties from Immunomedics/Gilead and research support from Eli Lilly. All other authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks David Fernandez-Antoran and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Isolation of mouse bladder urothelial and prostate epithelial cells for organoid culture and design/validation of a custom Mission Bio Tapestri single-cell DNA amplicon sequencing panel.
(a) Representative flow cytometry plot for the isolation of mouse bladder urothelial (top) and prostate epithelial (bottom) from dissociated tissues based on a Lin-(CD45-CD31-Ter119-) EpCAM+CD49fhigh immunophenotype. (b) Images of organoid cultures of mouse bladder urothelial and prostate epithelial cells on day 1 and day 5 after seeding. (c) Table showing the amplicons represented in a custom Mission Bio Tapestri single-cell DNA amplicon sequencing panel. (d) Table showing results of a validation study where a defined mixture of 3T3 cells with an unlabeled population and others labeled with combinations of lentiviruses encoding distinct barcodes were analyzed using the Mission Bio Tapestri single-cell DNA amplicon sequencing panel to determine clonality. ~2,000 cells were analyzed. (e) Overview of experiments with infection of mouse prostate epithelial (mPE) cells with a diverse barcoded lentiviral library in organoid culture across a range of multiplicity-of-infection (MOI) and quantification of viral copy number per cell across the population by single-cell amplicon sequencing. Created with BioRender.com.
Extended Data Fig. 2 Recurrent genetic alterations associated with bladder and prostate cancer encoded in barcoded lentiviral libraries.
(a) Tables showing gain-of-function and loss-of-function genetic alterations associated with bladder and prostate cancer selected for representation in cancer-specific barcoded lentiviral libraries. (b) Schematics of barcoded lentiviral vectors expressing open reading frames (ORF) or short-hairpin RNA (shRNA). LTR = long terminal repeat; BC = barcode; UBC = Ubiquitin C; CMV = cytomegalovirus; GFP = green fluorescent protein; WHV8 = Woodchuck hepatitis virus 8 post-transcriptional regulatory element. (c) Plot showing relative expression of target genes as determined by quantitative polymerase chain reaction (qPCR) in 3T3 cells 72 hours after lentiviral transduction with pLKO.1-TRC control or pLKO.1 expressing select shRNA previously screened and selected for inclusion in the barcoded lentiviral libraries based on the extent of gene knockdown. qPCR reactions were performed on four biologically independent replicates. Statistical analysis was performed by two-tailed, unpaired t-test with p-values shown. (d) Plot showing relative overexpression of gain-of-function genes as determined by qPCR in 3T3 cells 72 hours after lentiviral transduction with barcoded ORF vectors. qPCR reactions were performed on three biologically independent replicates. Statistical analysis was performed by two-tailed, unpaired t-test with p-values shown.
Extended Data Fig. 3 Generation of barcoded lentiviral libraries and normalization of library representation.
Schema showing the (a) generation of individual lentiviruses from the library in arrayed format with subsequent pooling and concentration by ultracentrifugation and (b) transduction of respective mouse bladder urothelial (mBU) or prostate epithelial (mPE) cells in organoid culture with concentrated lentiviral libraries to determine lentiviral barcode representation by bulk amplicon sequencing of genomic DNA. Created with BioRender.com. (c) Representative distribution of barcoded lentiviruses within a library with skewed enrichment of shRNA relative to ORF lentiviruses. (d) Representative distribution of barcoded lentiviruses within a library after applying information from c to adjust producer cell surface area in a for the generation of the lentiviral library.
Extended Data Fig. 4 Active mutant Fgfr3 S243C cooperates with other oncogenic factors in mouse bladder urothelial cells to drive papillary urothelial carcinoma with inverted growth pattern.
(a) Scheme of the mBU organoid transformation assay using a defined lentiviral library to confirm functional genotype-phenotype associations. Created with BioRender.com. (b) High-magnification images of H&E- and IHC-stained sections of a resultant tumor of the experiment in a with histologic features consistent with papillary urothelial carcinoma with inverted growth pattern.
Extended Data Fig. 5 Frequencies of gene alterations represented in the bladder urothelial LV pool in human muscle-invasive bladder cancer.
Oncoprint from cBioPortal analysis of human muscle-invasive bladder cancers from The Cancer Genome Atlas bladder cancer (TCGA-BLCA) cohort showing select genes for which gain- or loss-of-function events were incorporated into the bladder urothelial LV pool.
Extended Data Fig. 6 FHBT models demonstrate diverse cancer histologies.
(a–e) Low- and high-magnification images of H&E-stained sections and high-magnification images of IHC-stained sections for GFP and pan-cytokeratin (panCK) expression depicting characteristic histologies. Scale bars = 50 µm.
Extended Data Fig. 7 Phenotypic diversity and relevance of FHBT models.
(a) Heatmap showing the histologies of the Fred Hutch Bladder Tumor (FHBT) series relative to expressions of genes that constitute basal and luminal signatures for the BASE47 subtype predictor. (b) Principle component analysis (PCA) projection plot of FHBT samples over BBN carcinogen-induced mouse bladder tumors color-coded based on histology or histology and Consensus Molecular Classification (UC = urothelial carcinoma, Sq = squamous, Src = sarcomatoid, NE = neuroendocrine) with 90% confidence ellipses shown.
Extended Data Fig. 8 Association of adenocarcinoma with polymorphic giant cell carcinoma of the prostate with perturbation of Kmt2c.
(a) Scheme of the mouse prostate epithelial (mPE) organoid transformation assay to uncover functional genotype-phenotype associations in prostate cancer. Created with BioRender.com. (b) Left, Gross image of a tumor arising from mPE transformed with a prostate epithelial lentiviral pool (PE-LVp). Right, high-magnification images of H&E- and IHC-stained sections of regions with high-grade adenocarcinoma and pleomorphic giant cell carcinoma. Scale bar = 50 µm. (c) Overview of the experimental approach to enrich for prostate adenocarcinoma and pleomorphic giant cell carcinoma based on cell size and nuclear DNA content followed by single-cell lentiviral barcode enumeration. (d) Tables showing single-cell clonality analysis of: Top, tumor cells enriched for ‘small cells/nuclei.’ Bottom, tumor cells enriched for ‘large cells/nuclei.’ Highlighted in red is shKmt2c based on the enumeration of the associated lentiviral barcode. Created with BioRender.com.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Li, S., Wong, A., Sun, H. et al. A combinatorial genetic strategy for exploring complex genotype–phenotype associations in cancer. Nat Genet 56, 371–376 (2024). https://doi.org/10.1038/s41588-024-01674-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-024-01674-1