Genetic modification of primary human B cells to model high-grade lymphoma

Sequencing studies of diffuse large B cell lymphoma (DLBCL) have identified hundreds of recurrently altered genes. However, it remains largely unknown whether and how these mutations may contribute to lymphomagenesis, either individually or in combination. Existing strategies to address this problem predominantly utilize cell lines, which are limited by their initial characteristics and subsequent adaptions to prolonged in vitro culture. Here, we describe a co-culture system that enables the ex vivo expansion and viral transduction of primary human germinal center B cells. Incorporation of CRISPR/Cas9 technology enables high-throughput functional interrogation of genes recurrently mutated in DLBCL. Using a backbone of BCL2 with either BCL6 or MYC, we identify co-operating genetic alterations that promote growth or even full transformation into synthetically engineered DLBCL models. The resulting tumors can be expanded and sequentially transplanted in vivo, providing a scalable platform to test putative cancer genes and to create mutation-directed, bespoke lymphoma models.

D iffuse large B cell lymphoma (DLBCL) is the most common form of non-Hodgkin lymphoma. Although potentially curable with immunochemotherapy, up to 40% of patients succumb to their disease 1 . In an attempt to unravel the biological basis of DLBCL and to identify new therapeutic opportunities, several groups have recently reported large genomic studies [2][3][4] . These highlight the considerable genetic heterogeneity of DLBCL and identify hundreds of recurrently mutated genes, copy number alterations, and structural variants. Clusters of co-mutated genes suggest the existence of genetic subtypes of DLBCL that may behave differently when exposed to therapeutic agents. While the functional and mechanistic consequences of some of these genetic alterations have been established, for the majority we have little to no understanding of their contribution to lymphomagenesis. To translate these genomic findings into therapeutic progress, it is critical to understand the functional importance and therapeutic relevance of these genetic alterations, both individually and in combination.
Existing model systems used for the functional interrogation of lymphoma genetics consist predominantly of lymphoma cell lines and genetically modified mice. However, both have limitations; cell lines were often established from patients with end-stage, non-nodal or even leukemic phase lymphoma and carry an extensive and biased mutational repertoire, further selected over years or even decades of in vitro growth. Genetically engineered mice, on the other hand, are costly, time-consuming to generate, and therefore unsuitable for high-throughput or combinatorial experiments. Furthermore, the genetic requirements for tumorigenesis in mice do not always accurately reflect those in humans 5 . As such, the development of new, preclinical models of lymphoma that can capture its considerable genetic diversity has been identified as a priority area for lymphoma research 6 .
In common with many of the mature B cell malignancies, DLBCL is thought to arise from the germinal center (GC) stage of B cell differentiation 7,8 . An attractive solution would therefore be to use primary human GC B cells as a platform for ex vivo genetic manipulation. Equivalent approaches have proved fruitful for epithelial malignancies. However, technical difficulties associated with the ex vivo culture and genetic manipulation of human GC B cells, including high manipulation-associated cell toxicity and low transduction efficiency, have obstructed the exploitation of such models to study lymphoma.
Here, we describe an optimized strategy that facilitates proliferation and highly efficient transduction of non-malignant, primary, human GC B cells ex vivo. We show that combinations of oncogenes permit long-term culture in vitro, allowing the system to be used for high-throughput screening of oncogenes and tumor suppressors, and for the creation of genetically customized human lymphoma models that can be studied in immunodeficient mice.

Results
Ex vivo growth and transduction of primary human GC B cells. GC B cells are programmed to undergo apoptosis in the absence of survival signals from T follicular helper cells and follicular dendritic cells (FDC). Consistent with this, it is well-established that GC B cells perish rapidly if cultured unsupported ex vivo 9 . Previous attempts to support ex vivo growth of human GC B cells employed CD40 ligand (CD40lg)-transfected fibroblasts in combination with soluble cytokines including interleukin2 (IL2), IL4, and IL10 9,10 . Related strategies have used an FDC-like feeder cell, termed HK, that supported GC survival and allowed short-term proliferation when combined with CD40lg 11 . With the increasing appreciation of the importance of IL21 to GC B cell biology 12,13 , later systems have used HK feeder cells combined with CD40lg and IL21 (ref. 14 ). However, proliferation of GC B cells in all these systems was typically limited to a period of up to 10 days [9][10][11]14 .
We employed a similar system based upon a freshly established culture of modified HK cells, termed YK6 that were immortalized with TERT, P53dd, and CDK4 ( Supplementary Fig. 1a). Initial experiments suggested that membrane-expressed CD40lg in combination with IL21 facilitated robust stimulation of GC B cells ( Supplementary Fig. 1b). We therefore engineered our immortalized YK6 cells to express membrane human CD40lg and to secrete soluble IL21, termed YK6-CD40lg-IL21 (Supplementary Fig. 1c). We isolated primary GC B cells (CD38 + CD20 + CD19 + CD10 + ) from pediatric tonsil tissue (Fig. 1a), which when grown in co-culture with YK6-CD40lg-IL21 survived and proliferated vigorously for up to 10 days without a requirement for any additional cytokines (Fig. 1b, c, Supplementary . In line with previous observations in human B cells 15,16 we were unable to transduce human GC B cells with amphotrophic or VSV-G pseudotyped virus. Peripheral blood B cells have previously been transduced using virus pseudotyped with a Gibbon Ape Leukemia Virus (GaLV) envelope 17 , the receptor for which is SLC20A1 (ref. 18 ). RNA-Seq showed that human GC B cells express high levels of SLC20A1, but very low levels of the VSV-G receptor LDLR (Fig. 1d). Thus, we proceeded to test the GaLV viral envelope to transduce primary GC B cells. To permit lentiviral transduction, we generated a series of GaLV-MuLV fusion constructs based on previous reports 17,19 (Fig. 1e) and identified a fusion construct that permitted high efficiency transduction with both retroviral (Fig. 1f) and lentiviral (Fig. 1g) constructs of human primary GC B cells cultured on YK6-CD40lg-IL21 feeders. Interestingly, the GaLV envelopes also enabled the transduction of primary human DLBCL cells supported on YK6-CD40lg-IL21 cells ( Supplementary Fig. 1d).
Long-term expansion of human GC B cells ex vivo. We proceeded to use this culture-transduction system to introduce into human GC B cells oncogenes that are commonly deregulated in human lymphoma. Out of five genes tested, no single gene was able to prolong the survival of primary GC B cells cultured in our system (Fig. 2a, b). However, BCL2 when co-expressed with either MYC or BCL6 overexpression did lead to long-term expansion and survival of transduced GC B cells in culture. These cells continued to expand and proliferate vigorously in culture beyond 100 days. We also tested other transcription factors associated with the GC reaction, and their lymphoma-associated mutants, in combination with BCL2 in a pooled, competitive culture. This showed initial expansion of cells transduced with MEF2B Y69H, a mutation commonly found in DLBCL and follicular lymphoma 20 . However, by day 59, cultures were dominated by BCL6-transduced cells suggesting this as the transcription factor best able to promote long-term growth of GC B cells ex vivo (Fig. 2c, Supplementary Data 1). Flow cytometry after 10 weeks of culture showed that cells transduced with BCL2 and BCL6 maintained expression of surface markers reminiscent of GC B cells including CD19, CD20, CD22, CD38, CD80, and CD95 (Fig. 2d). Cells expressed both CD86 and CXCR4 markers, an immunophenotype intermediate between light and dark zone GC B cells (Fig. 2d). Cells transduced with BCL2 and MYC remained viable and proliferated but downregulated CD20 and CD19, consistent with differentiation towards plasmablasts ( Supplementary  Fig. 1e). The plasma cell marker CD138 was not expressed by either BCL2/MYC or BCL2/BCL6 transduced cells (Supplementary Fig. 1f). We compared gene expression profiles of freshly isolated and transduced GC B cells cultured ex vivo at early (5 days) and late (10 weeks) time points (Fig. 2e, Supplementary Table 1). As anticipated, this showed enrichment of a STAT3signature in cultured cells consistent with ongoing IL21 stimulation. While freshly isolated GC B cells were enriched for expression of centroblast genes, the cultured and transduced cells adopted a gene expression profile more similar to that of centrocytes, consistent with ongoing CD40 stimulation. Importantly, the centrocyte is the stage of GC differentiation most similar to DLBCL 21 . Transcriptome analysis was also compared with that of six cell lines commonly used as models of GC-derived lymphomas, including the main subtypes of DLBCL and Burkitt lymphoma. When compared to a signature of GC-expressed genes (GCB-1) 22 , long-term BCL6-transduced cells clustered more closely with GC B cells than did the cell lines (Fig. 2f,  Supplementary Fig. 1g).
Overall, these results suggest that transduced primary human GC B cells can be cultured long-term ex vivo, retaining characteristics of the initial GC B cell that are shared with DLBCL cells. This represents a valuable model system for the functional interrogation of genes involved in GC lymphomagenesis.    Screening putative tumor suppressor genes in GC B cells. We wished to use the system for the high-throughput study of putative tumor suppressor genes (TSGs) in lymphoma. We hypothesized that many tumor suppressor pathways are already inactivated in lymphoma cell lines, and as such, primary GC B cells should be a more sensitive platform to identify a competitive growth or survival advantage following TSG inactivation. Robust expression of Cas9 was achieved using a stable Cas9 retroviral packaging line ( Supplementary Fig. 2a) and initial experiments confirmed efficient gRNA-directed targeting in primary, human, GC B cells ex vivo ( Supplementary Fig. 2b, c). We therefore created a lymphoma-focused CRISPR gRNA library composed of 6000 gRNAs targeting a total of 692 genes reported to be mutated or deleted in human lymphoma, along with 250 non-targeting control guides. Each gene was targeted by up to nine gRNAs ( Supplementary Fig. 2d) and deep sequencing revealed that 99% of gRNAs were within four times of the mean frequency (Fig. 3a). The library was transduced into primary GC cells shortly after their transduction with BCL2, BCL6, and Cas9 cDNAs (experimental scheme of the CRISPR screening shown in Fig. 3b). Cas9 and gRNA constructs were marked with fluorescent proteins to allow selection to be visualized by FACS. While Cas9 and gRNA dual infected cells comprised only 10% of all cells at day 4, this population expanded to 90% by day 88 of culture ( Supplementary  Fig. 2e), suggesting strong selection for one or more of the library gRNAs. Genomic DNA was sequenced at intervals and a CRISPR gene score was generated for each gene (Fig. 3b). Genes that showed the greatest enrichment during culture over 10 weeks included well-established tumor suppressors such as TP53, CDKN2A, and PTEN (Fig. 3c), thus validating the ability of our system to detect bona fide TSGs. However, the greatest enrichment was seen for GNA13 (Fig. 3c), which encodes the G    Fig. 3a). Enrichment for GNA13 was also seen in two further replicate screens performed in separate tonsil donors (Supplementary Data 2). We saw remarkable consistency of enriched genes across the three replicate screens, across which the twelve most enriched genes were GNA13, TP53, CDKN2A, ATRX, NFKBIA, ZFP36L1, ZNF281, PTEN, FBXO11, FUBP1, S1PR2, and NFKBIE (Fig. 3d).
To determine whether the oncogenic backbone used would influence the co-operating TSGs enriched, we repeated the CRISPR screen using primary GC B cells transduced with BCL2 and MYC. Interestingly, a different profile of TSGs was enriched on this genetic background with much weaker enrichment of GNA13 ( Supplementary Fig. 3b, Supplementary Data 2). Instead, some of the most enriched gRNAs in the context of MYC overexpression targeted members of the ZFP36 family of RNAbinding proteins, previously demonstrated to oppose cellular transformation in a mouse model of MYC-induced lymphoma 23 .
To compare the ability of our culture system to identify TSGs to that of established cell lines, we performed a parallel screen using the lymphoma cell line HBL1 ( Supplementary Fig. 3c, Supplementary Data 2) and compared data from recent published CRISPR screens 24 (Supplementary Fig. 3d). In these cell line experiments, enrichment of gRNAs targeting TSGs was much more modest. This highlights the unique potential of our primary GC culture system to identify genetic changes associated with enhanced growth and survival; a phenotype that is difficult to identify using heavily mutated cell lines, already optimized for in vitro growth.
GNA13 depletion enhances survival of GC B cells. The striking enrichment of GNA13 in all three BCL6-based screens prompted us to examine this pathway further. Inactivating mutations of GNA13 are common in DLBCL and BL 2,25 but rare in other forms of cancer, where, in contrast, amplification may be more common ( Supplementary Fig. 3e) 26,27 . Progressive enrichment was seen for eight out of nine gRNAs targeting GNA13 over different timepoints, with equivalent of greater enrichment to that seen for TP53 and CDKN2A (Fig. 4a). All GNA13 gRNAs led to effective depletion of GNA13 ( Supplementary Fig. 3f), apart from one which was associated with presumed off-target toxicity and further confirmed in a cell line ( Supplementary Fig. 3g). GNA13 acts downstream of the G-protein coupled receptors S1PR2 and P2RY8 and enrichment for both genes was observed in our screens (Fig. 3c). Mouse knockout studies have suggested that suppressed activity of this pathway in lymphoma may allow egress from the GC and increase cell survival secondary to enhanced AKT activity 25,28 . In contrast, other studies suggest a pro-survival effect in DLBCL that is independent of AKT activity 29 . We therefore quantified pAKT levels in ex vivo GC B cells transduced with gRNAs targeting GNA13, PTEN, or nontargeting controls, and co-cultured on YK6-CD40lg-IL21 feeder cells (Fig. 4b). Although pAKT was increased in PTEN-depleted cells, no increase was seen in GNA13-depleted cells. However, GNA13 depletion did lead to a marked reduction in apoptosis in cultured primary GC cells (Fig. 4c), but no change in cell proliferation ( Supplementary Fig. 3h). This confirms AKT-independent, enhanced cell survival as the likely explanation for the competitive advantage seen following GNA13 depletion in this culture system.
Mutation-directed, in vivo models of human lymphoma. To examine the ability of the culture-transduction system to recapitulate lymphomagenesis in vivo, we transduced primary, human GC B cells with combinations of oncogenic alterations commonly found in DLBCL ( Supplementary Fig. 4a) and injected them in Matrigel into immunodeficient mice (Fig. 5a). Although sufficient for long-term, feeder-dependent growth in vitro, transduction with BCL2 and BCL6, with or without the addition of a dominant negative TP53 (P53dd) 30 , was insufficient for tumor formation in vivo. However, the addition of a fourth oncogene (BCL6, BCL2, P53dd, and CCND3) led to tumor formation with a median of 112 days (Fig. 5a). The combination of MYC, BCL2, and P53dd led to tumor formation with a median of 111 days and the combination of MYC, BCL2, and BCL6 resulted in tumor formation with a median of 108 days. The most potent combination tested: MYC, BCL2, P53dd, and CCND3 resulted in tumor formation in all mice within 38 days. Notably, tumors engrafted with a 100% penetrance and could be derived from multiple donors, excluding the possibility of donor-derived occult mutations contributing to transformation. Flow cytometry showed cells to be strongly positive for markers of all transduced oncogenes, suggesting potent selection during tumorigenesis (Fig. 5b).
Histological examination revealed diffuse sheets of medium to large, atypical lymphoid cells with frequent mitoses, closely mimicking the appearances of human high-grade B cell lymphoma (Fig. 5c, Supplementary Figs. 4b and 5). Immunoblastic and Burkitt-like appearances were seen in some tumors. Immunohistochemistry showed expression of the B cell markers CD19, CD20, CD79A, and PAX5 in the majority of tumors (Fig. 5c, Supplementary Figs. 4b and 5). In contrast to our CDKN2A in vitro observations where cells transduced with MYC but not BCL6 downregulated expression of CD20, most MYC-driven tumors expressed strong surface CD20. The GC marker CD10 was expressed in approximately half of tumors (Fig. 5c, Supplementary Fig. 4b). Importantly, all tumors were negative for EBER (ISH) confirming that latent EBV genes did not contribute to lymphomagenesis in these tumors. Western blot and RNA-Sequencing confirmed continued expression of the oncogenic backbone ( Supplementary Fig. 6a, b). Harvested tumor cells could be expanded in vitro and serially retransplanted back into immunodeficient mice ( Supplementary Fig. 6c), thus functioning as a robust, scalable model system.
To establish the similarity of our synthetic tumors to subsets of bona fide human lymphomas, we sequenced the transcriptome of   Fig. 7a).
To establish the clonality of tumors, we performed deep sequencing of PCR-amplified immunoglobulin heavy chain variable gene regions to assess the percentage of unique BCR sequences in each sample. This revealed that clonality was increased in primary tumor samples compared to the original donor cells and was also increased in retransplants compared to primary tumors (Fig. 7a). BCR network plots showed that tumors with four oncogenic hits were polyclonal (Fig. 7b). In contrast, cells transduced with just three oncogenic hits, which formed tumors with a longer latency, were clonal (Fig. 7b). This suggests that the combination of four oncogenic events (MYC, BCL2, CCND3, and P53dd) is by itself sufficient for transformation of human GC B cell. In contrast, further oncogenic events are required for lymphomagenesis in cells transduced with just three of the above constructs. To identify these co-operating oncogenic events, we performed targeted sequencing using a hematological malignancy panel of 292 genes. A subclonal NRAS G13A mutation (VAF 0.03) was detected in the oligoclonal tumor arising from MYC, BCL2, P53dd transduced cells (Fig. 7c). This mutation became clonal when retransplanted into secondary recipients confirming its role in the pathogenesis of those tumors (Fig. 7c). Mutation at this codon has been reported previously in DLBCL 35 as have other activating mutations of NRAS 2 . We observed copy number increase for the experimentally transduced gene BCL6 (Supplementary Fig. 7b) but saw no evidence of any significant aneuploidy in any tumor (Supplementary Fig. 7c). In the polyclonal tumors, subclonal mutations with VAF < 0.05 were detected in several genes commonly mutated in DLBCL including a frameshift variant in S1PR2 and missense mutations in GNA13, NOTCH2, CREBBP, EP300, SOCS1, and BCL6 (Fig. 7c) (Supplementary Data 3). The significance of these mutations to tumor formation is uncertain; however, some of these genes are typical targets of aberrant somatic hypermutation suggesting the possibility of ongoing somatic hypermutation in these lymphomas. To investigate this possibility, we analyzed the variable region sequence of dominant clones detected in the IgH clonality assay. As expected, given their GC origin, almost all clones showed evidence of diversification from the germline V gene sequence (Fig. 7d, e). Importantly, however, clones also showed evidence of ongoing diversification of the hypervariable regions (Supplementary Table 2). This suggests that AID-mediated somatic hypermutation remained active during the process of tumor formation. In addition, analysis of synthetic tumors showed varied expression of the IgH constant region genes across different tumors, with strong expression in many tumors of IgG and IgA transcripts suggesting that in these tumors, classswitching had occurred before or during tumor development ( Supplementary Fig. 7d, e).
Overall, the ability of these tumors to closely recapitulate the appearances of high-grade B cell lymphoma further validates the biological relevance of this system to the study of human lymphoma and provides the opportunity to generate mutationdirected, bespoke in vivo lymphoma models.

Discussion
The plethora of genomic information generated from nextgeneration sequencing studies has left us with a need for new experimental systems in which to study the genetics of human lymphoma and to decipher these rich data resources. The availability and suitability of current preclinical models is recognized as a rate-limiting step in translating genomic knowledge into patient benefit 6 . The cell of origin of most aggressive B cell lymphomas, including DLBCL and BL, is the GC B cell 7,8 . We therefore reasoned that non-malignant, human GC B cells should be the input for a system to create genetically defined models of human lymphoma. We describe an optimized system for the culture and transduction of primary, human GC B cells ex vivo. This relies on the provision of microenvironmental survival signals common to that of the GC, as well as the overexpression of combinations of oncogenes common to the pathogenesis of human lymphoma. In particular, this includes BCL6, a transcription factor central to the GC reaction as well as an established oncogene in GC-derived lymphoma. A related strategy has been employed previously to expand peripheral blood memory B cells for the purposes of monoclonal antibody engineering 36 . Here, we use genetically altered human, primary, GC B cells for the functional investigation of lymphoma genetics to generate synthetic, in vivo, human models of lymphoma.
A major advantage of using primary GC B cells over established lymphoma cell lines is the ability to investigate defined genetic alterations on a genetically normal background. In particular, this provides a sensitive platform for investigating the ability of specific genetic alterations to increase survival and proliferation. An enhanced oncogenic phenotype is much harder to discern in cell lines where the mutational repertoire is likely to have evolved extensively for optimal in vitro growth. The superior sensitivity of this system, compared to cell lines, to detect alterations associated with increased growth or survival is evidenced by the strong enrichment for TSGs in our CRISPR screen when compared to conventional cell lines. Of the 12 most enriched genes (TP53, GNA13, CDKN2A, ATRX, NFKBIA, ZFP36L1, ZNF281, PTEN, FBXO11, FUBP1, S1PR2, and NFKBIE), the majority are associated with a tumor suppressor function in lymphoma, established in the literature either from evidence of recurrent genetic inactivation or from their ability to inhibit cancer-promoting pathways 2,7,25,37 . The next 24 most enriched genes included TET2, TSC1, GSK3B, RB1, CDKN2B, P2RY8, and SOCS1, also implicated as TSGs. Thus, the most enriched genes contained a predominance of recognized TSGs. Although our experiment was not designed for detection of drop-outs, it is notable that the two most depleted genes were those targeting POU2AF1 and MYC, both well-established oncogenes in GC lymphomas 38,39 .
Notable absentees from the genes enriching in our CRISPR screens were the histone modifiers CREBBP, EP300, and KMT2D. These genes show very frequent inactivating mutations in DLBCL and follicular lymphoma [40][41][42][43] . These mutations are almost always clonal, suggesting that they arise at an early stage of lymphomagenesis, potentially before the GC stage. Interestingly, mouse  BMF  SH3BP5  BLNK  IL16  IRF4  PIM1  CCND2  ENTPD1  FUT8  PTPN1  ETV6  LMO2  NEK6  DENND3  MME  BCL6  LRMP  MYBL1  ITPKB  SMARCA4  SLC35E3  SSBP2  RGCC  BMP7  BACH2  RFC3  DLEU1  TERT  TCF3  ID3  LEF1  TUBA1A  MDFIC  S100A11  BCL2A1  NFKBIA  FNBP1  CTSH  CD40  STAT3  CD44  CFLAR  BCL3  FAM216A  MYC  SLC25A27  ALOX5  UQCRH  SNHG7  TNFSF8  LINC00957  PEG10  RPL6  CD80  SEMA7A  ANKRD33B  NCOA1  DGKG  ALS2  LTA  FCRL5  EBI3  IL21R GCB   [44][45][46][47][48] . Therefore, one potential explanation for why these genes do not enrich in our screens is that during lymphomagenesis, the predominant biological effect of these mutations is exerted prior to the GC stage. In contrast, the mutant genes enriched in our screens may reflect those that have the greatest effect in a GC B cell. Interestingly, some of the top hits from our screen show similarity to those associated with transformation of follicular lymphoma into high-grade lymphoma (GNA13, CDKN2A, TP53, P2RY8, S1PR2) 42 . We speculate that this might be consistent with our screen detecting those mutations that provide proliferation or survival advantage to an already "corrupted" GC B cell. In developing lymphoma cells, this corruption might reflect pre-GC mutations of CREBBP, while in our screen this oncogenic corruption could be provided by the forced expression of BCL2, BCL6, or MYC. The most striking finding from our CRISPR screens was the potent enrichment of gRNAs targeting GNA13, as well as its upstream receptors S1PR2 and P2RY8. Inactivating mutations of GNA13 are common in lymphoma, but rarely seen in other forms of malignancy. Indeed, amplification is more common in solid organ cancers, where GNA13 is generally considered to act as an oncogene 49 . Thus, its enrichment in our screens reinforces the specificity of this system to the pathogenesis of GC lymphomas. Previous mouse studies have proposed roles for GNA13 in the migration of GC B cells but reached differing conclusions in its ability to regulate AKT 25,28,29 . Our data reveal an AKTindependent effect in the regulation of survival of human GC B cells, a finding consistent with the greater enrichment of gRNAs targeting GNA13 over those targeting PTEN in our screens.
The enrichment of GNA13 across experiments performed in BCL2/BCL6-transduced GC B cells from three separate human tonsil donors was remarkably consistent. However, when BCL6 was removed from the oncogenic backbone and replaced with MYC, we no longer saw strong enrichment for depletion of GNA13. This fits with the distribution of GNA13 mutations in human DLBCL, which are found predominantly in the EZB subtype described by Schmitz   proteins. This finding is consistent with existing biological knowledge of these proteins, which negatively regulate cell cycle 37 and have been demonstrated to oppose cellular transformation in a mouse model of MYC-induced lymphoma 23 . These findings highlight the potential to introduce further changes to the backbone combination in order to study synergy between different sets of cancer genes. We envisage future studies may also remove or replace components of the feeder-based stimulation, for instance to identify factors promoting cytokine-independent growth. The selective pressure imposed could be further altered by the use of pharmacological inhibitors of specific pathways. Future studies might also employ mutant open reading frame (mORF) screens or targeted CRISPR gene editing to introduce specific mutations into endogenous loci.
The relevance of this culture system to the pathogenesis of human lymphoma is underscored most strongly by the ability to recapitulate the appearances of human high-grade B cell lymphoma when cells are engrafted into immunodeficient mice. Notably, our data reveal that the transformation of a human GC B cell appears to require a minimum of four oncogenic hits. The oncogenic backbones used in these experiments employed combinations of BCL2, BCL6, MYC, TP53, and CCND3, widely accepted as common lymphoma driver genes. RNA-Seq revealed how the oncogenic backbone affected the transcriptional profile, with tumors appearing more GCB-like when BCL6 was included in the oncogenic backbone. Consistent with the enforced expression of BCL2 and MYC, we saw strong enrichment for signatures of double hit lymphoma 33,34 , a subtype of lymphoma characterized by translocation of both MYC and BCL2 and associated with a particularly poor clinical outcome. We anticipate that future studies will further alter the oncogenic backbone to model specific disease subtypes for functional analysis and preclinical drug testing.
The complex genetic heterogeneity of human lymphoma is becoming increasingly evident [2][3][4] . It is clear that the repertoire of available cell lines does not adequately represent each of the many molecular subtypes predicted from the analysis of sequencing studies. Therefore, the ability to generate mutation-directed tumors in vivo provides an attractive route for patientpersonalized preclinical models. A particular advantage over tumor-derived xenograft models is the ability to create paired, syngeneic controls; tumors that are genetically identical other than the presence or absence of a specific mutation. Similar approaches to culture and manipulate human primary cells are proving successful for some solid organ malignancies. However technical limitations have precluded this in B cell lymphoma. We present an extensively optimized, yet inexpensive strategy to employ primary, human, GC B cells for the investigation of lymphoma genomics and to generate bespoke, in vivo models of human lymphoma. This addresses an important bottleneck in translating lymphoma genomic findings into functional understanding that can drive improved patient outcomes and personalized therapy.
The custom CRISPR library and single gRNAs were cloned into pKLV2-U6gRNA-Bbsi-PGK-GFP, which was modified from pKLV2-U6gRNA5(Empty)-PGKBFP2AGFP-W. pKLV2-U6gRNA5(Empty)-PGKBFP2AGFP-W was a gift from Kosuke Yusa (Addgene plasmid # 67979; http://n2t.net/addgene: 67979; RRID:Addgene_67979) 53 . Pooled oligos for construction of the lymphoma-focused CRISPR library were obtained from TWIST Bioscience and oligos for single gRNAs were obtained from IDT. To make the GaLV-MuLV fusion envelope constructs, pHIT123 54 (kind gift of Prof Markus Muschen, City of Hope, Los Angeles, CA) containing the retroviral ecotropic envelope, human cytomegalovirus immediateearly promoter and the origin of replication from simian virus 40 was used as the backbone. The viral envelopes GaLV_WT, GaLV_MTR, and GaLV_TR were based on the SEATO strain of GaLV (NP_056791). GaLV_MTR and GaLV_TR contain the 3′ GaLV envelope sequence replaced by the MuLV transmembrane region, cytoplasmic region, and R peptide region and the MuLV cytoplasmic region and R peptide region, respectively 19 . All sequences were purchased from IDT as synthetic double-stranded DNA and inserted by Gibson assembly. All plasmids were verified by capillary sequencing. Lenti-X 293 T Cell Line (Clontech Laboratories, 632180) were cultured in Dulbecco's modified Eagle's medium (DMEM, Invitrogen, Carlsbad, CA) containing 10% FBS, 100 IU/ml penicillin, and 100 µg/ml streptomycin and kept at 37°C in a humidified incubator (5% CO 2 and 95% atmosphere). All cell lines used in this study were confirmed to be free from mycoplasma contamination and identity was verified using a 16-amplicon multiplexed copy number variant fingerprinting assay 24 .
Construction of YK6-CD40Lg-IL21 feeder line. Discarded human tonsil tissue was obtained after a routine tonsillectomy and handled in accordance with an IRBapproved protocol (2013-0864) at the Asian Medical Center, Seoul, South Korea. The requirement for informed consent was waived by the institutional review board because there was no additional risk to the subjects and all identities were anonymized and completely delinked from unique identifiers. FDC were extracted from tonsils following an established protocol for the creation of HK FDC-like feeder cells 11 . Following mechanical disruption and enzymatic digestion, the released cells were collected and subjected to Ficoll gradient centrifugation for 20 min at 2200 r.p.m. The interface layer that contains FDC was then collected. The cells were resuspended in RPMI 1640 medium and centrifuged at 200 r.p.m. for 10 min at 4°C over a discontinuous gradient of 7.5% and 3% bovine serum albumin (BSA; A9418; Sigma-Aldrich, St. Louis, MO, US). FDC-enriched fractions were collected from the interface. Cells were washed with HBSS and cultured on tissue culture dishes. Cells isolated and culture after these procedures initially contained large adherent cells with attached lymphocytes. Non-adherent cells were removed and adherent cells replenished with fresh medium every 3-4 days. Adherent cells were trypsinized when confluence was attained. Because of the limited growth in culture, FDC-like cells were immortalized (now termed YK6) through retroviral transduction with pBABE_TERT.Hygro, P53DD_Thy1.1 and CDK4_R24C_Thy1.1. Immortalized YK6 cells were further transduced with hCD40Lg-Puro and IL21-LyT2.
Retroviral and lentiviral production. Retroviral packaging plasmids pHIT60 (kind gift of Dr. Louis Staudt, National Cancer Institute, USA) and GaLV WT were used as follows: 1 μg pHIT60 (gag-pol), 1 μg GaLV WT (envelope), and 4 μg of a retroviral construct was used to transfect each 10 cm 2 dish of HEK-293T, after mixing with 1 ml of Opti-MEM media (Invitrogen) and 18 μl Generation of lymphoma-focused CRISPR guideRNA library. gRNA sequences were based upon two recent genome-wide libraries 53,57 . Position one of 20 set to G for all gRNAs. Appropriate overlapping sequences (underlined) for Gibson Assembly into the gRNA expression plasmid pKLV2_U6gRNA_Bbsi_PGK_GFP (modified from Addgene #67979) were appended to all 6000 gRNAs. A 70-mer oligo pool was purchased from TWIST BIOSCIENCE as follows: 5′-TATCTTGTGGAAAGGACGAAACACCG-N 19 -GTTTAAGAGCTATG CTGGAAACAGC-3′ N 19 represents each of the 6000 gRNA sequences. The single-stranded oligo pool was converted to double-stranded DNA by PCR amplification using Q5 Hot Start High-Fidelity 2X Master Mix (NEB) with 3 ng of the oligo pool as a template and primers (Zhang_F and Zhang_R_modified). The following PCR conditions were used as follows: 95°C for 2 min, 10 cycles of 95°C for 20 s, 60°C for 20 s, and 72°C for 30 s, and the final extension, 72°C for 3 min. Primer sequences are as follows: Zhang_F 5′-GTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGA AAG GACGAAACACC-3′ Zhang_R_modified 5′-ACTTTTTCAAGTTGATAACGGACTAGCCTTATTTAAACTTGCTA TGC TGTTTCCAGCATAGCTCTTAAAC-3′ The 150 bp PCR product was gel purified from a 2% Agarose gel using the Gel Extraction kit (Qiagen) and eluted in 20 μl EB Buffer. Four Gibson Assembly reactions were performed using 14.4 ng of the purified 150 bp fragment and 200 ng of the BbsI-digested pKLV2_U6gRNA_Bbsi_PGK_GFP with Gibson HiFi DNA Assembly Master Mix (NEB). Gibson Assembly reactions were pooled and column-purified using MinElute PCR purification kit (Qiagen). Eight electroporations were performed using 1 μl of the purified Gibson reaction and 20 μl of Endura Competent Cells (Lucigen). The mixture was transferred to a 0.1 cm cuvette and electroporated at 1.8 kV. Immediately after, 2 ml of prewarmed SOC media was added to each reaction and placed on a shaker at 37°C for 1 h. The electroporated cells were combined and plated onto 16 24.5 cm 2 LB + ampicillin agar plates using ColiRollers Plating Beads (Merck Millipore). Plates were left at 30°C overnight and plasmid DNA was purified using a Plasmid Maxi kit (Qiagen).
Transduction of CRISPR library and generation of gRNA sequencing libraries. GC B cells were transduced with the backbone oncogene cocktail and Cas9-BFP retrovirus until Cas9-BFP reached between 50% and 80%. The number of cells transduced with gRNA library was adjusted to take account of the percentage of Cas9-expressing cells and target MOI of 0.3 in order to maintain representation of >1000× the size of the library. Four days after transduction, BFP and GFP expression was analyzed by flow cytometry and at each harvest timepoint going forward. A minimum of 1000× representation was maintained at each passaging step. Cells were harvested every 14 days. Genomic DNA extraction was conducted as described previously 58 which is as follows: 600 μl Lysis Buffer and 15 μl Proteinase K were used to re-suspend cells and left at 65°C for 15 min. Depending on cell pellet, Lysis Buffer and Proteinase K volume was scaled up and incubated until pellet completely lysed. Isopropanol was used to precipitate genomic DNA and further re-suspended in TE Buffer. Illumina sequencing was performed as follows 53,59 . For sequencing of all gRNAs in the CRISPR library, primers (gLibrary-HiSeq_50bp-SE-U1 and −L1) were used to amplify the region containing the gRNA. Primer sequences are as follows: For this PCR, it is crucial to use sufficient genomic DNA to capture every gRNA in the cell population. This depends on the complexity of populations to be analyzed. For human cells, 10 6 cells contain around 6.6 μg of genomic DNA (assuming normal copy number). Therefore, the number of cells (millions) harvested × 6.6 μg will correspond to the amount of genomic DNA needed in the first PCR. For example, if a cell population was 30% double positive for CAS9-BFP and CRISPR library-GFP, then 20 × 10 6 cells were harvested to achieve 1000× coverage (6000 guides × 1000 coverage/0.3). In this case 131 μg (20 × 6.6) of genomic DNA was used in the first PCR, with a maximum of 10 μg per 50 μl reaction. Therefore, 13 independent PCR reactions were performed using 10 μg of genomic DNA per reaction with Q5 Hot Start High-Fidelity 2x Master Mix. The following PCR conditions were used: 98°C for 30 s, 20-24 cycles of 95°C for 10 s, 61°C for 15 s, and 72°C for 20 s, and the final extension, 72°C for 2 min. Five microliters from each PCR reaction were run on 2% Agarose gel and PCR was run for a few more cycles if there was no PCR product or PCR bands were still faint. Next, 10 μl from each individual PCR reaction per sample was taken, pooled and purified using QIAquick PCR Purification Kit (Qiagen). DNA was eluted in 50 μl EB buffer (Qiagen) and concentration was quantified on the nanodrop. In the second PCR, nextgen sequencing adaptors (P5, P7) compatible with Illumina's HiSeq4000 and a barcode were added. One nanogram of the purified PCR product was used with NEBNext Q5 Hot Start HiFi PCR Master Mix with the following conditions: 98°C for 30 s, 9-12 cycles of 98°C for 10 s, 65°C for 75 s and the final extension, 65°C for 5 min. Forward primer named Indexing Adapter PE 1.0 and different reverse indexing primers (iPCRtagT1-56) were used in this second PCR. A different reverse indexing primer was used for each sample. Primer sequences are as follows: Five microliters from each PCR reaction were run on 2% agarose gel and checked for visible PCR bands. The PCR products were purified with Agencourt AMPure XP beads in a PCR-product-to-bead ratio of 1:0.7 and eluted in 30 μl EB Buffer (Qiagen). The purified libraries were quantified, pooled, and sequenced on Illumina HiSeq4000 by 50-bp single-end sequencing. Two custom sequencing primers were used here: iPCRtagseq which reads through the indices and U6-Illumina-seq2 which reads through the gRNA sequence. Enriched gRNAs were defined based upon enrichment relative to the plasmid pool counts. Purified libraries were quantified, pooled, and sequenced on Illumina HiSeq4000 by 50-bp single-end sequencing with the following primers: iPCRtagseq 5′-AAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTC-3′ U6-Illumina-seq2 5′-TCTTCCGATCTCTTGTGGAAAGGACGAAACACCG-3′ Computational analysis of CRISPR screens. Raw reads were normalized to a total number of reads in a sample as follows: denotes the raw sequencing reads of gRNA i of gene g at time t in replicate r. For each gRNA the Z-score of log 2 fold change between plasmid library and late sample, Z igr is given by Finally, CRISPR score g , which represents the magnitude and direction of a fitness of a gene g between the two time points is where L g denotes the number of sgRNA of gene g in replicate r and R is the number of available replicates.
RNA-sequencing. Total RNA from cells was extracted using NucleoSPIN RNA from Macherey-Nagel and cDNA was produced from 500 ng of total RNA using qScript TM cDNA SuperMix (Quanta Biosciences Computational analysis of barcoded overexpression experiments. Relative abundances F ictr of a construct in a pooled competitive culture were computed as follows: where N ictr denotes the raw sequencing counts of clone i of constructs c at time t in replicate r. The average relative abundance of construct c at time t is given by where L ct denotes the number of clones of construct c at time t and R is the number of available replicates. BCR amplification. PCR amplification of DNA from synthetic lymphoma tumors (100 ng input) was performed with 1 μl of JH reverse primer (10 μM) and 1 μl of FR1 forward primer set pools ( for Illumina MiSeq platforms following the manufacturer instructions. MiSeq reads were filtered for base quality (median Phred score > 32) using the QUASR program (http://sourceforge.net/projects/quaasi/) and for length (300 bp paired-end) 64 . The computational pipeline MRD Assessment and Retrieval Code in Python (MRDARCY) was then used to analyze BCRs, followed by secondary rearrangement analysis in which the relative frequencies of each IgHV gene were determined by BLAST using the ImMunoGeneTics (IMGT) reference gene database. The following primers were used: High-throughput sequencing and analysis of heavy chain immunoglobulin.
Deep sequencing of PCR amplified immunoglobulin heavy chain variable gene regions and BCR network generation algorithm and network properties were performed as follows 63 . Each vertex represents a unique sequence, where relative vertex size is proportional to the number of identical reads. Edges join vertices that differ by single-nucleotide non-indel differences and clusters are collections of related, connected vertices. Ig gene usages and sequence annotation were performed in IMGT V-QUEST, where repertoire differences were performed by custom scripts in Python.
For the visual representations of the BCR repertoires, BCR network subsampling was performed using the cluster-enforced linkage sampling (CC) method to preserve the overall clonal structure. Briefly, the CC algorithm employs three steps to account for loss of connectivity between vertices in clusters during sampling: Vertex selection: Vertices were reselected until the number of desired clusters in the original network G are represented.
Cluster-vertex migration: For each cluster in the original network which contains more than one vertex that was sampled, vertices were reselected such that the cluster connectivity is retained in the sampled network.
Induced graph formation: Graph induction selects the set of edges (Es) to be included in the sampled graph. Total graph induction is used in CC, selecting all edges incident on the sampled vertices are included in the sampled graph.
This process was repeated 20 times, and the subsample that most closely represented the true (unsampled) maximum cluster size was retained and plotted IGHV gene editing analyses were performed in a similar manner to. For all BCRs, stem regions were identified (defined as N-IgHD-N-IgHJ regions starting 3 bp downstream of the IgHV gene boundary). The number of unique BCR sequences sharing stem regions but with different IgHV gene usage (>95% difference in sequence identity in the IgHV region) and with different 5′ of the junctional region (defined as IgHV(last 3pb)-N-IgHD-N-IgHJ) was determined and compared to the total number of unique BCRs to give the percentage IgHV replacement. Sequences with joining regions (N-IgHD-N-IgHJ regions) shorter than eight nucleotides were excluded from this percentage due to potential of germline encoded receptors.
Mutation analysis. To identify somatic mutations across synthetic lymphoma tumors a hybrid-capture platform was used with a bait set 58 (SureSelect, Agilent, UK, ELID # 0731661) of 292 genes frequently mutated in hematological malignancies. After hybridization-based sequence enrichment (SureSelect HSXT , Agilent), high-throughput sequencing was performed on the Illumina HiSeq 4000 platform.
Sequencing data alignment. DNA sequencing reads were aligned to the GRCh37d5 according to the workflow described at Samtools webpage (http://www. htslib.org/workflow/) as follows: For mapping the data to a given reference genome BWA-MEM 65 0.7.17, followed by Samtools 66 1.9 for cleaning up read pairing information and flags on SAM records. For improvement of the mapped data Broad's GATK 67 Realigner 3.8.1 was used in order to reduce the number of miscalls of INDELs, followed by Picard 2.18.25 (http://broadinstitute.github.io/picard) for identifying duplicates.
Variant calling for substitutions and indels. Single base substitutions and short insertions and deletions were called using GATK 67 4.1 Mutect2 based on the tutorials available at Broad Institute website (https://gatkforums.broadinstitute.org/ gatk/discussion/11136/how-to-call-somatic-mutations-using-gatk4-mutect2). The mutant variants were annotated using Variant Effect Predictor 68 from ENSEMBL version 95.
FACS (fluorescence-activated cell sorting). Cells were stained with fluorophorelabeled antibodies in 2% BSA in PBS according to the manufacturer's instructions.
The stained/or unstained cells were analyzed on the LSRII (BD). For cell counting, CountBright Absolute Counting Beads (ThermoFisher) were used according to the manufacturer's instructions and analyzed on the LSRII (BD). For dead cell apoptosis analysis, APC-conjugated Annexin V/Dead Cell Apoptosis Kit (BioLegend 640930) was used for the detection of apoptotic cells according to the manufacturer's instructions. Externalization of phosphatidylserine (Annexin V, APC Conjugate; BioLegend, 1:20) and DNA content (7-AAD; BioLegend,1:20) were measured and gating on all cells was used for further analysis. Cell cycle analysis was performed using the Vybrant® DyeCycle™ Ruby Stain (ThermoFisher V10309, 1:500, Final stain concentration 5 μM) according to the manufacturer's instructions. Cells were treated with Nocodazole (1 μg/ml) 24 h prior to staining. Stained cells were analyzed by gating on cells in the G2 phase using FlowJo software.
Intracellular staining of phosphorylated AKT was performed as follows: Cell suspension and pre-warmed Fixation Buffer (BD Cytofix) was gently mixed in a 1:1 ratio and incubated at 37 o C for 15 min. Cell suspension was pelleted and washed with PBS twice at 350g for 5 min. Ice-cold True-Phos perm buffer (BD Cytofix) was added dropwise to the cell pellet while vortexing, followed by incubation at −20 o C for at least 60 min. Cells were further washed twice and resuspended in FACS buffer (PBS + 2% FBS) containing the appropriate antibody at a dilution of 1:50 (Phospho-Akt Ser473, Cell Signaling, #11962). After staining for 30 min, cells were washed and resuspended in FACS buffer followed by analysis on the LSRII (BD).