Abstract

Tumor-derived cell lines have served as vital models to advance our understanding of oncogene function and therapeutic responses. Although substantial effort has been made to define the genomic constitution of cancer cell line panels, the transcriptome remains understudied. Here we describe RNA sequencing and single-nucleotide polymorphism (SNP) array analysis of 675 human cancer cell lines. We report comprehensive analyses of transcriptome features including gene expression, mutations, gene fusions and expression of non-human sequences. Of the 2,200 gene fusions catalogued, 1,435 consist of genes not previously found in fusions, providing many leads for further investigation. We combine multiple genome and transcriptome features in a pathway-based approach to enhance prediction of response to targeted therapeutics. Our results provide a valuable resource for studies that use cancer cell lines.

  • Subscribe to Nature Biotechnology for full access:

    $250

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Accessions

Primary accessions

ArrayExpress

Referenced accessions

Gene Expression Omnibus

References

  1. 1.

    , & Cell line-based platforms to evaluate the therapeutic efficacy of candidate anticancer agents. Nat. Rev. Cancer 10, 241–253 (2010).

  2. 2.

    et al. An information-intensive approach to the molecular pharmacology of cancer. Science 275, 343–349 (1997).

  3. 3.

    et al. A gene expression database for the molecular pharmacology of cancer. Nat. Genet. 24, 236–244 (2000).

  4. 4.

    et al. The exomes of the NCI-60 panel: a genomic resource for cancer biology and systems pharmacology. Cancer Res. 73, 4372–4382 (2013).

  5. 5.

    et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

  6. 6.

    et al. Signatures of mutation and selection in the cancer genome. Nature 463, 893–898 (2010).

  7. 7.

    et al. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722–729 (2008).

  8. 8.

    et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).

  9. 9.

    et al. Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events. Genome Res. 22, 2315–2327 (2012).

  10. 10.

    et al. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 10, 515–527 (2006).

  11. 11.

    et al. Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat. Genet. 14, 457–460 (1996).

  12. 12.

    et al. Systematic variation in gene expression patterns in human cancer cell lines. Nat. Genet. 24, 227–235 (2000).

  13. 13.

    American Type Culture Collection Standards Development Organization Workgroup ASN-0002. Cell line misidentification: the beginning of the end. Nat. Rev. Cancer 10, 441–448 (2010).

  14. 14.

    et al. Inconsistency in large pharmacogenomic studies. Nature 504, 389–393 (2013).

  15. 15.

    et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

  16. 16.

    et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

  17. 17.

    et al. A global map of human gene expression. Nat. Biotechnol. 28, 322–324 (2010).

  18. 18.

    et al. Core epithelial-to-mesenchymal transition interactome gene-expression signature is associated with claudin-low and metaplastic breast cancer subtypes. Proc. Natl. Acad. Sci. USA 107, 15449–15454 (2010).

  19. 19.

    et al. Brief Report: The lincRNA Hotair is required for epithelial-to-mesenchymal transition and stemness maintenance of cancer cell lines. Stem Cells 31, 2827–2832 (2013).

  20. 20.

    et al. The identification of 2-(1H-Indazol-4-yl)-6-(4-methanesulfonyl-piperazin-1-ylmethyl)-4-morpholin-4-yl-thieno[3,2-d]pyrimidine (GDC-0941) as a potent, selective, orally bioavailable inhibitor of class I PI3 kinase for the treatment of cancer. J. Med. Chem. 51, 5522–5532 (2008).

  21. 21.

    et al. Intermittent administration of MEK inhibitor GDC-0973 plus PI3K inhibitor GDC-0941 triggers robust apoptosis and tumor growth inhibition. Cancer Res. 72, 210–219 (2012).

  22. 22.

    , & Crosstalk in Met receptor oncogenesis. Trends Cell Biol. 19, 542–551 (2009).

  23. 23.

    et al. Cross-talk between MET and EGFR in non-small cell lung cancer involves miR-27a and Sprouty2. Proc. Natl. Acad. Sci. USA 110, 8573–8578 (2013).

  24. 24.

    et al. Detection of murine leukemia virus in the Epstein-Barr virus-positive human B-cell line JY, using a computational RNA-seq-based exogenous agent detection pipeline, PARSES. J. Virol. 86, 2970–2977 (2012).

  25. 25.

    et al. The effects of hepatitis B virus integration into the genomes of hepatocellular carcinoma patients. Genome Res. 22, 593–601 (2012).

  26. 26.

    et al. Recurrent R-spondin fusions in colon cancer. Nature 488, 660–664 (2012).

  27. 27.

    et al. Transforming fusions of FGFR and TACC genes in human glioblastoma. Science 337, 1231–1235 (2012).

  28. 28.

    et al. Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia. N. Engl. J. Med. 344, 1031–1037 (2001).

  29. 29.

    et al. Genomic alterations of anaplastic lymphoma kinase may sensitize tumors to anaplastic lymphoma kinase inhibitors. Cancer Res. 68, 3389–3395 (2008).

  30. 30.

    et al. Functionally recurrent rearrangements of the MAST kinase and Notch gene families in breast cancer. Nat. Med. 17, 1646–1651 (2011).

  31. 31.

    et al. Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol. 12, R6 (2011).

  32. 32.

    , & (eds.) Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer (2013).

  33. 33.

    et al. deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput. Biol. 7, e1001138 (2011).

  34. 34.

    et al. Integrative analysis of the melanoma transcriptome. Genome Res. 20, 413–427 (2010).

  35. 35.

    et al. Exploration of the gene fusion landscape of glioblastoma using transcriptome sequencing and copy number data. BMC Genomics 14, 818 (2013).

  36. 36.

    et al. Identification of targetable FGFR gene fusions in diverse cancers. Cancer Discov. 3, 636–647 (2013).

  37. 37.

    & Fibroblast growth factor signalling: from development to cancer. Nat. Rev. Cancer 10, 116–129 (2010).

  38. 38.

    et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

  39. 39.

    & Lessons from the cancer genome. Cell 153, 17–37 (2013).

  40. 40.

    et al. The discovery of the benzhydroxamate MEK inhibitors CI-1040 and PD 0325901. Bioorg. Med. Chem. Lett. 18, 6501–6504 (2008).

  41. 41.

    et al. Discovery of a potent, selective, and orally available class I phosphatidylinositol 3-kinase (PI3K)/mammalian target of rapamycin (mTOR) kinase inhibitor (GDC-0980) for the treatment of cancer. J. Med. Chem. 54, 7579–7587 (2011).

  42. 42.

    et al. Crystal structure of an angiogenesis inhibitor bound to the FGF receptor tyrosine kinase domain. EMBO J. 17, 5896–5904 (1998).

  43. 43.

    & Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26, 873–881 (2010).

  44. 44.

    et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

  45. 45.

    et al. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics 11, 164–175 (2010).

  46. 46.

    & Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).

  47. 47.

    & GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).

  48. 48.

    et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216–220 (2013).

  49. 49.

    et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81 (2010).

  50. 50.

    & Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, condel. Am. J. Hum. Genet. 88, 440–449 (2011).

Download references

Acknowledgements

We thank members of the Genentech cell line bank (gCell) and the compound screening group (gCSI) for contributing cell lines and results to this paper. We thank A. Bruce for graphical assistance.

Author information

Affiliations

  1. Department of Bioinformatics and Computational Biology, Genentech Inc., South San Francisco, California, USA.

    • Christiaan Klijn
    • , Steffen Durinck
    • , Eric W Stawiski
    • , Peter M Haverty
    • , Zhaoshi Jiang
    • , Hanbin Liu
    • , Jeremiah Degenhardt
    • , Oleg Mayba
    • , Florian Gnad
    • , Jinfeng Liu
    • , Gregoire Pau
    • , Jens Reeder
    • , Yi Cao
    • , Kiran Mukhyala
    • , Gregory J Zynda
    • , Matthew J Brauer
    • , Thomas D Wu
    • , Robert C Gentleman
    • , Gerard Manning
    • , Richard Bourgon
    •  & Zemin Zhang
  2. Department of Molecular Biology, Genentech Inc., South San Francisco, California, USA.

    • Steffen Durinck
    • , Eric W Stawiski
    • , Zora Modrusan
    • , Frederic J de Sauvage
    •  & Somasekar Seshagiri
  3. Department of Discovery Oncology, Genentech Inc., South San Francisco, California, USA.

    • Yi Cao
    • , Suresh K Selvaraj
    • , Mamie Yu
    • , Robert L Yauch
    • , David Stokoe
    • , Richard M Neve
    •  & Jeffrey Settleman

Authors

  1. Search for Christiaan Klijn in:

  2. Search for Steffen Durinck in:

  3. Search for Eric W Stawiski in:

  4. Search for Peter M Haverty in:

  5. Search for Zhaoshi Jiang in:

  6. Search for Hanbin Liu in:

  7. Search for Jeremiah Degenhardt in:

  8. Search for Oleg Mayba in:

  9. Search for Florian Gnad in:

  10. Search for Jinfeng Liu in:

  11. Search for Gregoire Pau in:

  12. Search for Jens Reeder in:

  13. Search for Yi Cao in:

  14. Search for Kiran Mukhyala in:

  15. Search for Suresh K Selvaraj in:

  16. Search for Mamie Yu in:

  17. Search for Gregory J Zynda in:

  18. Search for Matthew J Brauer in:

  19. Search for Thomas D Wu in:

  20. Search for Robert C Gentleman in:

  21. Search for Gerard Manning in:

  22. Search for Robert L Yauch in:

  23. Search for Richard Bourgon in:

  24. Search for David Stokoe in:

  25. Search for Zora Modrusan in:

  26. Search for Richard M Neve in:

  27. Search for Frederic J de Sauvage in:

  28. Search for Jeffrey Settleman in:

  29. Search for Somasekar Seshagiri in:

  30. Search for Zemin Zhang in:

Contributions

C.K., F.J.d.S., J.S., S.S. and Z.Z. conceived the project. C.K., J.S., S.S. and Z.Z. wrote the manuscript. C.K., S.D., E.W.S., P.M.H., Z.J., H.L., J.D., O.M., F.G., J.L., G.P., J.R., K.M., G.J.Z., M.J.B., T.D.W., R.C.G., G.M. and R.B. performed bioinformatics data analysis or provided computational infrastructure. Y.C., S.K.S., M.Y., R.L.Y., D.S., Z.M. and R.M.N. prepared cell lines and performed biochemical experiments including drug treatments and sequencing.

Competing interests

The majority of authors are employees of Genentech Inc. and/or hold stock in Roche.

Corresponding authors

Correspondence to Jeffrey Settleman or Somasekar Seshagiri or Zemin Zhang.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–18, Supplementary Tables 3, 5, 9 and 12 and Supplementary Note

Excel files

  1. 1.

    Supplementary Table 1

    Overview of cell lines included in this study

  2. 2.

    Supplementary Table 2

    Sequencing statistics for RNA sequencing of cancer cell lines

  3. 3.

    Supplementary Table 4

    Results for GISTIC analysis run on 610 cell lines

  4. 4.

    Supplementary Table 6

    Viral integration sites detected by human-viral chimeric RNA

  5. 5.

    Supplementary Table 7

    Viral integration sites detected by human-viral chimeric RNA - murine viruses

  6. 6.

    Supplementary Table 8

    Gene-gene fusions identified in cancer cell lines

  7. 7.

    Supplementary Table 10

    Fusions found in TCGA for which at least one gene was also found in a fusion in cell lines

  8. 8.

    Supplementary Table 11

    Crizotinib response in cancer cell lines

  9. 9.

    Supplementary Table 13

    IC50 values for five drugs determined in 351 cell lines

Zip files

  1. 1.

    Supplementary Data 1

    Gene expression read counts for all coding genes

  2. 2.

    Supplementary Data 2

    Gene expression read counts for all non-coding genes

  3. 3.

    Supplementary Data 3

    All single nucleotide mutations found in cell lines in this study.

  4. 4.

    Supplementary Data 4

    Per-gene ploidy-corrected copy number values