Large panels of human cancer cell lines have been profiled at the DNA, RNA and pharmacological levels to accelerate the search for cancer therapies. But two of those large data sets show only partial concordance. See Analysis p.389
Despite obvious limitations in their ability to model clinical disease, cultured cell lines remain central to research on cancer. But a study by Haibe-Kains et al.1 on page 389 of this issue reveals apparent inconsistencies between two large studies of the sensitivity of hundreds of cell lines to dozens of drugs. The findings sound a note of caution about the interpretation of data from such projects, but do not undermine their value.
The first cell-line panel used for large-scale screening of compounds for anticancer activity was the NCI-60, a diverse set of 60 human lines that has been used to screen more than 100,000 compounds since 1988 (ref. 2). Because the same lines have been profiled at the DNA, RNA, protein and chromosomal levels, molecular aberrations in the cells can be correlated with their sensitivity to drugs3. But 60 is a relatively small number. So it was exciting when, in March 2012, the Cancer Cell Line Encyclopedia (CCLE)4 and Cancer Genome Project (CGP)5 were published, presenting gene-expression profiles and drug-sensitivity assays for 1,036 cell lines and 24 drugs, and 727 cell lines and 138 drugs, respectively. The publications also contained information on gene-copy number and genome sequence for some of the cell lines, and protein-level data have since been added to the mix. Those extensive databases are being used in numerous laboratories to guide research on the molecular mechanisms of cancer, to generate hypotheses for the development of new therapies, and in conjunction with clinical studies6.
Haibe-Kains et al. analysed the relationships between gene expression and drug sensitivity for 471 cell lines, 15 drugs and 12,187 genes that were in both the CCLE and CGP data sets. They found a rather low correlation between the two. Both original studies were carefully done and carefully documented; there is no implication that the apparent discrepancy involved error on the part of either team. So what is the source of the difference? In principle, it could be due to differences in the gene-expression profiles, pharmacological assays, computational methods or any combination thereof.
The authors found that the gene-expression profiles, which were obtained from microarray studies, showed quite good concordance between the two projects, whereas the pharmacological assays did not (Fig. 1). But that should come as no surprise. The pharmacological assay used by the CGP (the CellTiter 96 AQueous One Solution Cell Proliferation Assay from Promega) measures metabolic activity in terms of a reductase-enzyme product after a 72-hour incubation of cells with a drug; that used by the CCLE (the CellTiter-Glo assay from Promega) measures metabolic activity by assessing levels of the energy-transfer molecule ATP, after 72–84 hours of incubation. Both assays provide indices of the drug's activity against the cells, but they would not be expected to mirror each other across all cell and drug types, even if run in parallel (and neither may be the best indicator of cell viability).
Furthermore, many variables can affect the quantitative results obtained in such assays. For example, drug sensitivities can diverge if different batches of fetal bovine serum (an ingredient of cell-culture medium that varies in its content of cytokines and other biologically active molecules) are used. The time and conditions of the cells' incubation before the drug is added, the coating on the plastic culture wells, intra-study batch or trend effects and other such arcane factors can all be influential. In this case, the intrinsic sensitivities of the two assays as analysed were also different: for 12 of the 15 drugs in question, the CCLE assay was not sensitive enough to reach its endpoint for a large fraction of the cell types; in the CGP study, a mathematical extrapolation was used to obtain quantitative results in such cases. Haibe-Kains et al. performed extensive analyses to take account of such issues, but more experimental data would be required to pin down the true reasons for the discrepancies they highlight. Overall, if there is any surprise about the discordance between the two pharmacological data sets, it is quantitative, rather than qualitative.
An interesting question not directly addressed by Haibe-Kains and colleagues is whether the drugs would cluster into two separate groups on the basis of response data from the two different projects or whether they would intermingle — in other words, whether the between-assay differences were greater than the within-assay differences among drugs.
Given the differences, which pharmacological assay represents the 'truth'? The probable answer is either both or neither, depending on one's purpose. If the aim is to predict clinical efficacy, then neither assay will be 'correct' in most cases. The well-worn dictum “all models are wrong, some models are useful”7 applies with a vengeance in this context; there are too many differences between cultured cells and patients, particularly in terms of the delicate balance between beneficial and toxic effects of anticancer drugs.
In our view, the more appropriate uses of cell-line pharmacological data are for hypothesis generation and for elaborating on existing hypotheses, rather than for formal statistical prediction. Clues obtained by correlating drug-sensitivity patterns with molecular profiles will sometimes illuminate cellular mechanisms and pathways that advance our basic understanding, even if they are not directly predictive for the clinic. The CCLE and CGP authors each use their drug-response data to tell several provocative molecular stories, and there is considerable overlap between the two sets of stories. Indeed, the statistical tests presented in both studies suggest that the patterns of response they observe are quite robust, even if the individual drug-response measurements are not. The patterns can be highly instructive about mechanisms of action.
Haibe-Kains et al. make a plea for standardization of pharmacological assays among researchers. Standardization is without doubt useful for comparison of studies and for quality assurance, once a research community has decided what the standards should be. In this particular case, however, the more immediately productive enterprise would be a joint effort by the teams to pin down the reason(s) for the differences between assays. We strongly suspect that the two teams have already generated a great deal of the relevant information, and additional experiments could be done in parallel between the two — to support the activities of the many researchers who are using, or will use, these rich data resources.
Haibe-Kains, B. et al. Nature 504, 389–393 (2013).
Shoemaker, R. H. et al. Prog. Clin. Biol. Res. 276, 265–286 (1988).
Weinstein, J. N. et al. Science 275, 343–349 (1997).
Barretina, J. et al. Nature 483, 603–607 (2012).
Garnett, M. J. et al. Nature 483, 570–575 (2012).
The Cancer Genome Atlas Research Network et al. Nature Genet. 45, 1113–1120 (2013).
Box, G. E. P. in Robustness in Statistics: Proceedings of a Workshop (eds Launer, R. L. & Wilkinson, G. N.) 201–236 (Academic, 1979).
About this article
Cite this article
Weinstein, J., Lorenzi, P. Discrepancies in drug sensitivity. Nature 504, 381–383 (2013). https://doi.org/10.1038/nature12839
Frontiers in Oncology (2020)
International Journal of Cancer (2020)
Frontiers in Genetics (2019)
Nature Reviews Clinical Oncology (2019)
Briefings in Bioinformatics (2019)