Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients


Cell-line screens create expansive datasets for learning predictive markers of drug response, but these models do not readily translate to the clinic with its diverse contexts and limited data. In the present study, we apply a recently developed technique, few-shot machine learning, to train a versatile neural network model in cell lines that can be tuned to new contexts using few additional samples. The model quickly adapts when switching among different tissue types and in moving from cell-line models to clinical contexts, including patient-derived tumor cells and patient-derived xenografts. It can also be interpreted to identify the molecular features most important to a drug response, highlighting critical roles for RB1 and SMAD4 in the response to CDK inhibition and RNF8 and CHD4 in the response to ATM inhibition. The few-shot learning framework provides a bridge from the many samples surveyed in high-throughput screens (n-of-many) to the distinctive contexts of individual patients (n-of-one).

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Study design.
Fig. 2: Transfer of predictive models across tissue types.
Fig. 3: Transfer of cell-line models to PDTC lines.
Fig. 4: Transfer of cell-line models to PDXs.
Fig. 5: Model interpretation to identify predictive markers.
Fig. 6: Model predictions and interpretation for the BRAF inhibitor dabrafenib.

Data availability

The datasets generated during and/or analyzed during the current study are all public data: CCLE:; CERES-corrected CRISPR gene disruption scores:; GDSC1000 dataset:; PDTC dataset:; PDX dataset: Other miscellaneous datasets that support the findings of the present study are available at Source data are provided with this paper.

Code availability

The software implementation of TCRP, along with all supporting code, is available at Other supporting software is available as follows: Scikit-learn v.0.20.2:; PyTorch 1.0:


  1. Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).

    Article  CAS  Google Scholar 

  2. Iorio, F. et al. A landscape of pharmacogenomic interactions in cancer. Cell 166, 740–754 (2016).

    Article  CAS  Google Scholar 

  3. Brabetz, S. et al. A biobank of patient-derived pediatric brain tumor models. Nat. Med. 24, 1752–1761 (2018).

    Article  CAS  Google Scholar 

  4. Bruna, A. et al. A biobank of breast cancer explants with preserved intra-tumor heterogeneity to screen anticancer compounds. Cell 167, 260–274.e22 (2016).

    Article  CAS  Google Scholar 

  5. Butler, D. Translational research: crossing the valley of death. Nature 453, 840–842 (2008).

    Article  CAS  Google Scholar 

  6. Lieu, C. H., Tan, A.-C., Leong, S., Diamond, J. R. & Eckhardt, S. G. From bench to bedside: lessons learned in translating preclinical studies in cancer drug development. J. Natl Cancer Inst. 105, 1441–1456 (2013).

    Article  Google Scholar 

  7. Seyhan, A. A. Lost in translation: the valley of death across preclinical and clinical divide—identification of problems and overcoming obstacles. Trans. Med. Commun. (2019).

  8. Naumov, G. N. et al. Combined vascular endothelial growth factor receptor and epidermal growth factor receptor (EGFR) blockade inhibits tumor growth in xenograft models of EGFR inhibitor resistance. Clin. Cancer Res. 15, 3484–3494 (2009).

    Article  CAS  Google Scholar 

  9. Lee, J. S. et al. Vandetanib versus placebo in patients with advanced non-small-cell lung cancer after prior therapy with an epidermal growth factor receptor tyrosine kinase inhibitor: a randomized, double-blind phase III trial (ZEPHYR). J. Clin. Oncol. 30, 1114–1121 (2012).

    Article  CAS  Google Scholar 

  10. Parisot, J. P., Hu, X. F., DeLuise, M. & Zalcberg, J. R. Altered expression of the IGF-1 receptor in a tamoxifen-resistant human breast cancer cell line. Br. J. Cancer 79, 693–700 (1999).

    Article  CAS  Google Scholar 

  11. Drury, S. C. et al. Changes in breast cancer biomarkers in the IGF1R/PI3K pathway in recurrent breast cancer after tamoxifen treatment. Endocr. Relat. Cancer 18, 565–577 (2011).

    Article  CAS  Google Scholar 

  12. Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015).

    Article  CAS  Google Scholar 

  13. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D. & Lillicrap, T. Meta-learning with memory-augmented neural networks. in Proc. 33rd International Conference on Machine Learning Vol. 48 (eds Balcan, M. F. & Weinberger, K. Q.) 1842–1850 (PMLR, 2016).

  14. Dai, W., Yang, Q., Xue, G.-R. & Yu, Y. Boosting for transfer learning. in Proc. 24th International Conference on Machine Learning 193–200 (Association for Computing Machinery, 2007).

  15. Blitzer, J., McDonald, R. & Pereira, F. Domain adaptation with structural correspondence learning. in Proc. 2006 Conference on Empirical Methods in Natural Language Processing 120–128 (EMNLP, 2006).

  16. Argyriou, A., Evgeniou, T. & Pontil, M. Multi-task feature learning. in Advances in Neural Information Processing Systems Vol. 19 (eds Schölkopf, B. et al.) 41–48 (MIT Press, 2007).

  17. Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. The Omniglot challenge: a 3-year progress report. Curr. Opin. Behav. Sci. 29, 97–104 (2019).

    Article  Google Scholar 

  18. Altae-Tran, H., Ramsundar, B., Pappu, A. S. & Pande, V. Low data drug discovery with one-shot learning. ACS Cent. Sci. 3, 283–293 (2017).

    Article  CAS  Google Scholar 

  19. Medela, A. et al. Few shot learning in histopathological images: reducing the need of labeled data on biological datasets. in Proc. 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI, 2019);

  20. Snell, J. et al. Prototypical Networks for Few-shot Learning. in Advances in Neural Information Processing Systems 4077–4087 (Curran Associates, 2017);

  21. Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K. & Wierstra, D. Matching networks for one shot learning. in Advances in Neural Information Processing Systems Vol. 29 (eds Lee, D. D. et al.) 3630–3638 (Curran Associates, 2016).

  22. Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. Proceedings of the 34th International Conference on Machine Learning 70, 1126–1135 (2017).

    Google Scholar 

  23. Preuer, K. et al. DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics 34, 1538–1546 (2018).

    Article  CAS  Google Scholar 

  24. Yu, D.-D., Guo, S.-W., Jing, Y.-Y., Dong, Y.-L. & Wei, L.-X. A review on hepatocyte nuclear factor-1beta and tumor. Cell Biosci. 5, 58 (2015).

    Article  Google Scholar 

  25. Gao, H. et al. High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug response. Nat. Med. 21, 1318–1325 (2015).

    Article  CAS  Google Scholar 

  26. Lipton, Z. C. The mythos of model interpretability. ACM Queue (2018).

  27. Ma, J. et al. Using deep learning to model the hierarchical structure and function of a cell. Nat. Methods (2018).

  28. Liu, F. & Matsuura, I. Inhibition of Smad antiproliferative function by CDK phosphorylation. Cell Cycle 4, 63–66 (2005).

    Article  CAS  Google Scholar 

  29. Zhao, M., Mishra, L. & Deng, C.-X. The role of TGF-β/SMAD4 signaling in cancer. Int. J. Biol. Sci. 14, 111–123 (2018).

    Article  CAS  Google Scholar 

  30. Zhang, F., Bick, G., Park, J.-Y. & Andreassen, P. R. MDC1 and RNF8 function in a pathway that directs BRCA1-dependent localization of PALB2 required for homologous recombination. J. Cell Sci. 125, 6049–6057 (2012).

    Article  CAS  Google Scholar 

  31. Lu, C.-S. et al. The RING finger protein RNF8 ubiquitinates Nbs1 to promote DNA double-strand break repair by homologous recombination. J. Biol. Chem. 287, 43984–43994 (2012).

    Article  CAS  Google Scholar 

  32. Kobayashi, S. et al. Rad18 and Rnf8 facilitate homologous recombination by two distinct mechanisms, promoting Rad51 focus formation and suppressing the toxic effect of nonhomologous end joining. Oncogene 34, 4403–4411 (2015).

    Article  CAS  Google Scholar 

  33. Smith, R., Sellou, H., Chapuis, C., Huet, S. & Timinszky, G. CHD3 and CHD4 recruitment and chromatin remodeling activity at DNA breaks is promoted by early poly(ADP-ribose)-dependent chromatin relaxation. Nucleic Acids Res. 46, 6087–6098 (2018).

    Article  CAS  Google Scholar 

  34. Larsen, D. H. et al. The chromatin-remodeling factor CHD4 coordinates signaling and repair after DNA damage. J. Cell Biol. 190, 731–740 (2010).

    Article  CAS  Google Scholar 

  35. Prahallad, A. et al. Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR. Nature 483, 100–103 (2012).

    Article  CAS  Google Scholar 

  36. Young, L. C. et al. SHOC2–MRAS–PP1 complex positively regulates RAF activity and contributes to Noonan syndrome pathogenesis. Proc. Natl Acad. Sci. USA 115, E10576–E10585 (2018).

    Article  CAS  Google Scholar 

  37. Tzivion, G., Luo, Z. & Avruch, J. A dimeric 14-3-3 protein is an essential cofactor for Raf kinase activity. Nature 394, 88–92 (1998).

    Article  CAS  Google Scholar 

  38. Schwartz, L. H. et al. RECIST 1.1—update and clarification: from the RECIST committee. Eur. J. Cancer 62, 132–137 (2016).

    Article  Google Scholar 

  39. Yu, K. et al. Comprehensive transcriptomic analysis of cell lines as models of primary tumors across 22 tumor types. Nat. Commun. (2019).

  40. Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).

    Article  CAS  Google Scholar 

  41. Li, T. et al. A scored human protein-protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2017).

    Article  CAS  Google Scholar 

  42. Cerami, E. G. et al. Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res. 39, D685–D690 (2011).

    Article  CAS  Google Scholar 

  43. Giurgiu, M. et al. CORUM: the comprehensive resource of mammalian protein complexes—2019. Nucleic Acids Res. 47, D559–D563 (2019).

    Article  CAS  Google Scholar 

  44. Meyers, R. M. et al. Computational correction of copy-number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).

  45. Kim, Y., Bismeijer, T., Zwart, W., Wessels, L. F. A. & Vis, D. J. Genomic data integration by WON-PARAFAC identifies interpretable factors for predicting drug-sensitivity in vivo. Nat. Commun. 10, 5034 (2019).

    Article  Google Scholar 

  46. Harakalova, M. et al. Multiplexed array-based and in-solution genomic enrichment for flexible and cost-effective targeted next-generation sequencing. Nat. Protoc. 6, 1870–1886 (2011).

    Article  CAS  Google Scholar 

  47. Glorot, X., Bordes, A. & Bengio, Y. Deep sparse rectifier neural networks. in Proc. Fourteenth International Conference on Artificial Intelligence and Statistics 15, 315–323 (2011).

  48. Kingma, D. & Ba, J. Adam: a method for stochastic optimization. Preprint at arXiv (2014).

  49. Baumann, D. & Baumann, K. Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation. J. Cheminform. 6, 47 (2014).

    Article  Google Scholar 

  50. Ribeiro, M. T., Singh, S. & Guestrin, C. Why should I trust you?: explaining the predictions of any classifier. in Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (2016).

  51. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Series B Stat. Methodol. 67, 301–320 (2005).

    Article  Google Scholar 

  52. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Series B Stat. Methodol. 58, 267–288 (1996).

    Google Scholar 

  53. Binder, A., Montavon, G., Lapuschkin, S., Müller, K.-R. & Samek, W. Layer-wise relevance propagation for neural networks with local renormalization layers. in Artificial Neural Networks and Machine Learning—ICANN 2016 (eds Villa, A. et al.) 63–71 (Springer, 2016).

  54. Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. International Conference on Learning Representations (2014).

Download references


We thank the following for their support for the present study: the National Cancer Institute for grants (nos. U54CA209891 to T.I., R01CA204173 to C.B. and K22CA234406 to J.S.), the National Institute of General Medical Sciences for a grant (no. P41GM103504 to T.I.) and the National Human Genome Research Institute for a grant (no. R01HG009979 to T.I.). R.S. was supported by a research grant from the Israel Science Foundation (grant no. 715/18). J.P. was supported by a grant from the National Science Foundation (grant no. 1652815). L.W. and S.M. were supported by the ZonMw TOP grant COMPUTE CANCER (40-00812-98-16012). J.S. was supported by the Cancer Prevention and Research Institute of Texas (CPRIT RR180035).

Author information

Authors and Affiliations



J.M. and T.I. designed the study and developed the conceptual ideas. J.M. and Y.L. implemented the main algorithms. J.M. and S.H.F. collected all the input sources. J.M., S.M., L.F.A.W. and M.H. developed the strategy for alignment of in vitro and in vivo drug responses. J.M., C.J.B. and T.I. interpreted the results. J.M., S.H.F., R.S., C.J.B., J.P., J.P.S. and T.I. wrote the manuscript.

Corresponding author

Correspondence to Trey Ideker.

Ethics declarations

Competing interests

T.I. is co-founder of Data4Cure, Inc., is on the Scientific Advisory Board and has an equity interest. T.I. is on the Scientific Advisory Board of Ideaya BioSciences, Inc. and has an equity interest. The terms of these arrangements have been reviewed and approved by the University of California San Diego in accordance with its conflict-of-interest policies. L.W. received project funding from Genmab BV. The other authors declare no competing interests.

Additional information

Peer review information Nature Cancer thanks Cyril Benes, Roland Eils and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Analysis of fitness versus predictive performance for the panel of gene knockouts in our study.

a, Distribution of relative growth values after CRISPR gene knockout, median for all n = 341 cell lines. Blue: pooling knockouts of all n = 17670 genes; Pink: pooling n = 469 knockouts of genes selected in our study. Fitness is corrected by the Copy Number Variation by the CERES algorithm. b, For each knockout of a selected gene, predictive performance (y axis) is computed as the Pearson correlation between predicted and actual growth measurements over all n = 341 cell lines. This performance is displayed as a function of the median growth fitness of that knockout (x axis). Growth fitness is binned according to percentiles, for example the first bin (0-10%) represents the top 10% of selected genes with the strongest median effects on growth. The distribution of predictive performance for each bin is shown with a violin plot. Error bars represent 95% confidence interval.

Source data

Extended Data Fig. 2

Training accuracy of TCRP and other baseline models for all challenges.

Source data

Extended Data Fig. 3 Alternative calculation of model performance using Spearman correlation.

While Pearson correlation is used to calculate model performance in the main text, this supplemental figure provides equivalent performance calculations using the non-parametric rank-based Spearman correlation. a, Related to Fig. 3b on n = 83 PDTC models. b, Related to Fig. 4a on n = 228 PDX models.

Source data

Extended Data Fig. 4 Comparison of transferability of different machine learning models to patient-derived xenografts.

Predictive models were pre-trained using responses of cancer cell lines to perturbations with drugs, one model per drug. Few-shot learning was then performed on 0-10 PDX breast tumor samples exposed to that drug (x-axis), and model accuracy (y-axis) was measured by a, Pearson correlation or b, Spearman correlation on the remaining held-out PDX samples. Results averaged across five drugs (see main text). This experiment considers n = 228 PDX models.

Source data

Extended Data Fig. 5 Interpreting the TCRP model with the framework of Local Interpretable Model-Agnostic Explanations (LIME).

See Methods.

Supplementary information

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ma, J., Fong, S.H., Luo, Y. et al. Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients. Nat Cancer 2, 233–244 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer