This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Data availability
The data used to obtain the results is available on GitHub at https://github.com/PigeonMark/PanPep-Shuffled-Negatives and on Zenodo at https://doi.org/10.5281/zenodo.7798691.
Code availability
All scripts used to obtain the results are available on GitHub at https://github.com/PigeonMark/PanPep-Shuffled-Negatives and on Zenodo at https://doi.org/10.5281/zenodo.7798691.
References
Gao, Y. et al. Pan-peptide meta learning for T-cell receptor–antigen binding recognition. Nat. Mach. Intell. 5, 236–249 (2023).
Narla, A., Kuprel, B., Sarin, K., Novoa, R. & Ko, J. Automated classification of skin lesions: from pixels to practice. J. Invest. Dermatol. 138, 2108–2110 (2018).
Seyyed-Kalantari, L., Zhang, H., McDermott, M. B. A., Chen, I. Y. & Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27, 2176–2182 (2021).
Castro, D. C., Walker, I. & Glocker, B. Causality matters in medical imaging. Nat. Commun. 11, 3673 (2020).
Pavlović, M. et al. Improving generalization of machine learning-identified biomarkers with causal modeling: an investigation into immune receptor diagnostics. Preprint at https://doi.org/10.48550/arXiv.2204.09291 (2023).
Geirhos, R. et al. Shortcut learning in deep neural networks. Nat. Mach. Intell. 2, 665–673 (2020).
Hudson, D., Fernandes, R. A., Basham, M., Ogg, G. & Koohy, H. Can we predict T cell specificity with digital biology and machine learning? Nat. Rev. Immunol. 23, 511–521 (2023).
Krogsgaard, M. & Davis, M. M. How T cells ‘see’ antigen. Nat. Immunol. 6, 239–245 (2005).
Meysman, P. et al. Benchmarking solutions to the T-cell receptor epitope prediction problem: IMMREP22 workshop report. ImmunoInformatics 9, 100024 (2023).
Zhang, W. et al. A framework for highly multiplexed dextramer mapping and prediction of T cell receptor sequences to antigen specificity. Sci. Adv. 7, eabf5835 (2021).
Bekker, J. & Davis, J. Learning from positive and unlabeled data: a survey. Mach. Learn. 109, 719–760 (2020).
Moris, P. et al. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief. Bioinform. 22, bbaa318 (2021).
Grazioli, F. et al. On TCR binding predictors failing to generalize to unseen peptides. Front. Immunol. 13, 1014256 (2022).
Chandola, V., Banerjee, A. & Kumar, V. Anomaly detection: a survey. ACM Comput. Surv. 41, 1–58 (2009).
Chen, L. et al. Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening. PLoS ONE 14, e0220113 (2019).
Author information
Authors and Affiliations
Contributions
C.D. performed the study. C.D. and P.M. wrote the manuscript. W.B., K.L. and P.M. conceived and supervised the study. W.B., P.M. and K.L. revised the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
K.L. and P.M. hold shares in ImmuneWatch, an immunoinformatics company.
Peer review
Peer review information
Nature Machine Intelligence thanks Geir Kjetil Sandve for their contribution to the peer review of this work. Primary Handling Editor: Dr Liesbeth Venema, in collaboration with the Nature Machine Intelligence Editorial Team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Methods and data.
Rights and permissions
About this article
Cite this article
Dens, C., Laukens, K., Bittremieux, W. et al. The pitfalls of negative data bias for the T-cell epitope specificity challenge. Nat Mach Intell 5, 1060–1062 (2023). https://doi.org/10.1038/s42256-023-00727-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-023-00727-0
This article is cited by
-
Adaptive immune receptor repertoire analysis
Nature Reviews Methods Primers (2024)
-
Reply to: The pitfalls of negative data bias for the T-cell epitope specificity challenge
Nature Machine Intelligence (2023)