Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Mapping the space of chemical reactions using attention-based neural networks


Organic reactions are usually assigned to classes containing reactions with similar reagents and mechanisms. Reaction classes facilitate the communication of complex concepts and efficient navigation through chemical reaction space. However, the classification process is a tedious task. It requires identification of the corresponding reaction class template via annotation of the number of molecules in the reactions, the reaction centre and the distinction between reactants and reagents. Here, we show that transformer-based models can infer reaction classes from non-annotated, simple text-based representations of chemical reactions. Our best model reaches a classification accuracy of 98.2%. We also show that the learned representations can be used as reaction fingerprints that capture fine-grained differences between reaction classes better than traditional reaction fingerprints. The insights into chemical reaction space enabled by our learned fingerprints are illustrated by an interactive reaction atlas providing visual clustering and similarity searching.

A preprint version of the article is available at ArXiv.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Data representation and task.
Fig. 2: BERT reaction classification model.
Fig. 3: Attention weights interpretation.
Fig. 4: Reaction atlases.
Fig. 5: Nearest-neighbour queries.

Data availability

The Schneider 50k dataset is publicly available25. We provide a new reaction dataset (USPTO 1k TPL), derived from the work of Lowe50, containing the 1,000 most common reaction templates as classes. It can be accessed through The commercial Pistachio (version 191118) dataset can be obtained from NextMove Software38. Pistachio relies on Leadmine58 to text-mine patent data. The dataset comes with reaction classes assigned using NameRxn (

Code availability

The rxnfp code and the experiments on the public datasets, as well as an interactive TMAP, are provided at


  1. 1.

    Grzybowski, B. A., Bishop, K. J. M., Kowalczyk, B. & Wilmer, C. E. The ‘wired’ universe of organic chemistry. Nat. Chem. 1, 31–36 (2009).

    Article  Google Scholar 

  2. 2.

    Coley, C. W., Rogers, L., Green, W. H. & Jensen, K. F. Computer-assisted retrosynthesis based on molecular similarity. ACS Cent. Sci. 3, 1237–1245 (2017).

    Article  Google Scholar 

  3. 3.

    IBM RXN for Chemistry (IBM);

  4. 4.

    Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. & Jensen, K. F. Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci. 3, 434–443 (2017).

    Article  Google Scholar 

  5. 5.

    Schwaller, P., Gaudin, T., Lanyi, D., Bekas, C. & Laino, T. ‘Found in translation’: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 9, 6091–6098 (2018).

    Article  Google Scholar 

  6. 6.

    Schwaller, P. et al. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent. Sci. 5, 1572–1583 (2019).

    Article  Google Scholar 

  7. 7.

    Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

    Article  Google Scholar 

  8. 8.

    Thakkar, A., Kogej, T., Reymond, J.-L., Engkvist, O. & Bjerrum, E. J. Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain. Chem. Sci. 11, 154–168 (2020).

    Article  Google Scholar 

  9. 9.

    Schwaller, P. et al. Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy. Chem. Sci. 11, 3316–3325 (2020).

    Article  Google Scholar 

  10. 10.

    Vaucher, A. C. et al. Automated extraction of chemical synthesis actions from experimental procedures. Nat. Commun. 11, 3601 (2020).

    Article  Google Scholar 

  11. 11.

    Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems (eds Guyon, I. et al.) 5998–6008 (NIPS, 2017).

  12. 12.

    Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference on North American Chapter of the Association for Computational Linguistics 4171–4186 (Association for Computational Linguistics, 2019).

  13. 13.

    Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Model. 28, 31–36 (1988).

    Article  Google Scholar 

  14. 14.

    Weininger, D., Weininger, A. & Weininger, J. L. SMILES. 2. Algorithm for generation of unique SMILES notation. J. Chem. Inf. Comput. Sci. 29, 97–101 (1989).

    Article  Google Scholar 

  15. 15.

    Schwaller, P., Hoover, B., Reymond, J.-L., Strobelt, H. & Laino, T. Unsupervised attention-guided atom-mapping. Preprint at (2020).

  16. 16.

    Toniato, A., Schwaller, P., Cardinale, A., Geluykens, J. & Laino, T. Unassisted noise-reduction of chemical reactions data sets. Preprint at (2020).

  17. 17.

    Miyaura, N. & Suzuki, A. Palladium-catalyzed cross-coupling reactions of organoboron compounds. Chem. Rev. 95, 2457–2483 (1995).

    Article  Google Scholar 

  18. 18.

    NameRXN (Nextmove Software);

  19. 19.

    Kraut, H. et al. Algorithm for reaction classification. J. Chem. Inf. Model. 53, 2884–2895 (2013).

    Article  Google Scholar 

  20. 20.

    Daylight Theory Manual Ch. 5 (Daylight Chemical Information Systems);

  21. 21.

    Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).

    Article  Google Scholar 

  22. 22.

    Chen, L. & Gasteiger, J. Organic reactions classified by neural networks: Michael additions, Friedel–Crafts alkylations by alkenes, and related reactions. Angew. Chem. Int. Ed. 35, 763–765 (1996).

    Article  Google Scholar 

  23. 23.

    Chen, L. & Gasteiger, J. Knowledge discovery in reaction databases: landscaping organic reactions by a self-organizing neural network. J. Am. Chem. Soc. 119, 4033–4042 (1997).

    Article  Google Scholar 

  24. 24.

    Satoh, H. et al. Classification of organic reactions: similarity of reactions based on changes in the electronic features of oxygen atoms at the reaction sites. J. Chem. Inf. Comput. Sci. 38, 210–219 (1998).

    Article  Google Scholar 

  25. 25.

    Schneider, N., Lowe, D. M., Sayle, R. A. & Landrum, G. A. Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity. J. Chem. Inf. Model. 55, 39–53 (2015).

    Article  Google Scholar 

  26. 26.

    Gao, H. et al. Using machine learning to predict suitable conditions for organic reactions. ACS Cent. Sci. 4, 1465–1476 (2018).

    Article  Google Scholar 

  27. 27.

    Ghiandoni, G. M. et al. Development and application of a data-driven reaction classification model: comparison of an electronic lab notebook and medicinal chemistry literature. J. Chem. Inf. Model. 59, 4167–4187 (2019).

    Article  Google Scholar 

  28. 28.

    Schneider, N., Stiefl, N. & Landrum, G. A. What’s what: the (nearly) definitive guide to reaction role assignment. J. Chem. Inf. Model. 56, 2336–2346 (2016).

    Article  Google Scholar 

  29. 29.

    ChemAxon (ChemAxon);

  30. 30.

    Duvenaud, D. K. et al. Convolutional networks on graphs for learning molecular fingerprints. In Proc. 28th International Conference on Neural Information Processing Systems Vol. 2, 2224–2232 (NIPS, 2015).

  31. 31.

    Wei, J. N., Duvenaud, D. & Aspuru-Guzik, A. Neural networks for the prediction of organic chemistry reactions. ACS Cent. Sci. 2, 725–732 (2016).

    Article  Google Scholar 

  32. 32.

    Sandfort, F., Strieth-Kalthoff, F., Khnemund, M., Beecks, C. & Glorius, F. A structure-based platform for predicting chemical reactivity. Chem 6, 1379–1390 (2020).

    Article  Google Scholar 

  33. 33.

    Probst, D. & Reymond, J.-L. Visualization of very large high-dimensional data sets as minimum spanning trees. J. Cheminf. 12, 1–13 (2020).

    Article  Google Scholar 

  34. 34.

    Jorner, K., Brinck, T., Norrby, P.-O. & Buttar, D. Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies. Chem. Sci. (2020).

  35. 35.

    Schwaller, P., Vaucher, A. C., Laino, T. & Reymond, J.-L. Prediction of chemical reaction yields using deep learning. Preprint at (2020).

  36. 36.

    Socher, R. et al. Recursive deep models for semantic compositionality over a sentiment treebank. In Proc. 2013 Empirical Methods in Natural Language Processing 1631–1642 (Association for Computational Linguistics, 2013).

  37. 37.

    Warstadt, A., Singh, A. & Bowman, S. R. Neural network acceptability judgments. Trans. Assoc. Comput. Linguist. 7, 625–641 (2019).

    Article  Google Scholar 

  38. 38.

    Pistachio (Nextmove Software);

  39. 39.

    Johnson, J., Douze, M. & Jégou, H. Billion-scale similarity search with GPUs. IEEE Trans. Big Data (2019);

  40. 40.

    Landrum, G. et al. rdkit/rdkit: 2019_03_4 (q1 2019) release (Zenodo, 2019);

  41. 41.

    Wei, J.-M., Yuan, X.-J., Hu, Q.-H. & Wang, S.-Q. A novel measure for evaluating classifiers. Exp. Syst. Appl. 37, 3799–3809 (2010).

    Article  Google Scholar 

  42. 42.

    Matthews, B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta Protein Struct. 405, 442–451 (1975).

    Article  Google Scholar 

  43. 43.

    Gorodkin, J. Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 28, 367–374 (2004).

    Article  Google Scholar 

  44. 44.

    Willighagen, E. L. et al. The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas and substructure searching. J. Cheminf. 9, 33 (2017).

    Article  Google Scholar 

  45. 45.

    Capecchi, A., Probst, D. & Reymond, J.-L. One molecular fingerprint to rule them all: drugs, biomolecules and the metabolome. J. Cheminf. 12, 1–15 (2020).

    Article  Google Scholar 

  46. 46.

    Probst, D. & Reymond, J.-L. FUn: a framework for interactive visualizations of large, high-dimensional datasets on the web. Bioinformatics 34, 1433–1435 (2017).

    Article  Google Scholar 

  47. 47.

    Carey, J. S., Laffan, D., Thomson, C. & Williams, M. T. Analysis of the reactions used for the preparation of drug candidate molecules. Org. Biomol. Chem. 4, 2337–2347 (2006).

    Article  Google Scholar 

  48. 48.

    RXNO Ontology (RSC);

  49. 49.

    Schneider, N., Lowe, D. M., Sayle, R. A., Tarselli, M. A. & Landrum, G. A. Big data from pharmaceutical patents: a computational analysis of medicinal chemists’ bread and butter. J. Med. Chem. 59, 4385–4402 (2016).

    Article  Google Scholar 

  50. 50.

    Lowe, D. Chemical reactions from US patents (1976–Sep2016) (Figshare, 2017);

  51. 51.

    Coley, C. W., Green, W. H. & Jensen, K. F. RDChiral: an RDKit wrapper for handling stereochemistry in retrosynthetic template extraction and application. J. Chem. Inf. Model. 59, 2529–2537 (2019).

    Article  Google Scholar 

  52. 52.

    Klein, G., Kim, Y., Deng, Y., Senellart, J. & Rush, A. M. OpenNMT: Open-Source Toolkit for Neural Machine Translation (Association for Computational Linguistics, 2017).

  53. 53.

    BERT code (GitHub);

  54. 54.

    Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 8024–8035 (Curran Associates, 2019).

  55. 55.

    Wolf, T. et al. Transformers: state-of-the-art natural language processing. Preprint at (2019).

  56. 56.

    Probst, D. & Reymond, J.-L. SmilesDrawer: parsing and drawing SMILES-encoded molecular structures using client-side javascript. J. Chem. Inf. Model. 58, 1–7 (2018).

    Article  Google Scholar 

  57. 57.

    Haghighi, S., Jasemi, M., Hessabi, S. & Zolanvari, A. PyCM: multiclass confusion matrix library in Python. J. Open Source Softw. 3, 729 (2018).

    Article  Google Scholar 

  58. 58.

    Lowe, D. M. & Sayle, R. A. LeadMine: a grammar and dictionary driven approach to entity recognition. J. Cheminf. 7, 1–9 (2015).

    Article  Google Scholar 

  59. 59.

    RXNFP Repository (v0.0.7) (Zenodo, accessed 17 November 2020);

Download references


D.P. and J.-L.R. acknowledge financial support by the Swiss National Science Foundation (NCCR TransCure). We thank L. Rudin for the careful proofreading of our manuscript.

Author information




P.S. and A.C.V. conceived the initial idea for the project. P.S., D.P., A.C.V. and V.H.N. trained models, performed the classification experiments and analysed the results. P.S. investigated the reaction fingerprints and wrote the code base. P.S., D.P. and D.K. worked on the reaction atlases. The project was supervised by T.L. and J.-L.R. All authors took part in discussions and contributed to the writing of the manuscript.

Corresponding author

Correspondence to Philippe Schwaller.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Notes 1–3, Figs. 1–3 and Tables 1–5.

Supplementary Data

Interactive reaction atlas.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schwaller, P., Probst, D., Vaucher, A.C. et al. Mapping the space of chemical reactions using attention-based neural networks. Nat Mach Intell 3, 144–152 (2021).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing