In addition to the papers on generative models mentioned by Schneider et al.1 and by Walters and Murcko of Relay Therapeutics2, we would like to point out two other relevant reports: the first description of generative adversarial networks (GAN) published by A.A. & A.Z. and colleagues3 of Insilico Medicine and the variational autoencoder (VAE) model from A.A.-G.4 of the University of Toronto/Vector Institute for Artificial Intelligence/Canadian Institute for Advanced Research, Toronto, Ontario, Canada. Hongming Chen’s group at AstraZeneca has also been instrumental in contributing innovative papers to this field5,6.

In 2018, following the proposal of the first objective-reinforced generative model (A.G.-G.)7, our team came together to develop generative tensorial reinforcement learning (GENTRL)—an acceleration workflow for designing drugs against a nominated kinase using a defined set of criteria. Our objective was to design, synthesize and test small-molecule inhibitors generated using a generative model in a shorter period of time than previously possible using traditional drug discovery. The project resulted in the work submitted to Nature Biotechnology in November 20188 and published last September.

A comprehensive commentary on the paper was published by AstraZeneca scientists6. The critique of Murcko and Walters, and many similar online commentaries, fails to recognize that, as we state in our paper, our goal was to provide the first demonstration of the effectiveness of a novel generative approach; as such, in-depth validation of the molecules produced was not the main goal of our paper. We readily acknowledge that the compounds require further optimization.

The way generative models work when there is a given template molecule is similar to the way they work with images. If the training images include a picture of an individual, even if generation conditions such as age and sex are changed, the generated images will look similar to that original image. A similar issue arises with respect to how GENTRL-like systems generate images with the desired generation conditions. Unlike pictures, small molecules are discrete structures, in which small changes lead to dramatic differences in function. Compound 1 in our GENTRL paper8 is a unique and unpatented molecule. Murcko and Walters highlight the fact that compound 1 is similar to ponatinib, and consequently likely has a similar selectivity profile. But similarity is in the eye of the beholder.

We agree that the selectivity of compound 1 may be a challenge and that it should be tested against the additional kinases proposed by Murcko and Walters. However, compound 1 demonstrated a rather good selectivity index toward discoidin domain receptor 2 (DDR2 (IC50(DDR2)/IC50(DDR1) > 20), whereas ponatinib possesses the same inhibitory activity for both DDR1 and DDR2 (9 and 9.4 nM)9. This clearly illustrates that compounds that differ by isosterically equivalent fragments can demonstrate rather different profiles.

Regarding the statement that “compound 1 is selective,” it should be emphasized that selectivity versus DDR2, as well as against the small panel of kinases provided by Eurofins, is exactly what was claimed in our paper. There were many structures generated by GENTRL that were substantially different and likely to be more selective, but these were more difficult to synthesize in the short self-imposed ‘race’ mode of our original work.

To help the community and establish a range of standards in generative chemistry, the Alliance for Artificial Intelligence in Healthcare (AAIH), co-founded by Insilico Medicine, has proposed the Molecular Sets (MOSES)—a benchmarking initiative10 addressing many concerns outlined by Murcko and Walters. A similar effort, led by BenevolentAI, is called GuacaMol.

An extension of our model has been recently accepted for the NeurIPS conference11, in which we demonstrate the performance of the model on images of peoples’ faces to allow rapid human validation. We look forward to having further discussions so that together, as a community, we can find a consensus on ways to assess all the exciting methods being developed at a rapid pace.

In summary, we agree with the main message of Murcko and Wilson. We hope that a set of guidelines can be developed as a community effort that allows the comparison and assessment of the power of publishing generative models for drug discovery. We are ready to join forces with all of those interested and to make this happen as part of the MOSES initiative or other efforts like GuacaMol.