The organic chemist’s toolbox is vast, with technologies to accelerate the synthesis of novel chemical matter. The field of asymmetric catalysis is one approach to accessing new areas of chemical space and computational power is today sufficient to assist in this exploration. Unfortunately, existing techniques generally require computational expertise and are therefore underutilized in synthetic chemistry. Here we present our platform Virtual Chemist, which allows bench chemists to predict outcomes of asymmetric chemical reactions ahead of testing in the laboratory, in just a few clicks. Modular workflows facilitate the simulation of various sets of experiments, including the four realistic scenarios discussed: one-by-one design, library screening, hit optimization and substrate-scope evaluation. Catalyst candidates are screened within hours and the enantioselectivity predictions provide substantial enrichments compared to random testing. The achieved accuracies within ~1 kcal mol–1 provide opportunities for computational chemistry in the field of asymmetric catalyst design, allowing bench chemists to guide the design and discovery of asymmetric catalysts.
Organic chemistry research is vital to the discovery, optimization and large-scale production of numerous small molecules, such as novel drugs that treat life-threatening diseases. It contributes to the design of innovative materials comprising modern electronics and low power consumption OLEDs and to the development of novel agricultural practices, cosmetics, textiles, inks and paints, to name a few1. Unfortunately, a major hurdle in the production of these complex small molecules is the challenging syntheses they often require. Although several research groups are focused on the development and optimization of new methodologies, they are often reaction-specific and universalizing them for mainstream wet laboratory chemistry requires substantial work.
For the design of novel organic synthetic methodologies to access novel compounds, chemists often make use of the vast organic chemistry toolbox at their disposal; chemists routinely make use of nuclear magnetic resonance (NMR), mass spectrometry (MS) and chromatography. These complex scientific technologies are largely accessible without expert knowledge of their innerworkings. For example, synthetic chemists run standard 1H NMR, 13C NMR and various 2D NMR experiments without necessarily understanding and/or manipulating the magnetic pulse sequences. By contrast, computational chemistry remains largely inaccessible to the experimental chemistry community; complex theoretical calculations are neglected since coding/programming knowledge, sometimes advanced experience, is often a prerequisite. The omission of computational techniques from the larger toolbox is regrettable, since interpreting unexpected observations2 and proposing new reaction mechanisms3,4 have been attributed, in part, to computations. With the rise of quantum mechanics (QM) methods (Hartree–Fock (HF) and density functional theory (DFT)) and molecular mechanics (MM) methods (docking and molecular dynamics), organic chemists have become aware of the power and utility of such computations. Computational experts frequently collaborate with experimentalists to rationalize the observations of the organic chemists. However, rather than only offering post facto theories, computational chemistry could prospectively hypothesize and screen organic chemistry transformations. We remain sanguine at such a possibility upon consideration of a similar successful implementation of computer simulations in drug discovery5,6. After the pioneering development in 1982 of DOCK7, a structure-based drug discovery tool, an entire field of research emerged. In fact, many computational techniques including machine learning8, molecular dynamics9, molecular docking10 and pharmacophore modelling11 are now commonplace, addressing research challenges in drug discovery. Theoretically, analogous computational techniques could tackle synthetic chemistry challenges; already, robotics12 and synthetic planning computational tools13,14 have been reported and will likely be incorporated into many chemistry laboratories soon.
Among synthetic methodologies are asymmetric transformations. While biocatalysis and the use of the chiral pool are common approaches for the synthesis of chiral molecules (for example, chiral drugs and chiral materials), their application is limited (substrate specificity and stability of biocatalysts and limited available chiral molecules). Asymmetric synthesis is an attractive alternative to generating chiral molecules in high quantity and purity. In practical terms, cheap, selective, synthetically accessible and green asymmetric catalysts are highly desired to shorten synthetic routes to complex small molecules. The vastness of the chemical space suggests that many organocatalysts or transition metal catalysts exist, but their discovery is challenging, tedious and physically intractable using solely traditional experimental techniques15. The exploration of the chemical space can, however, be more efficient when performed computationally. Furthermore, virtually applying identified and selected catalysts to predict stereoselectivity for a specific reaction is within reach16.
Several groups have focused on the prediction of stereoselectivity of asymmetric transformations17,18,19. Among the proposed approaches are statistical models20,21, neural networks (NNs)22,23, DFT17,18,24,25,26,27,28,29,30,31, QM/MM32 and MM-based methods (Q2MM33,34 and ACE35,36), with DFT being the most widely used. However, despite the demonstration of its feasibility, it was not until 2009 that the first use of DFT for screening a small sized set of asymmetric catalysts and substrates was reported37, with little work communicated since. In addition, Bootsma and Wheeler recently revealed a major potential pitfall when DFT free energy calculations are used to predict enantioselectivity38. NNs are newer on the scene, but require a plethora of data and may not be appropriate for discovering novel catalysts. Our program ACE (Asymmetric Catalyst Evaluation) combines ground state parameters of reactants and products to predict transition state (TS) geometries and enantioselectivities. Alternatively, Q2MM can be used to derive a reaction-specific TS force field (TSFF).
Generally, most software in this domain has been plagued by poor usability and time inefficiencies, although the Wiest/Norrby research group developed CatVS to begin addressing those concerns39. We suggest that organic chemists should be able to screen for potential asymmetric catalysts using computational methods. More broadly, we aim to continue to advocate for the use of virtual asymmetric catalyst discovery and design as a complement to traditional and automated asymmetric catalysis.
Here we present our efforts to develop a platform (VIRTUAL CHEMIST) that integrates all the tools, accessories and automation required for organic chemistry laboratories to design experiments, rather than rationalize data. Its application to the simulation of four catalyst discovery scenarios (discovery of catalysis through trial and error, screening of potential catalysts, catalyst optimization and investigation of catalyst scope) has demonstrated its accuracy and usefulness. Moreover, VIRTUAL CHEMIST improves on CatVS by providing access to easily-customizable workflows, within the context of a graphical user interface. It allows application of a range of methods (ACE and Q2MM), with more approximate tools that can be applied almost immediately, up to methods that can be fine-tuned for each specific reaction type, as illustrated for two examples39.
For use by organic chemists, the accessibility aspect of this technology must be addressed without sacrificing accuracy. Regarding accessibility: this technology should not require large computational resources, should ideally be useable on a standard desktop computer (Windows, Linux, MacOS) and should be substantially faster than the experiments being simulated. We believe that this software should bring knowledge complementary to that of chemists, taking advantage of complex calculations (machine time) and years of expertise (human time). For example, chemists should be able to interact with this technology instructing the software for specially desired properties (for example protecting groups, water solubility and commercial availability of chemicals). Regarding accuracy: a difference of only 1 kcal mol–1 between diastereomeric TSs can distinguish between weakly stereoselective catalysts. To put this margin of error in context, in the drug discovery process one often investigates molecules hitting a target with reasonable binding affinity. In this case, an accuracy of a few kcal mol–1 can differentiate between strong, weak and non-binders (for example, 4 kcal mol–1 would differentiate between a nanomolar and a micromolar enzyme inhibitor). As such, accuracy is a major challenge in asymmetric catalyst screening.
The ultimate objective of this research programme is to deliver software that can simulate an entire organic chemistry project from beginning to end. As an example, we demonstrate the development of a Diels–Alder organocatalyst (Fig. 1).
In this scenario, we would need software to prepare virtually libraries of potential catalysts and to understand chemistry concepts such as chirality, functional group compatibility (chemoselectivity) and similarity, evaluate the catalytic activity of the potential catalysts and evaluate the enantioselectivity induced by these catalysts. Ideally, a common platform would seamlessly execute all three actions without user intervention. Chemists should also be able to instruct the software through sketches using a program they are familiar with (for example, ChemDraw, IsisDraw, ChemWindow and so on).
Automation and accessory programs
To run the gamut of simulations mentioned above, several transformations and computations must be automated and concealed from the user; this is an often forgotten, yet major, challenge of this research.
We built on and expanded our drug discovery platform Forecaster user interface (UI) to create a novel platform, VIRTUAL CHEMIST. This UI contains a 2D sketcher for drawing input catalysts and substrates and an easy-to-use three-dimensional graphical interface for visualizing the calculated output TS structures (Fig. 2). Additionally, resulting data is summarized in the UI (for example, the potential energies of TS structures and predicted enantioselectivities). Finally, we have made strides towards universal application by enabling the creation of modular workflows.
Preparing libraries of potential catalysts may be the first step. Previously reported programs SELECT (searches for analogues or dissimilar compounds and optimizes library diversity) and REDUCE (filters chemical library for presence of functional groups such as secondary amines for organocatalysed Diels–Alder cycloaddition)40 are accessible in modular workflows. A library of synthetic analogues can also be generated using our previously reported searching and combinatorial tools FINDERS and REACT2D41. In contrast to other virtual combinatorial library tools42 these programs consider stereochemistry change during a reaction (for example, in a Mitsunobu reaction), ensuring that the asymmetric catalysts virtually screened are truly synthetically accessible.
Predicting enantioselectivity would be the next step. Generally, for each catalyst candidate, the software must compile a TS, parameterize that system and then compute energies. First, where does a TS come from? As an example, consider the diethylzinc addition to aldehydes previously investigated with Q2MM43. In this work, TS structures were provided as Cartesian coordinates using a common ‘xyz’ format. These structures (text files) could be used as a starting point for screening asymmetric catalysts without any graphical user interface or QM methods. As shown in Fig. 2, provided Cartesian coordinates yield TS templates that are subsequently used to assemble realistic TS structures for a series of catalysts and substrates.
All of these steps were successfully integrated into a single program Constructs (Converting and Orienting Native Structures on Templates of Rotatable and Unoptimized Chemical Transition States). In short, CONSTRUCTS assembles TS structures with reasonable geometry (later optimized by ACE) from simple text files (TSs of simple models), 2D catalysts and substrate sketches (Fig. 2a). For simplicity, we have also included several reactions precluding the acquisition of TS coordinates.
Our software ACE, which predicts stereoselectivity of reactions, relies on MM3 force field (FF) parameters. However, in the MM3 FF, metals are not parameterized, precluding the use of previous versions of ACE for metal-catalysed additions. Parameters can now be fully developed in a user-friendly manner using our program QUEMIST (QUantum Energy of Molecules Inducing Structural Transformations). QUEMIST enables single point energy calculations, geometry optimizations and Hessian calculations using HF methods to automate the generation of FF parameters using the method developed by Seminario44 and improved by Allen et al.45. We acknowledge that using HF methods for metal-containing systems is far from ideal. However, the results presented below are encouraging and support their usage here. Importantly, the parameters only need to be developed once and can subsequently be used to screen libraries of catalysts.
Q2MM FFs were reported for some reactions and made available to the community (diethylzinc addition to aldehydes43, asymmetric dihydroxylation46 and rhodium-catalysed hydrogenation of activated alkenes47). However, these TSFFs require access to an external MM package for TS optimization. We therefore decided to take advantage of ACE, which includes all of these MM routines and added the option to use Q2MM-derived TSFFs, thus improving the usability of Q2MM.
Finally, we need to evaluate the catalytic activity. While some molecules may be predicted to be stereoselective, they may not be reactive (that is, they would not catalyse the reaction). To evaluate the catalytic activity of a set of molecules, various reactivity parameters, including a nucleophilicity index, may now be computed using our program QUEMIST, mentioned above, embedded into Smart. Smart had been developed to compute a number of molecular properties and descriptors, such as molecular weight and the presence of some functional groups40. The pseudocode of all the programs used in this study is available in the Supplementary Sections I.2–I.7.
To assess the applicability of these tools, we envisioned four different realistic scenarios. First, a chemist may draw catalysts one by one and test the potential stereoinduction. Second, a chemist may screen a large database of chiral molecules to identify novel chemical series. Third, a chemist may search for analogues as part of the lead optimization of a hit molecule (with analogy to drug discovery). Finally, a chemist may assess the substrate-scope of a specific catalyst.
Application of the software tools to scenario 1
A chemist may want to test one catalyst at a time and identify virtually the most promising, truly interacting with the platform. In this scenario, each catalyst may be drawn using the provided sketcher; TS templates are either available directly or may be built from literature data (see the tutorial provided as Supplementary Data 1 for examples). We tested this scenario on over 350 reactions from seven reaction classes (Fig. 3, complete set given in Supplementary Tables 1–8) and compared the results from random predictions to assess the accuracy of the methodology (Fig. 4).
To evaluate accuracy, we first visualized the proposed TS structures (Fig. 3). As observed with previous versions, Ace-generated TS structures resemble those previously proposed35,36. We then investigated whether the stereoselectivity predictions were accurate. The error of the prediction of ΔΔG‡ between the major diastereomeric TSs was computed and compared to a random assignment (Fig. 4). We note that none of the FFs used by ACE in these tests have been trained specifically on these reactions. Since Q2MM TSFFs have been derived to complement MM3* and Ace is using MM3, the accuracy presented here may underestimate the accuracy of the TSFFs.
As can be seen in Fig. 4, the overall average error ranges between 0.94 and 0.97 kcal mol–1 (over five runs). This ~1.0 kcal mol–1 value, often referred to as chemical accuracy, is the gold standard in quantum chemistry and catalysis48. With this accuracy, the platform can distinguish poor asymmetric catalysts (0% e.e., ΔΔG‡ ~0 kcal mol–1) from good asymmetric catalysts (90% e.e., ΔΔG‡ ~1.4 kcal mol–1) and good asymmetric catalysts from excellent asymmetric catalysts (99% e.e., ΔΔG‡ ~2.8 kcal/mol–1). It is noteworthy that some of the catalysts used in this set have been reported to produce various enantioselectivities depending on conditions (such as acid cocatalyst, solvent and temperature, see for example ref. 49). Although ACE considers solvent (implicit model) and temperature (Boltzmann population), manipulating the two parameters did not improve accuracy. The nature of the acid cocatalyst in the Diels–Alder reaction was not considered. ACE produces a similar average mean unsigned error (within 0.2 kcal mol–1) whether using the original MM3 implementation or the Q2MM-generated TSFF.
A closer look at the epoxidation reaction (Fig. 4b) shows that the most weakly stereoselective catalysts (for example, ΔΔG‡ < 1.0 kcal mol–1) were predicted to induce weak stereoselectivity, while the most strongly stereoselective (for example, ΔΔG‡ > 2.0 kcal mol–1) were predicted to induce strong stereoselectivity.
We investigated the false positives and false negatives that, in large part, result from poor parameters in the MM3 force field rather than intrinsic problems in the methodologies. For example, sugar derivatives such as 6, conjugated systems (possessing an aniline nitrogen and axial chirality) such as 3, sulfonamides (5), silylethers (4), polycyclic compounds (8, 9) and complex phosphine ligands (10) are not well-parameterized in MM3 (Fig. 5). In particular, phenyl sulfonamides have a very particular torsional energy profile (although MM3 parameters for alkyl sulfonamides have been reported50), while phosphines can adopt different cone angles51. Efforts to develop a FF with large applicability domain are ongoing in our laboratory to address this issue52,53,54,55.
Overall, the data demonstrated that this platform can be used to evaluate retrospectively asymmetric catalysts through interaction with the chemists and prompted us to start a larger virtual screening study.
Application of the software tools to scenario 2
A chemist may be looking for a new chemical series as catalyst for a known reaction. As examples, the Shi epoxidation and organocatalysed Diels–Alder reaction were used. These two well-characterized reactions were chosen here due to the existence of known, highly selective catalysts that are few in number. As a result, we expected to generate decoys from library filtering and attempted to recover the known molecules embedded in this list.
A library of approximately 140,000 chiral amines was assembled from the ZINC database56 for the Diels–Alder reaction and the workflow shown in Fig. 6a was assembled. Molecular descriptors were computed for these molecules and used to extract only those of interest (molecular mass < 500, uncharged compounds, only secondary amines, aldehydes and other reactive functional groups removed). Then, any molecules too similar to known catalysts (for example, proline methyl ester in organocatalysed reactions) were removed since the objective was to discover new chemical series. At this stage, nearly 10,000 potential catalysts were selected. SELECT was used to remove analogues and pick the most diverse molecules (for optimal computing time). To ensure that no duplicates were left, our program DIVERSE was applied. A total of 1,307 candidate catalysts remained for screening.
The evaluation of the 1,307 chiral secondary amines was carried out in two steps. A second workflow (Fig. 6b) filtered molecules for their reactivity. It is well established that some amines are more reactive (basic and/or nucleophilic) than others57. In this workflow, various reactivity parameters, including a nucleophilicity index, were computed using our program QUEMIST. Subsequently, REDUCE filtered molecules predicted to be less reactive than proline methyl ester, a known catalyst for the Diels–Alder reaction. CONSTRUCTS processed the remaining 798 molecules to assemble the TS structures that were finally used by Ace to compute stereoselectivity. These calculations were completed in ten days using a single core. Six known catalysts were added to the library to assess the accuracy of ACE to recover them (Fig. 6c).
The same overall process was applied to the search for Shi epoxidation catalysts starting from chiral ketones (very few in available chiral chemical databases) complemented with chiral secondary alcohols converted into chiral ketones using our program REACT2D. Eighteen known stereoselective catalysts were added to the library (Fig. 6c).
Most of the known stereoselective catalysts are ranked high, shown in Fig. 6c (Area Under Receiver Operating Curve (AUROC): 0.79 for Shi epoxidation and 0.92 for Diels–Alder). The evaluation of the program in this second scenario suggests that our platform can virtually screen numerous chemicals and discover novel chemical series of asymmetric catalysts. In addition, the options to use these programs in workflows enable chemists to guide the platform towards novel chemical series with specific features and to reduce chemical compatibility issues.
Application of the software tools to scenario 3
A chemist may have a hit molecule (for example, from scenario 2) and will look for analogues with improved selectivity. We used a detailed study by Gerosa et al.58 to simulate this scenario. In that work, chiral pyrrolidine derivatives were synthesized and tested as organocatalysts for the Diels–Alder cycloaddition after the core scaffold was identified as a promising candidate. As shown in Fig. 7, this research project can be simulated within a single workflow. Imines are synthesized and subsequently reacted with a chiral dipolarophile to make three potential diastereomers. These pyrrolidines are then assessed in both endo-Diels–Alder and exo-Diels–Alder cycloadditions. In practice, each reaction step requires extensive experimental work to isolate and characterize the stereoisomers.
As seen in Fig. 7, this virtual lead optimization had a mean unsigned error as low as 0.33 kcal mol–1. The most stereoselective catalysts predicted by Ace were the best (endo) and second best (exo) experimentally. This study was completed in just a few days on a standard Windows PC and could be extended to hundreds of analogues.
Application of the software tools to scenario 4
A chemist may evaluate the potential substrate-scope of a given catalyst. This last set of calculations was done using (DHQD)2PHAL, a now commercially available catalyst (in AD-mix α) for asymmetric dihydroxylation. This catalyst has been virtually (and previously experimentally) applied to 25 substrates and compared to experimental data (Fig. 8).
Overall, this last simulation suggests that the catalyst would be highly enantioselective (≥97% e.e.) on approximately 25% of the substrates and on approximately 20% it would be poorly selective (≤40% e.e.), in excellent agreement with the experiments. However, we observed a poorer reproducibility for dihydroxylation (large standard deviation over multiple runs) than with the other reactions (Supplementary Table 18). This can be explained by the significantly larger size and flexibility of the catalysts used in dihydroxylation (Fig. 5) and suggests a limitation of the approach. More time and computational resources may be required to search the conformational space of such systems adequately.
Three substrates are consistently poorly predicted (over five runs). One of these failures can be attributed to the poor parameterization of sulfur-containing groups (tosylate in this case). The other two are a cis olefin (the FF parameters were developed using a trans olefin) and a naphthalene derivative which contributes significant π–π interactions with the catalyst. Interestingly, the predicted average enantioselectivity varies from 67 to 76% e.e. over five runs while it is 73.6% e.e. experimentally with an overall good correlation with experiments (r2 varies from 0.51 to 0.64).
Our efforts to interface computational and organic chemistry have led to the creation of the VIRTUAL CHEMIST platform, which aims to aid experimental chemists in the pursuit of asymmetric synthesis projects. This platform is user friendly (designed for organic chemists) and highly customizable through the introduction of modular workflows. The power of these modular workflows and of the individual programs making up the Virtual Chemist software suite (free for academic use) has been demonstrated through the in-depth analysis of four realistic scenarios which could comprise various asymmetric synthesis projects.
We believe that every computational approach carries its own caveats. Here, we acknowledge that the methodology presented requires a mechanism-based TS to study, much like docking potential drug molecules requires a target structure. Additionally, the MM-based computations suffer from current FF limitations, although efforts are ongoing to overcome this obstacle. Finally, large catalytic systems provide a challenge for conformational searching in TS optimization and can lead to simulations trapped in local energy minima.
Notwithstanding these obstacles, we have demonstrated the reliability and accuracy of our platform, which is able to distinguish weak asymmetric catalysts from good asymmetric catalysts and good asymmetric catalysts from great asymmetric catalysts with chemical accuracy in most cases. With this platform, chemists could now test ideas in a matter of hours, a fraction of the time needed to synthesize and test novel catalysts.
We believe that our computational approach will lead to a more efficient catalyst discovery, as our simulated experiments allow a broader exploration of the chemical space than experiments allow, in a shorter amount of time. Moving forward, we hope that computational chemistry will have the same impact on organic chemistry as NMR, MS and chromatography had at the time of their incorporation into the chemist’s toolbox, or as structure-based design software has had in medicinal chemistry.
The VIRTUAL CHEMIST platform subversion 5679 was used throughout this work. Pseudocode for all the tools is available in the Supplementary Information (Section I). The workflow module was used to generate all the parameters and a batch script was used to run calculations. These were then ported to supercomputers for more time-efficient calculations. For scenario 1, scenario 3 and scenario 4, catalyst and substrate structures were drawn either using the sketcher provided in VIRTUAL CHEMIST and/or ChemDraw and provided as input to CONSTRUCTS. The templates needed to assemble the TS structures in CONSTRUCTS are either available in VIRTUAL CHEMIST35,36 or derived from reported TSs43,47,59. For metal-containing reactions, reactant and products structures were optimized using QUEMIST (DFT, HF/def2-SVP/D2 dispersion correction) and Hessians were computed to generate FF parameters ready to be used with ACE. For scenario 2, a library of chemicals was extracted from the ZINC database for both Diels–Alder and Shi epoxidation reactions. In the case of Diels–Alder reactions, we extracted a library containing only chiral amines, after which we manipulated it as described in the main text. For the Shi epoxidation, we extracted a library of cyclic ketones but as this proved to contain an insufficient number of molecules we supplemented it with secondary alcohols that were further manipulated as described in the main text.
The sets of molecules used in this study (Supplementary Tables 1–8) and representative computed data (Supplementary Tables 9–18) are available as Supplementary Information. A tutorial for the use of this platform is provided as Supplementary Data 1. The programs are available (free of charge for academic research) at www.molecularforecaster.com. All the data, parameter files and structures are available on moitessier-group.mcgill.ca/software.html.
All other data is available from the authors upon reasonable request.
ACS. Where Is Organic Chemistry Used? ACS Chemistry for life www.acs.org/content/acs/en/careers/college-to-career/areas-of-chemistry/organic-chemistry.html (accessed 30 June 2020).
Wang, A. et al. Unraveling the mysterious failure of Cu/SAPO-34 selective catalytic reduction catalysts. Nat. Commun. 10, 1137 (2019).
Wang, X.-G. et al. Three-component ruthenium-catalyzed direct meta-selective C–H activation of arenes: a new approach to the alkylarylation of alkenes. J. Am. Chem. Soc. 141, 13914–13922 (2019).
Meucci, E. A. et al. Nickel(IV)-catalyzed C–H trifluoromethylation of (hetero)arenes. J. Am. Chem. Soc. 141, 12872–12879 (2019).
Durrant, J. D. & McCammon, J. A. Molecular dynamics simulations and drug discovery. BMC Biol. 9, 71–71 (2011).
Borhani, D. W. & Shaw, D. E. The future of molecular dynamics simulations in drug discovery. J. Comput.-Aided Mol. Des. 26, 15–26 (2012).
Kuntz, I. D., Blaney, J. M., Oatley, S. J., Langridge, R. & Ferrin, T. E. A geometric approach to macromolecule–ligand interactions. J. Mol. Biol. 161, 269–288 (1982).
Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463–477 (2019).
Liu, X. et al. Molecular dynamics simulations and novel drug discovery. Expert Opin. Drug Discovery 13, 23–37 (2018).
Wang, G. & Zhu, W. Molecular docking for drug discovery and development: a widely used approach but far from perfect. Future Med. Chem. 8, 1707–1710 (2016).
Santosh, A. K., Alpeshkumar, K. M., Evans, C. C. & Sudha, S. Pharmacophore modeling in drug discovery and development: an overview. Med. Chem. 3, 187–197 (2007).
Sanderson, K. Automation: chemistry shoots for the moon. Nature 568, 577–579 (2019).
Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).
Zheng, S., Rao, J., Zhang, Z., Xu, J. & Yang, Y. Predicting retrosynthetic reactions using self-corrected transformer neural networks. J. Chem. Inf. Model. 60, 47–55 (2020).
Marques-Lopez, E., Herrera, R. P. & Christmann, M. Asymmetric organocatalysis in total synthesis–a trial by fire. Nat. Prod. Rep. 27, 1138–1167 (2010).
Maldonado, A. G. & Rothenberg, G. Predictive modeling in homogeneous catalysis: a tutorial. Chem. Soc. Rev. 39, 1891–1902 (2010).
Brown, J. M. & Deeth, R. J. Is enantioselectivity predictable in asymmetric catalysis. Angew., Chem. Int. Ed. 48, 4476–4479 (2009).
Houk, K. N. & Cheong, P. H. Y. Computational prediction of small-molecule catalysts. Nature 455, 309–313 (2008).
Harper, K. C. & Sigman, M. S. Predicting and optimizing asymmetric catalyst performance using the principles of experimental design and steric parameters. Proc. Natl Acad. Sci. USA 108, 2179–2183 (2011).
Reid, J. P. & Sigman, M. S. Holistic prediction of enantioselectivity in asymmetric catalysis. Nature 571, 343–348 (2019).
Norrby, P. O. Holistic models of reaction selectivity. Nature 571, 332–333 (2019).
Beker, W., Gajewska, E. P., Badowski, T. & Grzybowski, B. A. Prediction of major regio-, site-, and diastereoisomers in Diels–Alder reactions by using machine-learning: the importance of physically meaningful descriptors. Angew. Chem., Int. Ed. 58, 4515–4519 (2019).
Zahrt, A. F. et al. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 363, eaau5631 (2019).
Reid, J. P. & Sigman, M. S. Comparing quantitative prediction methods for the discovery of small-molecule chiral catalysts. Nat. Rev. Chem. 2, 290–305 (2018).
Bahmanyar, S. & Houk, K. N. The origin of stereoselectivity in proline-catalyzed intramolecular aldol reactions. J. Am. Chem. Soc. 123, 12911–12912 (2001).
Gordillo, R. & Houk, K. N. Origins of stereoselectivity in Diels-Alder cycloadditions catalyzed by chiral imidazolidinones. J. Am. Chem. Soc. 128, 3543–3553 (2006).
Ford, D. D., Nielsen, L. P. C., Zuend, S. J., Musgrave, C. B. & Jacobsen, E. N. Mechanistic basis for high stereoselectivity and broad substrate scope in the (salen)Co(iii)-catalyzed hydrolytic kinetic resolution. J. Am. Chem. Soc. 135, 15595–15608 (2013).
Lin, H., Pei, W., Wang, H., Houk, K. N. & Krauss, I. J. Enantioselective homocrotylboration of aliphatic aldehydes. J. Am. Chem. Soc. 135, 82–85 (2013).
Wolf, L. M. & Denmark, S. E. A theoretical investigation on the mechanism and stereochemical course of the addition of (E)-2-butenyltrimethylsilane to acetaldehyde by electrophilic and nucleophilic activation. J. Am. Chem. Soc. 135, 4743–4756 (2013).
Lam, Y.-h & Houk, K. N. How cinchona alkaloid-derived primary amines control asymmetric electrophilic fluorination of cyclic ketones. J. Am. Chem. Soc. 136, 9556–9559 (2014).
Lam, Y.-h & Houk, K. N. Origins of stereoselectivity in intramolecular aldol reactions catalyzed by cinchona amines. J. Am. Chem. Soc. 137, 2116–2127 (2015).
Reid, J. P., Simón, L. & Goodman, J. M. A practical guide for predicting the stereochemistry of bifunctional phosphoric acid catalyzed reactions of imines. Acc. Chem. Res. 49, 1029–1041 (2016).
Rosales, A. R. et al. Application of Q2MM to predictions in stereoselective synthesis. Chem. Commun. 54, 8294–8311 (2018).
Hansen, E., Rosales, A. R., Tutkowski, B., Norrby, P.-O. & Wiest, O. Prediction of Stereochemistry using Q2MM. Acc. Chem. Res. 49, 996–1005 (2016).
Corbeil, C. R., Thielges, S., Schwartzentruber, J. A. & Moitessier, N. Toward a computational tool predicting the stereochemical outcome of asymmetric reactions: development and application of a rapid and accurate program based on organic principles. Angew. Chem., Int. Ed. 47, 2635–2638 (2008).
Weill, N., Corbeil, C. R., De Schutter, J. W. & Moitessier, N. Toward a computational tool predicting the stereochemical outcome of asymmetric reactions: development of the molecular mechanics-based program ACE and application to asymmetric epoxidation reactions. J. Comput. Chem. 32, 2878–2889 (2011).
Schneebeli, S. T., Hall, M. L., Breslow, R. & Friesner, R. A. Quantitative DFT modeling of the enantiomeric excess for dioxirane-catalyzed epoxidations. J. Am. Chem. Soc. 131, 3965–3973 (2009).
Bootsma, A. N. & Wheeler, S. Popular integration grids can result in large errors in DFT-computed free energies. Preprint at https://doi.org/10.26434/chemrxiv.8864204.v5 (2019).
Rosales, A. R. et al. Rapid virtual screening of enantioselective catalysts using CatVS. Nat. Catal. 2, 41–45 (2019).
Therrien, E. et al. Integrating medicinal chemistry, organic/combinatorial chemistry, and computational chemistry for the discovery of selective estrogen receptor modulators with FORECASTER, a novel platform for drug discovery. J. Chem. Inf. Model. 52, 210–224 (2012).
Pottel, J. & Moitessier, N. Customizable generation of synthetically accessible, local chemical subspaces. J. Chem. Inf. Model. 57, 454–467 (2017).
van Hilten, N., Chevillard, F. & Kolb, P. Virtual compound libraries in computer-assisted drug discovery. J. Chem. Inf. Model. 59, 644–651 (2019).
Rasmussen, T. & Norrby, P. O. Modeling the stereoselectivity of the beta-amino alcohol-promoted addition of dialkylzinc to aldehydes. J. Am. Chem. Soc. 125, 5130–5138 (2003).
Seminario, J. M. Calculation of intramolecular force fields from second-derivative tensors. Int. J. Quantum Chem. 60, 1271–1277 (1996).
Allen, A. E. A., Payne, M. C. & Cole, D. J. Harmonic force constants for molecular mechanics force fields via hessian matrix projection. J. Chem. Theory Comput. 14, 274–281 (2018).
Norrby, P. O., Rasmussen, T., Haller, J., Strassner, T. & Houk, K. N. Rationalizing the stereoselectivity of osmium tetroxide asymmetric dihydroxylations with transition state modeling using quantum mechanics-guided molecular mechanics. J. Am. Chem. Soc. 121, 10186–10192 (1999).
Donoghue, P. J., Helquist, P., Norrby, P.-O. & Wiest, O. Prediction of enantioselectivity in rhodium catalyzed hydrogenations. J. Am. Chem. Soc. 131, 410–411 (2009).
Harvey, J. N., Himo, F., Maseras, F. & Perrin, L. Scope and challenge of computational methods for studying mechanism and reactivity in homogeneous catalysis. ACS Catal. 9, 6803–6813 (2019).
Yang, X. et al. Chiral pyrrolidine derivatives as catalysts in the enantioselective addition of diethylzinc to aldehydes. Tetrahedron: Asym. 10, 133–138 (1999).
Liang, G., Bays, J. P. & Bowen, J. P. Ab initio calculations and molecular mechanics (MM3) force field development for sulfonamide and its alkyl derivatives. J. Mol. Struct. THEOCHEM 401, 165–179 (1997).
Immirzi, A. & Musco, A. A method to measure the size of phosphorus ligands in coordination complexes. Inorg. Chim. Acta 25, L41–L42 (1977).
Liu, Z. et al. Elucidating hyperconjugation from electronegativity to predict drug conformational energy in a high throughput manner. J. Chem. Inf. Model. 56, 788–801 (2016).
Liu, Z., Barigye, S. J., Shahamat, M., Labute, P. & Moitessier, N. Atom Types Independent Molecular Mechanics Method for Predicting the Conformational Energy of Small Molecules. J. Chem. Inf. Model. 58, 194–205 (2018).
Champion, C. et al. Atom type independent modeling of the conformational energy of benzylic, allylic, and other bonds adjacent to conjugated systems. J. Chem. Inf. Model. 59, 4750–4763 (2019).
Wei, W. et al. Torsional energy barriers of biaryls could be predicted by electron-richness/deficiency of aromatic rings; advancement of molecular mechanics towards atom-type independence. J. Chem. Inf. Model. 59, 4764–4777 (2019).
Sterling, T. & Irwin, J. J. ZINC 15—Ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Hall, H. K. Correlation of the base strengths of amines. J. Am. Chem. Soc. 79, 5441–5444 (1957).
Gerosa, G. G., Spanevello, R. A., Suárez, A. G. & Sarotti, A. M. Joint experimental, in silico, and NMR studies toward the rational design of iminium-based organocatalyst derived from renewable sources. J. Org. Chem. 80, 7626–7634 (2015).
DelMonte, A. J. et al. Experimental and theoretical kinetic isotope effects for asymmetric dihydroxylation. evidence supporting a rate-limiting ‘(3 + 2)’ cycloaddition. J. Am. Chem. Soc. 119, 9907–9908 (1997).
We thank NSERC (Discovery programme) for financial support. Calcul Québec and Compute Canada are acknowledged for generous CPU allocations.
Virtual Chemist is distributed by Molecular Forecaster (free of charge for academic research) co-founded by N.M. (CEO: J.P.).
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Burai Patrascu, M., Pottel, J., Pinus, S. et al. From desktop to benchtop with automated computational workflows for computer-aided design in asymmetric catalysis. Nat Catal 3, 574–584 (2020). https://doi.org/10.1038/s41929-020-0468-3