A human-machine interface for automatic exploration of chemical reaction networks

Steiner, Miguel; Reiher, Markus

doi:10.1038/s41467-024-47997-9

Download PDF

Article
Open access
Published: 01 May 2024

A human-machine interface for automatic exploration of chemical reaction networks

Nature Communications volume 15, Article number: 3680 (2024) Cite this article

1678 Accesses
20 Altmetric
Metrics details

Subjects

Abstract

Autonomous reaction network exploration algorithms offer a systematic approach to explore mechanisms of complex chemical processes. However, the resulting reaction networks are so vast that an exploration of all potentially accessible intermediates is computationally too demanding. This renders brute-force explorations unfeasible, while explorations with completely pre-defined intermediates or hard-wired chemical constraints, such as element-specific coordination numbers, are not flexible enough for complex chemical systems. Here, we introduce a STEERING WHEEL to guide an otherwise unbiased automated exploration. The STEERING WHEEL algorithm is intuitive, generally applicable, and enables one to focus on specific regions of an emerging network. It also allows for guiding automated data generation in the context of mechanism exploration, catalyst design, and other chemical optimization challenges. The algorithm is demonstrated for reaction mechanism elucidation of transition metal catalysts. We highlight how to explore catalytic cycles in a systematic and reproducible way. The exploration objectives are fully adjustable, allowing one to harness the STEERING WHEEL for both structure-specific (accurate) calculations as well as for broad high-throughput screening of possible reaction intermediates.

A computer algorithm to discover iterative sequences of organic reactions

Article 12 January 2022

Computational planning of the synthesis of complex natural products

Article 13 October 2020

Automation and computer-assisted planning for chemical synthesis

Article 18 March 2021

Introduction

An exhaustive exploration of mechanisms of chemical processes requires the automated generation of chemical reaction networks (CRNs)^{1,2,3,4,5,6,7,8,9}. CRNs typically map chemical reactions into a graph of compound and reaction nodes^10,11,12. This graph can be constructed based on automated calculations that locate transition states of reactions assumed to take place, for which various strategies exist^{13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42}.

First-principles investigations of reaction intermediates and transition states provide valuable insights into reaction mechanisms, as demonstrated, for instance, by numerous studies in the field of catalysis^{43,44,45,46,47,48,49,50,51,52,53,54,55,56,57}. However, no universal, efficient, and reliable theoretical approach toward computational catalysis with generally applicable algorithms is available so that the study of a catalytic reaction mechanism of a single catalyst can require considerable time and expertise. Understanding catalysis in terms of CRNs can be a starting point for the design of cheaper, greener, and more selective catalysts^58,59, because automated procedures can analyze orders of magnitude more structures than manual approaches, leading to a far more complete understanding of relevant reaction steps (including side and decomposition reactions) and conformations. This results in a more accurate formalization of catalytic processes and in silico predictions that cover the whole spectrum of catalyst and substrate reactivity.

The increased number of structures leads, however, to a combinatorial explosion in a brute-force exploration of all possible reactive site combinations, which prohibits the exhaustive exploration the reactivity of even moderately sized structures⁵. Based on their determination of reactive sites, automated exploration programs can be classified into two main categories. On the one hand, there are fully automated approaches^{16,31,33,39,60,61} that are feasible for complex chemical systems only, if they rely on either a restrictive reactive site logic and/or computationally inexpensive calculations⁵. These conditions can, however, limit their applicability and accuracy for a particular system of interest; transition metal complexes are good examples owing to their variability in valency and generally intricate electronic structures. On the other hand, a class of approaches^22,37,62,63 requires a manual setup of reactivity trials through an algorithmic interface, which can save time compared to individual structure and calculation setup, although it still relies on human decision making to determine the reactive site logic and lacks general applicability and scalability.

However, to carry out mechanism elucidations routinely, catalyst design, and other chemical optimization challenges, acceleration protocols are needed that do not corrupt any key feature of an otherwise autonomous reaction mechanism exploration algorithm. For instance, one does not want to limit mechanistic studies by pre-selecting all reaction intermediates, which brings inherent biases and constraints.

Here, we present a new algorithm driving our automated first-principles exploration approach^17,39,60,64 that allows for intuitive on-the-fly interference of an operator with an otherwise autonomous exploration, which we denote as the STEERING WHEEL. Our algorithm is able to cover all ground-state molecular compound and reaction space and can explore a CRN either in a depth-first or in a breadth-first fashion. By virtue of an integration into a graphical user interface the steering of a running exploration is straightforward and intuitive.

In the following sections, we outline the general concept of the STEERING WHEEL, discuss its implementation and its integration into our graphical interface HERON⁶⁵. Afterwards, we demonstrate functionality and efficiency by application to several well-studied reactive systems from transition metal catalysis.

Results and Discussion

Conceptual design and implementation of the steering wheel

Within our modular program package SCINE⁶⁶, we have developed the automated exploration software CHEMOTON^17,39,60,64, which allows one to explore chemical reaction space based on the first principles of quantum mechanics in a single-ended manner without being constrained to specific compound or reaction types. This is achieved by defining local sites in molecular structures that are reacted with one another by pushing/pulling these potentially reactive sites together/apart and then locating a transition state. Compared to traditional (typically double-ended) transition state search algorithms, which aim at a single reaction step, our approach launches an exhaustive search for elementary steps which make no assumption on potential products. This is achieved by batch-wise writing instructions for multiple reaction trials into a database, which are then executed by processes on high-performance computing infrastructure or in the cloud^67,68. The results of the calculations are then written back to the database and aggregated and sorted by Chemoton to construct the emerging reaction network that can then be subjected to kinetic modeling. Kinetic modeling can even be exploited for taming the combinatorial explosion of reactive events^12,69. The number of reactive sites may also be controlled by various heuristic rules, such as first-principles heuristics that exploit properties of the wavefunction or electron density^17,70,71, graph-based rules in combination with known reactivity⁷², or electronegativity-based polarization rules, where, for example, hydrogen is considered active when bound to oxygen^60,69.

However, all these approaches to restrict the combinatorial explosion of potentially reactive events are either not directed or not coarse-grained to a degree that would allow for a quick tour to potentially relevant intermediates of a reactive system. Therefore, we propose the STEERING WHEEL to allow for efficient interactive control of an otherwise autonomous mechanism exploration. Its execution is linear and the automated exploration is split into sequential exploration steps based on an on-the-fly constructed steering protocol. In a complex system, one may want to change the reactive-site determination rules based on the actual state of an exploration to assemble a flexible steering protocol that establishes key parts of a CRN first (before the exploration can dive deeper into the reactive propensity of the system). The heuristic rules can be selected from several existing rules mentioned above (one or more of which can be based on machine learning, first principles, or graph-based rules). To enable such a workflow, we base the STEERING WHEEL on shell-like explorations. Each shell is a procedure to grow a CRN. That is, the STEERING WHEEL sets up and runs new calculations, waits for all of them to finish, and then classifies the results before further exploration steps are initiated. Reactions are, however, not limited to a specific shell, but later-found compounds can still react with the starting compounds.

The steering protocol therefore consists of two alternating exploration steps: Network Expansion Step and Selection Step. A Network Expansion Step is defined as an exploration step that adds new calculations and their results, i.e., structures, compounds, flasks, elementary steps, and reactions, to a growing CRN. Selection Step is defined as an exploration step that chooses a subset of structures (or compounds) and corresponding reactive sites from the reaction network, which limits the explored chemical space and avoids a combinatorial explosion in the subsequent expansion. For both, Network Expansion Step and Selection Step, we have developed implementations discussed in section 3.3 below. From these implementations, the operator can build the steering protocol in such a way that the desired chemical space is covered, as illustrated in Fig. 1. This steering protocol is assembled in terms of keywords – such as ’Dissociation’ to initiate the search for specific dissociation reactions – by a human operator on the fly. This protocol therefore supports easy processing and may easily be generated from written form into spoken language (cf.^{73,74,75,76,77,78}).

**Fig. 1: The steering protocol (the center of the figure) is built by steps that describe network expansion and selection in an alternating fashion.**

To ensure broad applicability across chemical space, the individual steps are defined in a general way, although they can be fine-tuned for each reactive system. For example, a ‘Dissociation’ expansion step is rather general in its definition: only dissociative reaction coordinates within a single compound are probed, but applying the step on a previous selection step can reduce the number of calculations set up from millions to hundreds or dozens. This high specificity can be achieved by combining multiple selection steps into one step, as shown for step three in Fig. 1, or by defining additional compound filters and reactive site filters – concepts available in CHEMOTON³⁹: Because CHEMOTON considers a priori every structure in a network as reactive with each of its atoms as a potentially reactive site, a huge number of possible reactions arises from the combinatorial explosion of reactive atom pairings. Filters reduce the number of potential reactions by eliminating certain structures or reactive-atom combinations from the search space. Similar to a Selection Step, the filters can be based on various rationales, such as graph rules or properties derived from first principles. An example for a compound filter is the Catalyst Filter, which allows one to define a combination of chemical elements as a catalyst and then carry out reaction trials that involve only this catalyst.

The explicit protocol for starting an exploration is not fixed, but it will evolve sequentially. The reason for this dynamic nature of the STEERING WHEEL protocol is that it cannot be known from the start what structures and reactions will be discovered, which then determines what next steps are to be enhanced or handled more restrictive. In this interactive rolling procedure, the current exploration status must be easily understandable and the potential effect of planned steps on the exploration must be foreseeable by an operator. To facilitate this immediate grasp of operator interference, the STEERING WHEEL can be executed concurrently in a Python environment and is integrated into our graphical user interface SCINE HERON⁶⁵. The integration into HERON allows one to build exploration steps directly in the graphical user interface and then carry out the steered exploration in an intuitive problem-focused fashion. The graphical user interface displays how a potential next Network Expansion Step would affect the exploration by presenting the number of calculations set up for the expansion, alongside with the constructed reactive complexes and their reactive sites. Together with the existing average runtime information available in HERON, the computing time for the step can be estimated. This enables one to refine the chosen selection step to be more inclusive or exclusive based on the targeted chemical space and available resources. One such example of an expansion preview alongside the protocol is shown in Fig. 2.

**Fig. 2: A schematic representation of the STEERING WHEEL interface in HERON.**

While such intuitive interactions with a running exploration allow for flexible workflows, they harbor the danger of producing non-reproducible mechanism exploration campaigns. Every set of generated calculations depends on the existing results in the network and if only a random subset of calculations in the previous step were finished, it would render the exploration irreproducible. Therefore, we designed our framework in such a way that it ensures reproducibility by requiring every step of the created exploration protocol to be completed, i.e., every calculation set up must be finished, before any further manipulations of the network are permitted. The linear protocol might lead one to believe that Network Expansion Steps taken early in the exploration impose strong constraints on the remaining exploration. However, any Expansion Step can be applied on the whole CRN at any point in the protocol, meaning that any part of an explored mechanism can be studied in more detail later on with additional calculations.

The linear protocol evolution of the network is advantageous, because it allows the exploration to be completely reproducible, given the steering protocol is published. Naturally, this also requires the same versions of the applied electronic structure programs and SCINE software stack, which can be ensured by containerization that we support out-of-the-box for Apptainer^79,80. Therefore, all explorations presented in this study are easily reproducible with the provided container and protocols deposited on Zenodo⁸¹, where also the resulting calculations (in total 76,000 reaction trials yielding 78,000 chemical structures) are stored in a MongoDB framework⁸².

In the following sections, we demonstrate how our steered automated exploration approach can be applied to study various catalytic systems. We selected three homogeneous transition-metal catalyst, which have been studied for several decades, and one heterogeneous single-site catalyst, which was recently explored with a different automated approach. A complete literature review and discussion will be impossible to achieve in the context of this work. Instead, the main focus of this work is on how the STEERING WHEEL can be applied to study complex reaction mechanisms. Although we do not achieve sufficient accuracy (because of the limited accuracy of the DFT models employed) to revoke or confirm any existing mechanistic hypothesis, each exploration includes some aspects of the mechanism where our automated search produced new results that have not been considered before.

Propylene hydrogenation by Wilkinson catalyst

We first apply the STEERING WHEEL to the reduction of propylene by the well-known transition metal catalyst [RhCl(PPh₃)₃], typically referred to as Wilkinson’s catalyst⁸³. The two most-widely accepted mechanisms of this catalytic reaction are the Halpern mechanism⁸⁴ and the Brown mechanism⁸⁵, which are shown in Fig. 3. Both mechanisms involve catalyst activation by ligand dissociation, oxidative addition of H₂, olefin coordination, olefin insertion into the metal hydride bond, and reductive elimination of the alkane.

**Fig. 3: Halpern⁸⁴ (left) and Brown⁸⁵ (right) mechanisms of the hydrogenation of propylene catalyzed by Wilkinson’s catalyst.**

Despite the long history of research on the mechanism of this catalyst^86,87,88,89, not all intermediates of the proposed mechanism have been observed experimentally yet. The Halpern⁸⁴ and Brown mechanisms⁸⁵ diverge at the intermediate w2, which shows the phosphine ligands in trans-position (w2a) in the Halpern mechanism and in cis-position (w2b) in the Brown mechanism. They then differ in their rate-determining step, which is the olefin insertion in the Halpern and the product elimination in the Brown mechanism. For further details, we refer to ref. ⁸⁹ and references cited therein.

Staub et al.⁹⁰ have recently explored the hydrogenation of ethylene by a simplified model of the catalyst with PH₃ ligands in an automated fashion based on the Artificial Force Induced Reaction (AFIR) approach³⁴. Their most favored mechanism included an initial olefin insertion reaction prior to ligand dissociation and subsequent dihydrogen association. This finding is in disagreement with experimental findings⁹¹ and can most likely be attributed to the simplification of the triphenyl phosphine ligand as PH₃ ligands, which has led to inconclusive theoretical results in the past⁹² and was shown to be relevant for the evaluation of different accessible isomers and the energetically most favorable path⁹³. We therefore included the full triphenylphosphine ligands in our exploration. Since this increased the computational cost, we limited the explored chemical space strictly to the two literature mechanisms. The steering protocol, which guided the exploration by CHEMOTON and which has been deposited on Zenodo⁸¹, reads [File_Input_Selection, Simple_Optimization, Central_Metal_Selection, Dissociation, All_Compounds_Selection, Association, All_Compounds_Selection, Association, Products_Selection, Rearrangement, Products_Selection, Rearrangement, Products_Selection, Dissociation]. By omitting the Selection Steps and the initial structure optimization, the list of Network Expansion Steps reads [Dissociation, Association, Association, Rearrangement, Rearrangement, Dissociation]. The clear connection of our algorithmic interface and the literature mechanism is already apparent based on the strong alignment of the reaction types and the schemes in Fig. 3. The only difference between the mechanism in Fig. 3 and our steering protocol is the split of the product elimination into a Rearrangement and Dissociation, because the formed propane was still weakly bound to the catalyst. This weak coordination is due to the semi-classical dispersion corrections, which favor non-covalent bonding in isolated species where no explicit solvent molecules stabilize the dissociated products. Even though our protocol was strictly based on the standard literature mechanism without any focus on finding diverging reaction intermediates, our selection steps and automated reaction search methods were able to find numerous isomers of the intermediates of the Halpern and Brown mechanisms during the exploration, some of which are displayed in Fig. 4. If specific isomers of intermediates are of interest or expected ones are still missing in the network, they can be searched for in a targeted manner, possibly with more accurate electronic structure methods should the electronic structure model be considered insufficient to localize them.

**Fig. 4: Scheme of the on-cycle intermediates discussed in the literature and all additional intermediates that were found during exploration.**

Already the first intermediate, the activated catalyst after ligand dissociation, features two possible isomers; that is, planar T-shaped conformations with either a phosphine ligand or the chlorine ligand in trans position to the vacant site. These isomers are known to interconvert via a trigonal planar structure⁸⁵. Moreover, we found for many intermediates in our reaction network that the expected vacant site is partially occupied by a weakly bound hydrogen atom of one of the phenyl groups, stabilising the conformation. This agnostic interaction originates from the attractive semi-classical dispersion correction in our electronic structure description. Its relevance is difficult to assess in the present structural model due to the lack of other potential bonding partners, such as solvent molecules. The agostic bond is most pronounced in the intermediates w2 resulting from dihydrogen association, which we found as a single concerted reaction step of hydrogen association with simultaneous breaking of the dihydrogen bond. For intermediates w2, all possible variations of the five-fold coordinated complex were shown to be accessible in NMR experiments of Brown et al.⁸⁵. This observation increases the complexity of mechanisms significantly due to the numerous possible combinations of reactive intermediates. However, with our approach, we were able to find all possible isomers and variants in the ligand sphere (w2c - w2h) at once.

The agostic bond between one hydrogen atom of a phenyl group and the rhodium central ion distorts the ligand sphere from a five-fold geometry to an octahedral complex, which is most likely caused by the the semi-classical dispersion corrections in GFN2-xTB on which we relied for this exploration. We have carried out structure refinements by optimizing some minimum structures with more reliable density functional theory (DFT) methods. This converted the octahedral complex to a five-fold coordination structure as expected. The agostic interaction remained intact only in intermediates w1c and w1d due to the strong under-coordination at the rhodium ion. However, the weak hydrogen-rhodium bond did not hinder the exploration progress to find the catalytic cycle with the GFN2-xTB method. The bonding of propylene leading to intermediates w3a–w3d was possible and replaced the rhodium-hydrogen bond.

Also for intermediate w3 we found cis- and trans-isomers. The only two possible intermediate configurations that were not found in our exploration were the two cis-phosphine conformers with either H or Cl in trans position to the η²-bound propylene. This can be attributed to steric hindrance, because it requires the three largest ligands (i.e., the two phosphine ligands and the olefin) to be all in cis-orientation to one another, which is unlikely to be energetically favorable. We manually constructed one such isomer and optimized its structure to investigate whether it would be stable for the electronic structure model employed in the exploration. Upon structure optimization, the propylene ligand is moved further away from the ruthenium ion, featuring an elongated and weak bond between the terminal propylene carbon atom and the rhodium central ion (Mayer bond order⁹⁴ of 0.22 and bond length of 2.6 Å). Both the Mayer bond order and length exceed the detection thresholds for a stable bond in our framework, which is why the automated reaction trials have not considered this to be a successful association reaction. Given that a full association of the propylene molecule is thermodynamically disfavored, as shown by the optimization, and that this potential reaction competes with association reactions leading to the other, energetically favored stereoisomers, we deem this unsuccessful reaction trial as correct and have not carried out further reaction explorations starting at this stereoisomer.

For the last reaction intermediate, the bound alkyl complex prior to reductive elimination, our steered exploration discovered not only the proposed terminally bound alkyl group w4a⁸⁹, but also intermediates with a 2-propyl ligand, for which we also found the trans- (w4c) and cis-phosphine (w4d) isomers. Furthermore, CHEMOTON located the isomers w4e to w4h, which again incorporated the weakly coordinating hydrogen atom. This hydrogen atom can either originate from phenyl or alkyl groups.

As a side remark, we note that the focus in the aforementioned work of Staub et al.⁹⁰ on the Wilkinson catalyst was on the training of a neural network potential based on the explored structures in the reaction network. This is a promising route for automated exploration algorithms, which depend on fast electronic structure methods. We chose the density functional tight-binding model GFN2-xTB, which, however, may suffer from inaccuracies in energies and sometimes also structures. The former can be easily corrected by DFT single-point calculations, whereas the latter are difficult to correct as they may lead to wrong intermediate and transition state structures and even to a wrong topology of the emerging CRN. By contrast, system-specific neural network potentials are almost as fast to evaluate as classical force fields, but achieve the accuracy of the reference data (typically DFT)^{95,96,97,98,99,100,101,102,103,104}. However, to achieve this accuracy, a huge number of reference data point (i.e., DFT single points) is required, which introduces a significant overhead before an exploration can start. Moreover, visiting new structures during the exploration may show the limitations of a parametrized neural network potential as its accuracy may deteriorate for them. These issues may be tackled by generalized neural network potentials^{105,106,107,108,109}, but one needs to be prepared for no generalist simple model achieving sufficient overall accuracy close to that of DFT. For this reason, we proposed a different scheme, called life-long machine learning potentials that can adjust in an exploration in a system-focused fashion¹¹⁰.

Ziegler–Natta propylene polymerization

Multiple polymerization reaction steps are a challenge for automated explorations due to the required number of exploration steps required to reach long-chained polymers. Therefore, as a second example, we present STEERING WHEEL results for the polymerization of propylene catalyzed by a Ziegler–Natta zirconium catalyst. The catalytic polymerization reaction is shown in Fig. 5. After activation of the stable catalyst to an active, cationic form¹¹¹, possibly facilitated by a co-catalyst, not shown in the figure and also not included in the reaction network, the polymerization is a two-step process. The to-be-inserted olefin monomer binds to a vacant coordination site of the catalyst by η² coordination, while the existing polymer chain is covalently bound to the zirconium in intermediate z1. The monomer is then inserted in a single step at the zirconium site, which generates a vacant site in intermediate z2. This vacant site is only weakly coordinated by an agostic C-H bond from the β position in the polymer chain. However, this site is still available for coordination of the next monomer and does not block the polymerization. For more details, we refer to ref. ¹¹² and references cited therein. The agostic bond increases the probability of the most common polymerization termination reaction for Ziegler–Natta-type catalysts¹¹², also shown in Fig. 5. The catalyst is inactivated by β-hydride transfer, in which compound z3 is formed, and concerted release of the polymer chain with a terminal carbon double bond.

**Fig. 5: Cossee–Arlman mechanism^124,125,126 for the polymerization of propylene by a Ziegler–Natta catalyst and a possible degradation reaction via β-hydride elimination.**

Since the termination reaction can occur in each polymerization cycle, the resulting polymerization products are of varying length. The lengths distribution can be narrowed by designing a catalyst such that the termination reaction is unfavored and only induced by the addition of a termination reagent. An automated reaction exploration can aid such catalyst design challenges as it allows one to identify rather easily all possible reaction products and study varying catalyst degradation reactions at multiple stages of the polymerization. Besides modulating the termination process, Ziegler–Natta-type catalysts allow for an elaborate ligand design to improve the stereoselectivity of the propylene insertion and, hence, to control the tacticity of the produced polymer^113,114,115, apart from general activity improvements based on the co-catalyst or solvent^{116,117,118,119,120}.

However, we focus on the distribution of the polymerization products and the termination reaction. We explored the polymerization with a short steering protocol with two addition reactions of the propylene monomer to the activated catalyst [Zr(Cp)₂CH₃]⁺. The Network Expansion Steps of our steering protocol read [Association, Rearrangement, Association, Conformer_Creation, Rearrangement, Rearrangement, Dissociation]. The catalytic polymerization cycle in Fig. 5 mapped well to a simple Association, Rearrangement protocol. However, we split up the second polymerization cycle with an intermittent conformer creation step, as we noticed during the exploration that the sampling of different conformers is required in later stages of the polymerization due to the increased number of degrees of freedom in the polymer chain. We then sampled the termination reaction with these Network Expansion Steps: Rearrangement and Dissociation. The additional Dissociation step was required, because the formed product was still weakly coordinated to the catalyst, again due to the attractive semi-classical dispersion correction in GFN2-xTB and the lack of explicit solvent molecules that could replace the product by coordinating to the zirconium central ion.

We analyzed the reaction network explored in terms of the products obtained and extracted the three-dimensional structure of all compounds that do not contain zirconium. Then, we inferred the Lewis structure of each compound with XYZ2MOL^121,122 and RDKIT¹²³; the result is shown in Fig. 6. After the addition reaction of propylene to [Zr(Cp)₂CH₃]⁺ and termination by β-hydride elimination, the expected main products are 2-methyl-propene (single addition reaction) and 2,4-methyl-pentene (two addition reactions of propylene). Both were found in the reaction network and the shortest paths for their creation determined by SCINE PATHFINDER¹² were identical to the paths established in the literature^124,125,126. Additionally, 18 hydrocarbon side products were found by CHEMOTON, which are shown in Fig. 6 together with the reactant propylene and the two expected products. However, their occurrence and distribution can only be considered a qualitative result, because we did not refine the GFN2-xTB reaction network with a more reliable electronic structure model such as DFT. However, the broad variety of the explored compound space highlights the capabilities of our STEERING WHEEL approach to broadly cover reaction space adjacent to that of the catalytic cycle while keeping the exploration direction aligned with the elucidation of the catalytic mechanism in question.

Fig. 6: Lewis structures of all hydrocarbons found in the reaction network starting from propylene and the zirconocene catalyst [Zr(Cp)₂CH₃]⁺ with only two allowed addition reactions of a propylene molecule with an alkyl-bearing Zr-catalyst.

Monsanto process for carbonylation of methanol

As a third example presenting a challenge for automated reaction mechanism exploration, we selected a process that involves two intertwined catalytic cycles, the production of acetic acid from methanol and carbon monoxide catalyzed by a rhodium catalyst, typically referred to as Monsanto process. The carbonylation takes place via multiple activated iodide species that are formed in solution from hydrogen iodide and are regenerated by hydrolysis of the acid iodide, which simultaneously forms the desired product. Following previous work^127,128,129, the intertwined catalytic cycles are depicted in Fig. 7. The reaction mechanism involves multiple insertion reactions at the transition metal complex and multiple substitution reactions without the presence of the catalyst.

**Fig. 7: Catalytic cycle of the Monsanto process for the carbonylation of methanol to acetic acid.**

The variety of compounds involved in other (non-Rh-catalyzed) reactions imposes a challenge for existing reaction filters of automated approaches, as outlined in section 2.1. The reason for this challenge is that a set of graph-based rules that define which compounds are reactive commonly activate either the organometallic (outer cycle) or solution-phase (inner cycle) reactions. A set of rules that enables the reaction exploration for both types of reactions during the whole exploration process is, however, prone to cause a combinatorial explosion due to the large chemical space spanned by such a super-set. In combination with the many subsequent reaction steps, this prohibits an exploration of the full catalytic cycle with unsupervised automated, i.e., fully autonomous, explorations. However, the reaction mechanism is also difficult to elucidate with semi-supervised explorations that rely on the specification of individual intermediates, due to multiple possible routes, stereoisomers, and bonding patterns. Because, one is interested only in the overlap of the two reaction spaces to explore the Monsanto process, our steered approach that can switch the focus of the exploration on the fly can tackle such mechanisms. By virtue of our STEERING WHEEL the required number of exploration steps and exploration flexibility is achieved easily and the intertwined catalytic cycles were found starting from methanol, hydrogen iodide (HI), carbon monoxide (CO), and [RhI₂(CO)₂]^– (m1) only.

The Network Expansion Steps required for the exploration of the Monsanto process were [Association, Association, Rearrangement, Rearrangement, Association, Rearrangement, Association], which again closely resembles the literature mechanism^127,128,129 shown in Fig. 7 with the only difference in the association reaction of methyl iodide to the catalyst m1 forming intermediate m2. This reaction was formulated as a single elementary step in Fig. 7, but required two steps in the steering protocol described by an Association and Rearrangement.

After exploration of the reaction network with the semi-empirical electronic structure model GFN2-xTB, we refined the reaction network by carrying out DFT single-point calculations for all minimum structures and transition states (see section 3.1 for details on the computational methodology). Such a refinement is an efficient approach to improve on the accuracy of the activation and reaction energies in a CRN. Applying DFT as the next more accurate electronic structure approach is the first step of a series of available refinement approaches of increasing accuracy in SCINE CHEMOTON (such a sequence of increasingly accurate, but also more costly and hence fewer ab initio calculations can be exploited in Bayesian approaches for systematic uncertainty quantification^130,131).

Based on the DFT activation energies and the chosen starting compounds, we searched the network for the energetically lowest paths from methanol to acetic acid with SCINE PATHFINDER¹². This search yielded the expected catalytic path, schematically shown in Fig. 8 (A), but also a stoichiometric reaction of methanol to acetic acid that consumes the catalyst by forming compound m9 shown in Fig. 8 (B). The two identified paths diverge at intermediate m5 with the association reaction of a second methanol molecule in path B instead of CO in path A. The second methanol molecule hydroxylates the catalyst, undergoes carbonylation, and forms acetic acid by a concerted elimination of the methyl and acid groups, which appears to be a path not discussed in the literature so far. Because the resulting rhodium species m9 is lacking only a CO ligand to be transformed into compound m5, we searched the CRN for a path from m9 to m5, which would close the cycle and, hence, also classify path B as catalytic. This connection was not present in the CRN after the exploration with the steering protocol described above. However, after adding another Association Network Expansion Step to the protocol, which reacted CO with four-fold coordinated rhodium complexes that contain only a single CO ligand, we could find the missing reaction. This path is therefore an excellent example to demonstrate on how CHEMOTON can uncover new reaction mechanisms, enhanced by the intuitive reaction network analysis with PATHFINDER in HERON. Additionally, the software allows one to export a graph similar to the one depicted in Fig. 8. Fig. 8 was generated by pressing a button in the graphical user interface, which generates the level diagram, and second manually augmenting the exported SVG plot with Lewis structures and arrows. Full automatism for figure generation has not been possible, because no software exists that can generate Lewis structures of organometallic complexes reliably. However, the latter is straightforward to achieve by hand within our framework, because HERON directly provides interactive three-dimensional views of all compounds along the path.

**Fig. 8: Two competing reaction paths found in the steered exploration of the Monsanto process and identified by SCINE PATHFINDER.**

However, the energetically lowest mechanism in our reaction network found by SCINE PATHFINDER shown in Fig. 8 (A) differs from the literature mechanism depicted in Fig. 7 in the sequence of carbonyl insertion and CO addition. In fact, the path shown in Fig. 8 is also not the only catalytic path we found in our exploration. All explored catalytic paths differed at the reaction step of methyl iodide addition to the planar quadratic catalyst m1, which we summarize in Fig. 9. The literature path involves rhodium insertion into the methyl iodide bond to form the octahedral complex m2, then methyl migration to form the acetyl ligand in intermediate m3, and subsequent CO addition to regenerate the octahedral complex m4^127,128,129. This path was present in our reaction network and we could find multiple stereoisomers of the reaction intermediates, which differ in their ligand sphere with the methyl group binding either trans to a CO m2a or iodide ligand m2.

**Fig. 9: Competing catalytic paths starting from the catalyst and methyl iodide.**

In addition, we found a pericyclic reaction, in which the methyl-iodide bond was broken and the methyl group was bound to an existing CO ligand instead of to the rhodium center, directly forming intermediate m3a, which is a stereoisomer of the compound m3, shown in Fig. 7.

The energy differences between all catalytic paths found were small and well within the uncertainty of our electronic structure description, hence further studies are necessary to discriminate the two paths. Furthermore, the activation energy of the reaction of methanol with HI was higher than expected, because this reaction is catalyzed by water¹³² and we did not include explicit solvent molecules in our reaction exploration.

Moreover, we note that a path hypothesized in the literature^133,134,135, where the methyl iodide oxidative addition to rhodium is a two-step process with an initial S_N2-like nucleophilic attack by the rhodium complex on methyl iodide with iodide acting as a leaving group and only associating to the rhodium center in a second elementary step, could not be found by CHEMOTON in our initial exploration. Since Feliz et al. observed a strong effect of the electronic structure model and solvent description on the initial transition state¹³⁵, we suspected that this elementary step is highly unfavored in the tight binding method that we relied on for the initial structure exploration. A manual study of the elementary step also failed to locate a transition state for the nucleophilic attack. We could confirm that this failure is due to the approximate electronic structure model employed and that it is not a failure of our exploration strategy by launching another CRN exploration with the identical steering protocol but replacing the tight-binding electronic structure model with a pure DFT model (PBE-D3/def2-SVP). Furthermore, we adjusted the Selection Step before the third Expansion Step to be more restrictive, so that it considered fewer reaction trials, in order to cope with the increased computational cost per calculation. Although the algorithm carried out fewer reaction trials, it was able to locate a transition state for the nucleophilic attack of the rhodium complex on methyl iodide in the DFT-based exploration. We then added an additional Association step that reacted an iodide ion with five-fold coordinated rhodium complexes as this was not considered in the initial GFN2-xTB-based exploration. This step completed the S_N2 mechanism. The CRN of the initial steps of the Monsanto process with DFT-based reaction trials was stored in a separate database on Zenodo⁸¹. Hence, we could show that an existing exploration protocol can be adapted directly to another electronic structure model while allowing one to adjust the scale of the exploration depending on the costs of the electronic structure model employed.

Silica-supported single-site catalyst

As the last challenge for the STEERING WHEEL, we selected a gallium single-site silica-supported catalyst for olefin polymerization¹³⁶. The diverse bonding patterns in the catalytic reaction mechanism, the flexible environment of the silica support, and different possible reaction paths for various gaseous hydrocarbons that can re-adsorb to the solid-state catalyst are a challenge for automated approaches. Because of their size, periodic systems lead to calculations with high computational costs, which can prohibit extensive explorations of complex systems^5,9. Therefore, established automated approaches in heterogeneous catalysis leverage existing literature data, group additivity, and linear scaling relations. Approaches by Goldsmith, Green, Nørskov, Reuter, Ulissi, and West have been demonstrated to be successful for pyrolysis^{137,138,139,140,141,142,143}, electrochemistry^{144,145,146,147}, and small molecule activation^148,149.

They delivered novel catalyst candidates and kinetic models close to experiment by incorporation of existing data^61,150,151. High-throughput calculations have leveraged algorithms that can generate any miller index surface¹⁵² and determine adsorbate positions^{153,154,155,156,157}. However, the study of novel chemistry with these approaches requires either prior large scale data generation^158,159 or an extension of the incorporated reaction rules by expert developers¹⁶⁰.

In contrast to such data-driven approaches, there exist strategies to carry out brute-force enumerations of all species based on single-ended reaction trials without biasing the calculations to known mechanisms or energies as presented by Maeda^{161,162,163,164} and Zimmerman^165,166. However, such first-principles-based approaches require limitations in the structural model, such as exploring a single potential energy surface, constraining the nuclei of the metal surface, or restricting the studied reactions to small molecules dissociating on low-Miller-index surfaces, such as a (111) surface.

To study catalytic reactions without being constrained by existing data, exploration strategies can make a compromise between these two extrema. On the one hand, the Liu group studied highly complex systems^{167,168,169,170} by decreasing the computational costs with a machine learning potential that is tailored to the specific system based on preceding first-principles-based molecular dynamics simulations. On the other hand, the Savoie group¹⁷¹ decreased the computational costs by studying a cluster model after validating it with periodic calculations, and predicting the products of each exploration shell with graph-based rules, subsequently leveraging double-ended transition state searches with constrained surface atoms, and a barrier limit to grow the CRN in a deep instead of broad manner. Hence, they showed that it is possible to study a deep CRN featuring highly complex heterogeneous species based on first-principles by restricting the search space.

Here, we show that we can reproduce and enhance their results further without any constrained atoms and solely with single-ended exploration methods by guiding the automated exploration with the STEERING WHEEL according to their mechanism hypothesis. We started our exploration with the identical cluster model of ref. ¹⁷². A gallium-ethyl species, labeled H1 in Fig. 10, was probed for ethylene association reactions, producing species H2 and so on. The labels up to H17 are identical to previous work¹⁷¹, all higher numbers are newly found gallium species by our protocol. The exploration required an increased number of exploration steps compared to the other CRNs due to high number of consecutive elementary steps of the mechanism. In total, our steering protocol consisted of 19 Network Expansion Steps.

**Fig. 10: Competing catalytic paths and degradation reactions starting from the gallium-ethyl species H1 and ethene.**

In addition to the known reaction pathways yielding 1-butene, ethane, cis-butene, isobutylene, propylene, and polymerization (up to C₅- and C₆-species), our protocol could locate pathways to 1,3-butadiene, trans-butene, and alternative pathways to the known products. In view of the new path to trans-butene, we can dismiss the speculated¹⁷¹ enantioselective preference in 2-butene production as the two reaction pathways are energetically identical within the uncertainty of our electronic structure model. Furthermore, the path to 1,3-butadiene agrees with other mechanistic studies¹⁷³.

Although our exploration exploited knowledge of the reaction mechanisms by Savoie¹⁷¹, we stress that our approach of steered automated explorations can also be applied in the case of vague or even conflicting ideas about a reaction mechanism. Our approach works in these cases as well, because pathways close in reaction space are sampled together with intended ones, and the exploration strategy can be adapted to the failure of finding certain species or pathways. The latter occurred in this study. Ref. ¹⁷¹ presented the shift of the methyl group in the reaction from H7 to H8 as a shift of the methyl group in β-position to the gallium ion. We first failed to find reaction paths to species H8. Then, we studied the three-dimensional structure and were certain that species H8 is accessible mainly by a shift of the methyl group in γ-position to the gallium ion. Therefore, we included this additional option into our steering protocol and could drive the exploration successfully to H8 and beyond.

In terms of computational resources and extensiveness of the explored chemical space, we note that our exploration required slightly more computational time (1272 compared to roughly 900¹⁷¹ days in serial computing time, cf. section 1 in the Supporting Information for the definition). However, we did not constrain any nuclei and could therefore explore more degradation reactions featuring interactions with the silica support, and also found new reaction paths by mapping out more of the chemical reaction space. The exact scale of CRNs produced with different automated approaches is difficult to compare, however, because the different approaches have varying definitions of elementary steps and structures, and different de-duplication algorithms, i.e., algorithms that identify two independently found structures to be identical.

Our CRN of the gallium single-site catalysis encompasses 1795 compounds and 1948 flasks that aggregate a total of 37,053 structures, which were connected by 14,118 elementary steps that were aggregated into 4,533 reactions (for a definition of these terms, see section 1 in the Supporting Information and ref. ⁴). The aggregation of structures into compounds or flasks (that is, a collection of non-covalently bound molecules) is based on identical molecular charge, spin multiplicity, and molecular graph as determined by MOLASSEMBLER^174,175. Ref. ¹⁷¹ does not specify the total size of the CRN. The supporting information¹⁷⁶ contains in total 134 transition states, meaning that even if all of them belong to unique reactions, i.e., no two transition states connect the same set of compounds, our CRN is about ten-fold in size in terms of unique reactions (4533).

However, we also note that comparisons between automated reaction network explorations, even if applied to the same chemical system, are generally difficult and still an open challenge in this field¹⁷⁷, due to numerous options in many programs, varying reaction exploration strategies, different storage of the reaction network, and the challenge to compare highly complex data structures that represent chemical reaction networks.

Since we did not constrain the silica support during our exploration, we witnessed our de-duplication algorithm, which is based on a molecular graph isomorphism¹⁷⁴, struggle in some cases in this reaction network because it could not distinguish two compounds that differ by slight variations in the silica support. However, this is not a drawback of the algorithm, but a consequence of the actual feature of the surface, presenting a variable support for the reaction to take place.

Conclusions

We have developed a general framework, the STEERING WHEEL, for intuitively guiding automated reaction mechanism exploration campaigns, while ensuring the creation of reproducible and transferable workflows. The design of our algorithm allows for straightforward monitoring of the exploration progress as well as for inquiring subsequent Network Expansion Steps, which allows one to target wanted regions of chemical reaction space without the need to specify individual intermediate compounds or even structures. This improves the feedback on exploration decisions and facilitates the planning of subsequent exploration steps.

While the framework allows for efficient explorations based on human input, each exploration step and therefore the complete network can be pushed towards exhaustive exploration (i.e., considering any pair of atoms of any nodes of the reaction network as reactive) at any point in the workflow. This allows one to study catalytic mechanisms routinely in a rather complete fashion with minimal human work or domain knowledge in the setup of electronic structure calculations and automated explorations. We emphasize, however, that our framework is not limited to catalytic mechanisms, but can be applied to explore chemical reactivity in general.

We have applied the STEERING WHEEL to three well-known homogeneous transition metal catalysts and one heterogeneous single-site catalyst. For the Wilkinson catalyst, our exploration covered both literature mechanisms as well as additional potentially relevant reaction intermediates. The effect of the triphenyl phosphine ligands upon the optimized reaction intermediates and transition states due to their strong steric effect suggests that their explicit inclusion in the structural model of theoretical studies of this mechanism is important.

For the Ziegler–Natta metallocene catalyst, our exploration covered the literature reaction path to the expected polymerization product including the correct termination reaction. Additionally, we found other homo- and heterolytic termination reactions that allowed us to cover the reaction space in such a way to find 18 other hydrocarbon polymerization (by-)products after only two addition reactions of propylene with the catalyst.

We note that no quantitative evaluation is possible from our Wilkinson and Ziegler–Natta reaction networks as this would require a refinement of the networks with more accurate electronic structure methods (such as DFT). We have, however, carried out a DFT refinement of the reaction and activation energies of the reaction network containing the most compounds in this study, namely the exploration of the Monsanto process. In this network, we could find the intertwined catalytic cycles spanning six subsequent reactions from the reactant to the product as well as an unreported mechanism that produces acetic acid in an additional catalytic cycle initialized by a second association reaction of methanol with a pentavalent rhodium species. Additionally, we found multiple alternative paths for the reaction with the highest activation energy in the catalytic mechanism, the oxidative addition of methyl iodide to the catalyst.

The two findings, the catalyst degrading mechanism and the alternative paths in the catalytic cycle, highlight how an automated and systematic, yet guidable, algorithm allows one to study complex reaction mechanisms in great detail. However, we also note that the exploration of the Monsanto process overestimated at least one activation energy due to missing explicit solvent molecules in the exploration, which therefore lacked catalytic solvent effects, and missed one previously reported reaction path for the addition reaction of methyl iodide due to the selection of the fast GFN2-xTB model of limited accuracy for initial structure explorations, which we confirmed with a second, but limited automated exploration of the mechanism with DFT-based reaction trials. Hence, it can be advantageous to apply our STEERING WHEEL algorithm to restrict reaction network explorations in such a way to reduce the required number of calculations such that initial DFT structure explorations are feasible and introduce systematic solvation correction protocols, which are currently under development in our group.

For the single-site gallium silicate catalyst, we could push the boundaries of accessible deep reaction mechanisms by exploring a mechanism spanning twelve subsequent elementary steps. We could recover the known reaction network completely and found additional reaction paths resulting in other (by-)products and alternative paths to already known products. We achieved this with single-ended exploration methods without explicit assumption of the products and without constrained nuclei. This demonstrates that our approach is general and applicable to a broad range of systems. It quickly maps out the relevant chemical reaction space and systematically improves on existing data and hypotheses.

The modular infrastructure of SCINE in general, SCINE CHEMOTON in particular, and of our STEERING WHEEL algorithm form a suitable basis for further extensions of individual parts of automated workflows, such as more advanced Network Expansion Steps (e.g., reaction trials featuring multiple electronic structure models³¹, systematic network refinement with more accurate electronic structure methods⁶⁸ or with automated microsolvation approaches^{178,179,180,181}, or more exhaustive conformer generation^{174,182,183,184}). Moreover, inclusion of Selection Steps that do not rely on human input, such as general heuristics derived from first principles⁷¹, results from existing explorations⁷², machine learning¹⁸⁵, path information¹², or kinetic simulations^{69,186,187,188,189,190,191,192} is straightforward and will further enhance the capabilities of the STEERING WHEEL, which has been implemented into our graphical user interface SCINE HERON, which is available free of charge and open source.

Methods

Computational methodology

All data management, quantum chemical calculations, and structure manipulations were conducted within our general software framework SCINE⁶⁶, which is available open source and free of charge. The STEERING WHEEL was implemented in our graphical user interface SCINE HERON⁶⁵, with SCINE CHEMOTON^39,64 as the underlying engine to drive the mechanistic explorations. All reaction trials were carried out with the Newton Trajectory 2 (NT2) algorithm^39,193,194, the reactive sites were determined by the selection steps made and filters chosen, which are stored within the provided protocol files deposited on Zenodo⁸¹. All reaction trials were carried out with the SCINE CHEMOTON default settings (also provided in the protocol files) and all calculations were carried out with a SCINE PUFFIN⁶⁷ Apptainer container. The molecular graphs required for sorting all chemical structures into compounds and flasks were constructed by SCINE MOLASSEMBLER^174,175, which also enabled the generation of conformers of the Ziegler–Natta zirconium catalyst based on distance geometry.

All electronic structure calculations were carried out by external programs, which can be controlled by the SCINE interface¹⁹⁵ that allows to freely select and substitute the underlying electronic-structure model, including hybrid models¹⁹⁶. All explorations were initially carried out with GFN2-xTB¹⁹⁷ as implemented in xtb 6.5.1¹⁹⁸ supported by our interface¹⁹⁹. Further refinement of the Monsanto network was carried out with the (pure) generalized-gradient-approximation Perdew–Burke–Ernzerhof^200,201 (PBE) exchange-correlation functional with 25 % exact exchange (PBE0)²⁰². These PBE0 calculations were carried out with TURBOMOLE 7.4.1²⁰³ with the def2-TZVP basis set²⁰⁴. The refinement of reaction and activation energies in the CRN of the gallium single-site catalyst was carried out with Becke-3–Lee–Yang–Parr (B3LYP) exchange-correlation functional^{205,206,207,208} and the 6-311G** basis set^209,210 in order to compare to the network published in ref. ¹⁷¹. The exploration of the first steps of the Monsanto network with DFT validates the reaction mechanism obtained with the more approximate electronic structure model. It was carried out with the PBE functional and the def2-SVP basis set²⁰⁴ with ORCA 5.0.3²¹¹. All DFT calculations included the D3 dispersion correction²¹² with Becke–Johnson damping²¹³ and density-fitting resolution of the identity through an auxiliary basis set²¹⁴.

The exploration of the Monsanto process was considered with an implicit solvation description applying the dielectric constant of water. As solvation models we applied (i) the Conductor-like screening model²¹⁵ for the PBE0/def2-TZVP single-point calculations, (ii) the Gaussian charge scheme^216,217 for the exploration with PBE/def2-SVP, and (iii) the generalized Born solvation area model^{218,219,220,221} for the GFN2-xTB tight-binding calculations. All calculations and their results were stored in our MongoDB-based database format⁸² on Zenodo⁸¹.

Methodological developments

We have improved the NT2 algorithm regarding the transition state guess. Previously, the highest local maximum along a search trajectory had been selected as transition state guess, which could cause problems for atoms that get too close at one end of the search trajectory as those configurations do not represent transition states, but some arbitrary high-energy structures. We improved the selection based on the observed bond order changes during the trajectory. If the desired bond order change has occurred during the trajectory, the last local energy maximum before the event will be selected as a transition state guess. If the desired bond order change occurs, but our algorithm has not observed any local energy maximum up to this point, as determined by a screening window after smoothing the curve with a Savitzky–Golay filter²²², the first local maximum after the event will be selected. If the bond order change is not observed during the trajectory, e.g., due to a failing calculation before the required bond order threshold could be reached, the highest local energy maximum structure will be selected as a transition state guess.

Moreover, we improved the NT2 stop criteria and forces for haptic bond formations, which are crucial for many transition metal catalyzed reactions, due to possible ηⁿ coordination modi. The NT2 algorithm is based on the construction of a reaction coordinate based on pairs of nuclei, which in general allows it to probe more complex reaction coordinates than fragment-based approaches such as NT1³⁹ and AFIR³⁴. However, the combination of pairs of single nuclei may lead to problematic force additions for highly complex reaction coordinates involving a haptic bond formation and another association reaction or multiple dissociation reactions involving the same reactive site multiple times. One such example is a S_N2-like reaction with one substituent forming a haptic bond. We have improved on the NT2 algorithm in such a way that the involvement of haptic bond formation or breaking is detected based on calculated bond orders, which allows the software to deduce whether pairs of nuclei belong to the same molecule or not. Based on this information, the applied forces on the reacting nuclei are scaled such that the intended η bond formation or breaking is possible without eliminating other concerted reaction coordinates.

Another notable improvement to CHEMOTON’s reaction exploration capabilities is the addition to carry out a fast screening of potential dissociation energies, which allows the software to skip the more expensive NT2-based algorithm. For this, the to-be-dissociated structure can be split into multiple molecules determined by CHEMOTON’s reactive-site logic. The two or more separate molecules can then be optimized separately to obtain the products of the dissociation and reaction energy. The optimizations are carried out for multiple possible charge combinations, to consider that the bond(s) can be cut in a homo- or heterolytic fashion, as well as for different spin multiplicities to account for different spin distributions in the fragments. For the combination that yields the lowest dissociation energy, the software then probes the optimized fragments for a barrierless reaction by placing the fragments alongside the cut bonds with the distance elongated to the sum of van der Waals radii of the reactive sites. If the optimization of this super-system then yields the initially dissociated structure, a barrierless elementary step is added to the CRN. Hence, barrierless reactions are found with minimal computational cost. Such barrierless reactions can be crucial for catalyst activation (see e.g., the activation of Wilkinson’s catalyst).

In the explorations reported in this work, all dissociation reactions with a reaction energy below 200 kJ/mol were additionally sampled with our conventional NT2-based algorithm afterwards to verify the results from the faster algorithm.

Implementation details of exploration steps and exploration workflows

The steered explorations are carried out by the Steering_Wheel Python data structure, which receives an exploration protocol as a list of exploration steps that are either a Selection Step or Network Expansion Step. The management of technical details such as individually forked processes, database information forwarding, and database querying are handled by this data structure. The implementation in HERON provides further abstractions and the operator can generally operate the steered exploration by selecting options from the existing implementations which are sufficiently general so that all explorations presented in this work can be carried out in the graphical user interface.

We have developed a number of STEERING WHEEL exploration steps that we expect to be well-suited for most molecular chemistry including homogeneous transition-metal and single-site catalysis. The implemented Network Expansion Steps allow one to generate conformers with SCINE MOLASSEMBLER^174,175 and to extend the CRN by steps that are encoded in basic chemical language; for example, Association for the association reaction of two molecules, Dissociation for the dissociation of a bond in a single molecule, and Rearrangement for rearrangements within a molecule by intramolecular reaction. A complete list of the implemented Network Expansion Steps is given in Supplementary Table 2 in the Supporting Information. Generally, the various Expansion Steps can be chosen such that either bi- or unimolecular reactions are sampled. This guides the CRN in the general direction of either aggregating reactants or progressing the reactivity of already activated compounds. Additional settings include the number of sampled reaction coordinates, e.g., a straightforward ligand association reaction can be sampled with a single reaction coordinate, while complex rearrangements of haptic bonds can involve multiple associative and dissociative coordinates.

The specificity of the Expansion Steps is achieved by the preceding Selection Step. The currently implemented Selection Steps allow one to continue with the products found based on different structural or energy criteria and are listed in Supplementary Table 3 in the Supporting Information. Relevant conformers can be selected based on their relative energies and/or based on maximum structural diversity in order to cover the phase space of reactants as much as possible enabled by clustering structures (e.g., according to their root mean square deviation).

Reactive sites of compounds and structures can be limited by various heuristic rules. To this set of rules, we have added a reactive site filter to carry out reaction trials only in the vicinity of a chemical element. For easy usability, this new filter was combined with a suitable compound filter in a Selection Step, the Central_Metal_Selection, that allows one to focus reaction trials strongly on a central ion and its vicinity, a concept that is particularly relevant for transition metal chemistry with the central ion orchestrating the chemical transformations. This and other Selection Steps may also be specialized up to the point of choosing a single pair of atoms within two specific structures as the sole reactive sites in the whole CRN with the integration of the existing filtering logic from CHEMOTON, as discussed in section 2.1, which has been enhanced with a general framework to build sets of reaction rules within HERON⁶⁵ as shown in Figure 1 in the Supporting Information. This enables one to apply different Selection Steps in a very flexible way. If a system can be described well by a general set of reaction rules, e.g., as is often the case in organic chemistry, the filters can be set for multiple steps in the exploration and the general reactivity is guided based on the available resources. However, if highly diverse chemical reactivity shall be explored within one CRN, such as in the Monsanto process that involves both hydrolysis and condensation reactions of small molecules and reactions with an organometallic catalyst, frequent changes of the applied reaction rules can efficiently shift the focus of the steered exploration.

In the unlikely case that none of the current implementations is sufficient to explore a particular system, further additions to our framework are straightforward. A new aggregate filter can be generated by defining a single method that takes either one or two aggregates and specifies if these are to be considered as reactive or not. Within that method the two aggregates can also be queried for more detailed information such as their molecular graph, charge, and more. A new reactive site filter is defined by methods that take potential reaction coordinates of a given molecular structure and returns a list of valid reaction coordinates.

Additionally, new Network Expansion and Selection Steps can be implemented. The linear steering protocol is expected to consist of alternating Network Expansion and Selection Steps. Therefore, each exploration step must be able to process the output of the step before and produce an output that can serve as an input to the next one. These input and outputs are encoded in specific data structures in CHEMOTON. A Selection Step produces a result that specifies to-be-applied filters and / or specific individual structures. A Network Expansion Step produces a list of all compounds, flasks, structures, and reactions that it has modified. A new Selection Step is implemented by defining a method that takes the result of a Network Expansion and constructs the list of valid structures. A new Network Expansion defines the specific jobs it must execute, the different CHEMOTON gears it must execute, and lastly, how to execute them and then collect the results by a database query.

We would like to stress that most applications will not require such development work, but can directly apply the existing exploration framework by selecting from the existing implementations.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The reaction network data generated in this study have been deposited in the Zenodo database under accession code 10611686⁸¹. The networks are stored in our MongoDB framework alongside with the exploration protocols, a description on how to load the data and reproduce it with our software, and the Apptainer container of SCINE PUFFIN⁶⁷ that carried out all calculations. The additional DFT-based exploration of the first steps in the Monsanto process is also deposited as a separate database in the same repository. The Supplementary Information includes metadata of the reaction networks, such as number of compounds and required computing time, and a summary of all implementations of Network Expansion Steps, Selection Steps, and filters. Source data are provided with this paper for Figs. 8 and 9 and the tables in the Supporting Information. Source data are provided with this paper.

Code availability

The underlying SCINE software stack as well as the new graphical user interface are freely available and open-source⁶⁶. The STEERING WHEEL software framework within SCINE CHEMOTON has already been released in version 3.1. The explorations of Wilkinson’s catalyst, Ziegler–Natta catalyst, and the Monsanto process were carried out with this version. The exploration of the gallium single-site catalyst requires the generation of reaction trials of non-covalently bound reactive complexes, which are added to SCINE CHEMOTON in version 3.2. A description on how to install a pre-release version of these features and the graphical user interface is given alongside the data archive on Zenodo⁸¹. In addition to the publicly available release, we note that HERON and CHEMOTON have been included into the AutoRXN workflow⁶⁸ on Microsoft Azure and Azure Quantum Elements^223,224.

References

Sameera, W. M. C., Maeda, S. & Morokuma, K. Computational Catalysis Using the Artificial Force Induced Reaction Method. Acc. Chem. Res. 49, 763–773 (2016).
Article CAS PubMed Google Scholar
Dewyer, A. L. & Zimmerman, P. M. Finding Reaction Mechanisms, Intuitive or Otherwise. Org. Biomol. Chem. 15, 501–504 (2017).
Article CAS PubMed Google Scholar
Simm, G. N., Vaucher, A. C. & Reiher, M. Exploration of Reaction Pathways and Chemical Transformation Networks. J. Phys. Chem. A 123, 385–399 (2019).
Article CAS PubMed Google Scholar
Unsleber, J. P. & Reiher, M. The Exploration of Chemical Reaction Networks. Annu. Rev. Phys. Chem. 71, 121–142 (2020).
Article CAS PubMed ADS Google Scholar
Steiner, M. & Reiher, M. Autonomous Reaction Network Exploration in Homogeneous and Heterogeneous Catalysis. Top. Catal. 65, 6–39 (2022).
Article CAS PubMed PubMed Central Google Scholar
Baiardi, A. et al. Expansive Quantum Mechanical Exploration of Chemical Reaction Paths. Acc. Chem. Res. 55, 35–43 (2022).
Article CAS PubMed Google Scholar
Ismail, I., Majerus, R. C. & Habershon, S. Graph-Driven Reaction Discovery: Progress, Challenges, and Future Opportunities. J. Phys. Chem. A 126, 7051–7069 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wen, M. et al. Chemical reaction networks and opportunities for machine learning. Nat. Comput. Sci. 3, 12–24 (2023).
Article PubMed Google Scholar
Margraf, J. T., Jung, H., Scheurer, C. & Reuter, K. Exploring Catalytic Reaction Networks with Machine Learning. Nat. Catal. 6, 112–121 (2023).
Article Google Scholar
Feinberg, M. Foundations of Chemical Reaction Network Theory, vol. 202 of Applied Mathematical Sciences (Springer International Publishing, 2019).
Blau, S. M. et al. A Chemically Consistent Graph Architecture for Massive Reaction Networks Applied to Solid-Electrolyte Interphase Formation. Chem. Sci. 12, 4931–4939 (2021).
Article CAS PubMed PubMed Central Google Scholar
Türtscher, P. L. & Reiher, M. Pathfinder-Navigating and Analyzing Chemical Reaction Networks with an Efficient Graph-Based Approach. J. Chem. Inf. Model. 63, 147–160 (2023).
Article PubMed Google Scholar
Maeda, S., Ohno, K. & Morokuma, K. Systematic Exploration of the Mechanism of Chemical Reactions: The Global Reaction Route Mapping (GRRM) Strategy Using the ADDF and AFIR Methods. Phys. Chem. Chem. Phys. 15, 3683–3701 (2013).
Article CAS PubMed Google Scholar
Shang, C. & Liu, Z.-P. Stochastic Surface Walking Method for Structure Prediction and Pathway Searching. J. Chem. Theory Comput. 9, 1838–1845 (2013).
Article CAS PubMed Google Scholar
Kim, Y., Choi, S. & Kim, W. Y. Efficient Basin-Hopping Sampling of Reaction Intermediates through Molecular Fragmentation and Graph Theory. J. Chem. Theory Comput. 10, 2419–2426 (2014).
Article CAS PubMed Google Scholar
Wang, L.-P. et al. Discovering Chemistry with an Ab Initio Nanoreactor. Nat. Chem. 6, 1044 (2014).
Article CAS PubMed PubMed Central Google Scholar
Bergeler, M., Simm, G. N., Proppe, J. & Reiher, M. Heuristics-Guided Exploration of Reaction Mechanisms. J. Chem. Theory Comput. 11, 5712–5722 (2015).
Article CAS PubMed Google Scholar
Zimmerman, P. M. Single-Ended Transition State Finding with the Growing String Method. J. Comput. Chem. 36, 601–611 (2015).
Article CAS PubMed Google Scholar
Martínez-Núñez, E. An automated method to find transition states using chemical dynamics simulations. J. Comput. Chem. 36, 222–234 (2015).
Article PubMed Google Scholar
Gao, C. W., Allen, J. W., Green, W. H. & West, R. H. Reaction Mechanism Generator: Automatic Construction of Chemical Kinetic Mechanisms. Comput. Phys. Commun. 203, 212–225 (2016).
Article CAS ADS Google Scholar
Habershon, S. Automated Prediction of Catalytic Mechanism and Rate Law Using Graph-Based Reaction Path Sampling. J. Chem. Theory Comput. 12, 1786–1798 (2016).
Article CAS PubMed Google Scholar
Guan, Y., Ingman, V. M., Rooks, B. J. & Wheeler, S. E. AARON: An Automated Reaction Optimizer for New Catalysts. J. Chem. Theory Comput. 14, 5249–5261 (2018).
Article CAS PubMed Google Scholar
Kim, Y., Kim, J. W., Kim, Z. & Kim, W. Y. Efficient Prediction of Reaction Paths through Molecular Graph and Reaction Network Analysis. Chem. Sci. 9, 825–835 (2018).
Article CAS PubMed Google Scholar
Rodríguez, A. et al. tsscds2018: A code for automated discovery of chemical reaction mechanisms and solving the kinetics. J. Comput. Chem. 39, 1922–1930 (2018).
Article PubMed Google Scholar
Grimme, S. Exploration of Chemical Compound, Conformer, and Reaction Space with Meta-Dynamics Simulations Based on Tight-Binding Quantum Chemical Calculations. J. Chem. Theory Comput. 15, 2847–2862 (2019).
Article CAS PubMed Google Scholar
Rizzi, V., Mendels, D., Sicilia, E. & Parrinello, M. Blind Search for Complex Chemical Pathways Using Harmonic Linear Discriminant Analysis. J. Chem. Theory Comput. 15, 4507–4515 (2019).
Article CAS PubMed Google Scholar
Kang, P.-L., Shang, C. & Liu, Z.-P. Glucose to 5-Hydroxymethylfurfural: Origin of Site-Selectivity Resolved by Machine Learning Based Reaction Sampling. J. Am. Chem. Soc. 141, 20525–20536 (2019).
Article CAS PubMed Google Scholar
Huang, S.-D., Shang, C., Kang, P.-L., Zhang, X.-J. & Liu, Z.-P. LASP: Fast Global Potential Energy Surface Exploration. WIREs Comput. Mol. Sci. 9, e1415 (2019).
Article CAS Google Scholar
Jara-Toro, R. A., Pino, G. A., Glowacki, D. R., Shannon, R. J. & Martínez-Núñez, E. Enhancing Automated Reaction Discovery with Boxed Molecular Dynamics in Energy Space. Chem. Syst. Chem. 2, e1900024 (2020).
CAS Google Scholar
Gu, T., Wang, B., Chen, S. & Yang, B. Automated Generation and Analysis of the Complex Catalytic Reaction Network of Ethanol Synthesis from Syngas on Rh(111). ACS Catal. 10, 6346–6355 (2020).
Article CAS Google Scholar
Zhao, Q. & Savoie, B. M. Simultaneously improving reaction coverage and computational cost in automated reaction prediction tasks. Nat. Comput. Sci. 1, 479–490 (2021).
Article PubMed Google Scholar
Kang, P.-L. & Liu, Z.-P. Reaction Prediction via Atomistic Simulation: From Quantum Mechanics to Machine Learning. iScience 24, 102013 (2021).
Article PubMed ADS Google Scholar
Martínez-Núñez, E. et al. AutoMeKin2021: An open-source program for automated reaction discovery. J. Comput. Chem. 42, 2036–2048 (2021).
Article PubMed Google Scholar
Maeda, S. & Harabuchi, Y. Exploring Paths of Chemical Transformations in Molecular and Periodic Systems: An Approach Utilizing Force. WIREs Comput. Mol. Sci. 11, e1538 (2021).
Article CAS Google Scholar
Liu, Y., Mo, Y. & Cheng, Y. Uncertainty-calibrated deep learning for rapid identification of reaction mechanisms https://doi.org/10.26434/chemrxiv-2022-gg647 (2022).
Xie, X. et al. Data-driven prediction of formation mechanisms of Lithium Ethylene Monocarbonate with an Automated Reaction Network. J. Am. Chem. Soc. 143, 13245–13258 (2021).
Article CAS PubMed Google Scholar
Young, T. A., Silcock, J. J., Sterling, A. J. & Duarte, F. autodE: Automated Calculation of Reaction Energy Profiles- Application to Organic and Organometallic Reactions. Angew. Chem. Int. Ed. 60, 4266–4274 (2021).
Article CAS Google Scholar
Raucci, U., Rizzi, V. & Parrinello, M. Discover, Sample, and Refine: Exploring Chemistry with Enhanced Sampling Techniques. J. Phys. Chem. Lett. 13, 1424–1430 (2022).
Article CAS PubMed Google Scholar
Unsleber, J. P., Grimmel, S. A. & Reiher, M. Chemoton 2.0: Autonomous Exploration of Chemical Reaction Networks. J. Chem. Theory Comput. 18, 5393–5409 (2022).
Article CAS PubMed Google Scholar
Xu, R., Meisner, J., Chang, A. M., Thompson, K. C. & Martínez, T. J. First principles reaction discovery: from the Schrodinger equation to experimental prediction for methane pyrolysis. Chem. Sci. 14, 7447–7464 (2023).
Article CAS PubMed PubMed Central Google Scholar
Zádor, J. et al. Automated Reaction Kinetics of Gas-Phase Organic Species over Multiwell Potential Energy Surfaces. J. Phys. Chem. A 127, 565–588 (2023).
Article PubMed Google Scholar
Medasani, B., Kasiraju, S. & Vlachos, D. G. OpenMKM: An Open-Source C++ Multiscale Modeling Simulator for Homogeneous and Heterogeneous Catalytic Reactions. J. Chem. Inf. Model. 63, 3377–3391 (2023).
Article CAS PubMed Google Scholar
Balcells, D., Clot, E. & Eisenstein, O. C–H Bond Activation in Transition Metal Species from a Computational Perspective. Chem. Rev. 110, 749–823 (2010).
Article CAS PubMed Google Scholar
Lin, Z. Interplay between Theory and Experiment: Computational Organometallic and Transition Metal Chemistry. Acc. Chem. Res. 43, 602–611 (2010).
Article CAS PubMed Google Scholar
Thiel, W. Computational Catalysis — Past, Present, and Future. Angew. Chem. Int. Ed. 53, 8605–8613 (2014).
Article CAS Google Scholar
Jover, J. & Fey, N. The Computational Road to Better Catalysts. Chem. Asian J. 9, 1714–1723 (2014).
Article CAS PubMed Google Scholar
Hong Lam, Y., Grayson, M. N., Holland, M. C., Simon, A. & Houk, K. N. Theory and Modeling of Asymmetric Catalytic Reactions. Acc. Chem. Res. 49, 750–762 (2016).
Article Google Scholar
Vidossich, P., Lledós, A. & Ujaque, G. First-Principles Molecular Dynamics Studies of Organometallic Complexes and Homogeneous Catalytic Processes. Acc. Chem. Res. 49, 1271–1278 (2016).
Article CAS PubMed Google Scholar
Zhang, X., Chung, L. W. & Wu, Y.-D. New Mechanistic Insights on the Selectivity of Transition-Metal-Catalyzed Organic Reactions: The Role of Computational Chemistry. Acc. Chem. Res. 49, 1302–1310 (2016).
Article CAS PubMed Google Scholar
Harvey, J. N., Himo, F., Maseras, F. & Perrin, L. Scope and Challenge of Computational Methods for Studying Mechanism and Reactivity in Homogeneous Catalysis. ACS Catal. 9, 6803–6813 (2019).
Article CAS Google Scholar
Vogiatzis, K. D. et al. Computational Approach to Molecular Catalysis by 3d Transition Metals: Challenges and Opportunities. Chem. Rev. 119, 2453–2523 (2019).
Article CAS PubMed Google Scholar
Funes-Ardoiz, I. & Schoenebeck, F. Established and Emerging Computational Tools to Study Homogeneous Catalysis—From Quantum Mechanics to Machine Learning. Chem 6, 1904–1913 (2020).
Article CAS Google Scholar
Chen, H. et al. On the Mechanism of Homogeneous Pt-Catalysis: A Theoretical View. Coord. Chem. Rev. 437, 213863 (2021).
Article CAS ADS Google Scholar
Durand, D. J. & Fey, N. Building a Toolbox for the Analysis and Prediction of Ligand and Catalyst Effects in Organometallic Catalysis. Acc. Chem. Res. 54, 837–848 (2021).
Article CAS PubMed Google Scholar
Wodrich, M. D., Sawatlon, B., Busch, M. & Corminboeuf, C. The Genesis of Molecular Volcano Plots. Acc. Chem. Res. 54, 1107–1117 (2021).
Article CAS PubMed Google Scholar
Catlow, C. R. A. Concluding Remarks: Reaction Mechanisms in Catalysis: Perspectives and Prospects. Faraday Discuss. 229, 502–513 (2021).
Article CAS PubMed ADS Google Scholar
Lledós, A. Computational Organometallic Catalysis: Where We Are, Where We Are Going. Eur. J. Inorg. Chem. 2021, 2547–2555 (2021).
Article Google Scholar
Laplaza, R., Sobez, J.-G., D. Wodrich, M., Reiher, M. & Corminboeuf, C. The (Not so) Simple Prediction of Enantioselectivity – a Pipeline for High-Fidelity Computations. Chem. Sci. 13, 6858–6864 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wodrich, M. D., Laplaza, R., Cramer, N., Reiher, M. & Corminboeuf, C. Toward in Silico Catalyst Optimization. CHIMIA 77, 139–139 (2023).
Article CAS PubMed Google Scholar
Simm, G. N. & Reiher, M. Context-Driven Exploration of Complex Chemical Reaction Networks. J. Chem. Theory Comput. 13, 6108–6119 (2017).
Article CAS PubMed Google Scholar
Liu, M. et al. Reaction Mechanism Generator v3.0: Advances in Automatic Mechanism Generation. J. Chem. Inf. Model. 61, 2686–2696 (2021).
Article CAS PubMed Google Scholar
Rasmussen, M. H. & Jensen, J. H. Fast and automatic estimation of transition state structures using tight binding quantum chemical calculations. PeerJ Phys. Chem. 2, e15 (2020).
Article Google Scholar
Ingman, V. M., Schaefer, A. J., Andreola, L. R. & Wheeler, S. E. QChASM: Quantum Chemistry Automation and Structure Manipulation. WIREs Comput. Mol. Sci. 11, e1510 (2021).
Article CAS Google Scholar
Bensberg, M. et al. qcscine/Chemoton: Release 3.0.0 https://zenodo.org/record/7928104 (2023).
Bensberg, M. et al. qcscine/Heron: Release 1.0.0 https://zenodo.org/record/7038388 (2022).
Software for Chemical Interaction and Networks (SCINE). https://scine.ethz.ch/. accessed June 2023.
Bensberg, M. et al. qcscine/Puffin: Release 1.2.0 https://zenodo.org/record/7928099 (2023).
Unsleber, J. P. et al. High-Throughput Ab Initio Reaction Mechanism Exploration in the Cloud with Automated Multi-Reference Validation. J. Chem. Phys. 158, 084803 (2023).
Article CAS PubMed ADS Google Scholar
Bensberg, M. & Reiher, M. Concentration-Flux-Steered Mechanism Exploration with an Organocatalysis Application. Isr. J. Chem. 63, e202200123 (2023).
Article CAS Google Scholar
Grimmel, S. A. & Reiher, M. The Electrostatic Potential as a Descriptor for the Protonation Propensity in Automated Exploration of Reaction Mechanisms. Faraday Discuss. 220, 443–463 (2019).
Article CAS PubMed ADS Google Scholar
Grimmel, S. A. & Reiher, M. On the Predictive Power of Chemical Concepts. CHIMIA 75, 311–318 (2021).
Article CAS PubMed Google Scholar
Unsleber, J. P. Accelerating Reaction Network Explorations with Automated Reaction Template Extraction and Application. J. Chem. Inf. Model. 63, 3392–3403 (2023).
Article CAS PubMed PubMed Central Google Scholar
Aspuru-Guzik, A., Lindh, R. & Reiher, M. The Matter Simulation (R)Evolution. ACS Cent. Sci. 4, 144–152 (2018).
Article CAS PubMed PubMed Central Google Scholar
Schwaller, P. et al. Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction. ACS Cent. Sci. 5, 1572–1583 (2019).
Article CAS PubMed PubMed Central Google Scholar
Hocky, G. M. & White, A. D. Natural Language Processing Models That Automate Programming Will Transform Chemistry Research and Teaching. Digital Discov. 1, 79–83 (2022).
Article Google Scholar
Bran, A. M. et al. Augmenting large language models with chemistry tools. In NeurIPS 2023 AI for Science Workshop https://openreview.net/forum?id=wdGIL6lx3l (2023).
Choudhary, K. & Kelley, M. L. ChemNLP: a natural language-processing-based library for materials chemistry text data. J. Phys. Chem. C. 127, 17545–17555 (2023).
Article CAS Google Scholar
Copilot in Azure Quantum. https://quantum.microsoft.com/en-us/experience/quantum-elements. accessed July 2023.
Kurtzer, G. M., Sochat, V. & Bauer, M. W. Singularity: Scientific Containers for Mobility of Compute. PLoS One 12, e0177459 (2017).
Article PubMed PubMed Central Google Scholar
Kurtzer, G. M. et al. hpcng/Singularity: Singularity 3.7.3 https://zenodo.org/record/4667718 (2021).
Steiner, M. & Reiher, M. Data Set for the Journal Article ’Navigating chemical reaction space with a steering wheel’ https://doi.org/10.5281/zenodo.8010372 (2024).
Bensberg, M. et al. qcscine/Database: Release 1.2.0 https://zenodo.org/record/7928096 (2023).
Young, J. F., Osborn, J. A., Jardine, F. H. & Wilkinson, G. Hydride Intermediates in Homogeneous Hydrogenation Reactions of Olefins and Acetylenes Using Rhodium Catalysts. Chem. Commun. (London) 131–132 https://doi.org/10.1039/C19650000131 (1965).
Halpern, J. Mechanistic Aspects of Homogeneous Catalytic Hydrogenation and Related Processes. Inorg. Chim. Acta 50, 11–19 (1981).
Article CAS Google Scholar
Brown, J. M., Chaloner, P. A. & Morris, G. A. The Catalytic Resting State of Asymmetric Homogeneous Hydrogenation. Exchange Processes Delineated by Nuclear Magnetic Resonance Saturation-Transfer (DANTE) Techniques. J. Chem. Soc., Perkin Trans. 2, 1583–1588 (1987).
Article Google Scholar
Dedieu, A. Hydrogenation of Olefins Catalyzed by the Chlorotris (Triphenylphosphine) Rhodium (I) Complex. A Theoretical Study of the Structural Aspects. Inorg. Chem. 19, 375–383 (1980).
Article CAS Google Scholar
Koga, N., Daniel, C., Han, J., Fu, X. Y. & Morokuma, K. Potential Energy Profile of a Full Catalytic Cycle of Olefin Hydrogenation by the Wilkinson Catalyst. J. Am. Chem. Soc. 109, 3455–3456 (1987).
Article CAS Google Scholar
Daniel, C., Koga, N., Han, J., Fu, X. Y. & Morokuma, K. Ab initio MO study of the full catalytic cycle of olefin hydrogenation by the Wilkinson catalyst RhCl(PR₃)₃. J. Am. Chem. Soc. 110, 3773–3787 (1988).
Article CAS Google Scholar
Torrent, M., Solà, M. & Frenking, G. Theoretical Studies of Some Transition-Metal-Mediated Reactions of Industrial and Synthetic Importance. Chem. Rev. 100, 439–494 (2000).
Article CAS PubMed Google Scholar
Staub, R., Gantzer, P., Harabuchi, Y., Maeda, S. & Varnek, A. Challenges for Kinetics Predictions via Neural Network Potentials: A Wilkinson’s Catalyst Case. Molecules 28, 4477 (2023).
Article CAS PubMed PubMed Central Google Scholar
Wink, D. A. & Ford, P. C. Reaction Dynamics of the Tricoordinate Intermediates MCl(PPh₃)₂ (M = Rh or Ir) as Probed by the Flash Photolysis of the Carbonyls MCl(CO)(PPh₃)₂. J. Am. Chem. Soc. 109, 436–442 (1987).
Article CAS Google Scholar
Dedieu, A. & Strich, A. A Molecular Orbital Analysis of the Oxidative Addition of Hydrogen to the Chlorotris(Triphenylphosphine)Rhodium(I) Complex. Inorg. Chem. 18, 2940–2943 (1979).
Article CAS Google Scholar
Matsubara, T., Takahashi, R. & Asai, S. ONIOM Study of the Mechanism of Olefin Hydrogenation by the Wilkinson’s Catalyst: Reaction Paths and Energy Surfaces of trans- and cis-Forms. Bull. Chem. Soc. Jpn. 86, 243–254 (2013).
Article CAS Google Scholar
Mayer, I. Charge, Bond Order and Valence in the AB Initio SCF Theory. Chem. Phys. Lett. 97, 270–274 (1983).
Article CAS ADS Google Scholar
Behler, J. Constructing high-dimensional neural network potentials: A tutorial review. Int. J. Quantum Chem. 115, 1032–1050 (2015).
Article CAS Google Scholar
Botu, V., Batra, R., Chapman, J. & Ramprasad, R. Machine learning force fields: Construction, validation, and outlook. J. Phys. Chem. C. 121, 511–522 (2017).
Article CAS Google Scholar
Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3, e1603015 (2017).
Article PubMed PubMed Central ADS Google Scholar
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
Article CAS PubMed PubMed Central Google Scholar
Glielmo, A., Zeni, C. & De Vita, A. Efficient nonparametric n-body force fields from machine learning. Phys. Rev. B 97, 184307 (2018).
Article CAS ADS Google Scholar
Behler, J. & Csányi, G. Machine Learning Potentials for Extended Systems: A Perspective. Eur. Phys. J. B 94, 142 (2021).
Article CAS ADS Google Scholar
Friederich, P., Häse, F., Proppe, J. & Aspuru-Guzik, A. Machine-Learned Potentials for next-Generation Matter Simulations. Nat. Mater. 20, 750–761 (2021).
Article CAS PubMed ADS Google Scholar
Unke, O. T. et al. Machine Learning Force Fields. Chem. Rev. 121, 10142–10186 (2021).
Article CAS PubMed PubMed Central Google Scholar
Deringer, V. L. et al. Gaussian Process Regression for Materials and Molecules. Chem. Rev. 121, 10073–10141 (2021).
Article CAS PubMed PubMed Central Google Scholar
Musil, F. et al. Physics-Inspired Structural Representations for Molecules and Materials. Chem. Rev. 121, 9759–9815 (2021).
Article CAS PubMed Google Scholar
Chen, C. & Ong, S. P. A Universal Graph Deep Learning Interatomic Potential for the Periodic Table. Nat. Comput. Sci. 2, 718–728 (2022).
Article PubMed Google Scholar
Takamoto, S. et al. Towards Universal Neural Network Potential for Material Discovery Applicable to Arbitrary Combination of 45 Elements. Nat. Commun. 13, 2991 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Takamoto, S., Okanohara, D., Li, Q.-J. & Li, J. Towards Universal Neural Network Interatomic Potential. J. Materiomics 9, 447–454 (2023).
Article Google Scholar
Choudhary, K. et al. Unified Graph Neural Network Force-Field for the Periodic Table: Solid State Applications. Digital Discov. 2, 346–355 (2023).
Article CAS Google Scholar
Deng, B. et al. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat. Mach. Intell. 5, 1031–1041 (2023).
Article Google Scholar
Eckhoff, M. & Reiher, M. Lifelong Machine Learning Potentials. J. Chem. Theory Comput. 19, 3509–3525 (2023).
Article CAS PubMed PubMed Central Google Scholar
Jordan, R. F., Bajgur, C. S., Willett, Roger & Scott, Brian Ethylene Polymerization by a Cationic Dicyclopentadienyl Zirconium(IV) Alkyl Complex. J. Am. Chem. Soc. 108, 7410–7411 (1986).
Article CAS Google Scholar
Jordan, R. F. Chemistry of Cationic Dicyclopentadienyl Group 4 Metal-Alky I Complexes. In Advances in Organometallic Chemistry, vol. 32, 325–387 (Elsevier, 1991).
Brintzinger, H. H., Fischer, D., Mülhaupt, R., Rieger, B. & Waymouth, R. M. Stereospecific Olefin Polymerization with Chiral Metallocene Catalysts. Angew. Chem. Int. Ed. 34, 1143–1170 (1995).
Article CAS Google Scholar
Coates, G. W. Precise Control of Polyolefin Stereochemistry Using Single-Site Metal Catalysts. Chem. Rev. 100, 1223–1252 (2000).
Article CAS PubMed Google Scholar
Resconi, L., Cavallo, L., Fait, A. & Piemontesi, F. Selectivity in Propene Polymerization with Metallocene Catalysts. Chem. Rev. 100, 1253–1346 (2000).
Article CAS PubMed Google Scholar
Alt, H. G. & Köppl, A. Effect of the Nature of Metallocene Complexes of Group IV Metals on Their Performance in Catalytic Ethylene and Propylene Polymerization. Chem. Rev. 100, 1205–1222 (2000).
Article CAS PubMed Google Scholar
Rappé, A. K., Skiff, W. M. & Casewit, C. J. Modeling Metal-Catalyzed Olefin Polymerization. Chem. Rev. 100, 1435–1456 (2000).
Article PubMed Google Scholar
Chen, E. Y.-X. & Marks, T. J. Cocatalysts for Metal-Catalyzed Olefin Polymerization: Activators, Activation Processes, and Structure-Activity Relationships. Chem. Rev. 100, 1391–1434 (2000).
Article CAS PubMed Google Scholar
Möhring, P. C. & Coville, N. J. Group 4 Metallocene Polymerisation Catalysts: Quantification of Ring Substituent Steric Effects. Coord. Chem. Rev. 250, 18–35 (2006).
Article Google Scholar
Parveen, R., Cundari, T. R., Younker, J. M., Rodriguez, G. & McCullough, L. DFT and QSAR Studies of Ethylene Polymerization by Zirconocene Catalysts. ACS Catal. 9, 9339–9349 (2019).
Article CAS Google Scholar
xyz2mol. https://github.com/jensengroup/xyz2mol. accessed June 2023.
Kim, Y. & Kim, W. Y. Universal Structure Conversion Method for Organic Molecules: From Atomic Connectivity to Three-Dimensional Geometry. Bull. Korean Chem. Soc. 36, 1769–1777 (2015).
Article CAS Google Scholar
Landrum, G. et al. Rdkit/Rdkit: 2023_03_1 (Q1 2023) Release https://zenodo.org/record/7880616 (2023).
Cossee, P. Ziegler-Natta Catalysis I. Mechanism of Polymerization of α-Olefins with Ziegler-Natta Catalysts. J. Catal. 3, 80–88 (1964).
Article CAS Google Scholar
Arlman, E. J. Ziegler-Natta Catalysis II. Surface Structure of Layer-Lattice Transition Metal Chlorides. J. Catal. 3, 89–98 (1964).
Article CAS Google Scholar
Arlman, E. J. & Cossee, P. Ziegler-Natta Catalysis III. Stereospecific Polymerization of Propene with the Catalyst System TiCl₃-AlEt₃. J. Catal. 3, 99–104 (1964).
Article CAS Google Scholar
Forster, D. On the Mechanism of a Rhodium-Complex-Catalyzed Carbonylation of Methanol to Acetic Acid. J. Am. Chem. Soc. 98, 846–848 (1976).
Article CAS Google Scholar
Dekleva, T. W. & Forster, D. Mechanistic Aspects of Transition-Metal-Catalyzed Alcohol Carbonylations. In Advances in Catalysis, vol. 34, 81–130 (Academic Press, 1986).
Haynes, A., Mann, B. E., Gulliver, D. J., Morris, G. E. & Maitlis, P. M. Direct Observation of MeRh(CO)₂I₃^–, the Key Intermediate in Rhodium-Catalyzed Methanol Carbonylation. J. Am. Chem. Soc. 113, 8567–8569 (1991).
Article CAS Google Scholar
Simm, G. N. & Reiher, M. Error-Controlled Exploration of Chemical Reaction Networks with Gaussian Processes. J. Chem. Theory Comput. 14, 5238–5248 (2018).
Article CAS PubMed Google Scholar
Reiher, M. Molecule-Specific Uncertainty Quantification in Quantum Chemical Studies. Isr. J. Chem. 62, e202100101 (2022).
Article CAS Google Scholar
Coumbarides, G. S., Eames, J. & Weerasooriya, N. A Practical Laboratory Route to the Synthesis of Trideuteriomethyl-[¹³C] Iodide. J. Label. Cpd. Radiopharm. 46, 291–296 (2003).
Article CAS Google Scholar
Griffin, T. R. et al. Theoretical and Experimental Evidence for S_N2 Transition States in Oxidative Addition of Methyl Iodide to Cis-[M(CO)₂I₂]^– (M = Rh, Ir). J. Am. Chem. Soc. 118, 3029–3030 (1996).
Article CAS Google Scholar
Ivanova, E. A., Gisdakis, P., Nasluzov, V. A., Rubailo, A. I. & Rösch, N. Methanol Carbonylation Catalyzed by the Anion of the Complex Dicarbonyldiiodorhodium(I). A Density Functional Study of the Catalytic Cycle. Organometallics 20, 1161–1174 (2001).
Article CAS Google Scholar
Feliz, M., Freixa, Z., van Leeuwen, P. W. N. M. & Bo, C. Revisiting the Methyl Iodide Oxidative Addition to Rhodium Complexes: A DFT Study of the Activation Parameters. Organometallics 24, 5718–5723 (2005).
Article CAS Google Scholar
LiBretto, N. J. et al. Olefin Oligomerization by Main Group Ga³⁺ and Zn²⁺ Single Site Catalysts on SiO₂. Nat. Commun. 12, 2322 (2021).
Article CAS PubMed PubMed Central ADS Google Scholar
Van de Vijver, R. et al. Automatic Mechanism and Kinetic Model Generation for Gas- and Solution-Phase Processes: A Perspective on Best Practices, Recent Advances, and Future Challenges. Int. J. Chem. Kinet. 47, 199–231 (2015).
Article Google Scholar
A. Class, C., Liu, M., G. Vandeputte, A. & H. Green, W. Automatic mechanism generation for pyrolysis of di-tert-butyl sulfide. Phys. Chem. Chem. Phys. 18, 21651–21658 (2016).
Article PubMed Google Scholar
Dana, A. G., Buesser, B., Merchant, S. S. & Green, W. H. Automated Reaction Mechanism Generation Including Nitrogen as a Heteroatom. Int. J. Chem. Kinet. 50, 243–258 (2018).
Article CAS Google Scholar
Chu, T.-C. et al. Modeling of aromatics formation in fuel-rich methane oxy-combustion with an automatically generated pressure-dependent mechanism. Phys. Chem. Chem. Phys. 21, 813–832 (2019).
Article CAS PubMed Google Scholar
Blondal, K. et al. Computer-Generated Kinetics for Coupled Heterogeneous/Homogeneous Systems: A Case Study in Catalytic Combustion of Methane on Platinum. Ind. Eng. Chem. Res. 58, 17682–17691 (2019).
Article CAS Google Scholar
Miller, J. A. et al. Combustion chemistry in the twenty-first century: Developing theory-informed chemical kinetics models. Prog. Energy Combust. Sci. 83, 100886 (2021).
Article Google Scholar
Kreitz, B. et al. Detailed Microkinetics for the Oxidation of Exhaust Gas Emissions through Automated Mechanism Generation. ACS Catal. 12, 11137–11151 (2022).
Article CAS Google Scholar
Ulissi, Z. W. et al. Machine-Learning Methods Enable Exhaustive Searches for Active Bimetallic Facets and Reveal Active Site Motifs for CO₂ Reduction. ACS Catal. 7, 6600–6608 (2017).
Article CAS Google Scholar
Tran, K. & Ulissi, Z. W. Active Learning across Intermetallics to Guide Discovery of Electrocatalysts for CO₂ Reduction and H₂ Evolution. Nat. Catal. 1, 696–703 (2018).
Article CAS Google Scholar
Back, S., Tran, K. & Ulissi, Z. W. Toward a Design of Active Oxygen Evolution Catalysts: Insights from Automated Density Functional Theory Calculations and Machine Learning. ACS Catal. 9, 7651–7659 (2019).
Article CAS Google Scholar
Back, S., Na, J., Tran, K. & Ulissi, Z. W. In Silico Discovery of Active, Stable, CO-Tolerant and Cost-Effective Electrocatalysts for Hydrogen Evolution and Oxidation. Phys. Chem. Chem. Phys. 22, 19454–19458 (2020).
Article CAS PubMed Google Scholar
Ulissi, Z. W., Medford, A. J., Bligaard, T. & Nørskov, J. K. To Address Surface Reaction Network Complexity Using Scaling Relations Machine Learning and DFT Calculations. Nat. Commun. 8, 14621 (2017).
Article PubMed PubMed Central ADS Google Scholar
Goldsmith, C. F. & West, R. H. Automatic Generation of Microkinetic Mechanisms for Heterogeneous Catalysis. J. Phys. Chem. C. 121, 9970–9981 (2017).
Article CAS Google Scholar
Dana, A. G. et al. Automated reaction kinetics and network exploration (Arkane): A statistical mechanics, thermodynamics, transition state theory, and master equation software. Int. J. Chem. Kinet. 55, 300–323 (2023).
Article CAS Google Scholar
Kreitz, B. et al. Automated Generation of Microkinetics for Heterogeneously Catalyzed Reactions Considering Correlated Uncertainties**. Angew. Chem. Int. Ed. 62, e202306514 (2023).
Article CAS Google Scholar
Tran, R. et al. Surface Energies of Elemental Crystals. Sci. Data 3, 160080 (2016).
Article CAS PubMed PubMed Central Google Scholar
Montoya, J. H. & Persson, K. A. A High-Throughput Framework for Determining Adsorption Energies on Solid Surfaces. npj Comput. Mater. 3, 1–4 (2017).
Article CAS Google Scholar
Boes, J. R., Mamun, O., Winther, K. & Bligaard, T. Graph Theory Approach to High-Throughput Surface Adsorption Structure Generation. J. Phys. Chem. A 123, 2281–2285 (2019).
Article CAS PubMed Google Scholar
Deshpande, S., Maxson, T. & Greeley, J. Graph Theory Approach to Determine Configurations of Multidentate and High Coverage Adsorbates for Heterogeneous Catalysis. npj Comput. Mater. 6, 1–6 (2020).
Article Google Scholar
Andriuc, O., Siron, M., Montoya, J. H., Horton, M. & Persson, K. A. Automated Adsorption Workflow for Semiconductor Surfaces and the Application to Zinc Telluride. J. Chem. Inf. Model. 61, 3908–3916 (2021).
Article CAS PubMed Google Scholar
Martí, C. et al. DockOnSurf: A Python Code for the High-Throughput Screening of Flexible Molecules Adsorbed on Surfaces. J. Chem. Inf. Model. 61, 3386–3396 (2021).
Article PubMed Google Scholar
Chanussot, L. et al. Open Catalyst 2020 (OC20) Dataset and Community Challenges. ACS Catal. 11, 6059–6072 (2021).
Article CAS Google Scholar
Tran, R. et al. The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts. ACS Catal. 13, 3066–3084 (2023).
Article CAS Google Scholar
Kreitz, B., Blöndal, K., Badger, K., H. West, R. & Franklin Goldsmith, C. Automatic mechanism generation involving kinetics of surface reactions with bidentate adsorbates. Digital Discov. 3, 173–185 (2024).
Article CAS Google Scholar
Iwasa, T. et al. Combined Automated Reaction Pathway Searches and Sparse Modeling Analysis for Catalytic Properties of Lowest Energy Twins of Cu₁₃. J. Phys. Chem. A 123, 210–217 (2018).
Article PubMed Google Scholar
Maeda, S., Sugiyama, K., Sumiya, Y., Takagi, M. & Saita, K. Global Reaction Route Mapping for Surface Adsorbed Molecules: A Case Study for H₂O on Cu(111) Surface. Chem. Lett. 47, 396–399 (2018).
Article CAS Google Scholar
Sugiyama, K., Sumiya, Y., Takagi, M., Saita, K. & Maeda, S. Understanding CO Oxidation on the Pt(111) Surface Based on a Reaction Route Network. Phys. Chem. Chem. Phys. 21, 14366–14375 (2019).
Article CAS PubMed Google Scholar
Sugiyama, K., Saita, K. & Maeda, S. A reaction route network for methanol decomposition on a Pt(111) surface. J. Comput. Chem. 42, 2163–2169 (2021).
Article CAS PubMed Google Scholar
Jafari, M. & Zimmerman, P. M. Reliable and Efficient Reaction Path and Transition State Finding for Surface Reactions with the Growing String Method. J. Comput. Chem. 38, 645–658 (2017).
Article CAS PubMed Google Scholar
Jafari, M. & Zimmerman, P. M. Uncovering Reaction Sequences on Surfaces through Graphical Methods. Phys. Chem. Chem. Phys. 20, 7721–7729 (2018).
Article CAS PubMed Google Scholar
Ma, S., Huang, S.-D. & Liu, Z.-P. Dynamic Coordination of Cations and Catalytic Selectivity on Zinc-Chromium Oxide Alloys during Syngas Conversion. Nat. Catal. 2, 671–677 (2019).
Article CAS Google Scholar
Ma, S., Shang, C. & Liu, Z.-P. Heterogeneous Catalysis from Structure to Activity via SSW-NN Method. J. Chem. Phys. 151, 050901 (2019).
Article ADS Google Scholar
lin Kang, P., Shang, C. & pan Liu, Z. Recent implementations in LASP 3.0: Global neural network potential with multiple elements and better long-range description. Chin. J. Chem. Phys. 34, 583–590 (2021).
Article Google Scholar
Chen, D., Shang, C. & Liu, Z.-P. Machine-learning atomic simulation for heterogeneous catalysis. npj Comput. Mater. 9, 1–9 (2023).
Article CAS ADS Google Scholar
Zhao, Q., Xu, Y., Greeley, J. & Savoie, B. M. Deep Reaction Network Exploration at a Heterogeneous Catalytic Interface. Nat. Commun. 13, 4860 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Roggero, I., Civalleri, B. & Ugliengo, P. Modeling Physisorption with the ONIOM Method: The Case of NH₃ at the Isolated Hydroxyl Group of the Silica Surface. Chem. Phys. Lett. 341, 625–632 (2001).
Article CAS ADS Google Scholar
Xu, Y., LiBretto, N. J., Zhang, G., Miller, J. T. & Greeley, J. First-Principles Analysis of Ethylene Oligomerization on Single-Site Ga³⁺ Catalysts Supported on Amorphous Silica. ACS Catal. 12, 5416–5424 (2022).
Article CAS Google Scholar
Sobez, J.-G. & Reiher, M. Molassembler: Molecular Graph Construction, Modification, and Conformer Generation for Inorganic and Organic Molecules. J. Chem. Inf. Model. 60, 3884–3900 (2020).
Article CAS PubMed Google Scholar
Bensberg, M. et al. qcscine/Molassembler: Release 2.0.0 https://zenodo.org/record/7928074 (2023).
Zhao, Q. YARP reaction database https://doi.org/10.6084/m9.figshare.14766624.v7 (2021).
Maeda, S. & Harabuchi, Y. On Benchmarking of Automated Methods for Performing Exhaustive Reaction Path Search. J. Chem. Theory Comput. 15, 2111–2115 (2019).
Article CAS PubMed Google Scholar
Simm, G. N., Türtscher, P. L. & Reiher, M. Systematic Microsolvation Approach with a Cluster-Continuum Scheme and Conformational Sampling. J. Comput. Chem. 41, 1144–1155 (2020).
Article CAS PubMed Google Scholar
Steiner, M., Holzknecht, T., Schauperl, M. & Podewitz, M. Quantum Chemical Microsolvation by Automated Water Placement. Molecules 26, 1793 (2021).
Article CAS PubMed PubMed Central Google Scholar
Spicher, S., Plett, C., Pracht, P., Hansen, A. & Grimme, S. Automated Molecular Cluster Growing for Explicit Solvation by Efficient Force Field and Tight Binding Methods. J. Chem. Theory Comput. 18, 3174–3189 (2022).
Article CAS PubMed Google Scholar
Bensberg, M., Türtscher, P. L., Unsleber, J. P., Reiher, M. & Neugebauer, J. Solvation Free Energies in Subsystem Density Functional Theory. J. Chem. Theory Comput. 18, 723–740 (2022).
Article CAS PubMed Google Scholar
Friedrich, N.-O. et al. Conformator: A Novel Method for the Generation of Conformer Ensembles. J. Chem. Inf. Model. 59, 731–742 (2019).
Article CAS PubMed Google Scholar
Pracht, P., Bohle, F. & Grimme, S. Automated Exploration of the Low-Energy Chemical Space with Fast Quantum Chemical Methods. Phys. Chem. Chem. Phys. 22, 7169–7192 (2020).
Article CAS PubMed Google Scholar
Talmazan, R. A. & Podewitz, M. PyConSolv: A Python Package for Conformer Generation of (Metal-Containing) Systems in Explicit Solvent. J. Chem. Inf. Model. 63, 5400–5407 (2023).
Article CAS PubMed PubMed Central Google Scholar
Toniato, A. et al. Quantum Chemical Data Generation as Fill-in for Reliability Enhancement of Machine-Learning Reaction and Retrosynthesis Planning. Digital Discov. 2, 663–673 (2023).
Article CAS Google Scholar
Proppe, J., Husch, T., Simm, G. N. & Reiher, M. Uncertainty Quantification for Quantum Chemical Models of Complex Reaction Networks. Faraday Discuss. 195, 497–520 (2017).
Article ADS Google Scholar
Proppe, J. & Reiher, M. Mechanism Deduction from Noisy Chemical Reaction Networks. J. Chem. Theory Comput. 15, 357–370 (2019).
Article CAS PubMed Google Scholar
Motagamwala, A. H. & Dumesic, J. A. Microkinetic Modeling: A Tool for Rational Catalyst Design. Chem. Rev. 121, 1049–1076 (2021).
Article CAS PubMed Google Scholar
Johnson, M. S., Pang, H.-W., Liu, M. & Green, W. H. Species Selection for Automatic Chemical Kinetic Mechanism Generation https://doi.org/10.26434/chemrxiv-2023-wwrqf (2023).
Johnson, M. S., Pang, H.-W., Payne, A. M. & Green, W. H. ReactionMechanismSimulator.jl: A Modern Approach to Chemical Kinetic Mechanism Simulation and Analysis https://doi.org/10.26434/chemrxiv-2023-tj34t (2023).
Rappoport, D. Statistics and Bias-Free Sampling of Reaction Mechanisms from Reaction Network Models https://doi.org/10.26434/chemrxiv-2023-wltcr-v2 (2023).
Bensberg, M. & Reiher, M. Uncertainty-Aware First-principles Exploration of Chemical Reaction Networks https://arxiv.org/abs/2312.15477v1 (2023).
Baiardi, A. et al. qcscine/Utilities: Release 8.0.0 https://zenodo.org/record/7928050 (2023).
Bensberg, M. et al. qcscine/Readuct: Release 5.0.0 https://zenodo.org/record/7928089 (2023).
Bosia, F. et al. qcscine/Core: Release 5.0.0 https://zenodo.org/record/7928043 (2023).
Csizi, K.-S., Steiner, M. & Reiher, M. Quantum Magnifying Glass for Chemistry at the Nanoscale https://doi.org/10.26434/chemrxiv-2023-t10sc (2023).
Bannwarth, C., Ehlert, S. & Grimme, S. GFN2-xTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput. 15, 1652–1671 (2019).
Article CAS PubMed Google Scholar
Bannwarth, C. et al. Extended Tight-Binding Quantum Chemistry Methods. WIREs Comput. Mol. Sci. 11, e1493 (2021).
Article CAS Google Scholar
Grimmel, S. A., Sobez, J.-G., Steiner, M., Unsleber, J. P. & Reiher, M. qcscine/Xtb_wrapper: Release 2.0.0 https://zenodo.org/record/7928082 (2023).
Perdew, J. P., Ernzerhof, M. & Burke, K. Rationale for Mixing Exact Exchange with Density Functional Approximations. J. Chem. Phys. 105, 9982–9985 (1996).
Article CAS ADS Google Scholar
Perdew, J. P., Burke, K. & Wang, Y. Generalized Gradient Approximation for the Exchange-Correlation Hole of a Many-Electron System. Phys. Rev. B 54, 16533–16539 (1996).
Article CAS ADS Google Scholar
Adamo, C. & Barone, V. Toward Reliable Density Functional Methods without Adjustable Parameters: The PBE0 Model. J. Chem. Phys. 110, 6158–6170 (1999).
Article CAS ADS Google Scholar
Balasubramani, S. G. et al. TURBOMOLE: Modular Program Suite for Ab Initio Quantum-Chemical and Condensed-Matter Simulations. J. Chem. Phys. 152, 184107 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar
Weigend, F. & Ahlrichs, R. Balanced Basis Sets of Split Valence, Triple Zeta Valence and Quadruple Zeta Valence Quality for H to Rn: Design and Assessment of Accuracy. Phys. Chem. Chem. Phys. 7, 3297–3305 (2005).
Article CAS PubMed Google Scholar
Vosko, S. H., Wilk, L. & Nusair, M. Accurate spin-dependent electron liquid correlation energies for local spin density calculations: a critical analysis. Can. J. Phys. 58, 1200–1211 (1980).
Article CAS ADS Google Scholar
Lee, C., Yang, W. & Parr, R. G. Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 37, 785–789 (1988).
Article CAS ADS Google Scholar
Becke, A. D. Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 98, 5648–5652 (1993).
Article CAS ADS Google Scholar
Stephens, P. J., Devlin, F. J., Chabalowski, C. F. & Frisch, M. J. Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields. J. Phys. Chem. 98, 11623–11627 (1994).
Article CAS Google Scholar
Krishnan, R., Binkley, J. S., Seeger, R. & Pople, J. A. Self-consistent molecular orbital methods. XX. A basis set for correlated wave functions. J. Chem. Phys. 72, 650–654 (1980).
Article CAS ADS Google Scholar
Curtiss, L. A. et al. Extension of Gaussian-2 theory to molecules containing third-row atoms Ga–Kr. J. Chem. Phys. 103, 6104–6113 (1995).
Article CAS ADS Google Scholar
Neese, F. Software update: The ORCA program system—Version 5.0. WIREs Comput. Mol. Sci. 12, e1606 (2022).
Article Google Scholar
Grimme, S., Antony, J., Ehrlich, S. & Krieg, H. A Consistent and Accurate Ab Initio Parametrization of Density Functional Dispersion Correction (DFT-D) for the 94 Elements H-Pu. J. Chem. Phys. 132, 154104 (2010).
Article PubMed ADS Google Scholar
Grimme, S., Ehrlich, S. & Goerigk, L. Effect of the Damping Function in Dispersion Corrected Density Functional Theory. J. Comput. Chem. 32, 1456–1465 (2011).
Article CAS PubMed Google Scholar
Weigend, F. Accurate Coulomb-Fitting Basis Sets for H to Rn. Phys. Chem. Chem. Phys. 8, 1057–1065 (2006).
Article CAS PubMed Google Scholar
Klamt, A. & Schüürmann, G. COSMO: A New Approach to Dielectric Screening in Solvents with Explicit Expressions for the Screening Energy and Its Gradient. J. Chem. Soc., Perkin Trans. 2, 799–805 (1993).
Article Google Scholar
Garcia-Ratés, M. & Neese, F. Efficient Implementation of the Analytical Second Derivatives of Hartree–Fock and Hybrid DFT Energies within the Framework of the Conductor-like Polarizable Continuum Model. J. Comput. Chem. 40, 1816–1828 (2019).
Article PubMed Google Scholar
Garcia-Ratés, M. & Neese, F. Effect of the Solute Cavity on the Solvation Energy and Its Derivatives within the Framework of the Gaussian Charge Scheme. J. Comput. Chem. 41, 922–939 (2020).
Article PubMed Google Scholar
Onufriev, A., Bashford, D. & Case, D. A. Exploring Protein Native States and Large-Scale Conformational Changes with a Modified Generalized Born Model. Proteins Struct. Funct. Bioinf. 55, 383–394 (2004).
Article CAS Google Scholar
Sigalov, G., Fenley, A. & Onufriev, A. Analytical Electrostatics for Biomolecules: Beyond the Generalized Born Approximation. J. Chem. Phys. 124, 124902 (2006).
Article PubMed ADS Google Scholar
Lange, A. W. & Herbert, J. M. Improving Generalized Born Models by Exploiting Connections to Polarizable Continuum Models. I. An Improved Effective Coulomb Operator. J. Chem. Theory Comput. 8, 1999–2011 (2012).
Article CAS PubMed Google Scholar
Ehlert, S., Stahn, M., Spicher, S. & Grimme, S. Robust and Efficient Implicit Solvation Model for Fast Semiempirical Methods. J. Chem. Theory Comput. 17, 4250–4261 (2021).
Article CAS PubMed Google Scholar
Savitzky, A. & Golay, M. J. E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Anal. Chem. 36, 1627–1639 (1964).
Article CAS ADS Google Scholar
Azure Quantum Elements. https://quantum.microsoft.com/en-us/our-story/quantum-elements-overview. accessed July 2023.
Azure Quantum June Event: Accelerating scientific discovery. https://news.microsoft.com/azure-quantum-june-event/. accessed July 2023.

Download references

Acknowledgements

This publication was created as part of NCCR Catalysis, a National Centre of Competence in Research funded by the Swiss National Science Foundation (grant number 180544). MS gratefully acknowledges a Swiss Government Excellence Scholarship for Foreign Scholars and Artists (2020.0047).

Author information

Authors and Affiliations

ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland
Miguel Steiner & Markus Reiher
ETH Zurich, NCCR Catalysis, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland
Miguel Steiner & Markus Reiher

Authors

Miguel Steiner
View author publications
You can also search for this author in PubMed Google Scholar
Markus Reiher
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The project was conceived by both authors. M.S. wrote the software and carried out the calculations. Both authors analyzed the results and prepared the manuscript. Both authors acquired funding. M.R. acquired the computing resources and supervised the project. M.R. is the corresponding author.

Corresponding author

Correspondence to Markus Reiher.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Robert Pollice, Wenbin Xu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Steiner, M., Reiher, M. A human-machine interface for automatic exploration of chemical reaction networks. Nat Commun 15, 3680 (2024). https://doi.org/10.1038/s41467-024-47997-9

Download citation

Received: 31 August 2023
Accepted: 15 April 2024
Published: 01 May 2024
DOI: https://doi.org/10.1038/s41467-024-47997-9

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.