Machine-learning approach mines unpublished 'dark' reactions that don't work, as well as ones that do.
Did your experiment fail? Don’t bin the data just yet — they could be useful. Chemists in the United States say that they have created a machine-learning algorithm that beats humans at predicting ways to make crystals, by training it on data both from successful experiments and from trials that didn’t work. The team terms these failures ‘dark reactions’, because they are either never written down or are recorded only privately in laboratory notebooks.
“Failed reactions contain a vast amount of unreported and unextracted information,” says Alex Norquist, a materials-synthesis researcher at Haverford College in Pennsylvania, who is part of the team that has reported the work in Nature1. “There are far more failures than successes, but only the successes generally get published.”
The work “shows a great example of what can be done by mining scientific experience — to start unravelling the ‘dark magic’ of synthesis”, says Kristin Persson, a materials chemist at the Lawrence Berkeley National Laboratory in California. She leads an initiative called the Materials Project that gathers information on known materials to aid the design and synthesis of new ones.
Learning from dark data
Several researchers are creating algorithms that learn from past experiments how to make new molecules, with the idea that computers might be able to glean patterns from reaction data more effectively than a human can2. The idea has been pursued for years in drug synthesis to find the most efficient route of making a complex molecule in multiple reaction steps.
The Haverford team of materials scientists, co-led by Norquist, Sorelle Friedler and Joshua Schrier, set themselves a slightly simpler goal: simply to predict whether a particular set of reagents will, when mixed in a solvent and heated, produce a crystalline material.
To narrow down their task further, they looked only at materials called templated vanadium selenites: compounds of vanadium, selenium and oxygen, in which small organic molecules, such as amines, guide (or 'template') the arrangement of the elements. (These particular crystals don't currently have a commercial use, but are being studied for their unusual interactions with light).
The researchers adopted a standard machine-learning approach. They trained an algorithm on data from almost 4,000 attempts to make the crystals under different reaction conditions (such as temperature, concentration, reactant quantity and acidity). That work included transcribing information on dark, failed reactions from the team’s archived lab notebooks into a format that a machine could analyse. Then they asked the computer to pick out principles that separated successful experiments from failures.
To test the algorithm, the team picked out previously untried combinations of reactants, and tried to guess the best processing conditions for making selenite materials. The reaction conditions suggested by the algorithm generated a crystalline product in 89% of around 500 cases. By comparision, using intuition and rules of thumb developed from more than ten accumulated years of experience with the materials, the researchers' own best guesses were successful only 78% of the time.
The algorithm doesn’t make clear its reasoning, so the chemists converted the results into handy rules of thumb which they rendered as an intuitive ‘decision tree’ that scientists can use for guidance in the lab. It involves questions such as 'Is sodium present?' and 'Is the pH greater or less than 3?'.
The team has set up a website, called the Dark Reactions Project, to encourage others to share — in a machine-readable format — their own failed attempts to make new crystals. One barrier to sharing is that other chemists' data might not take the same form as their own, Norquist says — but the researchers hope to be able to adjust the interface of their site to "accommodate the idiosyncrasies of others’ data”, he says.
"The planning and development of such tools is essential if we are to eventually make full use of our 'failed' experiments," adds Richard Cooper, a crystallographer at Oxford University, UK.
Raccuglia, P. et al. Nature http://dx.doi.org/10.1038/nature17439 (2016).
Fialkowski, M. et al. Angew. Chem. Int. Ed. 44, 7263-7269 (2005).
Related links in Nature Research
Related external links
About this article
Cite this article
Ball, P. Computer gleans chemical insight from lab notebook failures. Nature (2016). https://doi.org/10.1038/nature.2016.19866