Cheminformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Accurate prediction of solubility represents a challenge for traditional computational approaches due to the complex nature of phenomena involved. Here the authors report a successful approach to solubility prediction in organic solvents and water using combination of machine learning and computational chemistry.

    • Samuel Boobier
    • , David R. J. Hose
    •  & Bao N. Nguyen
  • Article
    | Open Access

    Development of algorithms to predict reactant and reagents given a target molecule is key to accelerate retrosynthesis approaches. Here the authors demonstrate that applying augmentation techniques to the SMILE representation of target data significantly improves the quality of the reaction predictions.

    • Igor V. Tetko
    • , Pavel Karpov
    •  & Guillaume Godin
  • Article
    | Open Access

    Organic reactions can readily be learned by deep learning models, however, stereochemistry is still a challenge. Here, the authors fine tune a general model using a small dataset, then predict and validate experimentally regio- and stereo-selectivity for various carbohydrates transformations.

    • Giorgio Pesciullesi
    • , Philippe Schwaller
    •  & Jean-Louis Reymond
  • Article
    | Open Access

    Extracting experimental operations for chemical synthesis from procedures reported in prose is a tedious task. Here the authors develop a deep-learning model based on the transformer architecture to translate experimental procedures from the field of organic chemistry into synthesis actions.

    • Alain C. Vaucher
    • , Federico Zipoli
    •  & Teodoro Laino
  • Article
    | Open Access

    The choice of molecular representations can severely impact the performances of machine-learning methods. Here the authors demonstrate a persistence homology based molecular representation through an active-learning approach for predicting CO2/N2 interaction energies at the density functional theory (DFT) level.

    • Jacob Townsend
    • , Cassie Putman Micucci
    •  & Konstantinos D. Vogiatzis
  • Article
    | Open Access

    Bond dissociation enthalpies are key quantities in determining chemical reactivity, their computations with quantum mechanical methods being highly demanding. Here the authors develop a machine learning approach to calculate accurate dissociation enthalpies for organic molecules with sub-second computational cost.

    • Peter C. St. John
    • , Yanfei Guan
    •  & Robert S. Paton
  • Article
    | Open Access

    Identifying kinases responsible for specific phosphorylation events remains challenging. Here, the authors leverage kinase inhibitor profiles for the identification of kinase-substrate site pairs in cell extracts, developing a method that can identify the enzymes responsible for unassigned phosphorylation events.

    • Nikolaus A. Watson
    • , Tyrell N. Cartwright
    •  & Jonathan M. G. Higgins
  • Article
    | Open Access

    The use of machine learning for identifying small molecules through their retention time’s predictions has been challenging so far. Here the authors combine a large database of liquid chromatography retention time with a deep learning approach to enable accurate metabolites’s identification.

    • Xavier Domingo-Almenara
    • , Carlos Guijas
    •  & Gary Siuzdak
  • Article
    | Open Access

    Derivatization of natural products is a powerful approach to generate new molecules for biological screenings. Here, the authors employ C-H oxidation and ring expansion methods for the preparation of a library of medium-sized ring skeleta, which occupy a unique chemical space based on chemoinformatic analysis.

    • Changgui Zhao
    • , Zhengqing Ye
    •  & Weiping Tang
  • Article
    | Open Access

    Mapping atoms across chemical reactions represents a challenging computational task. Here the authors show via a combination of graph theory and combinatorics with expert chemical knowledge the possibility to map very complex organic reactions.

    • Wojciech Jaworski
    • , Sara Szymkuć
    •  & Bartosz A. Grzybowski
  • Article
    | Open Access

    Synthetic chemists develop a "chemical intuition" over years of experience in the lab. Here the authors combine machine learning of (partially) failed experiments with robotic synthesis to capture this intuition used in searching for the optimal synthesis conditions of metal-organic frameworks.

    • Seyed Mohamad Moosavi
    • , Arunraj Chidambaram
    •  & Berend Smit
  • Article
    | Open Access

    The incomplete nature and undefined structure of the existing catalysis research data has prevented comprehensive knowledge extraction. Here, the authors report a novel meta-analysis method that identifies correlations between a catalyst’s physico-chemical properties and its performance in a particular reaction.

    • Roman Schmack
    • , Alexandra Friedrich
    •  & Ralph Kraehnert
  • Article
    | Open Access

    Parasitic nematodes causing onchocerciasis and lymphatic filariasis rely on a bacterial endosymbiont, Wolbachia, which is a validated therapeutic target. Here, Clare et al. perform a high-throughput screen of 1.3 million compounds and identify 5 chemotypes with faster kill rates than existing anti-Wolbachia drugs.

    • Rachel H. Clare
    • , Catherine Bardelle
    •  & Stephen A. Ward
  • Article
    | Open Access

    The fast and accurate determination of molecular properties is particularly crucial in drug discovery. Here, the authors employ supervised machine learning to treat differential mobility spectrometry – mass spectrometry data for ten classes of drug candidates and predict several condensed-phase properties.

    • Stephen W. C. Walker
    • , Ahdia Anwar
    •  & W. Scott Hopkins
  • Article
    | Open Access

    It is now possible to predict what a chemical smells like based on its chemical structure, however to date, this has only been done for a small number of odor descriptors. Here, using natural-language semantic representations, the authors demonstrate prediction of a much wider range of descriptors.

    • E. Darío Gutiérrez
    • , Amit Dhurandhar
    •  & Guillermo A. Cecchi
  • Article
    | Open Access

    Sequence-defined macromolecules consist of a defined chain length and topology and can be used in applications such as antibiotics and data storage. Here the authors developed two algorithms to encode text fragments and QR codes as a collection of oligomers and to reconstruct the original data.

    • Steven Martens
    • , Annelies Landuyt
    •  & Filip Du Prez
  • Article
    | Open Access

    Distributing a reaction workload across laboratories can solve chemical problems more efficiently, but it is challenging to develop viable hardware and software. Here, the authors present an internet-connected network of cheap robots that can perform chemical reactions and share outcomes in real time, demonstrating a digitized approach to chemical collaboration.

    • Dario Caramelli
    • , Daniel Salley
    •  & Leroy Cronin
  • Article
    | Open Access

    The success of a fluorescent dye as a molecular probe to monitor the intracellular activity of biomolecules depends on its physicochemical characteristics. Here, the authors use a predictive model to identify key features that allow them to design cell permeable, background-free fluorescent probes.

    • Samira Husen Alamudi
    • , Rudrakanta Satapathy
    •  & Young-Tae Chang