Cheminformatics is the use of computational and informational techniques to understand problems of chemistry, for instance in the in silico mapping of chemical space – the theoretical space occupied by all possible chemicals and molecules. Cheminformatics strategies are useful in drug discovery and other efforts where large numbers of compounds are being evaluated for specific properties.

    The number of chemical compounds and associated experimental data in public databases is growing, but presently there is no simple way to access these data in a quick and synoptic manner. Instead, data are fragmented across different resources and interested parties need to invest invaluable time and effort to navigate these systems.

    Scientists have combined functional and computational analysis to predict the substrate specificity of a family of glycosyltransferases from Arabidopsis thaliana, creating a tool that enables researchers to classify the donor and acceptor specificity of glycosyltransferase enzymes.

    Selecting compounds for the chemical library is the foundation of high-throughput screening (HTS). After some years and multiple HTS campaigns, many molecules in the Novartis and NIH Molecular Libraries Program screening collections have never been found to be active. An in-depth exploration of the bioactivity of this 'dark matter' does in fact reveal some compounds of interest.

    How complex is it to synthesize a given molecular target? Can this be answered by a computer? Now, a model of synthetic complexity that factors in methodology developments has resulted in a complexity index that evolves alongside them.

