Meeting aims to discuss tools that allow the safe exchange of chemical information.
Can researchers share relevant information on chemical compounds so they can test drug-discovery models and toxicity-prediction programmes without revealing structures to rivals? A meeting this month of two divisions of the American Chemical Society — Chemical Information (CINF) and Computers in Chemistry (COMP) in San Diego — aims to address this controversial question (http://oasys2.confex.com/acs/229nm/techprogram/).
“The pharmaceutical industry and academia want to share information, but proprietary and legal considerations mean that this cannot be done easily if there is a risk that chemical-structure information might be released,” says Christopher Lipinski, Adjunct Senior Research Fellow at Pfizer Global R&D and co-chairman of the meeting. “We need an uncrackable system that lets information-poor academia gain information for testing its models and techniques, and allows industry to tap into academic expertise without compromising its intellectual property.”
Quantitative structure–property relationship (QSPR) models rely on the quality and quantity of the experimental data they use, but the proprietary nature of industry data means that public-domain QSPR models rarely accomplish high-quality predictions. “This communication gap has created a cultural division between academic science and the industrial sector,” says co-chairman Tudor Oprea, Professor and Director, Biocomputing at the University of New Mexico School of Medicine.
Some studies suggest that safe sharing would be impossible. For instance, Robert Pearlman and colleagues at the University of Texas have shown how software can deduce a chemical structure from a compound's descriptors, such as molecular mass.
Jean-Loup Faulon and colleagues at Sandia National Laboratories in California have shown that descriptors of molecular fragments might be used to 'reverse engineer' chemical structures. Dave Weininger and John Bradshaw from Daylight, a California-based chemical informatics company, use genetic algorithms that can 'guess' structures from chemical fingerprints in their databases.
Nevertheless, many researchers believe safe sharing is possible. Ruben Abagyan and colleagues at The Scripps Research Institute, La Jolla, have shown that adding artificial noise to data can mask structures. For example, knowing a compound's molecular mass to four decimal places could be enough to obtain a molecular formula, but lowering its precision to less than ten daltons precludes this.
Alexandru Balaban of Texas A&M University, Galveston, has used a similar approach based on topology to produce chemical identifiers that contain less information about structure. Even at lower accuracy, some druggability tools, such as Lipinski's 'rule of five', still work. Oprea warns, however, that software could 'model out' such seemingly random rounding of data.
An approach called Screens, developed by Nikolay Osadchiy and colleagues at ChemDiv, a chemical compound supplier, describes structural fragments but hides the manner in which they are connected. This can provide molecular diversity information and fill voids in chemical space while keeping structures secret.
Tripos, a supplier of products for chemistry research, suggests that topomers might be the answer. Company CSO Richard Cramer says that these topologically equivalent isomers look and behave similarly but not uniquely, so they could be used for druggability tests.
A related approach from Anthony Nicholls of Santa Fe company OpenEye Scientific Software, a producer of software for structure-based drug design, and Andrew Grant of AstraZeneca relies on the fact that different compounds can have a similar shape and electrostatic properties. These are key but not unique descriptors for druggability studies.
Oprea suggests that VolSurf — a descriptor system for pharmacokinetics and toxicity developed by Molecular Discovery, a UK-based software company — cannot be reverse engineered. Key molecular properties, such as the hydrophobic and polar surface areas, are reduced from thousands to 92 descriptors, but these still encode chemical information relevant to QSPR models.
The success of chemical masking techniques hinges on their integrity; any leaks and the system will inevitably fail. But Lipinski argues that success offers tremendous value to companies. “Software developers and academics would like to get their hands on the information,” he says. “If they can share information without revealing structures, then useful research can be done.”
About this article
Cite this article
Bradley, D. Share and share alike. Nat Rev Drug Discov 4, 180 (2005). https://doi.org/10.1038/nrd1683
Journal of Computer-Aided Molecular Design (2014)