Dark chemical matter as a promising starting point for drug lead discovery

Wassermann, Anne Mai; Lounkine, Eugen; Hoepfner, Dominic; Le Goff, Gaelle; King, Frederick J; Studer, Christian; Peltier, John M; Grippo, Melissa L; Prindle, Vivian; Tao, Jianshi; Schuffenhauer, Ansgar; Wallace, Iain M; Chen, Shanni; Krastel, Philipp; Cobos-Correa, Amanda; Parker, Christian N; Davies, John W; Glick, Meir

doi:10.1038/nchembio.1936

Article
Published: 19 October 2015

Dark chemical matter as a promising starting point for drug lead discovery

Anne Mai Wassermann¹^nAff4,
Eugen Lounkine¹,
Dominic Hoepfner²,
Gaelle Le Goff¹,
Frederick J King^1,3,
Christian Studer²,
John M Peltier¹,
Melissa L Grippo¹,
Vivian Prindle³,
Jianshi Tao³,
Ansgar Schuffenhauer²,
Iain M Wallace¹,
Shanni Chen¹,
Philipp Krastel²,
Amanda Cobos-Correa²,
Christian N Parker²,
John W Davies¹ &
…
Meir Glick¹^nAff4

Nature Chemical Biology volume 11, pages 958–966 (2015)Cite this article

12k Accesses
92 Citations
36 Altmetric
Metrics details

Subjects

Cheminformatics

Abstract

High-throughput screening (HTS) is an integral part of early drug discovery. Herein, we focused on those small molecules in a screening collection that have never shown biological activity despite having been exhaustively tested in HTS assays. These compounds are referred to as 'dark chemical matter' (DCM). We quantified DCM, validated it in quality control experiments, described its physicochemical properties and mapped it into chemical space. Through analysis of prospective reporter-gene assay, gene expression and yeast chemogenomics experiments, we evaluated the potential of DCM to show biological activity in future screens. We demonstrated that, despite the apparent lack of activity, occasionally these compounds can result in potent hits with unique activity and clean safety profiles, which makes them valuable starting points for lead optimization efforts. Among the identified DCM hits was a new antifungal chemotype with strong activity against the pathogen Cryptococcus neoformans but little activity at targets relevant to human safety.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Dark matter definition and characterization.**

**Figure 2: Dark matter in chemical space.**

**Figure 3: Hit rates and selectivity.**

**Figure 5: Identification of Hem14 (protoporphyrinogen oxidase) as target for compound 1 and derivatives.**

Efficient gene knockout and genetic interaction screening using the in4mer CRISPR/Cas12a multiplex knockout platform

Article Open access 27 April 2024

Prospective de novo drug design with deep interactome learning

Article Open access 22 April 2024

Therapeutic peptides: current applications and future directions

Article Open access 14 February 2022

Accession codes

Accessions

Protein Data Bank

3NKS

References

Macarron, R. et al. Impact of high-throughput screening in biomedical research. Nat. Rev. Drug Discov. 10, 188–195 (2011).
Article CAS Google Scholar
Austin, C.P., Brady, L.S., Insel, T.R. & Collins, F.S. NIH Molecular Libraries Initiative. Science 306, 1138–1139 (2004).
Article CAS Google Scholar
Dobson, C.M. Chemical space and biology. Nature 432, 824–828 (2004).
Article CAS Google Scholar
Krier, M., Bret, G. & Rognan, D. Assessing the scaffold diversity of screening libraries. J. Chem. Inf. Model. 46, 512–524 (2006).
Article CAS Google Scholar
Chuprina, A., Lukin, O., Demoiseaux, R., Buzko, A. & Shivanyuk, A. Drug- and lead-likeness, target class, and molecular diversity analysis of 7.9 million commercially available organic compounds provided by 29 suppliers. J. Chem. Inf. Model. 50, 470–479 (2010).
Article CAS Google Scholar
Bickerton, G.R., Paolini, G.V., Besnard, J., Muresan, S. & Hopkins, A.L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90–98 (2012).
Article CAS Google Scholar
Lipinski, C.A., Lombardo, F., Dominy, B.W. & Feeney, P.J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 46, 3–26 (2001).
Article CAS Google Scholar
Petrone, P.M. et al. Rethinking molecular similarity: comparing compounds on the basis of biological activity. ACS Chem. Biol. 7, 1399–1409 (2012).
Article CAS Google Scholar
Petrone, P.M. et al. Biodiversity of small molecules—a new perspective in screening set selection. Drug Discov. Today 18, 674–680 (2013).
Article CAS Google Scholar
Wawer, M.J. et al. Toward performance-diverse small-molecule libraries for cell-based phenotypic screening using multiplexed high-dimensional profiling. Proc. Natl. Acad. Sci. USA 111, 10911–10916 (2014).
Article CAS Google Scholar
Wang, Y. et al. PubChem's BioAssay Database. Nucleic Acids Res. 40, D400–D412 (2012).
Article CAS Google Scholar
Wang, Y. et al. PubChem BioAssay: 2014 update. Nucleic Acids Res. 42, D1075–D1082 (2014).
Article CAS Google Scholar
Oprea, T.I. et al. A crowdsourcing evaluation of the NIH chemical probes. Nat. Chem. Biol. 5, 441–447 (2009).
Article CAS Google Scholar
Durstenfeld, R. Algorithm 235: Random permutation. Commun. ACM 7, 420 (1964).
Article Google Scholar
Nissink, J.W.M. & Blackburn, S. Quantification of frequent-hitter behavior based on historical high-throughput screening data. Future Med. Chem. 6, 1113–1126 (2014).
Article CAS Google Scholar
Kenseth, J.R. & Coldiron, S.J. High-throughput characterization and quality control of small-molecule combinatorial libraries. Curr. Opin. Chem. Biol. 8, 418–423 (2004).
Article CAS Google Scholar
Gleeson, M.P., Hersey, A., Montanari, D. & Overington, J. Probing the links between in vitro potency, ADMET and physicochemical parameters. Nat. Rev. Drug Discov. 10, 197–208 (2011).
Article CAS Google Scholar
Azzaoui, K. et al. Modeling promiscuity based on in vitro safety pharmacology profiling data. ChemMedChem 2, 874–880 (2007).
Article CAS Google Scholar
Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
Article CAS Google Scholar
Stumpfe, D., Hu, Y., Dimova, D. & Bajorath, J. Recent progress in understanding activity cliffs and their utility in medicinal chemistry. J. Med. Chem. 57, 18–28 (2014).
Article CAS Google Scholar
Dimova, D., Hu, Y. & Bajorath, J. Matched molecular pair analysis of small molecule microarray data identifies promiscuity cliffs and reveals molecular origins of extreme compound promiscuity. J. Med. Chem. 55, 10220–10228 (2012).
Article CAS Google Scholar
Breinbauer, R., Manger, M., Scheck, M. & Waldmann, H. Natural product guided compound library development. Curr. Med. Chem. 9, 2129–2145 (2002).
Article CAS Google Scholar
King, F.J. et al. Pathway reporter assays reveal small molecule mechanisms of action. J. Assoc. Lab. Autom. 14, 374–382 (2009).
Article CAS Google Scholar
Nigsch, F. et al. Determination of minimal transcriptional signatures of compounds for target prediction. EURASIP J. Bioinform. Syst. Biol. 2012, 2 (2012).
Article Google Scholar
Hoepfner, D. et al. High-resolution chemical dissection of a model eukaryote reveals targets, pathways and gene functions. Microbiol. Res. 169, 107–120 (2014).
Article CAS Google Scholar
Glerum, D.M., Shtanko, A., Tzagoloff, A., Gorman, N. & Sinclair, P.R. Cloning and identification of HEM14, the yeast gene for mitochondrial protoporphyrinogen oxidase. Yeast 12, 1421–1425 (1996).
Article CAS Google Scholar
Lee, A.Y. et al. Mapping the cellular response to small molecules using chemogenomic fitness signatures. Science 344, 208–211 (2014).
Article CAS Google Scholar
Camadro, J.M., Matringe, M., Scalla, R. & Labbe, P. Kinetic studies on protoporphyrinogen oxidase inhibition by diphenyl ether herbicides. Biochem. J. 277, 17–21 (1991).
Article CAS Google Scholar
Qin, X. et al. Structural insight into human variegate porphyria disease. FASEB J. 25, 653–664 (2011).
Article CAS Google Scholar
Hamon, J. et al. In vitro safety pharmacology profiling: what else beyond hERG? Future Med. Chem. 1, 645–665 (2009).
Article CAS Google Scholar
Watkins, R.E. et al. The human nuclear xenobiotic receptor PXR: structural determinants of directed promiscuity. Science 292, 2329–2333 (2001).
Article CAS Google Scholar
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).
Article CAS Google Scholar
Rose, P.W. et al. The RCSB Protein Data Bank: views of structural biology for basic and applied research and education. Nucleic Acids Res. 43, D345–D356 (2015).
Article CAS Google Scholar
Pletnev, I. et al. InChIKey collision resistance: an experimental testing. J. Cheminform. 4, 39 (2012).
Article CAS Google Scholar
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
Article CAS Google Scholar
Bemis, G.W. & Murcko, M.A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).
Article CAS Google Scholar
Yan, B. et al. Quality control in combinatorial chemistry: determination of the quantity, purity, and quantitative purity of compounds in combinatorial libraries. J. Comb. Chem. 5, 547–559 (2003).
Article CAS Google Scholar
Gaugaz, F.Z. et al. The impact of cyclopropane configuration on the biological activity of cyclopropyl-epothilones. ChemMedChem 9, 2227–2232 (2014).
Article CAS Google Scholar
Clinical and Laboratory Standards Institute. Reference method for broth dilution antifungal susceptibility testing of filamentous fungi (approved standard) 2nd edn., MA38-A2 (Clinical and Laboratory Standards Institute, Wayne, Pennsylvania, USA, 2008).
Clinical and Laboratory Standards Institute. Reference method for broth dilution antifungal susceptibility testing of yeast (approved standard) 3rd edn., M27-A3 (Clinical and Laboratory Standards Institute, Wayne, Pennsylvania, USA, 2008).

Download references

Acknowledgements

A.M.W. and G.L.G. were presidential postdoctoral fellows supported by the Education Office of the Novartis Institutes for BioMedical Research. The authors thank M. Schirle, R. Nutiu, S. Reiling and E. Gregori-Puigjané for valuable discussions; T. Aust, O. Galuba and R. Riedl for support with the HIP and follow-up experiments; M. Popov and F. Nigsch for help with data mining; P. Selzer for the cell permeability model; G. Wendel, B. Burakowska and L. Koppes for help with compound management; and R. Guha, J. Bittker and J. Braisted for help with BARD.

Author information

Anne Mai Wassermann & Meir Glick
Present address: Present addresses: Pfizer Inc., Cambridge, Massachusetts, USA (A.M.W.); Merck Research Laboratories, Boston, Massachusetts, USA (M.G.).,

Authors and Affiliations

Novartis Institutes for BioMedical Research Inc., Cambridge, Massachusetts, USA
Anne Mai Wassermann, Eugen Lounkine, Gaelle Le Goff, Frederick J King, John M Peltier, Melissa L Grippo, Iain M Wallace, Shanni Chen, John W Davies & Meir Glick
Novartis Institutes for BioMedical Research Inc., Basel, Switzerland
Dominic Hoepfner, Christian Studer, Ansgar Schuffenhauer, Philipp Krastel, Amanda Cobos-Correa & Christian N Parker
The Genomics Institute of the Novartis Research Foundation, San Diego, California, USA
Frederick J King, Vivian Prindle & Jianshi Tao

Authors

Anne Mai Wassermann
View author publications
You can also search for this author in PubMed Google Scholar
Eugen Lounkine
View author publications
You can also search for this author in PubMed Google Scholar
Dominic Hoepfner
View author publications
You can also search for this author in PubMed Google Scholar
Gaelle Le Goff
View author publications
You can also search for this author in PubMed Google Scholar
Frederick J King
View author publications
You can also search for this author in PubMed Google Scholar
Christian Studer
View author publications
You can also search for this author in PubMed Google Scholar
John M Peltier
View author publications
You can also search for this author in PubMed Google Scholar
Melissa L Grippo
View author publications
You can also search for this author in PubMed Google Scholar
Vivian Prindle
View author publications
You can also search for this author in PubMed Google Scholar
Jianshi Tao
View author publications
You can also search for this author in PubMed Google Scholar
Ansgar Schuffenhauer
View author publications
You can also search for this author in PubMed Google Scholar
Iain M Wallace
View author publications
You can also search for this author in PubMed Google Scholar
Shanni Chen
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Krastel
View author publications
You can also search for this author in PubMed Google Scholar
Amanda Cobos-Correa
View author publications
You can also search for this author in PubMed Google Scholar
Christian N Parker
View author publications
You can also search for this author in PubMed Google Scholar
John W Davies
View author publications
You can also search for this author in PubMed Google Scholar
Meir Glick
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.M.W., E.L., J.W.D. and M.G. conceived the study with contributions from A.S., I.M.W. and C.N.P. A.M.W. carried out the large-scale computational analyses of the Novartis and PubChem HTS assay results. G.L.G. performed the gene expression experiments. F.J.K. directed and analyzed the reporter-gene assay experiments. D.H. directed and analyzed the S. cerevisiae growth inhibition and chemogenomics experiments. C.S. performed S. cerevisiae experiments. J.M.P. and M.L.G. conducted the quality control experiments. J.T. and V.P. designed and performed the antifungal panel experiments. S.C. did safety profiling experiments. P.K. and A.C.-C. supervised the profiling of natural products against the cancer cell line panel. A.M.W., E.L., D.H., J.W.D. and M.G. wrote the manuscript with contributions from all authors that read and discussed the manuscript.

Corresponding authors

Correspondence to Anne Mai Wassermann or Meir Glick.

Ethics declarations

Competing interests

As employees of Novartis, the authors do have a perceived financial conflict of interest.

Supplementary information

Supplementary Text and Figures

Supplementary Results, Supplementary Tables 1–12, Supplementary Note 1 and Supplementary Figures 1–14. (PDF 1946 kb)

Supplementary Data Set 1

PubChem assay identifiers. All PubChem bioassays used in the analysis are reported. If two assay identifiers are listed in the same row, the corresponding PubChem bioassays have been combined because they reported different readouts from the same experiment. (XLS 101 kb)

Supplementary Data Set 2

Compound structures. The file reports InChI keys and SMILES strings for all dark compounds identified in the PubChem data set and a subset (10,355 structures) of the dark compounds in the Novartis data set (due to intellectual property reasons not all structures can be made available). For each compound, the field “set” reports whether the compound was identified as dark chemical matter for the PubChem, Novartis or both data sets. (XLSX 7000 kb)

Supplementary Data Set 3

Quality control results. For 623 compound structures identified as dark chemical matter in the Novartis data set, results from our quality control experiments are reported. Purity, identity, concentration, and comments about how to interpret the observed data for special cases (e.g. highly fluorinated compounds) are given. Compounds are represented by InChI keys and SMLES strings. (XLSX 54 kb)

Supplementary Data Set 4

DCM scaffolds. The data set lists 95 scaffolds that were significantly enriched in the PubChem DCM set. Scaffolds are reported as SMILES strings. For each scaffold, numbers of PubChem DCM and ACT compounds that it represents are reported. (XLSX 12 kb)

Supplementary Data Set 5

Dark chemical matter Bayes classifier. We attach the naive Bayes model trained on the PubChem data set as Pipeline Pilot component (xml file). This component returns a dark matter score for each molecular data record sent to it. (XML 2227 kb)

Supplementary Data Set 6

Reporter gene assay results. For 322 active (“ACT”) and 337 dark (“DCM”) compounds, we make activity readouts from the reporter gene assay panel available. Each row in the data table reports normalized activities for one compound across the 41 RGAs given in Supplementary Table 10. Activities were obtained 24 hours after compound treatment. If a compound has been tested in replicates, the reported activity value is the average of the normalized activities obtained for the different replicates. For details on compound activity normalization see the main text and references provided therein. (XLSX 274 kb)

Supplementary Data Set 7

Gene expression profiles. For 89 active (“ACT”) and 111 dark (“DCM”) compounds, we report measured fold changes and calculated R-scores for the 61 genes in our transcriptional profiling panel. Supplementary Data Set 7 reports gene expression changes after compound treatment with a final compound concentration of 1 μM. Genes are represented by EntrezGene identifiers, as listed in Supplementary Table 11. (XLSX 516 kb)

Supplementary Data Set 8

Gene expression profiles. For 89 active (“ACT”) and 111 dark (“DCM”) compounds, we report measured fold changes and calculated R-scores for the 61 genes in our transcriptional profiling panel. Supplementary Data Set 7 reports gene expression changes after compound treatment with a final compound concentration of 10 μM. Genes are represented by EntrezGene identifiers, as listed in Supplementary Table 11. (XLSX 518 kb)

Supplementary Data Set 9

Yeast growth inhibition compound list. The data set lists 178 dark compounds that were tested in yeast growth inhibition experiments. Only compound 1 reported in the manuscript showed activity in confirmation experiments, i.e., all other compounds are considered as inactive. Compounds are reported as InChI keys and SMILES strings. (XLSX 18 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wassermann, A., Lounkine, E., Hoepfner, D. et al. Dark chemical matter as a promising starting point for drug lead discovery. Nat Chem Biol 11, 958–966 (2015). https://doi.org/10.1038/nchembio.1936

Download citation

Received: 18 February 2015
Accepted: 10 September 2015
Published: 19 October 2015
Issue Date: December 2015
DOI: https://doi.org/10.1038/nchembio.1936

This article is cited by

Progress on open chemoinformatic tools for expanding and exploring the chemical space
- José L. Medina-Franco
- Norberto Sánchez-Cruz
- Bárbara I. Díaz-Eufracio
Journal of Computer-Aided Molecular Design (2022)
Treatment strategies for cryptococcal infection: challenges, advances and future outlook
- Kali R. Iyer
- Nicole M. Revie
- Leah E. Cowen
Nature Reviews Microbiology (2021)
Novel lead structures with both Plasmodium falciparum gametocytocidal and asexual blood stage activity identified from high throughput compound screening
- Wei Sun
- Xiuli Huang
- Wei Zheng
Malaria Journal (2017)
Opportunities and challenges in phenotypic drug discovery: an industry perspective
- John G. Moffat
- Fabien Vincent
- Marco Prunotto
Nature Reviews Drug Discovery (2017)
Systematic chemical-genetic and chemical-chemical interaction datasets for prediction of compound synergism
- Jan Wildenhain
- Michaela Spitzer
- Mike Tyers
Scientific Data (2016)