Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Extending the small-molecule similarity principle to all levels of biology with the Chemical Checker

A Publisher Correction to this article was published on 21 May 2020

This article has been updated


Small molecules are usually compared by their chemical structure, but there is no unified analytic framework for representing and comparing their biological activity. We present the Chemical Checker (CC), which provides processed, harmonized and integrated bioactivity data on ~800,000 small molecules. The CC divides data into five levels of increasing complexity, from the chemical properties of compounds to their clinical outcomes. In between, it includes targets, off-targets, networks and cell-level information, such as omics data, growth inhibition and morphology. Bioactivity data are expressed in a vector format, extending the concept of chemical similarity to similarity between bioactivity signatures. We show how CC signatures can aid drug discovery tasks, including target identification and library characterization. We also demonstrate the discovery of compounds that reverse and mimic biological signatures of disease models and genetic perturbations in cases that could not be addressed using chemical information alone. Overall, the CC signatures facilitate the conversion of bioactivity data to a format that is readily amenable to machine learning methods.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: CC statistics.
Fig. 2: CC signatures visualized.
Fig. 3: Characterization of compound collections with the CC.
Fig. 4: Signature reversion of Alzheimer’s disease-specific transcriptional profiles.
Fig. 5: Discovery of chemical analogs of biologics.
Fig. 6: Representation of the CCweb resource.

Data availability

All gene expression signatures have been deposited in the GEO (GSE137202).

Code availability

To facilitate access to our data, we built a web-based resource (, which includes all the bioactivity signatures in HDF5 format and the full code of the CC resource.

Change history


  1. 1.

    Sterling, T. & Irwin, J. J. ZINC 15—ligand discovery for everyone. J. Chem. Inform. Model. 55, 2324–2337 (2015).

    CAS  Google Scholar 

  2. 2.

    Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).

    CAS  PubMed  Google Scholar 

  3. 3.

    Wang, Y. et al. PubChem BioAssay: 2017 update. Nucleic Acids Res. 45, D955–D963 (2017).

    CAS  PubMed  Google Scholar 

  4. 4.

    Wishart, D. S. Chapter 3: small molecules and disease. PLOS Comput. Biol. 8, e1002805 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Duran-Frigola, M., Rossell, D. & Aloy, P. A chemo-centric view of human health and disease. Nature Commun. 5, 5676 (2014).

    CAS  Google Scholar 

  6. 6.

    Rouillard, A. D. et al. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database 2016, baw100–baw100 (2016).

    PubMed  PubMed Central  Google Scholar 

  7. 7.

    Newman, D. J. & Cragg, G. M. Natural products as sources of new drugs from 1981 to 2014. J. Nat. Prod. 79, 629–661 (2016).

    CAS  PubMed  Google Scholar 

  8. 8.

    Rodrigues, T., Reker, D., Schneider, P. & Schneider, G. Counting on natural products for drug design. Nat. Chem. 8, 531–541 (2016).

    CAS  PubMed  Google Scholar 

  9. 9.

    Welsch, M. E., Snyder, S. A. & Stockwell, B. R. Privileged scaffolds for library design and drug discovery. Curr. Opin. Chem. Biol. 14, 347–361 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Bleicher, K. H., Böhm, H.-J., Müller, K. & Alanine, A. I. Hit and lead generation: beyond high-throughput screening. Nat. Rev. Drug Disc. 2, 369–378 (2003).

    CAS  Google Scholar 

  11. 11.

    Holbeck, S. L., Collins, J. M. & Doroshow, J. H. Analysis of food and drug administration–approved anticancer agents in the NCI60 panel of human tumor cell lines. Mol. Cancer Therap. 9, 1451–1460 (2010).

    CAS  Google Scholar 

  12. 12.

    Seashore-Ludlow, B. et al. Harnessing connectivity in a large-scale small-molecule sensitivity dataset. Cancer Discov. 5, 1210–1223 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Campillos, M., Kuhn, M., Gavin, A.-C., Jensen, L. J. & Bork, P. Drug target identification using side-effect similarity. Science 321, 263–366 (2008).

    CAS  PubMed  Google Scholar 

  14. 14.

    Petrone, P. M. et al. Rethinking molecular similarity: comparing compounds on the basis of biological activity. ACS Chem. Biol. 7, 1399–1409 (2012).

    CAS  PubMed  Google Scholar 

  15. 15.

    Papadatos, G., Gaulton, A., Hersey, A. & Overington, J. P. Activity, assay and target data curation and quality in the ChEMBL database. J. Comput. Aided Mol. Des. 29, 885–896 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Duran-Frigola, M., Mateo, L. & Aloy, P. Drug repositioning beyond the low-hanging fruits. Curr. Opin. Syst. Biol. 3, 95–102 (2017).

    Google Scholar 

  17. 17.

    Nguyen, D. T. et al. Pharos: collating protein information to shed light on the druggable genome. Nucleic Acids Res. 45, D995–D1002 (2017).

    CAS  PubMed  Google Scholar 

  18. 18.

    Duran-Frigola, M., Fernandez-Torras, A., Bertoni, M. & Aloy, P. Formatting biological big data for modern machine learning in drug discovery. WIREs Comp. Mol. Sci. 9, e1408 (2018).

    Google Scholar 

  19. 19.

    Corsello, S. M. et al. The Drug Repurposing Hub: a next-generation drug library and information resource. Nat. Med. 23, 405–408 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Jokinen, E. & Koivunen, J. P. MEK and PI3K inhibition in solid tumors: rationale and evidence to date. Ther. Adv. Med. Oncol. 7, 170–180 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Lamb, J. et al. The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929–1935 (2006).

    CAS  Google Scholar 

  22. 22.

    Subramanian, A. et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171, 1437–1452 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Filzen, T. M., Kutchukian, P. S., Hermes, J. D., Li, J. & Tudor, M. Representing high throughput expression profiles via perturbation barcodes reveals compound targets. PLoS Comput. Biol. 13, e1005335 (2017).

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Chen, B. et al. Reversal of cancer gene expression correlates with drug efficacy and reveals therapeutic targets. Nat. Commun. 8, 16022 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Iorio, F. et al. A landscape of pharmacogenomic interactions in cancer. Cell 166, 740–754 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Encinas, M. et al. Sequential treatment of SH-SY5Y cells with retinoic acid and brain-derived neurotrophic factor gives rise to fully differentiated, neurotrophic factor-dependent, human neuron-like cells. J. Neurochem. 75, 991–1003 (2000).

    CAS  PubMed  Google Scholar 

  27. 27.

    Tanzi, R. E. The genetics of Alzheimer disease. Cold Spring Harb. Perspect. Med. 2, a006296 (2012).

    PubMed  PubMed Central  Google Scholar 

  28. 28.

    Carvalho-Silva, D. et al. Open Targets Platform: new developments and updates two years on. Nucleic Acids Res. 47, D1056–D1065 (2019).

    CAS  PubMed  Google Scholar 

  29. 29.

    Perszyk, R. E. et al. GluN2D-containing N-methyl-d-aspartate receptors mediate synaptic transmission in hippocampal interneurons and regulate interneuron activityity. Mol. Pharmacol. 90, 689–702 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Harold, D. et al. Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer’s disease. Nat Genet 41, 1088–1093 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Anselmo, A. C., Gokarn, Y. & Mitragotri, S. Non-invasive delivery strategies for biologics. Nat. Rev. Drug Discov. 18, 19–40 (2018).

    PubMed  Google Scholar 

  32. 32.

    Depper, J. M., Leonard, W. J., Robb, R. J., Waldmann, T. A. & Greene, W. C. Blockade of the interleukin-2 receptor by anti-Tac antibody: inhibition of human lymphocyte activation. J. Immunol. 131, 690–696 (1983).

    CAS  PubMed  Google Scholar 

  33. 33.

    Benson, J. M. et al. Therapeutic targeting of the IL-12/23 pathways: generation and characterization of ustekinumab. Nat. Biotechnol. 29, 615–624 (2011).

    CAS  PubMed  Google Scholar 

  34. 34.

    Reddy, M. et al. Modulation of CLA, IL-12R, CD40L, and IL-2Ralpha expression and inhibition of IL-12- and IL-23-induced cytokine secretion by CNTO 1275. Cell Immunol. 247, 1–11 (2007).

    CAS  PubMed  Google Scholar 

  35. 35.

    Xu, M. J., Johnson, D. E. & Grandis, J. R. EGFR-targeted therapies in the post-genomic era. Cancer Metastasis Rev. 36, 463–473 (2017).

    PubMed  PubMed Central  Google Scholar 

  36. 36.

    Masuelli, L. et al. Apigenin induces apoptosis and impairs head and neck carcinomas EGFR/ErbB2 signaling. Front. Biosci. 16, 1060–1068 (2011).

    CAS  Google Scholar 

  37. 37.

    Hu, W. J., Liu, J., Zhong, L. K. & Wang, J. Apigenin enhances the antitumor effects of cetuximab in nasopharyngeal carcinoma by inhibiting EGFR signaling. Biomed. Pharmacother. 102, 681–688 (2018).

    CAS  PubMed  Google Scholar 

  38. 38.

    Sawai, A. et al. Inhibition of Hsp90 down-regulates mutant epidermal growth factor receptor (EGFR) expression and sensitizes EGFR mutant tumors to paclitaxel. Cancer Res. 68, 589–596 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Williams, A. J. et al. Open PHACTS: semantic interoperability for drug discovery. Drug Disc. Today 17, 1188–1198 (2012).

    Google Scholar 

  40. 40.

    Rodgers, G. et al. Glimmers in illuminating the druggable genome. Nat. Rev. Drug Disc. 17, 301–302 (2018).

    CAS  Google Scholar 

  41. 41.

    Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).

    CAS  PubMed  Google Scholar 

  42. 42.

    Lee, Y. S. et al. A computational framework for genome-wide characterization of the human disease landscape. Cell Syst. 8, 152–162 (2019).

    PubMed  PubMed Central  Google Scholar 

  43. 43.

    Mendez-Lucio, O., Baillif, B., Clevert, D. A., Rouquie, D. & Wichard, J. De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nat. Commun. 11, 10 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Reymond, J.-L. The Chemical Space Project. Acc. Chem. Res. 48, 722–730 (2015).

    CAS  PubMed  Google Scholar 

  45. 45.

    Irwin, J. J., Gaskins, G., Sterling, T., Mysinger, M. M. & Keiser, M. J. Predicted biological activity of purchasable chemical space. J. Chem. Info. Modeling 58, 148–164 (2018).

    CAS  Google Scholar 

  46. 46.

    Wang, B. et al. Similarity network fusion for aggregating data types on a genomic scale. Nat Methods 11, 333–337 (2014).

    CAS  PubMed  Google Scholar 

  47. 47.

    Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S. & Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90–98 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Axen, S. D. et al. A Sisimple representation of three-dimensional molecular structure. J. Med. Chem. 60, 7393–7409 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Bemis, G. W. & Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).

    CAS  PubMed  Google Scholar 

  50. 50.

    Durant, J. L., Leland, B. A., Henry, D. R. & Nourse, J. G. Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 42, 1273–1280 (2002).

    CAS  PubMed  Google Scholar 

  51. 51.

    Lipinski, C. A. Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov. Today Technol. 1, 337–341 (2004).

    CAS  Google Scholar 

  52. 52.

    Congreve, M., Carr, R., Murray, C. & Jhoti, H. A ‘rule of three’ for fragment-based lead discovery? Drug Discov. Today 8, 876–877 (2003).

    PubMed  Google Scholar 

  53. 53.

    Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46, D1074–D1082 (2018).

    CAS  Google Scholar 

  54. 54.

    Cheng, H. et al. ECOD: an evolutionary classification of protein domains. PLoS Comput. Biol. 10, e1003926 (2014).

    PubMed  PubMed Central  Google Scholar 

  55. 55.

    Gilson, M. K. et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44, D1045–D1053 (2016).

    CAS  PubMed  Google Scholar 

  56. 56.

    Hastings, J. et al. ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res. 44, D1214–D1219 (2016).

    CAS  PubMed  Google Scholar 

  57. 57.

    Thiele, I. et al. A community-driven global reconstruction of human metabolism. Nat Biotechnol. 31, 419–425 (2013).

    CAS  PubMed  Google Scholar 

  58. 58.

    Cerami, E. G. et al. Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res. 39, D685–D690 (2011).

    CAS  PubMed  Google Scholar 

  59. 59.

    Fabregat, A. et al. The Reactome Pathway Knowledgebase. Nucleic Acids Res. 46, D649–D655 (2018).

    CAS  Google Scholar 

  60. 60.

    Pryszcz, L. P., Huerta-Cepas, J. & Gabaldon, T. MetaPhOrs: orthology and paralogy predictions from multiple phylogenetic evidence using a consistency-based confidence score. Nucleic Acids Res. 39, e32 (2011).

    CAS  PubMed  Google Scholar 

  61. 61.

    Kruger, F. A. & Overington, J. P. Global analysis of small molecule binding to related protein targets. PLoS Comput. Biol. 8, e1002333 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Zwierzyna, M. & Overington, J. P. Classification and analysis of a large collection of in vivo bioassay descriptions. PLoS Comput. Biol. 13, e1005641 (2017).

    PubMed  PubMed Central  Google Scholar 

  63. 63.

    Szklarczyk, D. et al. The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic Acids Res. 45, D362–D368 (2017).

    CAS  Google Scholar 

  64. 64.

    Li, T. et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2017).

    CAS  PubMed  Google Scholar 

  65. 65.

    Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016).

    CAS  Google Scholar 

  66. 66.

    Kandasamy, K. et al. NetPath: a public resource of curated signal transduction pathways. Genome Biol. 11, R3 (2010).

    PubMed  PubMed Central  Google Scholar 

  67. 67.

    Mi, H. et al. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res. 45, D183–D189 (2017).

    CAS  PubMed  Google Scholar 

  68. 68.

    Kelder, T. et al. WikiPathways: building research communities on biological pathways. Nucleic Acids Res. 40, D1301–D1307 (2012).

    CAS  PubMed  Google Scholar 

  69. 69.

    Mosca, R., Ceol, A. & Aloy, P. Interactome3D: adding structural details to protein networks. Nat. Methods 10, 47–53 (2013).

    CAS  PubMed  Google Scholar 

  70. 70.

    Leiserson, M. D. et al. Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).

    CAS  PubMed  Google Scholar 

  71. 71.

    Iorio, F. et al. Discovery of drug mode of action and drug repositioning from transcriptional responses. Proc. Natl Acad. Sci. USA 107, 14621–14626 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. 72.

    Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Basu, A. et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell 154, 1151–1161 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Chabner, B. A. NCI-60 cell line screening: a radical departure in its time. J. Natl Cancer Inst. 108, djv388 (2016).

    PubMed  Google Scholar 

  75. 75.

    Azur, M. J., Stuart, E. A., Frangakis, C. & Leaf, P. J. Multiple imputation by chained equations: what is it and how does it work? Int. J. Meth. Psychiatr. Res. 20, 40–49 (2011).

    Google Scholar 

  76. 76.

    Nelson, J. et al. MOSAIC: a chemical-genetic interaction data repository and web resource for exploring chemical modes of action. Bioinformatics 34, 1251–1252 (2017).

    PubMed Central  Google Scholar 

  77. 77.

    Wawer, M. J. et al. Toward performance-diverse small-molecule libraries for cell-based phenotypic screening using multiplexed high-dimensional profiling. Proc. Natl Acad. Sci. USA 111, 10911–10916 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. 78.

    Brown, A. S. & Patel, C. J. A standard database for drug repositioning. Sci. Data 4, 170029 (2017).

    PubMed  PubMed Central  Google Scholar 

  79. 79.

    Piñero, J. et al. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45, D833–D839 (2017).

    Google Scholar 

  80. 80.

    Kuhn, M., Letunic, I., Jensen, L. J. & Bork, P. The SIDER database of drugs and side effects. Nucleic Acids Res. 44, D1075–1079 (2016).

    CAS  PubMed  Google Scholar 

  81. 81.

    Kuhn, M. et al. Systematic identification of proteins that elicit drug side effects. Mol. Syst. Biol. 9, 663 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. 82.

    Duran-Frigola, M. & Aloy, P. Analysis of chemical and biological features yields mechanistic insights into drug side effects. Chem. Biol. 20, 594–603 (2013).

    CAS  PubMed  Google Scholar 

  83. 83.

    Davis, A. P. et al. The Comparative Toxicogenomics Database: update 2017. Nucleic Acids Res. 45, D972–D978 (2017).

    CAS  PubMed  Google Scholar 

  84. 84.

    Ryu, J. Y., Kim, H. W. & Lee, S. Y. Deep learning improves prediction of drug–drug and drug–food interactions. Proc. Natl Acad. Sci. USA 115, 4304–4311 (2018).

    Google Scholar 

  85. 85.

    Grover, A. & Leskovec, J. node2vec: scalable feature learning for networks. Preprint at (2016).

  86. 86.

    Matsui, Y. O., Yamasaki, K. & Aizawa, T. K PQk-means: billion-scale clustering for product-quantized codes. Preprint at (2017).

  87. 87.

    Maaten, L. v. d. Barnes–Hut-SNE. Preprint at (2013).

  88. 88.

    McInnes, L. & Healy, J. Accelerated hierarchical density based clustering. Proc. 2017 IEEE International Conference on Data Mining Workshops (IEEE, 2017).

  89. 89.

    Webber, W., Moffat, A. & Zobel, J. A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. 28, 1–38 (2010).

    Google Scholar 

  90. 90.

    Lo, Y. C. et al. Large-scale chemical similarity networks for target profiling of compounds identified in cell-based chemical screens. PLoS Comput. Biol. 11, e1004153 (2015).

    PubMed  PubMed Central  Google Scholar 

  91. 91.

    Rennie, J. D. M., Shih, L., Teevan, J. & Karger, D. R. Tackling the poor assumptions of naive Bayes text classifiers. Proc. International Conference on International Conference on Machine Learning 616–623 (AAAI Press, 2003).

  92. 92.

    Irwin, J. J. & Shoichet, B. K. ZINC–a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model 45, 177–182 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  93. 93.

    Fernandez-Torras, A., Duran-Frigola, M. & Aloy, P. Encircling the regions of the pharmacogenomic landscape that determine drug response. Genome Med. 11, 17 (2019).

  94. 94.

    Badia, R. et al. SAMHD1 is active in cycling cells permissive to HIV-1 infection. Antiviral Res. 142, 123–135 (2017).

    CAS  PubMed  Google Scholar 

  95. 95.

    Saxena, V., Orgill, D. & Kohane, I. Absolute enrichment: gene set enrichment analysis for homeostatic systems. Nucleic Acids Res. 34, e151 (2006).

    PubMed  PubMed Central  Google Scholar 

Download references


We thank the SB&NB laboratory members for their support and helpful discussions. We are grateful to the Broad Institute and National Center for Advancing Translational Sciences (NCATS-NIH) for providing compounds on request, and J. Duran-Frigola for the website design. We also thank the IRB Barcelona Biostatistics and Bioinformatics Unit and the IRB Functional Genomics Facility. P.A. acknowledges the support of the Spanish Ministerio de Economía y Competitividad (grant no. BIO2016-77038-R), the INB/ELIXIR-ES (grant no. PT17/0009/0007), the European Research Council (SysPharmAD, grant no. 614944) and ‘La Caixa’ BioMedTec (grant no. CTEC_15).

Author information




M.D.-F., E.P. and P.A. designed the study, analyzed the results and wrote the manuscript. M.D.-F. did the computational analysis, together with M.B., T.J.-B. and D.A. O.G.-P. implemented the web server. E.P. and V.A. carried out the experimental validations. All authors have read and approved the manuscript.

Corresponding authors

Correspondence to Miquel Duran-Frigola or Patrick Aloy.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Data 1 and 2 legends, Supplementary Figs. 1–17 and Supplementary Tables 1–3.

Reporting Summary

Supplementary Data 1

Reversion of transcriptional signatures of fAD mutations.

Supplementary Data 2

Small-molecule analogs of biologics.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Duran-Frigola, M., Pauls, E., Guitart-Pla, O. et al. Extending the small-molecule similarity principle to all levels of biology with the Chemical Checker. Nat Biotechnol 38, 1087–1096 (2020).

Download citation

Further reading

  • Computational Applications in Secondary Metabolite Discovery (CAiSMD): an online workshop

    • Fidele Ntie-Kang
    • , Kiran K. Telukunta
    • , Serge A. T. Fobofou
    • , Victor Chukwudi Osamor
    • , Samuel A. Egieyeh
    • , Marilia Valli
    • , Yannick Djoumbou-Feunang
    • , Maria Sorokina
    • , Conrad Stork
    • , Neann Mathai
    • , Paul Zierep
    • , Ana L. Chávez-Hernández
    • , Miquel Duran-Frigola
    • , Smith B. Babiaka
    • , Romuald Tematio Fouedjou
    • , Donatus B. Eni
    • , Simeon Akame
    • , Augustine B. Arreyetta-Bawak
    • , Oyere T. Ebob
    • , Jonathan A. Metuge
    • , Boris D. Bekono
    • , Mustafa A. Isa
    • , Raphael Onuku
    • , Daniel M. Shadrack
    • , Thommas M. Musyoka
    • , Vaishali M. Patil
    • , Justin J. J. van der Hooft
    • , Vanderlan da Silva Bolzani
    • , José L. Medina-Franco
    • , Johannes Kirchmair
    • , Tilmann Weber
    • , Özlem Tastan Bishop
    • , Marnix H. Medema
    • , Ludger A. Wessjohann
    •  & Jutta Ludwig-Müller

    Journal of Cheminformatics (2021)


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing