Review Article | Published:

Computational advances in combating colloidal aggregation in drug discovery


Small molecule effectors are essential for drug discovery. Specific molecular recognition, reversible binding and dose-dependency are usually key requirements to ensure utility of a novel chemical entity. However, artefactual frequent-hitter and assay interference compounds may divert lead optimization and screening programmes towards attrition-prone chemical matter. Colloidal aggregates are the prime source of false positive readouts, either through protein sequestration or protein-scaffold mimicry. Nevertheless, assessment of colloidal aggregation remains somewhat overlooked and under-appreciated. In this Review, we discuss the impact of aggregation in drug discovery by analysing select examples from the literature and publicly-available datasets. We also examine and comment on technologies used to experimentally identify these potentially problematic entities. We focus on evidence-based computational filters and machine learning algorithms that may be swiftly deployed to flag chemical matter and mitigate the impact of aggregates in discovery programmes. We highlight the tools that can be used to scrutinize libraries, and identify and eliminate these problematic compounds.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Schurmann, M., Janning, P., Ziegler, S. & Waldmann, H. Small-molecule target engagement in cells. Cell Chem. Biol. 23, 435–441 (2016).

  2. 2.

    Arrowsmith, C. H. et al. The promise and peril of chemical probes. Nat. Chem. Biol. 11, 536–541 (2015).

  3. 3.

    Garbaccio, R. M. & Parmee, E. R. The impact of chemical probes in drug discovery: a pharmaceutical industry perspective. Cell Chem. Biol. 23, 10–17 (2016).

  4. 4.

    Sink, R., Gobec, S., Pecar, S. & Zega, A. False positives in the early stages of drug discovery. Curr. Med. Chem. 17, 4231–4255 (2010).

  5. 5.

    Rishton, G. M. Reactive compounds and in vitro false positives in HTS. Drug Discov. Today 2, 382–384 (1997).

  6. 6.

    Roche, O. et al. Development of a virtual screening method for identification of “frequent hitters” in compound libraries. J. Med. Chem. 45, 137–142 (2002).

  7. 7.

    Baell, J. B. & Holloway, G. A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 53, 2719–2740 (2010).

  8. 8.

    Devine, S. M. et al. Promiscuous 2-aminothiazoles (PrATs): a frequent hitting scaffold. J. Med. Chem. 58, 1205–1214 (2015).

  9. 9.

    Huth, J. R. et al. ALARM NMR: a rapid and robust experimental method to detect reactive false positives in biochemical screens. J. Am. Chem. Soc. 127, 217–224 (2005).

  10. 10.

    Hann, M. et al. Strategic pooling of compounds for high-throughput screening. J. Chem. Inform. Comput. Sci. 39, 897–902 (1999).

  11. 11.

    Dahlin, J. L., Inglese, J. & Walters, M. A. Mitigating risk in academic preclinical drug discovery. Nat. Rev. Drug Discov. 14, 279–294 (2015).

  12. 12.

    Aldrich, C. et al. The ecstasy and agony of assay interference compounds. ACS Cent. Sci. 3, 143–147 (2017).

  13. 13.

    Ganesh, A. N., Donders, E. N., Shoichet, B. K. & Shoichet, M. S. Colloidal aggregation: from screening nuisance to formulation nuance. Nano Today 19, 188–200 (2018).

  14. 14.

    Young, R. J., Green, D. V., Luscombe, C. N. & Hill, A. P. Getting physical in drug discovery II: the impact of chromatographic hydrophobicity measurements and aromaticity. Drug Discov. Today 16, 822–830 (2011).

  15. 15.

    Baell, J. & Walters, M. A. Chemistry: chemical con artists foil drug discovery. Nature 513, 481–483 (2014).

  16. 16.

    Baell, J. B. Feeling nature’s PAINS: natural products, natural product drugs, and pan assay interference Compounds (PAINS). J. Nat. Prod. 79, 616–628 (2016).

  17. 17.

    McGovern, S. L., Caselli, E., Grigorieff, N. & Shoichet, B. K. A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. J. Med. Chem. 45, 1712–1722 (2002).

  18. 18.

    McGovern, S. L., Helfand, B. T., Feng, B. & Shoichet, B. K. A specific mechanism of nonspecific inhibition. J. Med. Chem. 46, 4265–4272 (2003).

  19. 19.

    Jadhav, A. et al. Quantitative analyses of aggregation, autofluorescence, and reactivity artifacts in a screen for inhibitors of a thiol protease. J. Med. Chem. 53, 37–51 (2010).

  20. 20.

    Pohjala, L. & Tammela, P. Aggregating behavior of phenolic compounds—a source of false bioassay results? Molecules 17, 10774–10790 (2012).

  21. 21.

    Coan, K. E. D. & Shoichet, B. K. Stoichiometry and physical chemistry of promiscuous aggregate-based inhibitors. J. Am. Chem. Soc. 130, 9606–9612 (2008).

  22. 22.

    Duan, D. et al. Internal structure and preferential protein binding of colloidal aggregates. ACS Chem. Biol. 12, 282–290 (2017).

  23. 23.

    Coan, K. E. D., Maltby, D. A., Burlingame, A. L. & Shoichet, B. K. Promiscuous aggregate-based inhibitors promote enzyme unfolding. J. Med. Chem. 52, 2067–2075 (2009).

  24. 24.

    Shoichet, B. K. Interpreting steep dose-response curves in early inhibitor discovery. J. Med. Chem. 49, 7274–7277 (2006).

  25. 25.

    Blevitt, J. M. et al. Structural basis of small-molecule aggregate induced inhibition of a protein−protein interaction. J. Med. Chem. 60, 3511–3517 (2017).

  26. 26.

    Owen, S. C. et al. Colloidal drug formulations can explain “bell-shaped” concentration-response curves. ACS Chem. Biol. 9, 777–784 (2014).

  27. 27.

    Sassano, M. F., Doak, A. K., Roth, B. L. & Shoichet, B. K. Colloidal aggregation causes inhibition of G protein-coupled receptors. J. Med. Chem. 56, 2406–2414 (2013).

  28. 28.

    Doak, A. K., Wille, H., Prusiner, S. B. & Shoichet, B. K. Colloid formation by drugs in simulated intestinal fluid. J. Med. Chem. 53, 4259–4265 (2010).

  29. 29.

    Duan, D., Doak, A. K., Nedyalkova, L. & Shoichet, B. K. Colloidal aggregation and the in vitro activity of traditional chinese medicines. ACS Chem. Biol. 10, 978–988 (2015).

  30. 30.

    Seidler, J., McGovern, S. L., Doman, T. N. & Shoichet, B. K. Identification and prediction of promiscuous aggregating inhibitors among known drugs. J. Med. Chem. 46, 4477–4486 (2003).

  31. 31.

    Ferreira, R. S. et al. Divergent modes of enzyme inhibition in a homologous structure-activity series. J. Med. Chem. 52, 5005–5008 (2009).

  32. 32.

    Feng, B. Y., Shelat, A., Doman, T. N., Guy, R. K. & Shoichet, B. K. High-throughput assays for promiscuous inhibitors. Nat. Chem. Biol. 1, 146–148 (2005).

  33. 33.

    Feng, B. Y. et al. A high-throughput screen for aggregation-based inhibition in a large compound library. J. Med. Chem. 50, 2385–2390 (2007).

  34. 34.

    Ngo, T. et al. Orphan receptor ligand discovery by pickpocketing pharmacological neighbors. Nat. Chem. Biol. 13, 235–242 (2017).

  35. 35.

    Rodrigues, T., Reker, D., Schneider, P. & Schneider, G. Counting on natural products for drug design. Nat. Chem. 8, 531–541 (2016).

  36. 36.

    van Hattum, H. & Waldmann, H. Biology-oriented synthesis: harnessing the power of evolution. J. Am. Chem. Soc. 136, 11853–11859 (2014).

  37. 37.

    Tannert, R. et al. Synthesis and structure–activity correlation of natural-product inspired cyclodepsipeptides stabilizing F-actin. J. Am. Chem. Soc. 132, 3063–3077 (2010).

  38. 38.

    Takayama, H. et al. Discovery of inhibitors of the Wnt and Hedgehog signaling pathways through the catalytic enantioselective synthesis of an iridoid-inspired compound collection. Angew. Chem. Int. Ed. 52, 12404–12408 (2013).

  39. 39.

    Nelson, K. M. et al. The essential medicinal chemistry of curcumin. J. Med. Chem. 60, 1620–1637 (2017).

  40. 40.

    Baker, M. Deceptive curcumin offers cautionary tale for chemists. Nature 541, 144–145 (2017).

  41. 41.

    Reker, D. et al. Revealing the macromolecular targets of complex natural products. Nat. Chem. 6, 1072–1078 (2014).

  42. 42.

    Rodrigues, T. et al. Unveiling (-)-englerin A as a modulator of L-type calcium channels. Angew. Chem. Int. Ed. 55, 11077–11081 (2016).

  43. 43.

    Rodrigues, T., Reker, D., Kunze, J., Schneider, P. & Schneider, G. Revealing the macromolecular targets of fragment-like natural products. Angew. Chem. Int. Ed. 54, 10516–10520 (2015).

  44. 44.

    Rodrigues, T. et al. Machine intelligence decrypts β-lapachone as an allosteric 5-lipoxygenase inhibitor. Chem. Sci. 9, 6899–8903 (2018).

  45. 45.

    Matter, W. F., Brown, R. F. & Vlahos, C. J. The inhibition of phosphatidylinositol 3-kinase by quercetin and analogs. Biochem. Biophys. Res. Commun. 186, 624–631 (1992).

  46. 46.

    Fabre, S., Prudhomme, M. & Rapp, M. Protein kinase C inhibitors; structure-activity relationships in K252c-related compounds. Bioorg. Med. Chem. 1, 193–196 (1993).

  47. 47.

    McGovern, S. L. & Shoichet, B. K. Kinase inhibitors: not just for kinases anymore. J. Med. Chem. 46, 1478–1483 (2003).

  48. 48.

    Wermuth, C. G. Selective optimization of side activities: the SOSA approach. Drug Discov. Today 11, 160–164 (2006).

  49. 49.

    Hopkins, A. L. Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 4, 682–690 (2008).

  50. 50.

    Hopkins, A. L. Network pharmacology. Nat. Biotechnol. 25, 1110–1111 (2007).

  51. 51.

    Irwin, J. J. et al. An aggregation advisor for ligand discovery. J. Med. Chem. 58, 7076–7087 (2015).

  52. 52.

    Mysinger, M. M. et al. Structure-based ligand discovery for the protein–protein interface of chemokine receptor CXCR4. Proc. Natl Acad. Sci. USA 109, 5517–5522 (2012).

  53. 53.

    Prinz, H. Hill coefficients, dose-response curves and allosteric mechanisms. J. Chem. Biol. 3, 37–44 (2010).

  54. 54.

    Owen, S. C., Doak, A. K., Wassam, P., Shoichet, M. S. & Shoichet, B. K. Colloidal aggregation affects the efficacy of anticancer drugs in cell culture. ACS Chem. Biol. 7, 1429–1435 (2012).

  55. 55.

    Irwin, J. J. & Shoichet, B. K. Docking screens for novel ligands conferring new biology. J. Med. Chem. 59, 4103–4120 (2016).

  56. 56.

    Ryan, A. J., Gray, N. M., Lowe, P. N. & Chung, C. W. Effect of detergent on “promiscuous” inhibitors. J. Med. Chem. 46, 3448–3451 (2003).

  57. 57.

    Feng, B. Y. & Shoichet, B. K. A detergent-based assay for the detection of promiscuous inhibitors. Nat. Protoc. 1, 550–553 (2006).

  58. 58.

    Coan, K. E. & Shoichet, B. K. Stability and equilibria of promiscuous aggregates in high protein milieus. Mol. BioSyst. 3, 208–213 (2007).

  59. 59.

    Tomohara, K., Ito, T., Onikata, S., Kato, A. & Adachi, I. Discovery of hyaluronidase inhibitors from natural products and their mechanistic characterization under DMSO-perturbed assay conditions. Bioorg. Med. Chem. Lett. 27, 1620–1623 (2017).

  60. 60.

    Rodrigues, T. et al. Multidimensional de novo design reveals 5-HT2B receptor-selective ligands. Angew. Chem. Int. Ed. 54, 1551–1555 (2015).

  61. 61.

    Chan, L. L. et al. A method for identifying small-molecule aggregators using photonic crystal biosensor microplates. JALA 14, 348–359 (2009).

  62. 62.

    Rausch, K., Reuter, A., Fischer, K. & Schmidt, M. Evaluation of nanoparticle aggregation in human blood serum. Biomacromolecules 11, 2836–2839 (2010).

  63. 63.

    Ganesh, A. N., McLaughlin, C. K., Duan, D., Shoichet, B. K. & Shoichet, M. S. A new spin on antibody–drug conjugates: trastuzumab–fulvestrant colloidal drug aggregates target HER2-positive cells. ACS Appl. Mater. Interfaces 9, 12195–12202 (2017).

  64. 64.

    Lifeng, C. & Gochin, M. Colloidal aggregate detection by rapid fluorescence measurement of liquid surface curvature changes in multiwell plates. J. Biomol. Screen. 12, 966–971 (2007).

  65. 65.

    LaPlante, S. R. et al. Compound aggregation in drug discovery: implementing a practical NMR assay for medicinal chemists. J. Med. Chem. 56, 5142–5150 (2013).

  66. 66.

    Zega, A. NMR methods for identification of false positives in biochemical screens. J. Med. Chem. 60, 9437–9447 (2017).

  67. 67.

    Giannetti, A. M., Koch, B. D. & Browner, M. F. Surface plasmon resonance based assay for the detection and characterization of promiscuous inhibitors. J. Med. Chem. 51, 574–580 (2008).

  68. 68.

    Merk, D., Friedrich, L., Grisoni, F. & Schneider, G. De novo design of bioactive small molecules by artificial intelligence. Mol. Inform. 37, 1700153 (2018).

  69. 69.

    Frenkel, Y. V. et al. Concentration and pH dependent aggregation of hydrophobic drug molecules and relevance to oral bioavailability. J. Med. Chem. 48, 1974–1983 (2005).

  70. 70.

    Walters, W. P., Murcko, A. A. & Murcko, M. A. Recognizing molecules with drug-like properties. Curr. Opin. Chem. Biol. 3, 384–387 (1999).

  71. 71.

    Walters, W. P. & Namchuk, M. Designing screens: how to make your hits a hit. Nat. Rev. Drug Discov. 2, 259–266 (2003).

  72. 72.

    Walters, W. P., Stahl, M. T. & Murcko, M. A. Virtual screening—an overview. Drug Discov. Today 3, 160–178 (1998).

  73. 73.

    Olson, M. E. et al. Oxidative reactivities of 2-furylquinolines: ubiquitous scaffolds in common high-throughput screening libraries. J. Med. Chem. 58, 7419–7430 (2015).

  74. 74.

    Lipinski, C. A. Drug-like properties and the causes of poor solubility and poor permeability. J. Pharmacol. Tox. Methods 44, 235–249 (2000).

  75. 75.

    Brenk, R. et al. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem 3, 435–444 (2008).

  76. 76.

    Baell, J. B. & Nissink, J. W. M. Seven year itch: pan-assay interference compounds (PAINS) in 2017-utility and limitations. ACS Chem. Biol. 13, 36–44 (2018).

  77. 77.

    Sadowski, J. & Kubinyi, H. A scoring scheme for discriminating between drugs and nondrugs. J. Med. Chem. 41, 3325–3329 (1998).

  78. 78.

    Schneider, P., Rothlisberger, M., Reker, D. & Schneider, G. Spotting and designing promiscuous ligands for drug discovery. Chem. Commun. 52, 1135–1138 (2016).

  79. 79.

    Yang, J. J. et al. Badapple: promiscuity patterns from noisy evidence. J. Cheminformatics 8, 29 (2016).

  80. 80.

    Stork, C. et al. Hit dexter: a machine-learning model for the prediction of frequent hitters. ChemMedChem 13, 564–571 (2018).

  81. 81.

    Rao, H. et al. Identification of small molecule aggregators from large compound libraries by support vector machines. J. Comput. Chem. 31, 752–763 (2010).

  82. 82.

    Reker, D., Rodrigues, T., Schneider, P. & Schneider, G. Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus. Proc. Natl Acad. Sci. USA 111, 4067–4072 (2014).

  83. 83.

    Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. & Blaschke, T. The rise of deep learning in drug discovery. Drug Discov. Today (2018).

  84. 84.

    Reker, D., Schneider, P. & Schneider, G. Multi-objective active machine learning rapidly improves structure–activity models and reveals new protein–protein interaction inhibitors. Chem. Sci. 7, 3919–3927 (2016).

  85. 85.

    Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).

  86. 86.

    Capuzzi, S. J., Muratov, E. N. & Tropsha, A. Phantom PAINS: problems with the utility of alerts for pan-assay interference compoundS. J. Chem. Inf. Model. 57, 417–427 (2017).

  87. 87.

    Kenny, P. W. Comment on the ecstasy and agony of assay interference compounds. J. Chem. Inf. Model. 57, 2640–2645 (2017).

  88. 88.

    Jasial, S., Hu, Y. & Bajorath, J. How frequently are pan-assay interference compounds active? Large-scale analysis of screening data reveals diverse activity profiles, low global hit frequency, and many consistently inactive compounds. J. Med. Chem. 60, 3879–3886 (2017).

  89. 89.

    Gilberg, E., Stumpfe, D. & Bajorath, J. Activity profiles of analog series containing pan assay interference compounds. RSC Adv. 7, 35638–35647 (2017).

  90. 90.

    Vidler, L. R., Watson, I. A., Margolis, B. J., Cummins, D. J. & Brunavs, M. Investigating the behavior of published PAINS alerts using a pharmaceutical company dataset. ACS Med. Chem. Lett. 9, 792–796 (2018).

  91. 91.

    Senger, M. R., Fraga, C. A., Dantas, R. F. & Silva, F. P. Jr. Filtering promiscuous compounds in early drug discovery: is it a good idea? Drug Discov. Today 21, 868–872 (2016).

  92. 92.

    Gilberg, E., Jasial, S., Stumpfe, D., Dimova, D. & Bajorath, J. Highly promiscuous small molecules from biological screening assays include many pan-assay interference compounds but also candidates for polypharmacology. J. Med. Chem. 59, 10285–10290 (2016).

  93. 93.

    Perna, A. M. et al. Fragment-based de novo design reveals a small-molecule inhibitor of helicobacter pylori HtrA. Angew. Chem. Int. Ed. 54, 10244–10248 (2015).

  94. 94.

    Mike, L. A. et al. Activation of heme biosynthesis by a small molecule that is toxic to fermenting staphylococcus aureus. Proc. Natl Acad. Sci. USA 110, 8206–8211 (2013).

  95. 95.

    Lavecchia, A. Machine-learning approaches in drug discovery: methods and applications. Drug Discov. Today 20, 318–331 (2015).

  96. 96.

    Schwaighofer, A. et al. Accurate solubility prediction with error bars for electrolytes: a machine learning approach. J. Chem. Inf. Model. 47, 407–424 (2007).

  97. 97.

    Rodrigues, T. et al. De novo fragment design for drug discovery and chemical biology. Angew. Chem. Int. Ed. 54, 15079–15083 (2015).

  98. 98.

    Cheng, J., Tegge, A. N. & Baldi, P. Machine learning methods for protein structure prediction. IEEE Rev. Biomed. Eng. 1, 41–49 (2008).

  99. 99.

    Wernick, M. N., Yang, Y., Brankov, J. G., Yourganov, G. & Strother, S. C. Machine learning in medical imaging. IEEE Signal Process. Mag. 27, 25–38 (2010).

  100. 100.

    Zhang, L., Tan, J., Han, D. & Zhu, H. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discov. Today 22, 1680–1685 (2017).

  101. 101.

    Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

  102. 102.

    Reker, D. & Schneider, G. Active-learning strategies in computer-assisted drug discovery. Drug Discov. Today 20, 458–465 (2015).

  103. 103.

    Reker, D., Schneider, P., Schneider, G. & Brown, J. B. Active learning for computational chemogenomics. Future Med. Chem. 9, 381–402 (2017).

  104. 104.

    Kohonen, T. Self-organized formation of topologically correct feature maps. Biol. Cybern. 43, 59–69 (1982).

  105. 105.

    Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).

  106. 106.

    Weston, J. et al. in Advances in Neural Information Processing Systems 13 (eds T. K. Leen, T. G. Dietterich, & V. Tresp) 668–674 (MIT Press, 2001).

  107. 107.

    Ekins, S. et al. Analysis and hit filtering of a very large library of compounds screened against mycobacterium tuberculosis. Mol. BioSyst. 6, 2316–2324 (2010).

Download references


D.R. is a Swiss National Science Foundation Fellow (Grants P2EZP3_168827 and P300P2_177833). G.J.L.B. is a Royal Society URF (UF110046 and URF/R/180019), an iFCT Investigator (IF/00624/2015), and the recipient of an ERC StG (TagIt, Grant Agreement 676832). T.R. and G.J.L.B. acknowledge Marie Sklodowska-Curie ITN Protein Conjugates (Grant Agreement 675007) for funding. T.R. is a Marie Curie Fellow (Grant Agreement 743640). T.R. acknowledges the H2020 (TWINN-2017 ACORN, Grant Agreement 807281) and POR Lisboa 2020/FEDER (02/SAICT/2017, Grant Agreement Lisboa-01-0145-FEDER-028333) for funding. D.R. acknowledges the MIT-IBM Watson AI Lab and the MIT SenseTime coalition for funding.

Author information

D.R. and T.R. conceived the study, performed literature and data analyses. D.R., G.J.L.B. and T.R. wrote the manuscript. All authors approved the submitted version of the manuscript.

Competing interests

The authors of this article declare that V. Cantrill is employed by G. Bernardes as a Research Coordinator at the University of Cambridge. V. Cantrill was not involved in the preparation, writing or editing of this Review, but is married to S. Cantrill, who is the Chief Editor of Nature Chemistry. The editorial team of Nature Chemistry declares that S. Cantrill has not been involved in the editorial handling of this Review.

Correspondence to Daniel Reker or Tiago Rodrigues.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark
Fig. 1: Mechanisms of unspecific protein inhibition by colloidal aggregates.
Fig. 2: SCAMs are ubiquitous in drug discovery and impact network pharmacology.
Fig. 3: Historical evolution of confirmed SCAMs and their underlying scaffolds.
Fig. 4: Common filters do not detect SCAMs.
Fig. 5: Comparison of three different in silico methods for the automated identification of SCAMs.