As defined by the World Health Organization, an endocrine disruptor is an exogenous substance or mixture that alters function(s) of the endocrine system and consequently causes adverse health effects in an intact organism, its progeny, or (sub)populations. Traditional experimental testing regimens to identify toxicants that induce endocrine disruption can be expensive and time-consuming. Computational modeling has emerged as a promising and cost-effective alternative method for screening and prioritizing potentially endocrine-active compounds. The efficient identification of suitable chemical descriptors and machine-learning algorithms, including deep learning, is a considerable challenge for computational toxicology studies. Here, we sought to apply classic machine-learning algorithms and deep-learning approaches to a panel of over 7500 compounds tested against 18 Toxicity Forecaster assays related to nuclear estrogen receptor (ERα and ERβ) activity. Three binary fingerprints (Extended Connectivity FingerPrints, Functional Connectivity FingerPrints, and Molecular ACCess System) were used as chemical descriptors in this study. Each descriptor was combined with four machine-learning and two deep- learning (normal and multitask neural networks) approaches to construct models for all 18 ER assays. The resulting model performance was evaluated using the area under the receiver- operating curve (AUC) values obtained from a fivefold cross-validation procedure. The results showed that individual models have AUC values that range from 0.56 to 0.86. External validation was conducted using two additional sets of compounds (n = 592 and n = 966) with established interactions with nuclear ER demonstrated through experimentation. An agonist, antagonist, or binding score was determined for each compound by averaging its predicted probabilities in relevant assay models as an external validation, yielding AUC values ranging from 0.63 to 0.91. The results suggest that multitask neural networks offer advantages when modeling mechanistically related endpoints. Consensus predictions based on the average values of individual models remain the best modeling strategy for computational toxicity evaluations.
Subscribe to Journal
Get full journal access for 1 year
only $41.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Hall JM, Couse JF, Korach KS. The multifaceted mechanisms of estradiol and estrogen receptor signaling. J Biol Chem. 2001;276:36869–72.
Eddy EM, Washburn TF, Bunch DO, Goulding EH, Gladen BC, Lubahn DB, et al. Targeted disruption of the estrogen receptor gene in male mice causes alteration of spermatogenesis and infertility. Endocrinology. 1996;137:4796–805.
Lubahn DB, Moyer JS, Golding TS, Couse JF, Korach KS, Smithies O. Alteration of reproductive function but not prenatal sexual development after insertional disruption of the mouse estrogen receptor gene. Proc Natl Acad Sci USA. 1993;90:11162–6.
Heldring N, Pike A, Andersson S, Matthews J, Cheng G, Hartman J, et al. Estrogen receptors: how do they signal and what are their targets. Physiol Rev. 2007;87:905–31.
Prossnitz ER, Arterburn JB. International union of basic and clinical pharmacology. XCVII. G protein-coupled estrogen receptor and its pharmacologic modulators. Pharmacol Rev. 2015;67:505–40.
Brzozowski AM, Pike AC, Dauter Z, Hubbard RE, Bonn T, Engström O, et al. Molecular basis of agonism and antagonism in the oestrogen receptor. Nature. 1997;389:753–8.
Björnström L, Sjöberg M. Mechanisms of estrogen receptor signaling: Convergence of genomic and nongenomic actions on target genes. Mol Endocrinol. 2005;19:833–42.
De Coster S, van Larebeke N. Endocrine-disrupting chemicals: associated disorders and mechanisms of action. J Environ Public Health. 2012;2012:713696.
Meigs L, Smirnova L, Rovida C, Leist M, Hartung T. Animal testing and its alternatives–the most important omics is economics. ALTEX. 2018;35:275–305.
Stouch TR, Kenyon JR, Johnson SR, Chen X-Q, Doweyko A, Li Y. In silico ADME/Tox: why models fail. J Comput Aided Mol Des. 2003;17:83–92.
Maggiora GM. On outliers and activity cliffs–Why QSAR often disappoints. J Chem Inf Model. 2006;46:1535.
Dearden JC, Cronin MTD, Kaiser KLE. How not to develop a quantitative structure-activity or structure-property relationship (QSAR/QSPR). SAR QSAR Environ Res. 2009;20:241–66.
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. J Big Data. 2015;2:1.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.
Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85–117.
Ramsundar B, Liu B, Wu Z, Verras A, Tudor M, Sheridan RP, et al. Is multitask deep learning practical for pharma? J Chem Inf Model. 2017;57:2068–2076.
Xu Y, Ma J, Liaw A, Sheridan RP, Svetnik V. Demystifying multitask deep neural networks for quantitative structure–activity relationships. J Chem Inf Model. 2017;57:2490–504.
Dahl GE, Jaitly N, Salakhutdinov R. Multi-task neural networks for QSAR predictions. arXiv. 2014;1406:1231.
Simões RS, Maltarollo VG, Oliveira PR, Honorio KM. Transfer and multi-task learning in QSAR modeling: advances and challenges. Front Pharmacol. 2018;9:74.
Mayr A, Klambauer G, Unterthiner T, Hochreiter S. DeepTox: toxicity prediction using deep learning. Front Environ Sci. 2015;3:80.
Wenzel J, Matter H, Schmidt F. Predictive multitask deep neural network models for ADME-Tox properties: learning from large data sets. J Chem Inf Model. 2019;59:1253–68.
Byvatov E, Fechner U, Sadowski J, Schneider G. Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. J Chem Inf Comput Sci. 2003;43:1882–9.
Korotcov A, Tkachenko V, Russo DP, Ekins S. Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Mol Pharm. 2017;14:4462–75.
Koutsoukas A, Monaghan KJ, Li X, Huan J. Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data. J Cheminform. 2017;9:42.
Russo DP, Zorn KM, Clark AM, Zhu H, Ekins S. Comparing multiple machine learning algorithms and metrics for estrogen receptor binding prediction. Mol Pharm. 2018;15:4361–70.
Zhou Y, Cahya S, Combs SA, Nicolaou CA, Wang J, Desai PV, et al. Exploring tunable hyperparameters for deep neural networks with industrial ADME data sets. J Chem Inf Model. 2019;59:1005–16.
Attene-Ramos MS, Miller N, Huang R, Michael S, Itkin M, Kavlock RJ, et al. The Tox21 robotic platform for the assessment of environmental chemicals–from vision to reality. Drug Discov Today. 2013;18:716–23.
Ciallella HL, Zhu H. Advancing computational toxicology in the big data era by artificial intelligence: data-driven and mechanism-driven modeling for chemical toxicity. Chem Res Toxicol. 2019;32:536–47.
Zhu H. Big data and artificial intelligence modeling for drug discovery. Annu Rev Pharmacol Toxicol. 2020;60:573–89.
Zhu H, Zhang J, Kim MT, Boison A, Sedykh A, Moran K. Big data in chemical toxicity research: The use of high-throughput screening assays to identify potential toxicants. Chem Res Toxicol. 2014;27:1643–51.
Zhao L and Zhu H Big data in computational toxicology: challenges and opportunities. In: Ekins S, editor. Computational toxicology: risk assessment for chemicals. Hoboken, NJ: John Wiley & Sons, 2018. p. 291–312.
Luechtefeld T, Rowlands C, Hartung T. Big-data and machine learning to revamp computational toxicology and its use in risk assessment. Toxicol Res. 2018;7:732–44.
Zhang L, Tan J, Han D, Zhu H. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discov Today. 2017;22:1680–5.
Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ. The ToxCast program for prioritizing toxicity testing of environmental chemicals. Toxicol Sci. 2007;95:5–12.
Judson RS, Houck KA, Kavlock RJ, Knudsen TB, Martin MT, Mortensen HM, et al. In vitro screening of environmental chemicals for targeted testing prioritization: the ToxCast project. Environ Health Perspect. 2010;118:485–92.
Shukla SJ, Huang R, Austin CP, Xia M. The future of toxicity testing: a focus on in vitro methods using a quantitative high-throughput screening platform. Drug Discov Today. 2010;15:997–1007.
Thomas RS, Paules RS, Simeonov A, Fitzpatrick SC, Crofton KM, Casey WM, et al. The US federal Tox21 program: a strategic and operational plan for continued leadership. ALTEX. 2018;35:163–8.
Hsu C-W, Huang R, Attene-Ramos MS, Austin CP, Simeonov A, Xia M. Advances in high-throughput screening technology for toxicology. Int J Risk Assess. Manag. 2017;20:109–35.
Russo DP, Strickland J, Karmaus AL, Wang W, Shende S, Hartung T, et al. Nonanimal models for acute toxicity evaluations: applying data-driven profiling and read-across. Environ Health Perspect. 2019;127:47001.
Zhao L, Russo DP, Wang W, Aleksunes LM, Zhu H. Mechanism-driven read-across of chemical hepatotoxicants based on chemical structures and biological data. Toxicol Sci. 2020;174:178–88.
Luechtefeld T, Marsh D, Rowlands C, Hartung T. Machine learning of toxicological big data enables read-across structure activity relationships (RASAR) outperforming animal test reproducibility. Toxicol Sci. 2018;165:198–212.
Browne P, Judson RS, Casey WM, Kleinstreuer NC, Thomas RS. Screening chemicals for estrogen receptor bioactivity using a computational model. Environ Sci Technol. 2015;49:8804–14.
Judson RS, Magpantay FM, Chickarmane V, Haskell C, Tania N, Taylor J, et al. Integrated model of chemical perturbations of a biological pathway using 18 in vitro high-throughput screening assays for the estrogen receptor. Toxicol Sci. 2015;148:137–54.
Kleinstreuer NC, Ceger P, Watt ED, Martin M, Houck K, Browne P, et al. Development and validation of a computational model for androgen receptor activity. Chem Res Toxicol. 2017;30:946–64.
Leach AR and Gillet VJ Introduction to Chemoinformatics. Dordrecht, The Netherlands: Springer, 2007.
Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50:742–54.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.
Manning CD, Raghavan P, Schuetze H. The Bernoulli model. Introduction to information retrieval. New York, NY: Cambridge University Press; 2009. p. 234–65.
Cover TM, Hart PE. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13:21–27.
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
Vapnik VN Methods of Pattern Recognition. In: The Nature of Statistical Learning Theory. New York: Springer Science+Business Media, 2000. p. 123-70.
He K, Zhang X, Ren S, Sun J Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2015. p. 1026-34.
Bottou L Large-Scale Machine Learning with Stochastic Gradient Descent. In: 19th International Conference on Computational Statistics. 2010. p. 177-86.
Sutskever I, Martens J, Dahl G, Hinton G On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th International Conference on Machine Learning. Atlanta, Georgia: 2013. p. 1139-47.
Nair V, Hinton GE Rectified Linear Units Improve Restricted Boltzmann Machines. In: Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel: 2010. p. 807-14.
Goodfellow I, Bengio Y, Courville A Challenges in Neural Network Optimization. In: Deep Learning. Cambridge, MA: The MIT Press, 2016. p. 279-90.
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15:1929–58.
Ng AY Feature selection, L1 vs. L2 regularization, and rotational invariance. In: Proceedings of the 21st International Conference on Machine Learning. Banff, Canada: 2004. p. 78.
Li M, Soltanolkotabi M, Oymak S Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) 2020. Palermo, Italy: 2020. p. 4313-24.
Han J, Moraga C The influence of the sigmoid function parameters on the speed of backpropagation learning. In: Mira J and Sandoval F, editors. International Workshop on Artificial Neural Networks: From Natural to Artificial Neural Computation. Springer, Berlin, Heidelberg: Malaga-Torremolinos, Spain, 1995. p. 195–201.
Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett. 2006;27:861–74.
Zakharov AV, Peach ML, Sitzmann M, Nicklaus MC. QSAR modeling of imbalanced high-throughput screening data in PubChem. J Chem Inf Model. 2014;54:705–12.
Mansouri K, Abdelaziz A, Rybacka A, Roncaglioni A, Tropsha A, Varnek A, et al. CERAPP: Collaborative estrogen receptor activity prediction project. Environ Health Perspect. 2016;124:1023–33.
Shen J, Xu L, Fang H, Richard AM, Bray JD, Judson RS, et al. EADB: an estrogenic activity database for assessing potential endocrine activity. Toxicol Sci. 2013;135:277–91.
Kleinstreuer NC, Ceger PC, Allen DG, Strickland J, Chang X, Hamm JT, et al. A curated database of rodent uterotrophic bioactivity. Environ Health Perspect. 2016;124:556–62.
Ribay K, Kim MT, Wang W, Pinolini D, Zhu H. Predictive modeling of estrogen receptor binding agents using advanced cheminformatics tools and massive public data. Front Environ Sci. 2016;4:12.
Zhang L, Fourches D, Sedykh A, Zhu H, Golbraikh A, Ekins S, et al. Discovery of novel antimalarial compounds enabled by QSAR-based virtual screening. J Chem Inf Model. 2013;53:475–92.
Wang J, Deng F, Zeng F, Shanahan AJ, Li WV, Zhang L. Predicting long-term multicategory cause of death in patients with prostate cancer: random forest versus multinomial model. Am J Cancer Res. 2020;10:1344–55.
Organisation for Economic Co-operation and Development. Guidance document on the validation of (Quantitative) structure-activity relationship [(Q)SAR] models. OECD Environ Heal Saf Publ Ser Test Assess. 2007;69:1–154.
Tropsha A, Gramatica P, Gombar VK. The importance of being earnest: Validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb Sci. 2003;22:69–77.
Kim MT, Sedykh A, Chakravarti SK, Saiakhov RD, Zhu H. Critical evaluation of human oral bioavailability for pharmaceutical drugs by using various cheminformatics approaches. Pharm Res. 2014;31:1002–14.
Wang W, Kim MT, Sedykh A, Zhu H. Developing enhanced blood-brain barrier permeability models: integrating external bio-assay data in QSAR modeling. Pharm Res. 2015;32:3055–65.
Solimeo R, Zhang J, Kim M, Sedykh A, Zhu H. Predicting chemical ocular toxicity using a combinatorial QSAR approach. Chem Res Toxicol. 2012;25:2763–9.
Huang R, Sakamuru S, Martin MT, Reif DM, Judson RS, Houck KA, et al. Profiling of the Tox21 10 K compound library for agonists and antagonists of the estrogen receptor alpha signaling pathway. Sci Rep. 2014;4:5664.
Rotroff DM, Dix DJ, Houck KA, Kavlock RJ, Knudsen TB, Martin MT, et al. Real-time growth kinetics measuring hormone mimicry for ToxCast chemicals in T-47D human ductal carcinoma cells. Chem Res Toxicol. 2013;26:1097–107.
Xing JZ, Zhu L, Gabos S, Xie L. Microelectronic cell sensor assay for detection of cytotoxicity and prediction of acute toxicity. Toxicol In Vitro. 2006;20:995–1004.
Haji M, Kato K, Nawata H, Ibayashi H. Age-related changes in the concentrations of cytosol receptors for sex steroid hormones in the hypothalamus and pituitary gland of the rat. Brain Res. 1981;204:373–86.
Knudsen TB, Houck KA, Sipes NS, Singh AV, Judson RS, Martin MT, et al. Activity profiles of 309 ToxCastTM chemicals evaluated across 292 biochemical targets. Toxicology. 2011;282:1–15.
O’Keefe JA, Handa RJ. Transient elevation of estrogen receptors in the neonatal rat hippocampus. Brain Res Dev Brain Res. 1990;57:119–27.
Sipes NS, Martin MT, Kothiya P, Reif DM, Judson RS, Richard AM, et al. Profiling 976 ToxCast chemicals across 331 enzymatic and receptor signaling assays. Chem Res Toxicol. 2013;26:878–95.
MacDonald ML, Lamerdin J, Owens S, Keon BH, Bilter GK, Shang Z, et al. Identifying off-target effects and hidden phenotypes of drugs in human cells. Nat Chem Biol. 2006;2:329–37.
Yu H, West M, Keon BH, Bilter GK, Owens S, Lamerdin J, et al. Measuring drug action in the cellular context using protein-fragment complementation assays. Assay Drug Dev Technol. 2003;1:811–22.
Stossi F, Bolt MJ, Ashcroft FJ, Lamerdin JE, Melnick JS, Powell RT, et al. Defining estrogenic mechanisms of bisphenol A analogs through high throughput microscopy-based contextual assays. Chem Biol. 2014;21:743–53.
Martin MT, Dix DJ, Judson RS, Kavlock RJ, Reif DM, Richard AM, et al. Impact of environmental chemicals on key transcription regulators and correlation to toxicity end points within EPA’s ToxCast program. Chem Res Toxicol. 2010;23:578–90.
United States Environmental Protection Agency. Use of high throughput assays and computational tools; endocrine disruptor screening program; notice of availability and opportunity for comment. Fed Regist. 2015;80:35350–5.
Zhu BT, Lee AJ. NADPH-dependent metabolism of 17β-estradiol and estrone to polar and nonpolar metabolites by human tissues and cytochrome P450 isoforms. Steroids. 2005;70:225–44.
Schrager S, Potter BE. Diethylstilbestrol exposure. Am Fam Physician. 2004;69:2395–2400.
Greenberger LM, Annable T, Collins KI, Komm BS, Lyttle CR, Miller CP. et al. A new antiestrogen, 2-(4-hydroxy-phenyl)-3-methyl-1-[4-(2-piperidin-1-yl-ethoxy)-benzyl]-1H- indol-5-ol hydrochloride (ERA-923), inhibits the growth of tamoxifen-sensitive and -resistant tumors and is devoid of uterotropic effects in mice and rats. Clin Cancer Res.2017;7:3166–77.
Riggs BL, Hartmann LC. Selective estrogen-receptor modulators — mechanisms of action and application to clinical practice. N Engl J Med. 2003;348:618–29.
Stump AL, Kelley KW, Wensel TM. Bazedoxifene: a third-generation selective estrogen receptor modulator for treatment of postmenopausal osteoporosis. Ann Pharmacother. 2007;41:833–9.
Zhu H, Tropsha A, Fourches D, Varnek A, Papa E, Gramatica P, et al. Combinatorial QSAR modeling of chemical toxicants tested against tetrahymena pyriformis. J Chem Inf Model. 2008;48:766–84.
Jaworska J, Nikolova-Jeliazkova N, Aldenberg T. QSAR applicability domain estimation by projection of the training set in descriptor space: a review. Altern Lab Anim. 2005;33:445–59.
Organization for Economic Co-operation and Development. OECD principles for the validation, for regulatory purposes, of (quantitative) structure-activity relationship models. 2004.
This project was partially supported by the National Institute of Environmental Health Sciences (Grant numbers R01ES029275, R01ES031080, R15ES023148, and P30ES005022) and an ExxonMobil research grant for Rutgers University.
Conflict of interest
The authors declare that they have no conflict of interest.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ciallella, H.L., Russo, D.P., Aleksunes, L.M. et al. Predictive modeling of estrogen receptor agonism, antagonism, and binding activities using machine- and deep-learning approaches. Lab Invest (2020). https://doi.org/10.1038/s41374-020-00477-2