High-Throughput Screening to Predict Chemical-Assay Interference

The U.S. federal consortium on toxicology in the 21st century (Tox21) produces quantitative, high-throughput screening (HTS) data on thousands of chemicals across a wide range of assays covering critical biological targets and cellular pathways. Many of these assays, and those used in other in vitro screening programs, rely on luciferase and fluorescence-based readouts that can be susceptible to signal interference by certain chemical structures resulting in false positive outcomes. Included in the Tox21 portfolio are assays specifically designed to measure interference in the form of luciferase inhibition and autofluorescence via multiple wavelengths (red, blue, and green) and under various conditions (cell-free and cell-based, two cell types). Out of 8,305 chemicals tested in the Tox21 interference assays, percent actives ranged from 0.5% (red autofluorescence) to 9.9% (luciferase inhibition). Self-organizing maps and hierarchical clustering were used to relate chemical structural clusters to interference activity profiles. Multiple machine learning algorithms were applied to predict assay interference based on molecular descriptors and chemical properties. The best performing predictive models (accuracies of ~80%) have been included in a web-based tool called InterPred that will allow users to predict the likelihood of assay interference for any new chemical structure and thus increase confidence in HTS data by decreasing false positive testing results.

oxidation of the luciferin substrate 9 . These phenomena are not isolated to a few chemicals, rather more than 5% of PubChem chemical libraries (from over 70,000 tested samples), may have autofluorescence properties 10 , and 12% of active chemicals from the NIH Molecular Libraries Small Molecule Repository give paradoxical luminescence changes 11 . These phenomena also impact many drug discovery campaigns and have been reported and documented for various targets and methods, e.g. kinases or proteases 12 .
Considering the importance of HTS platforms in drug discovery and chemical toxicity screening, and the potential impact of false signals derived from these two major interference mechanisms, standardized in silico tools are needed to predict and limit interferent compounds being misinterpreted in fluorescent assay technologies. Adequately trained interference prediction models based on chemical structure will help avoid artefacts and false signals, conserve time and resources, and allow for optimized screening approaches. Correlations between chemical structure and fluorescence/interference have been studied previously, and most approaches have relied upon identifying chemicals with particular substructures as problematic, e.g. thiol or quinone substructures 13 , or rule-based alerts on ring structures/properties 14 . A popular example in drug discovery to guide HTS library conception is the Pan Assay Interference Compounds (PAINS) approach, where a set of chemicals/substructures were identified as interferent due to non-selective activity 15 . However, the success of existing methods to define interferents is still limited due to the complexity and multifactorial nature of the platforms, such as cell culture, range of absorbance in the light spectrum, detection technology, fluorophore used, etc. which necessitate specific studies to directly measure chemical-assay interference rather than inferring interference patterns using datasets measuring biological activity.
The Tox21 library includes 10K chemicals (8,305 unique structures) tested in a variety of assay technologies, many of which rely upon fluorescence and luciferase-based readouts to indicate biological activity 16,17 . Here we present the results from a subset of Tox21 assays specifically designed to measure interference in the form of luciferase inhibition and autofluorescence. A single assay was used to measure luciferase inhibition in a cell-free format. An additional 12 assay endpoints were derived by screening two cell types (HEK-293 and HepG2) at three different fluorescent wavelengths (red, blue, green) under cell culture conditions with and without cells. The Tox21 library, focused on environmentally relevant and potentially toxic chemicals, was screened in these assays using a quantitative HTS protocol 18 , representing one of the largest screening efforts for chemical-assay interference to date. Using a set of 1D and 2D molecular descriptors covering both physicochemical and topological chemical properties, Self-Organizing Maps (SOM) and hierarchical clustering were applied to characterize interferent chemicals. Machine learning approaches were leveraged to build statistical quantitative structure-activity relationships (QSAR) models that use selected molecular descriptors to predict the probability of a chemical to interfere with fluorescent intensity or luciferase assays; these open-source models are freely accessible for new chemical prediction via a web-based interface called InterPred (https://sandbox.ntp.niehs.nih.gov/interferences/).

Material and Methods
Assays. Three assay platforms were applied to analyze luciferase and fluorescence interference patterns using the Tox21 chemical screening library. The raw data are freely available on the NCATS Tox21 browser (https:// tripod.nih.gov/tox21/assays/) under the names "tox21-luc-biochem-p1" for the luciferase inhibition assay, and "tox21-spec-hepg2-p1" and "tox21-spec-hek293-p1" for autofluorescence assays using HepG2 and HEK-293 cell cultures, respectively, measuring red, green and blue wavelengths using cell-based and cell-free culture-medium-only conditions. The Tox21 chemical library (8,305 unique substances) was screened in triplicate concentration response in all assays, with concurrent cytotoxicity measurements where applicable.
Luciferase assays. Reagents. The substrate D-Luciferin and the enzyme firefly-Luciferase were purchased from Sigma-Aldrich (St. Louis, MO). The positive control compound (PTC-124) was purchased from Santa Cruz Biotechnology, Inc. (Dallas, TX).
Luciferase (biochemical) qHTS assay. Three µL of the substrate (mixture containing 50 mM Tris-acetate pH 7.6, 13.3 mM magnesium acetate, 0.01 mM D-luciferin, 0.01 mM ATP, 0.01% Tween, 0.05% BSA and distilled H 2 O) was dispensed into medium-binding white/solid 1,536-well plates (Greiner Bio-One North America Inc., Monroe, NC) using a Flying Reagent Dispenser, FRD (Aurora Discovery, Carlsbad, CA). Then 23 nL of the test compounds (into 5-48 columns), PTC-124 and/or DMSO (into 1-4 columns) were transferred to the assay plates using a Pintool station (Wako, San Diego, CA). The positive control plate format is, column-1: sixteen-concentration two-fold titration of PTC-124 from 0.035 nM to 1.15 µM with duplicate points; columns-2 & 3 for the top half portions: 0.58 and 1.15 µM PTC-124 respectively and DMSO in the rest. Compound addition was followed by the addition of 1 μl of 10 nM enzyme (mixture containing 50 mM Tris-acetate, 0.04 µM firefly-Luciferase and distilled H 2 O) by using an FRD to all the wells of assay plate except in the 4th column which received the buffer instead. After 5 min incubation at room temperate, luminescence intensity was measured by a Viewlux plate reader (Perkin Elmer, Waltham, MA). Data were expressed as relative luminescence units. Each test compound was measured at 15 concentrations ranging from 1.5 nM to 115 µM and in triplicate.
Data analysis. For primary data analysis, raw plate reads for each titration point were first normalized relative to DMSO-only wells that received firefly-Luciferase (basal, 0%) and PTC-124 control wells that received firefly-Luciferase (0.58 µM, −100%) and then corrected by applying a pattern correction algorithm using compound-free control plates (DMSO plates). Concentration-response titration points for each compound replicate were fitted to the Hill equation and concentrations of half-maximal inhibition activity (IC 50 ) and maximal response (efficacy) values were calculated 19 ; the mean of the IC 50 s was used as the representative value. Concentration response curves were designated as class 1-4 based on the quality of the fit, the number of points above background, and the response efficacy, as previously described (summarized in Supporting Information Table S1) 18,20 .
Data analysis. Analysis of compound concentration-response data was performed as previously described 21 . Briefly, raw plate reads for each titration point were first normalized and then corrected by applying a pattern correction algorithm using compound-free control plates (DMSO plates) 22 . The responses from the positive control compounds were much higher than many of the fluorescent compounds in the library, some by several orders of magnitude. In order to see responses from weaker fluorescent compounds, the data were normalized in two different ways: (1) using the respective positive control compound and (2) using negative controls (DMSO-only wells). When normalizing to the positive control, the positive control response was set to 100% and negative control to 0% as follows: where V c denotes the compound well values, V pos denotes the median value of the positive control wells, and V DMSO denotes the median values of the DMSO-only wells. Fluorescein (0.32 µM for HepG2 and 0.6 µM for HEK-293) was used as the positive control for green fluorescence, triamterene (20 µM) for blue fluorescence and rose bengal control (50 µM) for red fluorescence. The following formula was used to normalize data to the negative control: Concentration-response titration points for each compound were fitted to the Hill equation and concentrations of half-maximal activity (AC 50 ) and maximal response (efficacy) values were calculated 19 . Concentration response curves were designated as class 1-4 based on the quality of the fit, the number of points above background, and the response efficacy, as previously described (summarized in Supporting Information Table S1) 18,20 . Active chemical identification. Chemicals were first filtered based on potency by only considering AC 50 (autofluorescence)/IC 50 (luciferase inhibition) values below 150 µM, and then based on curve class (i.e. only considered active with curve class 1.x and 2.x for AC 50 and −1.x and −2.x for IC 50 ; see Supporting Information Table S1 for details). To increase the confidence of the measurements, only AC 50 /IC 50 with efficacy above 30% were considered. Finally, for autofluorescence assays, to avoid indications of generalized cell stress without specific biomolecular interactions, only AC 50 below the burst cutoff defined in 23 were considered. Indeed, activity above this limit may result from triggering cell stress pathways, chemical reactivity, physico-chemical disruption of proteins or membranes, or broad low-affinity non-covalent interactions and not from a specific technology interference reaction. This filter was applied to distinguish chemical-structure driven autofluorescence from the increase in signal that has been observed due to cell stress 24 . The distribution of the curve classes for active chemicals is presented in Supporting Information Fig. S1.

Molecular modeling. Each chemical was encoded using a unique SMILES string format and Chemical
Abstract Services Registry Number (CASRN). Data chemical curation and descriptor selection followed the best practices protocol available in 25 . Using the MolVS python library available on (http://molvs.readthedocs.io/en/latest/ guide/intro.html) each chemical has been prepared from original SMILES including the following steps: hydrogen removing, sanitization, metal disconnection, stereochemistry process, desolvation, and filtering of salt fragments. Mixtures were not considered and were excluded in an early step. From the Tox21 library (8305 unique substances), 8065 chemicals passed thought the structure curation and were used for clustering and QSAR modeling. From each curated structure, a set of 677 2D descriptors was computed. A set of 647 descriptors including topological and physicochemical descriptors were computed using RDKit tool kit 26 implemented in the PyDPI python library 27 , and an additional set of 30 physicochemical descriptors were computed using the OPERA models 28 .
Only informative and non-correlated descriptors were selected for subsequent analyses. Descriptors with a null variance and no discriminant, i.e. the same value for more than 90% of the chemicals, were removed.
Pair-wise Person's correlation coefficient (ρ) was computed for each combination of descriptors, those with ρ > 0.9 were clustered, and one descriptor per cluster was randomly chosen. Molecular descriptors computation was performed in Python 2.7 using standard libraries. Statistical descriptors selection and clustering was performed in R 3.4.4 29 . clustering. Chemicals were characterized using selected descriptors and clustered using Self-Organizing Maps (SOM) 30 and hierarchical clustering based on a Euclidian distance matrix and Ward linkage 31,32 . SOM was used to identify structural clusters in the entire Tox21 library that were enriched for interference activity, and hierarchical clustering was performed on only the active chemical set for each assay to identify structural features related to potency or interference activity specific to cell types/culture conditions. SOM plots were developed using the R library kohonen. QSAR classification. Machine learning. The QSAR modeling workflow was conducted according to the best practices [33][34][35] . Classification models to predict active versus inactive chemicals for each of the interference assay endpoints were built using four machine learning approaches: (i) Linear Discriminant Analysis (LDA) based on Fisher's linear discriminant methods 36 , (ii) Random Forests (RF) 37 , (iii) Support Vector Machine (SVM) with a linear, radial and sigmoid kernel 38 ,(iv) Classification and Regression Tree (CART) 39 and a neural network 40,41 . QSAR models were built using R packages: pls, randomForest, rpart, e1071, nnet and caret available for R > 3.6 in the standard package repository.
Each model was tuned via a grid optimization, when appropriate for the machine learning, and parameters/ models were chosen to maximize performance on a ten-fold cross validation using Matthew's correlation coefficient (MCC). Grids were implemented using the caret R packages, where for SVM models the gamma and cost parameters were optimized as well as the classification methods, for NN the network size and the decay parameters were optimized, and for RF the number of trees and number of iterations. Considering the unbalanced dataset, i.e. far more inactive chemicals, under-sampling methods were applied via random selection of inactive chemicals to yield a ratio of 70% inactive and 30% active chemicals. Each model was built ten times with a different inactive set to cover the full set of chemicals. Model performance was reported as a mean with associated standard deviation on the ten repetitions for the training set, the cross-validation, and the external test set.
All InterPred models developed here are available in R environment format on the website: https://sandbox. ntp.niehs.nih.gov/interferences.
Performance criteria. The model quality was estimated using five statistical criteria: the accuracy (Acc) or correct classification rate, the specificity (Sp) and sensitivity (Se), i.e. ability of the model to predict active and inactive chemicals respectively, the balanced accuracy (acc b , average of sensitivity and specificity) and the Matthew's correlation coefficient (MCC). MCC is an overall prediction metric (ranging from −1: less than random, to 1: perfect prediction, where 0 corresponds to random) that considers the imbalance ratio between active and inactive chemicals in the data set. Scripts used and developed for this study are available in the GitHub platform https://github.com/ABorrel/ interferences.

Results
To quantify chemical interference in luminescence and autofluorescence assays, five assays with thirteen readouts in total were run. One assay measured firefly luciferase (Fluc) inhibition under cell-free conditions and four assays, consisting of two cell types (HepG2 and HEK-293) and two culture conditions (cell-based and cell-free using culture medium only), measured autofluorescence. Each autofluorescence assay had three fluorescent channel readouts (blue, green and red with filters of emission/excitation equal to 485/535 nm, 405/460 nm, and 540/590 nm, respectively). The Tox21 chemical library of 8305 unique structures were screened in quantitative high-throughput format; see Methods for details. Tox21 chemical library. The Tox21 chemical library is composed of 9,667 substances and 8,305 unique structures excluding mixtures and ions. The Tox21 library was built from a federal cross-agency effort covering a large chemical space as discussed in 16,42,43 , and includes diverse chemical groups e.g. pesticides, antimicrobials, water contaminants, industrial chemicals, high production volume chemicals, endocrine disruptors, FDA food additives, fragrances, plasticizers and drugs. The library contains many industrial chemicals prioritized based on potential for human exposure, which are not designed to be bioactive and therefore have a higher potential to interfere with the assays because they have not been filtered out by medicinal chemistry principles used in developing small molecule libraries.
To examine the chemical coverage of the Tox21 library, chemicals were projected on a Principal component Analysis (PCA) defined using the Distributed Structure-Searchable Toxicity (DSSTox) database covering the largest curated environmental chemical library with more that 800,000 chemicals. Figure 1 shows that the Tox21 chemicals are well distributed on the PCA defined from the DSSTOX using 1D and 2D molecular descriptors, see Methods, and provide broad coverage of the structural landscape Reference chemical response curves. Figure 2A shows the response curve for Ataluren (PTC-124; structure shown in insert panel 1) the reference positive control chemical for the luciferase inhibition assay. PTC-124 has an 3,5-diaryl oxadiazole scaffold ramified with a m-carboxylate which binds Fluc in the ATP active site. The interaction modifies the α-phosphate of ATP through a SN2 displacement reaction and inhibits Fluc 9,44 . The response curve shows rapid signal reduction with an estimated IC 50 (based on triplicate measurements) below 20 nM. The response curve class was −1.1, which corresponds to a complete inhibition response, see Methods and Supporting Information Table S1 18 .
For autofluorescence assays, reference chemicals were the fluorophores Tiramterene (insert panel 2), fluorescein (3) and rose bengal (4) for blue, green and red channels, respectively. Response curves are presented in Fig. 2B-D for each of the four assay conditions (HEK-293 cell-based, HEK-293 cell-free, HepG2 cell-based, and HepG2 cell-free). All reference chemicals response curves showed a significant concentration-dependent fluorescence increase. The signal was stronger for the green channel with AC 50 s <4 µM (curve class 1.1, complete activation response). For the blue and red channels, the AC 50 s were below 40 µM and 12 µM, respectively, with response curve classes equal to 2.1, corresponding to an incomplete response (due to lack of second asymptote) but with >80% efficacy.
Assay activity summary. The Tox21 chemical library of 8305 different structures was screened using autofluorescence and luciferase inhibition assays to directly measure technology interference. Table 1 summarizes the number of active chemicals by assay. A chemical was defined as active if it passed all of the filters, i.e. AC 50 /IC 50 cutoff, curve class, and efficacy for luciferase and autofluorescence, and cytotoxicity cutoffs for autofluorescence only. Supporting information Table S2 contains the dataset before and after filtering. For luciferase inhibition assays 6.6% of the chemicals were found to be active, with an average IC 50 of 28.3 +/− 19.1 µM. For autofluorescence assays, the blue channel had the highest number of active chemicals with between 2.5 to 2.7% of the library depending on cell types and culture conditions. In the green channel between 0.5 to 0.9% of the chemicals were www.nature.com/scientificreports www.nature.com/scientificreports/ found to be active and between 0.4 to 0.5% in the red channel. The average AC 50 ranged from 16.5 to 22.3 µM depending on the channel and conditions for autofluorescence and 28.3 for luciferase.
To characterize chemical interference activity by substance type and use case, the Tox21 chemical library was classified using the consumer products database 45 , the Toxic Substances Control Act chemical list (TSCA), and approved drugs lists available in the EPA chemical dashboard (https://comptox.epa.gov/dashboard). Based on these resources, 4950 of the 8305 unique chemicals included in the Tox21 chemical library could be classified into 80 classes. The most populated classes of active chemicals are represented in Fig. 3. Interferent chemicals are found independently of technology or condition in various chemical classes including chemicals with light absorption properties i.e. UV absorber (enriched for luciferase inhibition), hair dye (enriched for autofluorescence), photo initiator and colorants. Drugs and TSCA classes were also found to be enriched in interference chemicals. Specifically, luciferase interferents were found mostly in preservative, pesticide and antioxidant chemical classes. Chemicals causing autofluorescence in the blue channel were found in skin conditioner, antioxidant and fragrance classes, while red and green channel interferents were enriched for drugs, TSCA and hair dye classes.
The distribution of activity of selected active chemicals for each assay is shown in Fig. 4 as a density plot. For luciferase inhibition assays, the IC 50 distribution (Fig. 4A) shows a density peak around 1.5 log(µM) and specific chemical inhibitory concentrations distributed below. The distributions of active chemicals in autofluorescence assays ( Fig. 4B-D) appear to be unaffected by any combination of cell type and culture conditions. Interestingly, the activity density plots for all three-color wavelengths demonstrate bimodal potency distributions, with two peaks around 1 and 1.6 log(µM), and the majority of AC 50 values falling below 2 log(µM). The bimodal potency distribution was found for the most populated curve classes distribution as show in Supporting Information Fig. S1. For compounds with incomplete curves, e.g. curves that did not reach a response plateau, the actual AC 50 could be higher than the highest test concentration, but to avoid getting an extrapolated range of AC 50 s that are not reliable, our curve fitting algorithm tries to restrict the AC 50 to below the highest test concentration, thus forming the 2nd peak in the distribution. www.nature.com/scientificreports www.nature.com/scientificreports/ Structural activity patterns: luciferase inhibition. From the Tox21 library, 8,065 unique chemicals were extracted following structure curation, and were clustered using a SOM approach and a set of 165 non-correlated and informative 1D-2D descriptors, see methods. Chemicals were clustered in a SOM set with 225 clusters, allowing good segregation of the chemicals with only two clusters empty and one singleton, and two clusters with less than ten chemicals. On average each cluster is composed of 36 +/− 14 chemicals sharing similar structural properties ( Supplementary Information Fig. S2).
The SOM was colored using the percentage of active chemicals found in each cluster, Fig. 5A. Active chemicals were distributed across 154 clusters with 2 +/− 3 active chemicals per cluster, and an average percentage of active chemicals equal to 11 +/− 11% with a maximum equal to 47%. Three clusters that are enriched for chemicals active in the luciferase inhibition assay are highlighted and example structures shown. Cluster 163 had 33% (n = 6) active chemicals, and included three-ring structures ramified with at least one phosphate group and some ketone and alcohol groups, see example structures 5, 6 and 7. Phosphate group substructures can bind the ATP binding site of Fluc and block its activity 9 . Cluster 144 had 43% (n = 4) active chemicals, and included at least five rings structures not ramified, see example structures 8, 9 and 10. Cluster 129, in close proximity on the SOM map,  www.nature.com/scientificreports www.nature.com/scientificreports/ had 44% (n = 10) active chemicals and included at least five connected rings ramified with alcohol, ketone or chlorine for example, structures not shown. Cluster 99, had 47% (5) active chemicals, where structural scaffolds included biphenyl groups (example structures 11, 12) or a diphenyldiazene (13) ramified primary amines groups (11, 12 and 13).
Structural activity patterns: autofluorescence. The same structure-based SOM run on the entire Tox21 library was then colored according to the percentage of active chemicals by cluster on all of the autofluorescence assays merged together, Fig. 5B. The set of active auto fluorescent chemicals was composed of 379 chemicals distributed across 134 clusters with 3 +/− 2 active chemicals per cluster and an average percentage of active chemicals of 9.2 +/− 8.8%. Three clusters that are enriched for active chemicals in the autofluorescence assays are highlighted and example structures shown.
The SOM was then colored by percentage of active chemical by color channel corresponding to blue, green, and red wavelengths, Fig. 6A-C, respectively. The 379 active chemicals in autofluorescence assays were divided into 315 actives for the blue, 95 for the green and 40 for the red channel, see Table 1.
From the three clusters discussed above, none appear to be specific for one color and only cluster 203 was found to also be enriched in the blue channel (56% active chemicals, n = 10). Additional example structures are shown (23, 24 and 25) which have a large scaffold including more than six rings ramified mostly with oxygen groups, e.g. alcohol or ketone.
Structural clusters with specific enrichment for the blue channel include 147 and 129 with 44% and 40% (8 and 10 chemicals) active in each cluster, respectively. Cluster 147 is composed of chemicals with sulfonic acids (example structures 26, 27) or a guanidium substructure (28) connected to an aromatic ring such as naphthalene For the green channel, cluster 209 is the cluster with the highest percentage of active chemicals with 27% actives representing eight chemicals. Active chemicals are composed of at least one unsaturated ring ramified with oxygen groups such as alcohol, ketone or carboxylic acid (example structures 32, 33 and 34), and do not include heteroatoms other than oxygen and carbon. Cluster 173 is also displayed in Fig. 6, with only 8% actives on the green channel but representing five chemicals, all of which include a diazatetracyclohexadeca pentaene and terminal chain composed by nitrogen and oxygen group (example structure 35, 36 and 37). Cluster 117 is also enriched for autofluorescence on the green wavelength with seven actives representing 23% of the full cluster, covering aromatic rings ramified with only alcohol or ketone group (example structures 38, 39, and 40). These chemicals are structural analogues of the fluorescein molecule with a high degree of scaffold similarity.
In the red channel cluster 209 was again enriched with 27% actives, corresponding to eight chemicals. All of the active chemicals in this cluster were autofluorescent in both the green and red channel and included at least one unsaturated ring ramified with oxygen groups such as alcohol, ketone or carboxylic acid (additional example structures 41, 42 and 43).
Cluster 194 included two chemicals representing 18% (n = 4) of the cluster actives with structure composed by three aromatics ramified with iodine (44), bromine (45) and also chlorine, not shown. Finally, cluster 143 also included three active chemicals representing 17% of the cluster. Structurally, chemicals include on both extremities a benzothiazole connected with an unsaturated chain (example structures 46 and 47), and no oxygens.
Autofluorescence patterns among channels and culture conditions. For each color channel (blue, green, and red), autofluorescence activity was measured using two cell types (HepG2 and HEK-293) and two culture conditions, cell-free (culture medium only) and cell-based. As shown with the SOM, some chemicals demonstrated activity across color channels. Figure 7 quantifies the overlap between autofluorescent chemicals by color channel (Fig. 7A) and within each wavelength by cell line/culture condition (Fig. 7B-D).
Only 3.5% of the 404 autofluorescent chemicals were active across all cell culture conditions and color wavelengths, Fig. 7A. The 14 chemicals are shown in Supporting Information Table S3. Structurally these chemicals included usually a large scaffold with a complex aromatic ring arrangement composed of more than 3 rings. www.nature.com/scientificreports www.nature.com/scientificreports/ Figure 7B-D defines the overlap of active chemicals among cell culture conditions, within a color channel. A chemical that is autofluorescent in every combination of cell culture condition can be considered strongly interferent with the color channel with a high confidence; however, these cases represented only a portion of the active chemicals: 40% of actives in the blue, 36% in the green channel and 53% for the red channel. The overlap among autofluorescence assays and the luciferase inhibition assay is shown in Supplemental Fig. S3.
Hierarchical clustering on active chemicals. The influence of four different cell culture conditions (HepG2 and HEK-293 cell lines with cell-based and cell-free conditions) on chemical activity in the autofluorescence assays was investigated across the three-color channels. To identify clusters of chemicals specifically active for one cell culture or condition for different color channels, hierarchical clustering was performed where active chemicals sharing similar physico-chemical or topological properties were clustered together. The clustering is presented using a circular dendrogram, for each color channel and cell culture condition (Fig. 8A-C) and colored by potency. Chemicals specifically active for certain experimental conditions are highlighted and structures shown. Hierarchical clustering of autofluorescent chemicals across all twelve combinations of cell culture www.nature.com/scientificreports www.nature.com/scientificreports/ conditions and color channels is shown via circular dendrogram in Supporting Information Fig. S4, and for all active chemicals in the luciferase inhibition assay in Fig. S5.
Among chemicals active within wavelengths, certain chemical features were found to be associated with activity for one cell culture condition. Chemicals 62, 63, 64, 65 and 79 were found active specifically on the HepG2 culture. These chemicals have small structural scaffolds, less than 20 atoms with not more than three conjugated rings such as benzoindole or naphthalene. Similarly, chemicals 49, 57, 66, 67, 68, 69 and 77 are active only on HEK-293 and share one benzene ramified with primary amine or alcohol groups. These structural scaffolds may drive interference with the culture conditions rather than the color channels.
Specific to the blue channel, chemicals 50, 51, 52 and 53, containing unsaturated chains ramified with several methyl groups, interact specifically with the HEK-293 cell-based and cell-free condition. Chemicals 58 and 59 composed of an imidazole ring connected with an aliphatic chain, were active only in the HEK-293 cell-free condition and chemicals 60 and 61 composed of a pyrimidine and a phosphate group are also found specifically active on HEK-293 cell-free condition.
Finally, some of the chemicals represented in Fig. 8 (structures 48, 54, 55, 56, 72, 73, 74, 75, 79, 80 and 81) included chemical substructures important for fluorescence, e.g. high number of aromatic rings and are close structural analogues of chemicals already discussed in the SOM results. QSAR interference classification models. As shown using unsupervised statistical approaches, chemicals that actively inhibit luciferase share similar structural chemical properties and, separately, those that interfere with autofluorescence assays have common features, in some cases specific to interference patterns driven by assay platform, color channel, cell or condition. Based on structural and physico-chemical properties, quantitative structure-activity relationships (QSAR) models were developed to predict the likelihood of chemical-assay interference. Multiple machine learning approaches (see Methods) were used to predict luciferase inhibition, autofluorescence in any color channel across cell culture conditions, autofluorescence individually by color channel regardless of conditions, and autofluorescence by color channel and cell culture conditions uniquely. All model www.nature.com/scientificreports www.nature.com/scientificreports/ building steps, including undersampling from the over-represented set of inactives in the Tox21 dataset, training, cross-validation, and testing, were repeated ten times to ensure that all chemicals were incorporated in the process, and the mean and standard deviation of each performance criterion across all ten iterations were reported for each machine learning approach.
First, QSAR models were built to predict luciferase inhibition. After data processing, each full set of chemicals for model building was composed of 1,724 chemicals (30% actives, 70% inactives; Table 2). The RF model For autofluorescence assays, QSAR models were first developed to predict activity without distinguishing by color channel or cell culture conditions; performance metrics are reported in Table 3. Only chemicals that were active across all conditions for each color (independently) were used for the active training set. For each modeling iteration, the full set included 507 chemicals (30% actives, 70% inactives). Similar to QSAR models developed for luciferase inhibition, RF gave the best performance with a cross validation MCC equal to 0.564 +/− 0.026 and test set MCC of 0.568 +/− 0.094, and CART, SVMs, NN and LDA QSAR models had lower performance.
Specific QSAR models were then developed for each color channel. A chemical was considered active if it was active on at least one cell culture condition in a particular color channel. Performance metrics are presented in Table 4. The modeling dataset for the blue channel consisted of 1,045 chemicals, for the green channel 339 chemicals, and for the red channel 148 chemicals. As with the previous QSAR models, RF  www.nature.com/scientificreports www.nature.com/scientificreports/ concurrent blue channel activity on other cell culture conditions (Fig. 7). For the other culture-specific QSAR models, the performances are close to the color specific QSAR models' performance presented in Table 4, due to a high degree of consistency among the red and green actives within each channel across conditions. QSAR model descriptors. The variable importance scores of the RF QSAR model descriptors are presented in Fig. 9 for luciferase and Fig. 10 for autofluorescence assays and are summarized in Supporting Information Table S8. For all models, the most important descriptors include at least one feature characterizing the polarizability of the chemical (CombDipolPolariz or bcutp) and one of the following physicochemical properties (UI, logP_pred, BP_pred, MP_pred or BioDegHL_pred). Specifically, for luciferase inhibition the most important descriptors characterize the ratio between unsaturated/saturated bonds (Sp3Sp2HybRatio), followed by the E-state of a methyl connected to an aromatic (S12). On average, active chemicals have a lower ratio between unsaturated/saturated bonds (0.21 for actives and 0.48 for inactives) and have an energy state more influenced by methyl group connected to an aromatic than inactive (S12) (11.15 for actives and 5.74 for inactives, Supporting Information Table S9). For autofluorescence QSARs, unsaturated index (UI) descriptors, which characterize the ratio between unsaturated and saturated bonds in a chemical, were found in almost every model as an important descriptor. The unsaturated index is higher for active chemicals (~3.5) than inactive (~2.6) due to the fact that active chemical are mostly enriched in aromatic groups. Burden descriptors (bcutp), which characterize mass by atom type and bond, are also present in every model and burden descriptors are higher for active chemicals than inactive. The red channel model includes three descriptors characterizing the charges (QNmin, Qmin and QOmin) and active chemicals are less charged that inactives. The QSAR models built individually on the blue and green color channels shared six of the top 10 important descriptors, and the blue channel and luciferase inhibition models shared four important descriptors.

Discussion
Out of the large, diverse chemical set represented by the Tox21 library, approximately 10% of chemicals demonstrated some degree of assay interference potential depending on the endpoint (luciferase inhibition or autofluorescence via blue, green, and/or red channels). These chemicals cover a wide range of commercial uses and regulatory lists of concern, with varying degrees of associated environmental occurrence and toxicity data 42 . In Continued most cases, data generated under the Tox21 program represent the bulk of bioactivity information available for a particular chemical, highlighting the importance of identifying false signals to allow for true characterization of chemical-target activity driven by biological and toxicological perturbations. Luciferase inhibition was the most prevalent interference activity observed among the Tox21 chemicals (Table 1). This assay relies on loss of signal, which would potentially be interpreted as a false positive for the antagonist or inhibitory mode of a luciferase-based reporter gene assay measuring biological activity. For autofluorescence, there were many more chemicals active in the blue channel (~3%) than in the green (~1%) or red (~0.5%) channels. These results are in agreement with other studies showing that around 5% of the chemicals in a diverse library can fluoresce in the blue spectrum versus less than 1% for the other parts of the spectrum 10,46 . Interestingly, the activity density plots for all three color wavelengths demonstrated bimodal potency distributions, with two peaks around 10 and 40 µM, and the majority of AC 50 values falling below 50 µM (Fig. 4).
Multiple supervised and unsupervised machine learning and clustering approaches were harnessed to analyze the interference activity data and characterize structural patterns associated with luciferase inhibition or autofluorescence among different wavelengths. For all of the clusters discussed in the SOM for luciferase inhibition (Fig. 5A), the structural scaffolds found were consistent with those that have been previously characterized in the literature as Fluc inhibitors 9,11 . When examining the SOM for autofluorescence (Fig. 5B), it is clear that a high number of rings in a scaffold is associated with fluorescence activity, as discussed in Su et al. 14 , where the authors dictated that the presence of more than six rings in a chemical fulfilled criteria to be a strong fluorophore. This observation was also consistent when examining chemicals that were strongly interferent, i.e. active across all wavelengths and culture conditions, where a high number of rings (and the presence of at least one oxygen) increases the ability to absorb light. Further known and novel associations were identified between various structural scaffolds and autofluorescence activity (Figs. 4A-C and 5B), and channel/condition-specific potency/activity ( Fig. 8A-C), prompting the development of structure-based prediction models.
QSAR models were developed using four machine learning algorithms. Across models, CART and SVM nonlinear were not able to discriminate active and inactive chemicals, where SVM models tended to overfit the training set with accuracies close to 100% and CART models poorly predicted the test sets (negative MCC). The lack  www.nature.com/scientificreports www.nature.com/scientificreports/ of performance of these models is explained by the unbalanced set and the difficulty to linearly separate active and inactive chemicals due to the large diversity of inactive chemical (see PCA plot, Supporting Information Fig. S6). Overall, RF models exhibited the best predictive performance, as discussed below.
The QSAR models for luciferase inhibition demonstrated high predictive performance, with ~83% external test set accuracy values (Table 2) and balanced accuracy equal to 76%. For each color channel in the autofluorescence dataset, there were varying degrees of chemical activity in specific cell culture conditions, but the largest number of active chemicals was always found in the intersection among all conditions (Fig. 7), representing those compounds that could be considered "strongly interferent" for a particular wavelength. These strong actives were used as "true positives" in the machine learning approaches applied to build models for general autofluorescence, which demonstrated over 82% accuracy on external validation sets (Table 3) with balanced accuracy of 77%. There was no significant difference between percent actives in individual cell culture conditions within or across wavelengths, indicating that one particular cell line or media choice did not play a large role in driving autofluorescent interference. However, structure-based analysis did reveal clusters of active chemicals specific to certain cell culture conditions, and the QSAR models that were built for individual channel/culture combinations demonstrated high performance (76-92% external set accuracy, Supporting Information Table S4-S7). In the case of the red channel models, this is due to the high degree of consistency where active chemicals were generally active across culture conditions (Fig. 8C), meaning that the individual culture condition models are essentially equivalent to the red channel model overall. For the blue channel models there were higher numbers of active chemicals in individual culture conditions (Fig. 8A), and the respective structure-based prediction models therefore provide unique insight.
As would be expected based on the number of chemicals and patterns of activity across culture conditions within each channel, the highest performing color-specific model was on the red channel, with 86-87% external set accuracies and 80% balanced accuracy, followed by the green channel with 85-86% external set accuracy and 81% balanced accuracy. Performances were lower on the blue channel with external set accuracies of 77-78% and a balanced accuracy equal to 68% due to the larger number of diverse chemical structures with disparate activity across cell culture conditions. For every model, loss of performance can be explained by low sensitivity, i.e. the difficulty in predicting active chemicals. This phenomenon is explained by the structural diversity of the large inactive set, and the coverage of the active set by the inactive set, where inactive chemicals share many structural features with active chemicals as shown in Supporting Information Fig. S6 by the high density of the PCA map. The lack of purity of clusters in the SOM clustering (Figs. 5 and 7) below 70% actives also shows the proximity between active and inactive chemicals.
The developed QSAR models were applied to predict interference for 4632 small molecules included in the DrugBank database with structures available on the EPA comptox chemical dashboard (https://comptox.epa.gov/ Figure 9. Variable importance plots for the top 10 descriptors from the random forest QSAR models predicting luciferase inhibition. Ten values for each descriptor are reported, corresponding to each of the ten models developed with different data set segregations. Descriptors are defined in Supporting Information www.nature.com/scientificreports www.nature.com/scientificreports/ dashboard/chemical_lists/DRUGBANK). From this list of chemicals, 1495 were found in the Tox21 chemical library, and performances on this subset of chemicals with assay data were found to be consistent with the performances on the test set with an accuracy of 0.94 and a MCC equal to 0.63 for luciferase models, and an accuracy of 0.88 and a MCC equal to 0.33 for autofluorescence models (merging conditions, cell lines and wave lengths). The interference probability distribution (Supplemental Fig. S7 panel A), shows that most chemicals were predicted not to interfere with assay technologies, while the distribution of those that were predicted as interferent compounds (Supplemental Fig. S7 panel B) was consistent with the Tox21 chemical library (Fig. 7). For example, the models identified Lymecycline (CAS: 992-21-2) as an interferent chemical for all of the autofluorescence models in the blue and green. Lymecycline belongs to the tetracycline family, which are known to absorb the light at low wave lengths (Conover et al. 1953) which is consistent with interference in low absorbance color ranges Figure 10. Variable importance plots for the top 10 descriptors from the random forest QSAR models predicting autofluorescence in: (A) any color channel, (B) the blue channel, (C) the green channel and (D) the red channel. The importance of each descriptor is normalized, and ten values for each descriptor are reported corresponding to each of the ten models developed with different data set segregations. Descriptors are defined in Supporting Information Table S8. www.nature.com/scientificreports www.nature.com/scientificreports/ (blue and green). The QSAR predictions for this list of drugs/small molecules have been included in supporting information.
The application of RF models to predict luciferase inhibition and autofluorescence allows for interrogation of important structural features and physicochemical properties driving the model performance (variable importance, Fig. 9). Over 50% of the important descriptors found were dependent or connected to aromatic properties, confirming that aromatic rings play an important role in the absorption/emission of light and, as previously discussed, aromatic rings are known to influence fluorescence chemical properties 14 . When examining the reference chemicals for each channel, this is apparent. The scaffold of triamterene is composed of a pteridine ramified with three primary amines and one benzene, and includes only nitrogen and carbon as heavy atoms. Fluorescein is composed of a dioxaspirane ramified with one benzene and two phenol groups, and includes only carbon and oxygen. Finally, rose bengal includes a xanthene group ramified with four iodine atoms, two ketone and one tetrachlorinebenzoic acid group, and more diverse atom types than the Triamterene and the fluorescein, with chlorine and iodine atoms. The red channel model includes three descriptors characterizing the charges, which can be explained by the more diverse composition of heteroatoms in active chemicals on the red channel as shown for example with rose bengal, chemical 4 in Fig. 4, including iodine and chlorine. It is also worth noting that the blue channel model and the luciferase inhibition model shared four of the most important descriptors, which can be explained by the number of chemicals (26) active in both assays (Supporting Information Fig. S3).
The work discussed here represents one of the largest screening efforts to date specifically intended to identify and characterize chemical-assay interference via luciferase inhibition and autofluorescence, and to interrogate the influence of cell types and culture conditions. The resulting predictive models (https://sandbox.ntp.niehs. nih.gov/interferences) can be used to predict interference potential of new chemicals, and to provide insight into structural features that may influence activity and inform molecular design and assay selection.