Optimization of cell viability assays to improve replicability and reproducibility of cancer drug sensitivity screens

Cancer drug development has been riddled with high attrition rates, in part, due to poor reproducibility of preclinical models for drug discovery. Poor experimental design and lack of scientific transparency may cause experimental biases that in turn affect data quality, robustness and reproducibility. Here, we pinpoint sources of experimental variability in conventional 2D cell-based cancer drug screens to determine the effect of confounders on cell viability for MCF7 and HCC38 breast cancer cell lines treated with platinum agents (cisplatin and carboplatin) and a proteasome inhibitor (bortezomib). Variance component analysis demonstrated that variations in cell viability were primarily associated with the choice of pharmaceutical drug and cell line, and less likely to be due to the type of growth medium or assay incubation time. Furthermore, careful consideration should be given to different methods of storing diluted pharmaceutical drugs and use of DMSO controls due to the potential risk of evaporation and the subsequent effect on dose-response curves. Optimization of experimental parameters not only improved data quality substantially but also resulted in reproducible results for bortezomib- and cisplatin-treated HCC38, MCF7, MCF-10A, and MDA-MB-436 cells. Taken together, these findings indicate that replicability (the same analyst re-performs the same experiment multiple times) and reproducibility (different analysts perform the same experiment using different experimental conditions) for cell-based drug screens can be improved by identifying potential confounders and subsequent optimization of experimental parameters for each cell line.


Results
Replicability was affected by suboptimal resazurin reduction assay and cell culture protocols. To be able to compare drug potency estimates between different studies, long-established breast cancer cell lines (HCC38 and MCF7) and pharmaceutical agents (bortezomib, cisplatin, and carboplatin) in cancer research and drug screening were utilized. To investigate the effect of these chemotherapeutic agents on cell viability in MCF7 and HCC38 cells, we initially performed the resazurin reduction assay using standard experimental protocols and calculated dose-response curves and the half-maximal inhibitory concentration (IC50) after 24-hour drug exposure. Cells were plated at a density of 1.0 × 10 4 cells per 96-well in 100 µl HuMEC basal serum-free medium. Previous studies have shown that complete growth medium supplemented with FBS reduces the effect of proteasome inhibitor bortezomib on proteasome activity 27 , thereby warranting the use of HuMEC medium for all drug treatments. Due to poor aqueous solubility, the pharmaceutical drugs were dissolved in DMSO (final DMSO concentration ranging from 0.0002-10%) according to the manufacturer's instructions, further diluted with 1xPBS to the desired working concentration (1-10000 nM bortezomib and 2-1024 µM cisplatin/carboplatin), and stored at 4 °C for up to one week in Falcon TC-treated flat-bottom culture microplates (VWR). However, use of these experimental parameters (cell density of 1.0 × 10 4 per 96-well plate, serum-free medium, and a single DMSO control) resulted in intra-and interexperimental inconsistencies, including differences between replicates and dose-response curves with viability estimations higher than 100% at the start of the experiment (Fig. 1).

Cell viability was negatively affected by evaporation and DMSO solvent.
To evaluate the effect suboptimal experimental design has on cell viability (IC50) and the area under the dose-response curve (AUC) values, we tested several experimental parameters that may affect replicability and reproducibility (e.g. storage of the pharmaceutical drugs, cell culture conditions, and resazurin assay conditions; Table 1 and Fig. 2). We first examined whether evaporation had a negative impact on cell viability and should, therefore, be minimized. Evaporation rates and/or cell viability were measured for a) diluted drugs stored at 4 °C or −20 °C for 48 hours in 96-well flat-bottom culture microplates sealed with Parafilm around the lid, b) diluted drugs stored at −20 °C for 72 hours in either PCR plates (sealed with aluminum tape) or culture microplates (sealed with Parafilm around the lid), and c) HCC38 cells treated with DMSO, PBS or diluted drugs incubated at 37 °C in a humidified 5% CO 2 environment. After as little as 48 hours storage at 4 °C or −20 °C, evaporation and eventual concentration of the diluted drugs had a significant effect on cell viability (Fig. 3a). Consequently, IC50 and AUC values decreased with time, but storage of the drugs in the refrigerator or freezer (the same length of time) had no effect on cell viability. However, evaporation of the drugs occurred at a faster rate in flat-bottom culture microplates (Fig. 3b). Despite using culture microplates that are designed to minimize evaporation, an edge effect was observed in wells around the perimeter of microplates incubated at 37 °C (Supplementary Figure 1). Consequently, elevated resazurin-based absorbance values were measured for cells in the perimeter wells, with similar effects found for cells treated with DMSO and bortezomib (Fig. 3c).
We then evaluated experimental conditions (e.g. sensitivity to DMSO, medium type, resazurin incubation time) that could potentially have a detrimental effect on the performance of the resazurin assay or cell viability (Table 1). Major DMSO cytotoxic effects on MCF7 cells were observed after 24 hours exposure to as little as 1% (v/v) DMSO, with a substantial decrease in cell viability with increasing DMSO concentration (Fig. 3d). In addition, use of a single DMSO vehicle control containing 1% (v/v) DMSO resulted in dose-response curves starting at cell viability higher than 100% and relatively large error bars, which was corrected by using matched DMSO concentration controls for each drug dose. All other tested experimental conditions were frequently cell line-specific and had minimal effects on the dose-response curves. Nevertheless, stable dose-response curves with small error bars were produced using 7.5 × 10 3 cells per 96-well in 100 µl growth medium containing 10% FBS, without daily renewal of the medium/drug or supplementing the growth medium with antibiotics. These culture conditions allowed the cells to be cultured for at least 72 hours without reaching the plateau-phase in growth, no detectable production of the non-fluorescent dihydroresorufin or affecting the proteasome inhibitory properties of bortezomib. Furthermore, both absorbance and fluorescence were comparable methods of resorufin detection Experimental parameters should be optimized due to the influence on cell viability. To identify factors influencing cell viability, one-way ANOVA was then employed using cell viability as the dependent variable and all other experimental parameters (e.g. cell line, drug, drug dose) as covariates. As expected, the vast majority (approximately 90%) of the variation in cell viability was associated with the choice of pharmaceutical drugs (i.e. drug type, dose, and treatment time) and cell line (5.4%; Supplementary Table 1). Though to a lesser extent, growth medium (type of medium/addition of FBS and medium volume) and resazurin incubation time also had an impact on cell viability (P < 0.05; Fig. 4). These findings highlight the importance of assay optimization for each cell line to minimize the effect of confounders.
optimized assay conditions facilitated reproducible cancer drug cytotoxicity testing. To validate the optimized resazurin reduction assay for drug cytotoxicity testing, we determined IC50, GR50, and GRmax values for MCF7, HCC38, MCF-10A, and MDA-MB-436 cells exposed for 24 hours with bortezomib or cisplatin (cell seeding density of 7.5 × 10 3 cells per 96-well in 100 µl growth medium supplemented with 10% FBS). All experiments were performed at least three times in triplicate. Intra-and interexperimental variation in resazurin-based absorbance values were also examined using assay QCM (Z-factor, signal window (SW), coefficient of variation (CV)) with thresholds set to Z-factor > 0.4, SW > 2, and CV < 20%. An ideal assay would have a large signal dynamic range and/or small signal variation (standard deviation (SD) or CV), with a Z-factor close to or equal to 1 28 . In addition, the signal window is recommended to be at least 2 SD the largest detected assay signal to differentiate biologically active and inactive compounds 29 . Here, we show that the optimized assay performed well in all four cell lines using bortezomib and cisplatin. All data points were within the recommended ranges, with the majority of data points for Z-factor > 0.75, SW > 10, and CV < 5% ( Fig. 5 and Supplementary Table 2). The 50% inhibitory concentration (IC50) is conventionally used to determine drug potency with cell-based cytotoxicity tests. According to the IC50 values, HCC38 cells were more sensitive to bortezomib than MCF7, MCF-10A, and MDA-MB-436 cells 24 hours after treatment (Fig. 6a). In line with Hafner et al., IC50 values were shown to be dependent on cell growth rates and drug treatment time 6 . For the HCC38 and MCF7 cell lines, IC50 values were found to decrease with increasing bortezomib treatment times and resulted in relatively similar IC50 values 72 hours post-treatment (Supplementary data; MCF7, IC50 = 37 nM and HCC38, IC50 = 2.5 nM). These findings prompted us to investigate whether bortezomib was an effective proteasome inhibitor. At least 50% inhibition of proteasome activity was detected in all four cell lines two hours after exposure to 10 nM bortezomib and close to complete proteasome inhibition with 100 nM bortezomib (Fig. 6b). As expected, bortezomib blocked cell cycle progression by inducing G2/M phase arrest in HCC38 cells 24 h after treatment (Fig. 6c).
The pharmacoDB database was then used to compare the IC50 values for MCF7, HCC38, MCF-10A, and MDA-MB-436 cells treated with bortezomib and cisplatin (Fig. 7). On average, the pharmacoDB datasets showed Figure 2. Schematic workflow of the experimental design. The resazurin viability assay was used to evaluate drug response using two breast cancer cell lines (MCF7 and HCC38) and three pharmaceutical drugs (bortezomib, carboplatin and cisplatin). The assay was optimized using Analysis of Variance (ANOVA) to identify potential experimental confounders and Z-factor (Z), Signal window (SW) and coefficient of variation (CV) to evaluate the signal dynamic range. Thereafter, the optimized parameters were implemented using four breast cancer cell lines (MCF7, HCC38, MCF-10A, and MDA-MB-436) and three pharmaceutical drugs (bortezomib, carboplatin and cisplatin). Finally, our data were compared to published data (pharmacoDB 42 and Hafner et al. 31 ). (2020) 10:5798 | https://doi.org/10.1038/s41598-020-62848-5 www.nature.com/scientificreports www.nature.com/scientificreports/    www.nature.com/scientificreports www.nature.com/scientificreports/ NCI-60 cancer cell line panel (26 hours) 30 . Due to the difference in population doubling times, growth inhibition metrics (GR50, GRmax, and GR AOC ) were used to assess drug potency and efficiency after adjusting for cell growth rates. Using these metrics, we found that cisplatin was comparably cytotoxic to HCC38 (GR50 = 21 ± 7 µM, GRmax = −0.93 ± 0.05, and GR AOC = 1.   (Fig. 8a). Interestingly, in agreement with Hafner et al. HCC38 cells were shown to be more sensitive to bortezomib than MCF7 and MCF-10A cells, and GRmax values revealed that bortezomib and cisplatin were cytotoxic to all cell lines except for MDA-MB-436 31 . However, Hafner's data indicated that HCC38 cells were more sensitive to cisplatin than MCF7 cells, whereas our data showed no difference in sensitivity to cisplatin. In addition, we found a statistically significant difference in GR50 values for our two studies for MCF-10A cells treated with bortezomib and cisplatin, as well as, GRmax values for cisplatin-treated MCF-10A, MCF7, and MDA-MB-436 cells (Fig. 8b,c).

Discussion
In the current study, we describe a number of biological (cell type, medium composition and volume, and seeding density) and technical (edge effect, drug type, dose, storage, and treatment time, assay duration time, dose-response metric) factors that should be taken into consideration to achieve more reliable and reproducible drug-dose sensitivity screens in 2D models. Although many of these factors are known to affect drug response, few studies have proposed strategies to quantify and correct sources of experimental variability in cell-based drug screens using QCM [6][7][8]16,18,19,32 . The optimization strategy used here can be adapted to other cell-based and animal model systems by first identifying confounding factors in the experimental setup, followed by optimization of critical experimental parameters and assessment of data quality. However, it is recommended that each lab evaluates whether the proposed optimal conditions are suitable for the cells (e.g. derived from other tissue types) and assay (e.g. signal dynamic range) used in their experiments. Here, we first optimized the resazurin viability assay using two breast cancer cell lines (HCC38 and MCF7) and then implemented the optimized assay on two additional cell lines derived from breast tissue (MCF-10A and MDA-MB-436). We used the resazurin substrate as a marker of cell viability because it is considered to be a relatively quick, non-toxic, cost-effective, and flexible method that can be utilized with a variety of different culture conditions 8,33,34 . To be able to compare drug potency estimates between different studies, long-established breast cancer cell lines (HCC38, MCF7, MCF-10A, and MDA-MB-436) and pharmaceutical agents (bortezomib, cisplatin, and carboplatin) in cancer research and drug screening were utilized [35][36][37] .
There has been considerable controversy whether high-throughput screens (HTS) produce consistent results between different centers 7,13 . Despite differences in experimental protocols, two large-scale HTS pharmacogenomics studies (Cancer Genome Project (CGP) and the Cancer Cell Line Encyclopedia (CCLE)) were shown to generate highly concordant transcriptional profiles, but discordant drug response. These findings indicate a need to standardize protocols for cell-based drug sensitivity screens by e.g. establishing a consensus on appropriate positive controls (genomically verified cancer cell lines treated with "reference" chemotherapeutic agents), cell culture conditions, etc 8 . To address this issue, use of long-established breast cancer cell lines and pharmaceutical agents were used in the present study. However, HTS are frequently performed using automated liquid handling robots. Although robotic systems are time effective and can facilitate high precision pipetting, challenges associated with this method include e.g. the inability to adjust drug doses for different cell types, evaporation (edge effect), clogging, air bubbles, use of small volumes and pipetting of viscous liquids 38 . In order to develop common standards for drug screening, it is also important to investigate the effect of confounding factors on cell viability and identify drugs that have a cytotoxic effect by evaluating assay suitability with QCM 28 .
Drug screening assays with a low CV (little to no signal variation), high SW (above 2 SD, large signal dynamic range) and a Z-factor close to or equal to 1 should be able to identify biologically active www.nature.com/scientificreports www.nature.com/scientificreports/ pharmaceutical compounds. Here, CV, SW, and Z-factor were within recommended ranges (CV < 20%, SW > 2 SD, and Z-factor > 0.4) in the three independent experiments, with CV below 5%, SW exceeding 10 SD and Z-factor close to 1. Furthermore, variance component analysis can potentially identify confounders in an assay by estimating the contribution of covariates (e.g. cell type, drug, drug dose) to the variance of a dependent variable (e.g. cell viability). In agreement with previous studies, we demonstrated that cell line, drug type, dose and treatment time had a substantial effect on cell viability 8,16 . Though we used cell lines derived from different breast cancer subtypes with differing biological features, the impact of cell type on drug response was likely underestimated due to the use of cell lines from the same tissue. Seeding density, medium type and volume, and resazurin assay time were also shown to have an impact on cell viability, though to a lesser extent 6,8 . In the present study, a narrow range of seeding densities (5000 to 10000 cells/well) was used. However, Hafner et al. showed that seeding density can have a major effect on GR values, especially when using a broad range of seeding numbers 6 . In contrast, the effect of medium type and volume is less studied. Interestingly, we show that different FBS concentrations (0-20% FBS) have a minimal effect on drug sensitivity after 24 hours exposure. As FBS is expected to have an effect on cell proliferation, it would be interesting to assess the impact of FBS concentration on cell viability over time 39,40 .
Interexperimental variability was reduced by optimizing critical experimental parameters (e.g. controls, edge effect, evaporation, and cell seeding density) for each cell line, thereby improving replicability of the drug-dose response curves. Pharmaceutical drugs are typically dissolved in various solvents (e.g. DMSO, DMF, saline, PBS) for use in in vitro studies. However, solvents such as DMSO can have a profound effect on cell viability, even at concentrations as low as 0.33%. By using a single DMSO vehicle control, we not only observed both over-and underestimation of viability, but also dose-response curves starting at levels above 100% viability. In contrast, matched DMSO concentration controls are highly recommended as they reduce the risk of dose curves starting at >100% viability. Due to the risk of evaporation, matched controls and drugs should be plated in the same location on the 96-well plate. Subsequent use of matched concentration controls is more time consuming because e.g. a dilution series of the solvent needs to be prepared, which takes up more space on the 96-well plate and drives up costs. In the edge effect experiments, we concluded that evaporation was higher around the perimeter of the 96-well plates, thereby affecting the concentration of added components and cell viability measurements. To minimize the edge effect, we excluded perimeter wells and instead filled these wells with PBS. Since the cell viability assay is usually performed over several days, each 96-well should be seeded with an appropriate cell number so that the cell population will be in the exponential phase during the course of the experiment and not affected by high confluency (stationary phase) that in turn could affect drug efficiency 31,41 . Lastly, an evaluation of evaporation on plates containing diluted drugs (stored at −20 °C) demonstrated that 96-well flat-bottom plates sealed with parafilm had higher evaporation than 96-well PCR plates sealed with aluminum tape. Consequently, evaporation of diluted drugs and medium components can lead to fluctuations in drug response, thereby making it difficult to estimate accurate IC50, GR50 and GRmax values and achieve reproducible results.
To evaluate the reproducibility of our results, we compared our data for four breast cancer cell lines (HCC38, MCF7, MCF-10A, and MDA-MB-436) treated with two anti-cancer drugs (bortezomib and cisplatin) with data from pharmacoDB and Hafner et al. 31,42 . Although we frequently report slightly higher IC50 values due to differences in drug treatment time (24 hours vs 48-72 hours), drug response was relatively comparable. With the exception of MCF-10A cells (treated with bortezomib and cisplatin) and MCF7/MDA-MB-436 cells treated with cisplatin, we show similar GR values as Hafner and colleagues. Drug treatment time differs frequently between studies, ranging from minutes to days. Here, we conducted drug sensitivity screening for 24, 48, 72 hours and observed similar IC50 values between different cell lines after 48 hours despite differences in population doubling time. In addition, medium and drug replacement every 24 hours had no effect on cell viability, indicating that treatment was most crucial during the first 24 hours. In the case of bortezomib, proteasome activity was inhibited after a 2 hour treatment time and G2/M phase arrest was induced within 24 hours. Consequently, the half-life of the tested drugs ranges from 25 minutes to about 20-30 hours in patients (depending on the drug type) 36,43,44 .
In summary, optimization is crucial for data reliability regardless of the model type (e.g. 2D monolayers, 3D spheroids or in vivo models) used to evaluate drug sensitivity 45 . We identified new critical experimental parameters (e.g. matched solvent concentration controls and drug storage) that need to be optimized to develop high precision, robust and reproducible cell viability assays. IC50 is commonly used by researchers to determine the potency of a drug on a certain cell line. To be able to compare drug potency estimates between different studies, we recommend using both IC50 and GR50. Human error or instrument inconsistencies can result in variations in data measurements that need to be overcome to identify biologically active hits (potent drugs) 28 . However, using QCM metrics increases the chance to identify hits within experiments. Ultimately, careful consideration to assay optimization and estimation of drug potency during the preclinical drug screening process may help to improve the success rates of cancer drug candidates that reach clinical trials. www.nature.com/scientificreports www.nature.com/scientificreports/ pharmaceutical compounds. Stock solutions for platinum-based agents (10 mM cisplatin and carboplatin) and a proteasome inhibitor (1 mM bortezomib) were prepared using DMSO (Sigma-Aldrich; stored at −80 °C), further diluted in 1xPBS to the appropriate concentration, and plated in 96-well PCR plates (VWR; stored at −20 °C). The pharmaceutical compounds were screened at nine concentrations (2-1024 µM cisplatin/carboplatin and 1-10,000 nM bortezomib) using a 2-fold dilution series with matched DMSO concentration vehicle controls. The pharmaceutical compounds were at room temperature (18-25 °C) when added to cells. Proteasome activity was assessed using the Proteasome-Glo Chymotrypsin-like assay (Promega) with bortezomib-treated cells seeded in 96-well clear, flat-bottom microplates (Corning Life Sciences) at a density of 7.5 × 10 3 cells per well in 100 µl culture medium (RPMI or DMEM basal medium supplemented with 5%, 10% or 15% FBS or without FBS, and HuMEC Basal Serum-Free medium supplemented with epidermal growth factor, hydrocortisone, isoproterenol, transferrin, insulin, and bovine pituitary extract (Life Technologies)).
Resazurin-based cell viability assay. Cells were plated in 96-well clear, flat-bottom microplates (Corning Life Sciences), at a density of 7.5 × 10 3 cells per well in 100 µl culture medium supplemented with 10% FBS and cultured for 24 hours. Cell viability was assessed after drug treatment for 24, 48 or 72 hours using 0.2 mg/ml resazurin solutions prepared from resazurin sodium salt (Fisher Scientific) dissolved in sterile 1xPBS (ThermoFisher Scientific). In brief, resazurin was at room temperature before adding to the cells in each well. The cells were incubated with 10 µl resazurin solution (10% of cell culture volume) for four hours at 37 °C. The absorbance was measured with a 560 nm excitation filter and a 615 nm emission filter in a Wallac 1420 VICTOR2 microplate reader (Perkin Elmer). For five days, the metabolic activity of each cell line (untreated cells) was measured daily to monitor potential resorufin (fluorescent pink) conversion to dihydroresorufin (non-fluorescent and colorless). Optimization of the resazurin assay parameters is detailed in the Supplementary data.
Percentage cell viability was calculated as 100% × (absorbance of treated cells -absorbance of background controls) / (absorbance of matched DMSO concentration controls -absorbance of background controls). Normalized growth rate inhibition (GR) was assessed using the absorbance of untreated cells at the time of treatment (t = 0; 24 hours after plating). The half-maximal inhibitory concentration (IC50), drug potency (GR50), and drug efficiency (GRmax) were determined for each compound using the GRmetrics (version 1.0.0) package 46 in R/Bioconductor (version 3.3.2). cell population doubling time. HCC38 and MCF7 cells were plated in 96-well flat-bottom plates (Corning Life Science) at a density of 7.5 × 10 3 cells per well in 100 µl culture medium supplemented with 10% FBS and cultured for five days. The resazurin assay was performed daily to determine the cell population doubling time (Td) during the exponential growth phase. The doubling time was calculated in hours using linear regression analysis.
flow cytometry-based cell cycle distribution analysis. HCC38 cells were cultured in 10-cm dishes with either different drug concentrations or DMSO (vehicle control) for 24 hours. Cell cycle distribution analyses were performed using harvested cells, fixed with 70% ethanol and stained with propidium iodide/RNase staining solution (Cell Signaling Technology). Apoptosis was assessed using the sub-G1 cell cycle fraction. Data analysis for cell cycle distribution was performed using the FACScalibur system (BD Biosciences) and Flowing software (version 2.5.1).

Statistical analysis.
A variance component analysis was used to determine the influence of experimental factors (e.g. cell line, drug, drug dose) in the resazurin-based cell viability assay, analysis of variance (ANOVA), Z-factor (Z), signal window (SW), and coefficient of variation (CV) were calculated as described elsewhere 27,47 , with cutoffs set at Z > 0.4, SW > 2, and CV < 20%. In brief, one-way ANOVA (base stats package in R/Bioconductor version 3.3.2) was performed using cell viability as the dependent variable and all other variables (e.g. cell line, drug, drug dose) as covariates. The ANOVA percentage of variation explained (η²) was calculated by dividing the sum of squares (Sum Sq) between groups by the Total Sum Sq (Supplementary Table 1). The Z, SW, and CV scores were calculated using resazurin-based absorbance values for treated cells and background controls, as previously described 47 . To evaluate edge effect, i.e. differences in resazurin-based absorbance values in edge and middle wells on a 96-well plate, Wilcoxon rank-sum test (P < 0.05) was used. The Shapiro-Wilk normality test was used in R to determine whether the data was normally distributed, where the parametric T-test was used if P > 0.05 (normally distributed) or the non-parametric Wilcoxon test was used if P < 0.05 (not normally distributed). Bar plots were constructed using the ggpubr (version 0.2.1.999) 48

Data availability
The dataset and source codes for this work are publicly available as a capsule on CodeOcean 50 .