Intracellular optical doppler phenotypes of chemosensitivity in human epithelial ovarian cancer

Development of an assay to predict response to chemotherapy has remained an elusive goal in cancer research. We report a phenotypic chemosensitivity assay for epithelial ovarian cancer based on Doppler spectroscopy of infrared light scattered from intracellular motions in living three-dimensional tumor biopsy tissue measured in vitro. The study analyzed biospecimens from 20 human patients with epithelial ovarian cancer. Matched primary and metastatic tumor tissues were collected for 3 patients, and an additional 3 patients provided only metastatic tissues. Doppler fluctuation spectra were obtained using full-field optical coherence tomography through off-axis digital holography. Frequencies in the range from 10 mHz to 10 Hz are sensitive to changes in intracellular dynamics caused by platinum-based chemotherapy. Metastatic tumor tissues were found to display a biodynamic phenotype that was similar to primary tissue from patients who had poor clinical outcomes. The biodynamic phenotypic profile correctly classified 90% [88–91% c.i.] of the patients when the metastatic samples were characterized as having a chemoresistant phenotype. This work suggests that Doppler profiling of tissue response to chemotherapy has the potential to predict patient clinical outcomes based on primary, but not metastatic, tumor tissue.


Immobilization
The two different sample immobilization methods were used. The first 8 samples were immobilized using agar, and the remaining 15 samples were immobilized using poly-lysine. The shift occurred because polylysine was found to provide better sample stabilization. However, this shift created systematic differences in some BDI features. There is a difference in the means for values of baseline biomarkers like NSD in a twosample t-test for samples immobilized with agarose vs with poly-lysine. The difference in drug response is significant for paclitaxel and its combination with carboplatin, but for not carboplatin only. D'Agostino-Pearson normality tests were used to validate the t-test normalized data assumption. (p-values are given in Table S2). The low NSD values found in poly-lysine immobilized samples indicate that poly-lysine is more effective for sample attachment. Drugs containing paclitaxel have lower SDIP0 values that may indicate that the paclitaxel mechanism of action, i.e. targeting tubulin and stabilizing the microtubule polymer, may be interacting with the mechanical properties of the agarose, creating trends that show up as part of the drug responses. The comparison of agar to poly-lysine is shown in Fig. S1. Because of this immobilization systematic, the primary analysis trains exclusively on the 15 samples immobilized by poly-lysine, then uses the trained model to test the agar samples and the metastatic samples (of either immobilization). This approach down-weights the feature selection that might be influenced by the agar mechanical properties.   Table S3 is a comprehensive description of all 40 metrics, or features, associated with a patient and drug.
The first 9 are the global biomarkers, and the next 9 are the local biomarkers, discussed in the main text.
These 18 are all based on the time-frequency format of the drug-response spectrogram. The frequency bands are described in Table 2 in the main text. The time dependence is simple polynomial: 0 is constant, 1 is linear, and 2 is quadratic.
The biomarkers 19 -27 are drug-induced changes in the preconditions 28 -36. NSD is the normalized standard deviation, also known as temporal speckle contrast. BSB is the brightness of the sample. NCNT is the number of pixels in the cross-sectional image of a target. DR is the "vertical" dynamic range of the spectral density of a power spectrum. NY is the value of the spectral density at the Nyquist frequency. KNEE is the knee frequency of the fluctuation spectrum at which the power falls to half of its low-frequency value. HW is the half-width of the spectrum, closely related to KNEE. S is the spectral slope (linear on log-log) of the power spectrum for frequencies above the knee frequency and is closely related to SF which uses a nonlinear fitting method to measure the slope. The final metrics in Fig.   S3 include three measures of the baseline B0, B1 and B2 which each represent constant, linear and quadratic frequency dependence. The final metric DQ is the data quality assigned to each well or to each patient and drug. The biomarkers in Table S3 have a covariance matrix with off-diagonal values that measure the correlations among them. Therefore, we use principal component analysis (PCA) based on singular vector decomposition (SVD) to pool the biomarkers into a smaller number of independent biomarkers. The feature selection is described in detail in the main text. The four features selected in this study are shown in Table S4 along with the coefficients for each of the raw features in Table S3.

Spectral Response to Refreshed Medium
The growth medium is RPMI-1640, and the carrier for adding drugs to the medium is 0.1% DMSO.
Therefore, 17 replicates of the negative control (0.1% DMSO in RPMI-1640 medium) are applied for each patient to measure the response of the living tissue to the refreshed medium that contains fresh nutrients and oxygen. The spectrogram of the negative control is shown in Fig. S2   This background response is subtracted from each drug response.
Sample-to-sample variance is a key aspect of live-tissue measurements caused by sample heterogeneity.
An important distinction is well-to-well variability among spectrograms for a given patient, compared to the patient-to-patient variability. The first is analogous to homogeneous broadening, and the second is analogous to inhomogeneous broadening. Standard deviations of the spectrograms are shown in Fig. S3.
The average patient spectrogram standard deviation is shown in Fig. S3a, and the average standard deviation across all patients is shown in Fig. S3b. The peak standard deviation in the latter case is 0.4 about 8 hours after the medium refresh and in the former is 0.27. Therefore, there is more variance patientto-patient than well-to-well for a given patient. The standard deviations for the R-class and the S-class are given in Figs. S3c and S3d. The S-class shows larger variability among patients than the R-class. It is important to note that with 18 well-replicates per treatment, the maximum standard error on a drugresponse spectrogram is approximately 0.1, or a ±10% change in spectral density at low frequencies. The standard deviation at the Nyquist floor is much smaller, which suggests that mid and high frequencies may be more reliable as biomarkers than lower frequencies. The maximum standard deviation for a given patient is 0.27 at low frequencies approximately 8 hours after treatment. With 18 replicates per treatment per patient, this represents ±7% spectral density uncertainty at that time and frequency. The average standard deviation over the entire time-frequency plane is 0.15 and the average standard error on 18 replicates is then about 4%. This 4% change in spectral content is then the detection limit of a drug effect for a given patient.

Training-Set Stability for Predicting Chemosensitivity
The chemosensitivity values for each patient in the study, presented in Fig. 3a in the main text, is based on a training set for only poly-lysine immobilization of sensitive and resistant primary tumor biopsies. The metastatic samples (hov8b, 9, 11, 12, 18b, 20b, 26), as well as the agar-immobilized samples (hov5, 7, 8, 10), were then predicted using the trained algorithm. Furthermore, the poly-lysine-immobilized training set was predicted using one-hold-out.
However, other training subsets are possible. For instance, one could train on all the samples and predict chemosensitivity using one-hold-out for each one. The results are shown in Fig. S4 as the red bars (error bars are the standard error on the average of the ensembles). Alternatively, the training set can be the agar and poly-immobilized samples, predicted using hold-out, and predicting the metastatic samples using the trained algorithm. The results are shown as the green bars in Fig. S4. These are reasonably disparate choices for the training set, and most of the patients share similar chemosensitivity values among all three training methods. Notable exceptions are patients hov17, hov11, hov18b, hov10 and hov20. Therefore, 84% of the patients predict consistently among the different training subset methods. The statistical analysis of the poly-immobilized predictions are given in Table S5 based on the decision point that optimizes the sensitivity and specificity.

Comparison of Human/Mouse/Cell-Line Drug Responses
Our previous work (Ref. 23) studied human ovarian cell lines grown as spheroids or as mouse xenografts.
The response to carboplatin for sensitive (A2780) and resistant (CP70) cell lines are shown in Fig. S7 compared to the results from the current work on the human biopsies. The biopsies share similarities with the spheroids, but not the mouse explants. The spheroids and biopsies have only human ovarian constituents, while the explants have constituents from the mouse host (stroma, fibroblasts and possibly immune cells), which may contribute to the differences.