Label-free, rapid and quantitative phenotyping of stress response in E. coli via ramanome

Rapid profiling of stress-response at single-cell resolution yet in a label-free, non-disruptive and mechanism-specific manner can lead to many new applications. We propose a single-cell-level biochemical fingerprinting approach named “ramanome”, which is the collection of Single-cell Raman Spectra (SCRS) from a number of cells randomly selected from an isogenic population at a given time and condition, to rapidly and quantitatively detect and characterize stress responses of cellular population. SCRS of Escherichia coli cells are sensitive to both exposure time (eight time points) and dosage (six doses) of ethanol, with detection time as early as 5 min and discrimination rate of either factor over 80%. Moreover, the ramanomes upon six chemical compounds from three categories, including antibiotics of ampicillin and kanamycin, alcohols of ethanol and n-butanol and heavy metals of Cu2+ and Cr6+, were analyzed and 31 marker Raman bands were revealed which distinguish stress-responses via cytotoxicity mechanism and variation of inter-cellular heterogeneity. Furthermore, specificity, reproducibility and mechanistic basis of ramanome were validated by tracking stress-induced dynamics of metabolites and by contrasting between cells with and without genes that convey stress resistance. Thus ramanome enables rapid prediction and mechanism-based screening of cytotoxicity and stress-response programs at single-cell resolution.


S-3
Newton EMCCD (Andor, UK) utilizing a 1600×200 array of 16 µm pixels with thermoelectric cooling down to -70℃ for negligible dark current. Acquisition of each SCRS was completed within 10s with spectral resolution of 1 cm -1 . Twenty cells were measured for SCRS in each biological replicate of cell culture.
Preprocessing of SCRS data Pre-processing of raw SCRS data was performed with LabSpec 5 (HORIBA Scientific, France) 3 . The averaged background (three to five spectra acquired from the area of slide surrounding the cell) was subtracted from each Raman spectrum, and then the intensities of the spectra were normalized to the total area under the curve 5 . Spectra were cropped to a spectral region of interest ranging from 600 to 1800cm -1 for chemometrics analysis.

Determination of lipid and DNA content of individual cells
Total lipid of a 100ml culture of E. coli cells was measured via biphasic chloroform-methanol-water extraction and then weighted in a precision electronic balance.
Total DNA content of a 1ml culture of E. coli cells was estimated by extraction using DNeasy Blood & Tissue kit (Qiagen, Germany) and then quantification using Qubit® 2.0. The number of cells in the corresponding culture was counted by blood cell counting plate (average from three times for each sample). Thus single-cell lipid or DNA content was estimated by dividing the total lipids/DNA content by the total number of cells.
The density of lipids (or DNA) within a single cell was estimated via dividing single-cell total lipid (or DNA) content by the area size of a cell. To derive the area size of individual cells, fluorescence images of cells that were stained with 0.1% AO for 5min were S-4 photographed within two seconds and exposed for 15µs (OLYMPUS, Japan). Software ImageJ was then used to estimate the area size of individual cells on the images.

Chemometrics analysis
PLSR model building A PLSR model was constructed using the Ramanome data and the experimental data (e.g., lipids and DNA density) in Matlab R2010a 4 . By relating the two datasets of X (intensity of related bands of SCRS) and y (single-cell lipid and DNA density estimated via conventional approaches) by means of regression, PLSR performs a multivariate calibration in order to establish a linear model which enables the prediction of y from the measured dataset of X. In the regression process, decomposition of X is performed under the consideration of y in a simultaneous analysis of the two datasets 6 . Specifically, the lipid or nucleic acid related Raman bands of SCRS of each of the 20 cells from triplicates at 0.5, 1, 3 and 5h were averaged separately, generating data of 24 (12 from each control groups and 12 from treatment groups) combined Raman spectra as a matrix (designed as X). Among them two of the triplicates at each time point were randomly selected to form a training dataset for calibration of the model (Xc, n=16), and the rest was used as a test dataset for validation (Xv, n=8). Correspondingly, single-cell lipid and DNA density of each triplicate culture at each time point were measured, generating data of another 24 values as a vector (designated as y), also including the training dataset (yc, n=16) and the test dataset (yv, n=8).
Firstly, the PLSR model was established using the Xc and the yc data. Secondly, the Xv data and the function of the established model were used to predict the yv value. The predicted yv value was compared with the measured yv value. The reliability of the PLSR model was Random Forest Analysis Random Forest model was used to classify SCRS under the different stress treatments via default parameters (R package "randomForest", ntree=5,000, using default mtry of sqrt(p) where p is the number of Raman bands) 9 . Rank lists of Raman bands in the order of "band importance" by Random Forests were determined over 50 iterations of the algorithm. Raman datasets were reordered based on the rank list and then used as the input data for calculating the minimum number (Nmin) of Raman bands for discriminating between the control and the stressed cells via ROC (receiver operating characteristic) analysis based on the largest AUC (area under the ROC curve) 10 . The top Nmin ranking bands that showed significant difference between the control and the stressed were S-6 designated as the marker bands for each of the stressors (Wilcoxon rank sum test; p<0.001).       S-20