Linking the morphological and metabolomic response of Lactuca sativa L exposed to emerging contaminants using GC × GC-MS and chemometric tools

The occurrence of contaminants of emerging concern (CECs) in irrigation waters (up to low μg L−1) and irrigated crops (ng g−1 in dry weight) has been reported, but the linkage between plant morphological changes and plant metabolomic response has not yet been addressed. In this study, a non-targeted metabolomic analysis was performed on lettuce (Lactuca sativa L) exposed to 11 CECs (pharmaceuticals, personal care products, anticorrosive agents and surfactants) by irrigation. The plants were watered with different CEC concentrations (0–50 µg L−1) for 34 days under controlled conditions and then harvested, extracted, derivatised and analysed by comprehensive two-dimensional gas chromatography coupled to a time-of-flight mass spectrometer (GC × GC-TOFMS). The resulting raw data were analysed using multivariate curve resolution (MCR) and partial least squares (PLS) methods. The metabolic response indicates that exposure to CECs at environmentally relevant concentrations (0.05 µg L−1) can cause significant metabolic alterations in plants (carbohydrate metabolism, the citric acid cycle, pentose phosphate pathway and glutathione pathway) linked to changes in morphological parameters (leaf height, stem width) and chlorophyll content.


CEC extraction from the lettuce leaves
Leaf tissue spiked with a mixture of surrogates was extracted with a mixture of acetone: hexane (1:1, v/v) using a pressurized solvent extraction (PSE) system (Applied Separations, PA, USA). Neutral-basic and acid fractions were obtained by solvent partitioning at neutral and acid pH, respectively. After a cleanup with Florisil and sodium sulfate, TPhA was added as internal standard and TMSH was added as derivatization agent.
All the CECs were analyzed by GC-MS/MS. Methylation of the acidic carboxyl group and the hydroxyls group of BPA for plant tissue was performed in a programmed temperature vaporizing (PTV) injector of the gas chromatograph by adding 10 µL TMSH to a 50 µL sample aliquot before injection. A volume of 5 µL was injected into a Bruker 450-GC gas chromatograph coupled to a Bruker 320-MS triple stage quadrupole mass spectrometer (Bruker Daltonics, Billerica, MA) fitted with a 20 m × 0.18 mm ID, 0.18 µm film thickness Sapiens X5-MS capillary column coated with 5 % diphenyl 95 % dimethyl polysiloxane from Teknokroma (Sant Cugat del Vallès, Spain). The PTV was set at 60 ºC for 0.5 min and rapidly heated to 300 ºC at 200 ºC min -1 , and hold for 7 min. Then the injector was cooled to initial 60 ºC at 200 ºC min -1 . The oven temperature was held at 60 ºC for 3.5 min and then the temperature was programmed at 30 ºC min -1 to a 150 ºC and finally at 8 ºC min -1 to 320 ºC, holding the final temperature for 6 minutes. Gas flow rate was set at 0.6 mL min -1 . Ion source temperature and the transfer line both were held at 250ºC. A solvent delay of 8 minutes was applied. Argon gas was used for CID at a pressure of 1.8 mTorr and the optimum collision energy (CE) was selected for each transition.
Qualitative and quantitative analysis was performed based on retention time and selected reaction monitoring (SRM) mode of two product ions, and the ratio between the product ions (Table S5). The limit of detection (LOD) and the limit of quantitation (LOQ) for plant tissue were defined as the mean background noise in a blank triplicate plus three and ten times, S4 respectively, the standard deviation of the background noise from three blanks. LODs and LOQs were compound dependent and ranged from 0.3 to 4.5 ng g -1 dry weight (Table S6).
The recoveries of the surrogates added can be seen in Table S7.

Data arrangement, compression and MCR-ALS analysis
Due to huge size of data sets collected in GC×GC-MS data sets of 20 lettuce samples, a data segmentation strategy was used. In this regard, GC×GC-TOFMS data for 20 samples segmented into four parts (A-D) by visual inspection of the chromatograms. Figure S3 shows these chromatographic segments as an example on GC×GC-TIC of one of the control sample.
Wavelet decomposition and compression is applied on every column (m/z) of Xaug independently. Compression reduces the size of data 2 n times, which, n is the compression level 1,2 . The compressed matrix (Xcompr) contains the same information as Xaug, but needs much lower computer storage. For the datasets under study in this paper, level-3 wavelet S5 compression was sufficient without loss of relevant information in the elution time direction, and spectral domain remained unchanged.
Before starting MCR-ALS analysis, some prior knowledge is required. One of the main difficulties in MCR analysis is determination of the number of chemical components exist in data matrix. In this work, the size of singular values and changes in lack of fit (LOF) of MCR-ALS solutions by adding or removing components into the model were used as criteria to estimate the number of significant components.
To start ALS optimization, simple-to-use interactive self-modeling mixture analysis in the factor matrices. MCR-ALS solves Eq. S4 for C and S T , using an iterative algorithm based on two constrained linear least-squares steps 6,7 . The values of coefficient of S7 determination (R 2 ) and LOF were used for evaluation of MCR-ALS model and they can be defined as follows: 2 , , , . . . is the element of the original data matrix and ̂. . . is the recovered value using MCR-ALS method.

Metabolite detection and NIST identification
MCR-ALS resolved profiles (S T ) were assigned to metabolites and identified by comparing the mass fragmentation patterns associated to the MCR-ALS resolved mass spectra profiles using the standard mass spectral database of the National Institute of Standards and Technology (NIST) and GOLM. For each mass spectrum, 100 hits were retrieved by the NIST Mass Spectral Search 2.2 software distributed with the NIST 2014 library. A reverse match factor (RMF) based on the correlation coefficient between the MCR-AS resolved and experimental mass spectra reported by NIST software was used for selection of the best identified compound for MCR-ALS resolved mass spectra. This match factor is reported between 0 (no match) and 1000 (perfect match). As a general guide, a value of 900 or greater was considered to be a very good matching; between 800 and 900, a good match; between S8

PLS modeling
Root mean squares error in leave-one-out cross-validation (RMSECV-LOO) has been used for choosing the significant number of latent variables (LV) in PLS model 8   L-Serine 1) Pathways with less than two metabolites detected are not included. Pathway dataset from Arabidopsis thaliana.   Figure S1. Linear correlation between the leaf concentration (ng g -1 dw) and the irrigation concentration (ng mL -1 ) of CBZ. . Figure S2. Contour plots of lettuce extracts analyzed by GC×GC-TOFMS. Lettuces exposed at (a) 0 µg L -1 , (b) 0.05 µg L -1 , (c) 0.5 µg L -1 , and (d) 50 µg L -1 of 11 CECs. Figure S3. MCR-ALS analysis of segment 1 of 4 segments in 20 samples (control and exposed samples). The number of components was 20 in this case which confirmed using singular value decomposition (SVD). The value of lack of fit (LOF) and R 2 for the developed model were respectively 4.6% and 99.8%, which were acceptable according to the noise level of data.