Ultrasound-assisted extraction optimization and validation of an HPLC-DAD method for the quantification of polyphenols in leaf extracts of Cecropia species

Cecropia species are traditionally used in Latin American folk medicine and are available as food supplements with little information warranting their quality. The optimum conditions for the extraction of chlorogenic acid (CA), total flavonoids (TF) and flavonolignans (FL) from leaves of Cecropia species were determined using a fractional factorial design (FFD) and a central composite design (CCD). A reversed-phase high-performance liquid chromatographic method coupled to a diode array detector (HPLC-DAD) was validated for the quantification of CA, TF and FL, following the ICH guidelines. Quantitative and Principal Component Analysis (PCA) was also performed. The extraction-optimization methodology enabled us developing an appropriate extraction process with a time-efficient execution of experiments. The experimental values agreed with those predicted, thus indicating suitability of the proposed model. The validation parameters for all chemical markers of the quantification method were satisfactory. The results revealed that the method had excellent selectivity, linearity, precision (repeatability and intermediate precision were below than 2 and 5%, respectively) and accuracy (98–102%). The limits of detection and quantification were at nanogram per milliliter (ng/mL) level. In conclusion, the simultaneous quantification of chemical markers using the proposed method is an appropriate approach for species discrimination and quality evaluation of Cecropia sp.


Results and Discussion
HPLC-DAD analysis of CA, TF and FL. Liquid chromatography coupled with mass spectrometry (LC-MS) offers better sensitivity and selectivity for compound identification in comparison with UV detection. However, the high cost of this equipment together with the absence of primary standards may be a limiting factor for low-budget analytical laboratories. On the other hand, DAD is inexpensive, broadly applied to natural product analysis, has a reliable and reproducible performance, and provides the possibility of online collection of UV spectra, inducing a viable added value for quantitative analysis 23 . Hence, we present an HPLC-DAD method for the routine analysis of CA, TF and FL in samples Cecropia species.
A representative HPLC-DAD profile of a leaf extract from a Cecropia species mixture is given in Fig. 1. Analysis of online UV spectra, obtained from peaks between 25-46 min, let us recognize typical UV absorption bands for flavonoids (Band I, λ max around 300-354 nm and Band II, λ max around 240-285 nm) 24 . As previously reported by our research group, flavone C-glycosides were predominately detected in C. obtusifolia, C. peltata and C. insignis; while quercetin O-glycosides were the main flavonoids described in C. hispidissima (Rivera-Mondragón et al., 2018. Submitted). Considering this, we decided to express O-glycosyl flavonols and C-glycosyl flavones as rutin (RU) and vitexin (VX) equivalents, respectively. FL such as mururin A and vaccinin A are not available as reference standard on the market. Therefore, VX was chosen as a secondary analytical standard for the quantification of these compounds.
Extraction optimization. Variables with major effects over the extraction yields were optimized. First, ultrasound assisted extraction (UAE) was selected as the most appropriate extraction method due to its efficacy offering a great reduction of extraction time and low environmental impact [25][26][27] . Following, mixtures of water: methanol and acetone were chosen as extraction solvents, since these solvents have widely described as the most suitable systems for polyphenol extraction from plant material 28,29 .
The screening and optimization steps of TF, CA and FL from Cecropia species leaves were conducted by fractional factorial (FFD) and central composite (CCD) designs, respectively.
Screening (FFD). Seven factors were chosen to be screened: methanol fraction (%, v/v) (A), extraction time (min) (B), number of extractions with methanol (C), extraction temperature (°C) (D), mass/solvent ratio (w/v) (E), number of extractions with acetone (F) and particle size of the plant material (µm) (G). The FFD, including the fraction levels and the experimental values of TF, CA and FL, is presented in the supplementary material (Table S1). The estimated effect of each factor on the responses and the corresponding critical effects are shown in Table 1. Those factors with effects below the critical effect (E critical ) were considered statistically not significant factors affecting the yield of phenolic extraction.
TF, CA and FL extraction yields were significantly affected by the methanol fraction (A) and the extraction temperature (D). CA and FL contents were also significantly affected by the number of extractions (C). Additionally, CA content was significantly affected by mass/solvent ratio (E) and particle size (G). For the optimization step, we selected the three factors with the major effects on TF, CA and FL yields: methanol fraction (A), number of extractions with methanol (C) and extraction temperature (D). The number of extractions (C), a discrete variable, was fixed at three (3) extractions. The less significant factors were mass/solvent ratio (E) and particle size (G), and were established as 1:50 (m/v) and ≤125 µm, respectively, since those are the best tested to obtain the highest CA yield. In addition, one (1) acetone extraction (F) was set, because it has a significant positive effect on FL peak area. Extraction time (B) was set at 30 min, since it has no significant effect neither on CA, TF or FL.  Optimization (CCD). A response surface methodology (RSM), using a CCD, was employed to optimize the effects of the methanol fraction (X 1 ) and extraction temperature (X 2 ) on the extraction yields of TF (Y 1 ), CA (Y 2 ) and FL (Y 3 ). The other variables were set as mentioned in the previous section (Screening FFD). Supplementary  Tables S2 and S3 show the independent variables and their levels used for this experiment, and the actual design and the experimental data for the response variables, respectively. Second-order polynomial equations were built to describe the relationship between X 1 and X 2 ( Table 2). The analysis of variance (ANOVA) for the quadratic polynomial models developed for the response variables indicated that the linear effects of X 1 and X 2 were found to be significant (p < 0.05) for TF, CA and FL extraction (Table 3). Besides, X 1 2 is significant (p < 0.05) for all response variables, while X 2 2 was not significant (p > 0.05) for any response. On the other hand, the interaction effect of X 1 X 2 is only significant (p < 0.05) for TF.
In order to visualize the factor and interaction effects on the extraction efficiency, the three-dimensional response surface plots were generated as a function of X 1 and X 2 as shown in Fig. 2. In general, the peak areas of TF, CA and FL increased within a range of methanol fraction of 70-75% (v/v), 55-72% (v/v) and 70-80% (v/v), respectively. However, a methanol fraction below or above this ranges appeared to decrease the extraction yields of these compounds. Furthermore, the highest TF, CA and FL extraction yields were observed with extraction temperature ranges from 70-75 °C, 65-75 °C and 55-65 °C, respectively. FL extraction efficiency was negatively affected when the extraction temperature was above 65 °C. For TF and CA the extraction yields were only affected to a limited extend by the temperature.
Verification of the predicted optimal extraction conditions. The optimal X 1 and X 2 variables were determined by maximizing the responses using MODDE Pro Software. In order to confirm the reliability of the mathematical model, experimental extractions were performed under conditions selected as optimal: a methanol fraction of 70% (v/v), a extraction temperature of 64 °C, 30 min of extraction time, three (3) repeated extractions with methanol and one (1) with acetone, a mass/solvent ratio of 1:50, w/v, and particle size of ≤125 µm. The predicted and experimental responses demonstrated no significant differences (t-test, p > 0.05) and the difference between the predicted and experimental values were less than 2.0%, indicating an appropriate fitness of the predicted model (See Table 4).
In order to develop a less time-and solvent-consuming method, the robustness of the yield, was determined from different combinations of mass/solvent ratio (1:30, 1:15 and 1:10) and number of acetone extractions (1 and 0) (Fig. 3). One-way ANOVA followed by a Bonferroni post hoc test revealed not significant differences in CA, TF and FL contents (p >0.05) in comparison with the yield at the optimized conditions. Results show that mass/ solvent ratio and number of acetone extractions may be changed (reduced) without compromising the analytes extraction. During the validation of the method, the mass/solvent ratio was set at 1:30, while the extraction with acetone-extraction was eliminated from the procedure.
Validation of the HPLC-DAD method. Specificity. Blank solution, authentic standards and isolated compounds from Cecropia obtusifolia (FL1, FL2 and FL3) were analyzed under the analytical method conditions established here (Fig. 4). Authentic and commercial sample extracts from different species were analyzed as well   in order to verify for possible interferences (Figs 5 and 6). Representative UV and MS spectra inspection of each analyte ( Fig. 1), shows no relevant interferences at the regions (retention times) of interest. Despite some overlapping peaks were found (Fig. 5c), identified as flavonoids, the quantification of the total flavonoids is not interfered.
Linearity. Four curves were plotted at different levels (n ≥ 5) within an appropriate concentration range of CA, VX and RU, and the residuals were observed to be homoscedastic and with % RSD values below than 5% ( Supplementary Fig. S2). As shown in Table 5, all curves have a linear response with r 2 > 0.999, t-tests (N-2, p = 95%) demonstrated that intercepts were not significantly different from zero, while slopes were significantly different from zero. Additionally, 95% CI of intercepts include zero, which means that quantification analysis can be performed based on a single-point calibration approach. Moreover, Mandel's fitting tests indicate that first-order calibration function (linear equation) provides a significantly (α = 0.01) better fit than second-order calibration (quadratic equation).
Precision. Repeatability (intra-day and intra-concentration level, n = 6) and intermediate precision (between 4 days, and three concentration levels, n = 6) were evaluated in order to assess the precision of the method. Overall, Cochran's C test for homogeneity of variances (95% confidence level) indicated that variances between days and concentration levels are homogeneous (Supplementary Table S4). Values of % RSD for all parameters were below than 2.0% and 5.0% for repeatability and intermediate precision, respectively. In addition, all values  Table 4. Experimental and predicted values of TF, CA and FL at optimal conditions. All the values are means ± standard deviations and those sharing the same superscript letter in the same row are not significantly different from each other (p > 0.05).  Table 6.
Accuracy. The accuracy was determined by means of a recovery experiment, adding known quantities of CA, VX and RU in three concentration levels (75, 100 and 125%). Fortified solutions were analyzed and the results were reported as percent recovery (%). Table 7 shows that accuracy data are in agreement with the acceptance criteria at all three concentrations levels since recovery values varied between 98 and 102%, 95% CI of the mean include 100%, and % RSD for each level concentration were lower or equal to % RSD r 30 .
Limit of detection (LoD) and limit of quantification (LoQ). LoD and LoQ were estimated based on the calibration curves of CA, VX and RU, constructed in a 1.5-50 µg/mL range (data not shown). Additionally, these estimates were subsequently validated by the independent analysis of six quantified samples (prepared by serial dilutions) in order to find concentration levels around LoD and LoQ. All compounds showed LoD and LoQ of the order of magnitude of ng/mL. CA, VX and RU showed lower concentrations of LoD and LoQ than those calculated by the mathematical method, except for VX detected at 390 nm, which displayed a higher LoD and a similar LoQ to those determined by its calibration curve (See supplementary material Table S5). This method is able to detect and quantify CA, TF (expressed as vitexin or rutin equivalent) and FL (expressed as vitexin equivalent) of Cecropia species in concentrations below than 0.5 µg/mL and 1.0 µg/mL, respectively (Table 5).     Quantitative and multivariate analysis. Subsequent to the optimization of the extraction and the validation of the proposed analytical method, the contents of CA, TF and FL from fourteen authentic Cecropia leaves and three commercial products were assessed. The concentrations of phenolics found in Cecropia are shown in Table 8. It was observed that CA was present in all samples ranging from 78.7 ± 4.3 (CP-C) to 14724.5 (CO-7) (µg/g). Quantitative analysis of samples from Cerro Azul, Camino de Cruces and Cerro Campana showed higher CA concentrations collected during the rainy season (October, 2015) compared to those collected in the second dry period during the rainy season (known as First Canicula, July 2016); except for CO-7, which presented the highest CA content and was collected in Chiriqui (July, 2017) (See Supplemenatary Fig. S3).
Although a broad variety of flavonoid glycosides was detected, most flavonoids belonged to the flavone or flavonol classes and that the main skeletons were derivatives of luteolin, apigenin or quercetin. Among all samples, flavone C-glycosides were the main flavonoids subclass present in C. obtusifolia, C. peltata, C. insignis and C. hololeuca, but flavonol O-glycosides in C. hispidissima. Comparison of these results accords with those of Costa et al. 21 , Ortmann et al. 31 , da Silva Mathias and Rodrigues de Oliveira 22 who also found that CA and flavone C-glycosides derived from apigenin and luteolin were the most abundant compounds detected in C. glaziovii, C. pachystachya and C. hololeuca.
Besides, the highest total flavonoid content was detected for C. hispidissima CH-1(14899.2 µg/g), followed by the C. obtusifolia samples: CO-7 (14071.7 µg/g), CO-4 (12455.4 µg/g) and CO-C (12387.1 µg/g). Regarding the seasonal variation of flavonoid content, a general pattern of increasing or decreasing levels among the two different periods could not be observed (See Supplemenatary Fig. S4). Similar findings were reported by Costa et al. 4 , who observed no correlation between the values of pluviosity and the production of C-glycosylflavonoids.  Supplementary Table S6).
Association between variables (chemical composition) was determined by the analysis of the sample correlation coefficient (See Supplementary Fig. S5). The CA level showed positive correlations with the content of LG (0.76), LMG (0.54) and AMG (0.36). Similarly, LG revealed a high association with LMG (0.72) and AMG (0.47); and FL with AG (0.40). In contrast, a negative correlation (−0.57) between AG and QG was observed. These results indicated that high concentrations of O-glycosides correlated with a low content of apigenin C-glycosides and vice-versa.
Two-dimensional PCA score and loading plots from Fig. 7 shows that the first three components (PC1, PC2 and PC3) accounted for 79.0 % of the cumulative variability of the original variables. The PCA results showed that C. hispidissima individuals (CH-1 and CH-2) were characterized by a strongly negative score on PC2. The low score of PC2 corresponds to higher values of QG and lower levels of AG. This result allowed to distinguish C. hispidissima from other Cecropia species. PC3 was able to separate C. obtusifolia samples (except for CO-4) from the other species. High scores of PC3 corresponded to a relatively high flavonolignans content.

Conclusions
The present study was designed to describe the optimal experimental conditions for maximizing the extraction efficiency of leaves from plants of the genus Cecropia based on an experimental design and to validate an appropriate method to simultaneously quantify the polyphenolic compounds, including CA, TF and FL, in authentic and commercial samples by using high-performance liquid chromatography with diode-array detection (HPLC-DAD). Although there are some reports about analytical work from Cecropia species, previous investigations have not comprehensively considered the process optimization for improved phenolic compound recovery and the quantitative determination of flavonolignans in leaves from different species of this genus. Therefore, the methodology described in this work includes for the first time several parameters, such as extraction solvent, temperature and time, among others, combined with statistical tools in order to determine optimal extraction conditions and quantification of main phenolic constituents from Cecropia. The use of response surface methodology (RSM) was selected in this research because it can broadly be applied, due to its advantages in comparison with classical approaches (one-factor-at-a-time method), such as the capability of collecting information on many quantitative variables at once, fitting an adequate mathematical function to the experimental data and assessing the interaction effect between the parameters on the response. The results of this investigation show that solvent concentration and temperature had a significant influence on extraction, then optimization of these parameters was essential to obtain accurate and efficient recovery of the Cecropia leaves constituents. The optimized parameters were determined to be a temperature of 64 °C, methanol fractions (70%, v/v), extraction time of 30 min, 3 extractions a mass/solvent ratio of 1:30 (w/v) and particle size of ≤125 µm. Variations between predicted and measured recoveries below 2.0% were observed.
The HPLC-DAD method showed adequate validation parameters such as specificity, linearity, precision, accuracy, and limits of detection and quantification on ng/mL scale. It could be concluded that the HPLC-DAD method is reliable and adequate for the determination of the chemical composition of Cecropia samples, as an important tool for the quality control of derived commercial products. Our investigation may be useful for the discrimination of C. obtusifolia and C. hispidissima from the other species, based on their content of flavonolignans, apigenin C-glycosides, and quercetin O-glycosides.
Plant material. The leaves of four Cecropia species were collected in West Panama, Panama and Chiriqui Provinces, Republic of Panama (See supplementary Fig. S1). The species were identified by the botanist Orlando  Table 8. Concentrations of CA, TF and FL (µg/g) in authentic and commercial Cecropia leaf samples. CO, CP, CI and CH correspond to authentic leaves of C. obstusifolia, C. peltata, C. insignis and C. hispidissima samples (see supplementary Fig. S1). CO-C, CP-C and CHO-C correspond to commercial products of C. obstusifolia, C. peltata and C hololeuca. Contents of analytes are reported as mean ± standard deviation (n = 3). Content below the limit of quantification: <LOQ. . A flow rate of 0.7 mL/min was used. The column temperature was maintained at 26 °C. The DAD signal was recorded between 190 and 500 nm. TF and CA were monitored at 340 nm, while FL was detected at 390 nm. The analyte peak areas were used as responses for the optimization of the extraction process, while mass fraction (m/m) of CA, TF and FL from the plant material were used for the validation of the analytical method. TF was reported as the sum of all responses corresponding to flavonoids (Fig. 1). An LC-DAD-MS system was used for the evaluation of the specificity of the HPLC-UV method and to confirm the chemical composition of the Cecropia species. Mass spectra were recorded using a Finnigan LXQ Mass spectrometer (Thermo Fisher Scientific, CA, USA) coupled with a Finnigan Surveyor LC system (LC Pump Plus, Autosampler Plus and PDA Plus detector). The same chromatographic conditions as described above were applied. During analysis, full scan data were recorded in ESI (−) mode from m/z 100 to 1800. The source, capillary and tube lens voltages; sheath and auxiliary gas flows; and the capillary temperature were set as follows:  Supplementary Table S1.

HPLC-DAD and HPLC
The statistical interpretation of the estimated effect of each factor (E x ) was performed as described by Klein-Junier et al. 34 . Briefly, a t-test was used to evaluate which factors have significant effects. The standard error of the effect (SE) e was estimated based on Dong's algorithm using the 75% lowest absolute factor effects. Unimportant factors are selected by an initial error estimation (S 0 ). Finally, the critical effect (E critical ) for the response was estimated based on (SE) e .
Central Composite Design (CCD): After the most important factors for extraction were determined in the screening step, variables were further optimized. Two factors at five levels were varied according to a CCD design in order to optimize the extraction process. The two factors were methanol concentration (%, v/v) (X 1 ) and extraction temperature (°C) (X 2 ). The factor levels with α = 1.414 are shown in Supplementary Table S2, while  the design and the responses are given in supplementary Table S3. A quadratic polynomial model used in the response surface analysis was established for each response using the following equation: where y is the response variable, and b 0 , b i , b ii and b ij are the regression coefficients of the model representing intercept, linear, quadratic, and interaction terms, respectively; X i and X j are the independent variable functions; and k the number of variables (k = 2).

Validation of the analytical method.
Validation studies were carried out in accordance with the guidelines of the International Conference on the Harmonization (ICH) of Technical Requirements for the Registration of Pharmaceuticals for Human Use, about the validation of analytical procedures 35 . Validation was performed in terms of specificity, linearity, precision, accuracy, limit of detection and limit of quantification.
Specificity. Specificity was evaluated in order to identify potential interferences by other analytes or other compounds at the chromatographic region of interest. Analytical standards (CA, VX, IV, OT, IO and RU) and sample extracts from different plant collections were analyzed to assess the selectivity of the method at the retention times of CA, flavonoids and flavonolignans within the analytical conditions established. Mass spectrometry was used to identify co-eluting substances and to determine their relevance under routine conditions. Additionally, as an indirect approach, specificity was evaluated by demonstrating acceptable accuracy (see Accuracy). working standard solutions were prepared daily. Three determinations (n = 3) were carried out for each solution and each calibration point was fitted by linear regression. The most appropriate concentration range was determined. First-order calibration curve equations (y = a + bx), coefficients of determination (r 2 ), residual values, 95% confidence intervals (CI) of the intercept, and t-tests (N-2, α = 0.05) for slope and intercept were calculated. Additionally, the Mandel's fitting test was used for mathematical verification of the first-order regression model 36 . See supplementary S. Methods.
Precision. The precision of the method was assessed for each analyte as repeatability (intra-day precision) and the intermediate precision (between days and concentration levels) during four different days and three concentration levels (50, 100 and 150%). Six replicates (n = 6) were performed for each working solution. The results were expressed as percentage relative standard deviation (% RSD).
The acceptability of the results was evaluated by the Horwitz equation 37,38 : where RSD r is the expected coefficient of variation under repeatability conditions; RSD R is the expected coefficient of variation under intermediate precision conditions; and C is the concentration of analyte expressed as a dimensionless mass fraction (m/m, where numerator and denominator have the same units).
Accuracy. Accuracy was determined by recovery (%) at three concentration levels (75, 100 and 125%) in triplicate (n = 3). The experiment was performed by adding known quantities of analytical standards (CA, VX and RU) to plant material extracts (mixture of Cecropia species) prepared at 50% of center point of the linear range. The recovery and 95% confidence interval (CI) were estimated. The % RSD should be in the order of coefficient of variation for intermediate precision conditions (RSD R ), and the recovery % is not significantly different from 100% when the 95% CI includes 100%. Additionally, a nominal range from 98 to 102% was set as recovery %.
Limit of detection (LoD) and limit of quantification (LoQ). The limits of detection and of quantification were estimated based on analytical calibration curves containing the analytes (CA, VX and RU) spiked to the sample extracts. The standard deviation of the intercept was used to express the experimental error.
Additionally, these mathematical estimations of LoD and LoQ were subsequently validated by analyzing samples known to be near detection and quantification limits. Six determinations (n = 6) were performed for each concentration. Then, the lower concentrations with appropriate signal/noise ratio (S/N > 2-3.5) and precision (% RSD ≤ 5.0) were adopted as LoD and LoQ for each analyte, respectively.
Application of the HPLC-DAD method. The proposed HPLC-DAD method was used to assess the contents of CA, TF and FL in fourteen authentic Cecropia leaves and three commercial products. CA, TF and FL were determined by using an external standard method, using CA (20.12 µg/mL), VX (20.32 µg/mL) and RU (18.69 µg/ mL) as references. Samples were prepared in triplicate. Analytical results were reported by x ± SD, where x is the mean of results and SD is the standard deviation of measurements. Results were expressed in µg/g of dried-leaves weight.
Identity of chemical constituents was determined by comparison of their retention times with authentic reference standards or isolated compounds, and by UV/DAD spectral data. Additionally, individual flavonoids identity was confirmed by analysis of their mass spectral data.