A strategy for qualitative and quantitative profiling of glycyrrhiza extract and discovery of potential markers by fingerprint-activity relationship modeling

This study was to evaluate the quality consistency of glycyrrhiza extract and to explore the possible anti-oxidant components in combination with chromatographic fingerprint and bioactivity evaluation. Characteristic fingerprints of glycyrrhiza extract samples from different sources were generated by high performance liquid chromatography with diode array detector (HPLC-DAD) and evaluated using hierarchical clustering and similarity analysis. Compared with the conventional qualitative similarity evaluation method, the averagely linear quantified fingerprint method had an important quantitative similarity parameter supported by quantitative analysis, which was recommended in the fingerprint evaluation. Antioxidant activities of the glycyrrhiza extract samples were determined by DPPH (2, 2-diphenyl-1-picryldrazyl) radical scavenging assays. In addition, the fingerprint-efficacy relationship was investigated by the chemical fingerprints and the anti-oxidant activities utilizing partial least squares model, which was capable of exploring and discovering the bioactive components of glycyrrhiza extracts. Therefore, the present study provided a powerful strategy to evaluate the holistic quality consistency of medicinal plant.

Chromatographic separation was measured with an Agilent series 1100 system equipped with quaternary gradient pump, thermostatic column compartment, DAD, autosampler and a CAPCELL PAK C18 MG column (250 × 4.6 mm, 5.0 μm, shiseido, Japan). The mobile phase consisted of aqueous solution containing 0.1% (v/v) phosphoric acid and 5 mM sodium 1-heptanesulfonate (A) and acetonitrile containing 10% (v/v) anhydrous ethyl alcohol and aqueous solution and 2.4% (v/v) phosphoric acid (B). Separation was achieved using the following linear gradient program: 6-18% B at 0-10 min, 18-33% B at 10-20 min, 33-46% B at 20-32 min, 46-60% B at 32-45 min, 60-78% B at 45-60 min, 78-80% B at 60-65 min. The column temperature was maintained at 35 °C. A 10 μL aliquot of each sample was injected into the HPLC-DAD system. The flow rate was set at 1.0 mL/min. The detection wavelengths were set at 344 nm, 280 nm, 250 nm, 220 nm and 203 nm. Similarity analysis. ALQFM is a mathematical data processing method based on fingerprint vector, which contains the quality and quantity analysis of fingerprints. Meanwhile, the quality level of evaluation criteria can be applied to read the evaluation results more intuitively and easily. The visualization analysis process for ALQFM was shown in Fig. 2A. , where x i and y i are the ith peak area, serve as sample and reference fingerprint vector, respectively. Average linear qualitative similarity (S L , Eq. 1) can clearly reflect the degree of similarity in the chemical compositions of the sample fingerprint and reference fingerprint in terms of distribution ratio. To describe apparent and calculated similarity, slope of linear equation (b, Eq. 2) and apparent content similarity (R%, Eq. 3) are calculated after weight-corrected by m S (the weight of each GE sample) and m R (the average weight of 30 GE samples). The closer R% and b are to 1, the more similar the sample fingerprint vector is to the reference fingerprint vector. Fingerprint variation coefficient (α, Eq. 4), a statistical error, is calculated using R% and b, which can reflect the accuracy of the model. Average linear quantitative similarity (P L , Eq. 5), an important quantitative parameter, is employed to monitor the overall content of all fingerprint components in the sample fingerprint. Accordingly, α, S L and P L are combined in the averagely linear quantified fingerprint method (ALQFM) to evaluate the quality of TCM (8 grades listed in Supporting Information Table S1). Generally, samples with the grade ≤ 5 were recommended as the qualified ones.

Results and Discussion
Methodology validation of HPLC analysis. The calibration curve, including linear ranges as well as the limit of quantity (LOQ) and limit of detection (LOD), regression equation and correlation coefficient (R 2 ) were showed in Supporting Information Table S2. All the analytes showed excellent linearity (R 2 ≥ 0.9999) over the tested concentration ranges. The LOQ and LOD, which were determined in the range of 0.500-5.000 μg/mL and 0.025-0.125 μg/mL, respectively. The accuracy of the HPLC method was evaluated by using the standard addition method and the average percent recoveries for six investigated compounds ranged from 97.65% to 100.71%, with relative standard deviation (RSD) value less than 1.92%. RSD values of each co-possessing fingerprint peak area were, respectively, less than 3.40%, 1.65% and 2.78% for the stability, precision and repeatability tests. Considering these results, the method was accurate and valid enough.
Compounds content analysis. The content of active ingredient contents in 30 GE samples was simultaneously determined using the established calibration curves (Supporting Information Table S2). Based on quantitative results (Table 1) Fig. S1), respectively, corresponding to the flavonoids and triterpene saponins 29 . In order to perform quality control of HM, it is necessary to establish a chromatogram with comprehensive composition information. However, for the complexity of HM compositions, it is difficult to accomplish this at a single wavelength. Therefore, fusion fingerprint 43 capable of synthesizing enhancing signal response and rich fingerprint information was absolutely necessary. As shown in Fig. 3, typical fusion HPLC fingerprints of 30 GE samples had similar chemical profiles with some common chemical compositions. Next, GE sample fingerprints and the reference fingerprint (RFP constructed by taking the average of the GE sample chromatograms) were imported to an in-house software "Digitized Evaluation System for Super-Information Characteristics of TCM Chromatographic Fingerprints 4.0" (Software certificated NO. 0407573, China) to calculate the quality evaluation results as presented in Table 2. Table 2 showed that except S6 (grade 5), the fusion fingerprint grades exhibited some fluctuations compared with the results at each wavelength. The resulting fluctuations may be caused by differences in response intensity and number of fingerprints at each wavelength. Consequently, the fusion fingerprint evaluation strategy is comprehensive and indispensable for avoiding potential deviations from single wavelength. Furthermore, except S18, the remaining 29 GE samples had α < 0.30 and all of the 30 GE samples had S L > 0.80, indicating that the samples were similar in the distribution and number of chemical compositions. Based on the qualitative parameters S L and α, the quality grades of 29 samples should be within 1-5. In fact, at 203 nm, S9 and S13 were judged to be outliers (8 and 6 respectively) combined with quantitative parameters (48.3% and 139.2%, respectively). For fingerprint analysis, a qualitative assessment (an acceptable α < 0.30 and S L > 0.70) should be performed firstly, followed by a further quantitative evaluation (acceptable 70.0% ≤ ≤ 130.0%). Eventually, in this study, the fusion qualities of 30 GE samples were successfully evaluated in terms of the criteria (shown in Table 2). The quality grade of S16, S17, S25, S26, S30 were best (Grade 1), those of S8, S19-S21, S23 and S27 were better (Grade 2), and those of S4, S5, S7, S11, S14, S22, S24, S28 and S29 were good (Grade 3), and those of S15 and S18 were fine (Grade 4), except for that of S1-S3, S6, S9, S10, S12 and S13 as moderate (Grade 5) due to their much lower or higher contents. ALQFM provides a reliable and feasible strategy to qualitatively and quantitatively assess TCM/HM fingerprints simultaneously.
Correlation between average linear quantitative similarity and quantitative analysis. In "Compounds content analysis" section, the compounds content in the GE samples were accurately quantitated using chemical reference substances. However, available chemical reference substances and more time were required for the quantitative analysis. Even if quantitation is feasible when the chemical reference substances  As discussed in "HPLC fingerprint analysis" section, P L is an important parameter for quantifying all fingerprint components in the sample fingerprint. The relationship between P L and the quantitative content of the six compounds was further explored. The P 6C values (y, average of six component percentages, Table 1) were, respectively, plotted vs. the P L values (x, P L −203 nm, P L −220 nm, P L −250 nm, P L −280 nm, P L −344 nm and P L -fusion, Table 2) as shown in Supporting Information Fig. S2. Linear regression shown that the correlation coefficients (r) were all above 0.5435, especially the correlation coefficient between P 6C and P L -fusion, reached an excellent value of 0.8487. This indicated that P L was basically consistent with the quantitative results for the six investigated compounds. Consequently, due to the economic, reliability, feasibility and simplicity advantages of quantifiable fingerprints for HM quality control, which has the potential to replace multi-component quantification.

Hierarchical clustering analysis.
To investigate the internal characteristics of GE samples from different sources, HCA 44 , a powerful multivariate analysis technique, was performed on HPLC fingerprints. HCA was performed using HemI statistics sofware (Heatmap Illustrator, Version 1.0). For the 30 GE samples, the quality grade difference between the fusion and 220 nm results were all within two grades (Supporting Information Fig. S3). Therefore, 220 nm fingerprint was the closest one to the fusion fingerprint and reflected more compound information than any other wavelengths from both qualitative and quantitative aspects. In this study, HCA was performed according to the peak areas of the 39 co-possessing peaks of 30 GE samples at 220 nm. Subsequently, between-groups linkage method and squared Euclidean distance was used as average linkage and a metric to evaluate 30 GE samples similarity. A hierarchical cluster heat-map (Fig. 4) was obtained to represent the dissimilarity (between the different groups) or similarity (within the same group) between 30 GE samples. Two main clusters were produced (Samples S1-S10 collected from Manufacturer A, Shanxi province and S11-S20, S21-S30 collected from Manufacturer B, C, Xinjiang province). The distance between samples in the different cluster was farther than the distance between samples in the same cluster, suggesting that the internal quality characterized by HPLC Sample 203 nm 220 nm 250 nm 280 nm 344 nm Fusion    17,41,42 . These results are consistent with the results of the quantitative fingerprint analysis, indicating that HCA only performs simple classification of samples, and the quantitative fingerprint analysis could provide a more accurate, feasible and reliable evaluation of TCM/HM. Subsequently, a simple and intuitive comparison of the chemical fingerprints (Fig. 3) involved comparing the color changes between the 39 co-possessing peaks in Fig. 4, which might be related to their different biological activities. In particular, the six investigated compounds showed strong color reactions. Thence, in the next part, antioxidant activity of the GE sample was studied.
Antioxidant activity. Antioxidant activity assay based on FIA. Antioxidant activity has been demonstrated to be an effective means of evaluating the biological activity of GE and related products in vitro 45,46 . To determine the total antioxidant capacity of GE samples, DPPH assays were performed by using the method introduced by Mrazek et al. 36 with slightly modification. In this study, the total antioxidant activities of the GE samples were assessed by the on-line UV spectroscopic DPPH method, where ascorbic acid equivalent (ASAE) values were measured as shown in Table 1. The results indicated that a majority of the GE samples were found to possess excellent antioxidant activities with an ASAE value above 0.2000 mM; however, S6, S9, S10 and S19 showed lower ASAE values (<0.2000 mM), indicating poor antioxidant activities. In addition, the ASAE values among different GE samples ranged from 0.1365 mM (S6) to 0.2809 mM (S12), showing a 2.06-fold difference in antioxidant activity. By observing chemical fingerprints and evaluation results, the difference in antioxidant activity was not caused by a simple change in the content of one certain component. Therefore, it is necessary to establish a model to investigate the relationship between antioxidant activity and chemical fingerprint.

Relationship between HPLC fingerprints and antioxidant activities.
A correlation model between the antioxidant activities and HPLC fingerprints was established to discover potential constituents with antioxidant capacity. PLS [47][48][49] , carried out using SIMCA-P 13.0 software (Umetrics, Sweden), was constructed by the ASAE values as the response matrix Y, and the HPLC fingerprints at 220 nm as the descriptor matrix X to investigate the spectrum-effect relationship. Based on the score plot centered on the data mean (Fig. 5A), outliers (S6 and S18) were identified and eliminated when constructing the final mathematical model. After excluding the outliers, the rest 28 samples were divided randomly into training and test sets (  , the more relevant for variable classification. Therefore, the six investigated compounds showed a strong antioxidant activity. In particular, the three largest variables (VIP value > 1.5) responsible for positive coefficient with antioxidant activity were the change in peak 7 (VIP value = 1.6928), peak 21 (VIP value = 1.6148) and LQA (VIP value = 1.5607) concentration (Fig. 5D).

Conclusion
To evaluate the quality consistency of GE samples, this study established a multi-prong approach including quantitative fingerprint evaluation (ALQFM), chemometric methods (HCA) and antioxidant activity assay. In quantitative fingerprint evaluation, all the GE samples showed similar S L and α; however, P L were able to identify the differences among the different samples due to the variations in the chemical contents. Moreover, P L was highly correlated with the content of the six investigated compounds, indicating that ALQFM had the potential to replace quantitative analysis to quantify complex HM systems. HCA had the ability to distinguish GE samples from different sources; however, the active ingredients in GE samples could not be clearly indicated. Therefore, with the help of chemometric approaches (PLS), fingerprint-efficacy relationship obtained by the rapid spectroscopic DPPH method was established for exploring the possible anti-oxidant active components in GE sample. The strategy proposed in this study provides a powerful, effective and practical method to evaluate the quality consistency of GE samples.