A self-feedback network based on liquid chromatography-quadrupole-time of flight mass spectrometry for system identification of β-carboline alkaloids in Picrasma quassioides

Profiling chemical components in herbs by mass spectrometry is a challenging work because of the lack of standard compounds, especially for position isomers. This paper provides a strategy based on a self-feedback network of mass spectra (MS) data to identify chemical constituents in herbs by liquid chromatography-quadrupole-time of flight mass spectrometry without compound standards. Components sharing same skeleton were screened and all ions were classified into a database. All candidates were connected by the selected bridging ions to establish a primary MS network. Benefited from such a network, it is feasible to characterize sequentially the structures of all diagnostic ions and candidates once single component has been de novo identified. Taking Picrasma quassioides as an example, the primary network of β-carbolines was established with 65 ions (selected from 76 β-carbolines), each of which appeared at least in four compounds. Once an alkaloid has been identified, its logical ions could feedback into primary network to build pathways with other unknown compounds. Moreover, the position of the substituent groups could be deduced through the secondary metabolic pathways of alkaloids (plant secondary metabolism). The network therefore can be utilized for identification of unknown compounds and even their position isomers.

substructures. In general, structural analogues in herb usually shared with a similar core substructure and characterized by various chemical groups including hydroxyls, formyls, methoxys, methyls, and glycosyls. Thus, the similar core substructure with significantly lower molecular mass could be utilized as MDF template compound. And the chemical groups provided the changes and limited ranges. Once the preset filtering setting applied, the characteristic compounds could be extracted and the heterogeneous ions could be removed 5,13 . An approach of liquid chromatography-hybrid ion trap time-of-flight mass spectrometry (LC-IT-TOF-MS) with MDF technique was developed to extract and classify peaks according to the structure types. By using this approach, 50 ophiopogonins and 27 ophiopogonones were structurally classified and determined from the extract of Ophiopogon japonicus 5 . Also, a study developed an improved strategy using MDF in combination with LC-IT-TOF-MS analysis and theoretical calculations for identification and structural characterisation of dihydroindole-type alkaloids in processed Semen Strychni extracts. Twenty-four dihydroindole-type alkaloids, including four that were previously not described, were tentatively identified 14 . Moreover, ion fragment pathways were applied for identification. Based on LC-IT-TOF-MS, a novel and generally applicable approach to identifying nontarget components from herbal preparations was developed. By classified peaks into families based on the exact same fragment ions (mass error <5 mDa) and connected all families by peaks that presented in two or more families, this method was successfully applied to identify 43 compounds from an herbal preparation 6 . In our laboratory, 18 tetracyclic monoterpenoid oxindole alkaloids in Uncaria rhynchophylla were determined by analyzing their LC-MS 2 cleavage pathways and metabolic relationships 7 . However, all of the above approaches are restricted due to the needs of the predefined ion fragments of known compounds summarized from standards and/or literatures. Because of the scarcity of the standard compounds for the majority of herbal medicines and natural products, it is urgent to develop a new strategy to identify compounds without standards.
Picrasma quassioides (D. Don) Benn. (Fam. Simaroubaceae), called Kumu in Chinese, is one of TCMs for the treatment of swollen sore throat, diarrhea and dysentery, eczema, sore and deep-rooted boil, and bite wound of insect or snake, or as a gastrointestinal vermifuge agent 8,9 . It has been listed in all versions of Chinese Pharmacopeia and has been used singly (Kumu Injection) or as an ingredient for many Chinese herbal preparations. Previous investigation showed that alkaloids, including β-carboline, canthinone and the dimers of them, were the principal active components in P. quassioides 10,11 . A high performance liquid chromatography (HPLC) fingerprint has been constructed from the extract of P. quassioides. Only 7 alkaloids of 27 peaks were identified 12 . Also, in our laboratory, a method of total ion chromatogram combined with chemometrics and MDF was established for the prediction of antitumor active ingredients in P. quassioides samples. A total of 17 constituents were predicted as the potential antitumor active compounds, and only 12 of which were identified 13 . Because there are too many isomers and not enough standards could be available, it is difficult to identify and/or even predict the druggability of the alkaloids in P. quassioides. Therefore, it is a challenging work to develop a methodology for comprehensive identification of such components with numerous isomers in the absence of standards.
Herein, we described an efficient and practicable strategy to identify alkaloids, including structural analogues such as isomers, in P. quassioides based on developed self-feedback enhanced network by quadrupole-time of flight mass spectrometry (LC-Q-TOF MS). This strategy was aimed to establish a primary network with the most common fragment ions (the number of times the ion appeared ≥4) of alkaloids by MDF technique. Once a single alkaloid has been identified, its fragment ions presented in the primary network could be connected by the cleavage pathways, while some logical ions (the maximum tolerance of mass error <5 mDa with a rational predicted molecular formula) which were not presented in the primary network could be feedback to establish an enhanced network. The identifying capability of this network will become ever more powerful with more and more ions being feedback to it from the identified compounds. Based on the chemical information obtained from literatures, the structures of a particular type of constituents could be easily identified by this strategy without standards. It is useful for the quality control and component identification of various natural products, such as herbal medicines, preparations and other biological samples.

Results and Discussion
The workflow of this strategy was described in the method section and shown in Fig. 1.

Extraction of alkaloid candidates by MDF.
The extract of P. quassioides was detected by LC-Q-TOF MS, and the total ion chromatogram (TIC) was shown in Fig. 2A. Under the selected conditions, most of the peaks were well separated with high resolution and good sensitivity. However, the alkaloids contained in P. quassioides were the major active components. In order to reduce interferences of other ions in matrix, the potential alkaloids, divided into single (β-carbolines and canthinones) and dimer alkaloids, were extracted by MDF technique.
For the single alkaloids, the mass defects of 9H-pyrido[3,4-b]indole (C 11 H 8 N 2 ) (filter reference) and all its derivatives with various chemical substituents were summarized (Supplementary Table S1). Based on the information of Supplementary Table S1, the minimum and maximum values of mass defects were calculated as 0.0378 and 0.1790 Da, corresponding to the formula of C 11 H 6 N 2 O 3 and C 18 H 23 N 3 O 2 , respectively. Therefore, the filter was set as C 14.5 H 14.5 N 2.5 O 2.5 ± 70.6 mDa over the mass range of 160-400 Da. The filtered chromatogram was shown in Fig. 2B and its noise level (2.5 × 10 e 6 counts per second, cps) was 36% of that of the original chromatogram (7 × 10 e 6 cps) ( Fig. 2A). After excluding the irrelevant ions by MDF, a total of 76 single candidates (Table 1) were detected in the filtered TIC profiles.
Establishment of the primary network. Diagnostic ion pathways (cleavage pathway connected by diagnostic fragment ions (DFIs)) are useful in LC-MS n analysis and have been widely applied for the rapid identification of compounds in various studies [2][3][4] . Generally, the way to construct ion cleavage pathways was based on chemical standards containing same carbon skeletons or substructures, from which the same fragment ions (i.e., DFIs) can be produced. Lack of enough chemical standards for selecting reasonable DFIs limits the application of ion cleavage pathways to identify the chemical constituents in herbal productions. Herein, a strategy to select reasonable DFIs with a primary network of ion fragments rather than chemical standards was proposed.
Based on the Q-TOF MS analysis, the 76 single alkaloid candidates gave a total of 231 MS 2 ion fragments with the maximum tolerance of mass error less than 5 mDa. They could be divided into two groups: Group-I with 79 ions (the number of times the ion appeared (NTAs) ≥4) and Group-II with 152 ions (NTAs <4). Among Group-I, 65 ions which had logical losses of molecular weight and corresponded to a certain chemical formula were selected to establish the primary network (Supplementary Table S2). The selection of NTAs was able to include most of the alkaloid candidates with the least fragment ions to facilitate the compound identification. Identification of alkaloid candidates by primary network. The single alkaloid candidates captured by MDF were rearranged according to the number of isomers (including detected and reported) ( Table 2). The candidates without isomers can be identified easily by the MS and/or MS 2 ions with reported information. For example, compounds 8 and 32, two of single alkaloids with [M + H] + m/z at 259 and 289, were unambiguously identified by the precise relative molecular mass, respectively. Sometimes, although a compound has more than two reported isomers, its structure is also easily identified by the precise relative molecular mass and the MS 2 ion fragments inferred from the reported isomers, such as compounds 10, 52 and 57 ([M + H] + m/z at 229, 185 and 273). After the identification of the above compounds, the MS 2 ion fragments and their cleavage pathways of these compounds can be applied to extrapolate the compounds with the same ion fragments. Generally, the more the same ion fragments, the more the structure is similar. For example, there were 6 isomers at [M + H] + m/z 257, but only compound 27 shared 4 of the same ion fragments with compound 10 at m/z 128.0495, 155.0604, 156.0444, 183.0553 in the primary network (Supplementary Table S2). Thus, compound 27 can be easily identified through the diagnostic ion pathways built with the above four ion fragments (Fig. 3). Besides, the metabolic relationship of two compounds in the plant is also useful for the structure identification. For example, compound 32, having one more methoxyl at C-8 than compound 8, can be considered as a metabolite of compound 8. Therefore, the cleavage patterns of compounds 8 and 32 were very similar for losing seven identical neutral fragments (-C 2 H 5 O 2 , -CH 4 O 2 , -C 2 H 6 O 2 , -CH, -CH 2 , -CH 3 O and -CO) at the corresponding same positions (Fig. 4). This suggested that the structures, especially the position of the substituent group, of the compounds which showed the similar cleavage patterns (i.e., with the same neutral losses) can be deduced through their metabolic pathways.
As a result, 31 of 76 single alkaloids (Table 1) were identified by the ion fragments in the primary network, combined with the structural or metabolic correlations between these compounds. The information of the ion fragments (including precise molecular mass, logical loss, percentage of relative abundance and tolerance) of the determined compounds can continually feed back to the primary network for correlating and analyzing more unidentified compounds. Enhancement of the network with feedback DFIs from the identified compounds. In order to identify the rest of the candidate alkaloids, the fragment ions in Group-II (NTAs <4) that involved into the cleavage pathways of the identified compounds were selected and fed back to the primary network. This network could be continually enhanced with more and more compounds being identified. The enhanced network was effective at discriminating the unidentified alkaloid analogues, especially the position isomers.
Compounds 6, 19, 31, 55, 67 and 69 are six isomers with the same molecular weight ([M + H] + at m/z 241). As an example, their structures were distinguished by the enhanced network and the procedure was presented as follows. In the primary network, there were 10 ion fragments for compound 6, 15 for 19, 15 for 31, 2 for 55, 29 for 67 and 11 for 69. They showed many same ion fragments because they are isomers. Among them, four isomers have been reported with the precise relative molecular mass of 240.2615 and a formula of C 14 H 12 N 2 O 2 (Supplementary Table S1). By comparing the MS 2 ion fragments, the structures of compounds 6, 19, 31, 55 could be assigned ( , loss of CH 4 ) in the primary network and its structure could be deduced as 1-ethenyl-4-methoxyl-8 -hydroxyl-β-carboline. Compound 55 only provided two ion fragments of 167.0604 (C 11 H 7 N 2 + , loss of C 3 H 6 O 2 , 70.26%) and 140.0495 (C 10 H 6 N + , loss of CHN from C 11 H 7 N 2 + , 29.74%) with high relative abundance, indicating that the structure was easily losing all the substituents to yield structurally stable residue. Therefore, compound 55 could be determined as 1-ethoxymethenyl-β-carboline. All the ion fragments and the cleavage pathways of the four compounds fed back to the enhanced network in order to identify the rest isomers and other compounds. Compound 67 shared 13, 7 and 4 ion fragments that were the same as compounds 31 (1-ethenyl-4-methoxyl-8hydroxyl-β-carboline), 40 (1-ethenyl-4-methoxyl-β-carboline) and 24 (1-ethenyl-β-carboline), respectively (Supplementary Table S2). By compared with the above three compounds, compound 67 was determined as 1-ethenyl-4-methoxyl-6-hydroxyl-β-carboline. It is a known compound but found in P. quassioides for the first time, and its structure was confirmed by the MS 2 ion fragments listed in Table 1. In the same way, compound 69  was identified as 1-propenyl-4,8-dihydroxyl-β-carboline, which shared 8 and 5 fragments that were the same as compounds 47 and 13, respectively (Supplementary Table S2). By utilizing the enhanced network, 39 of the 45 unidentified single alkaloids were determined. As shown in Table 2, a total of 70 single alkaloids were identified by the primary and enhanced networks. Among them, 20 compounds were reported for the first time and 19 known compounds were detected in P. quassioides for the first time.

Structure validation by chemical standards and literatures.
As a result, totals of 79 single alkaloids were screened by the MDF technique and the enhanced network, and 73 of them were identified by the enhanced network (Table 1). Among them, the structures of 25, 28, 41, 56, 70 and kx3 were confirmed by comparing with the standards, respectively. Moreover, the rationality of this strategy was also verified by comparing the ion fragments and the cleavage pathways of the standards with the corresponding identified compounds. Consulting with the literatures, the structures of 28 known alkaloids (Table 1) were also confirmed by analyzing their reasonable cleavage pathways and/or plant metabolic pathways. Among the identified compounds, 20 single alkaloids have not been reported before. Although we have revealed reasonable cleavage pathways to prove their possible structures, it is necessary to get more information to confirm the results due to the varied structures for a single fragment ion, which may lead to a wrong combination of the fragments. However, the case of wrong combinations would be not very likely based on the enhanced network, because a mismatch combination is easily observed and excluded by the cleavage pathways and the metabolic pathways. Six single alkaloids (1, 7, 21, 33, 66, 76) extracted by MDF technique were not successfully distinguished by this strategy. The main reason was the lack of enough fragment information. Among them, compound 76 was an alkaloid with two optical isomers which provide almost the same ion fragments and cannot be identified by this strategy.
In the same way, 16 of 35 dimer alkaloids were identified and 4 of them have not been reported before ( Supplementary Fig. S2

Conclusions
Alkaloids are one kind of secondary metabolites in natural products with significant biological and therapeutic activities. It is challenging to distinguish the position isomers of alkaloids because of the difficulty to determine the exact structure when multiple position isomers are available. In this paper, a strategy for determination of the structure of alkaloids has been developed based on self-feedback network of MS 2 ions obtained from the LC-Q-TOF MS analysis, combining with the logical cleavage pathways and metabolic pathways. Taking P. quassioides as an example, 89 alkaloids, including 70 position isomers, had been successfully characterized by using this strategy. Moreover, 24 compounds had not been reported (new compounds), and 19 known compounds were detected in this herbal medicine for the first time. The results showed that this is an efficient and practical method to identify different kinds of secondary metabolites, especially some position isomers, in natural products.

Materials and Methods
Chemicals. Alkaloid standards of β-carboline-1-carboxylic acid, 3-methylcanthin-5,6-dione, 5-hydroxy-4-methoxycanthin-6-one, 4,5-dimethoxycanthin-6-one, 1-ethanoyl-β-carboline, canthin-4-one were isolated from P. quassioides in our laboratory and identified by MS 2 , 1 H and 13 C NMR spectral data. Their purities were determined to be more than 98% by HPLC method. HPLC grade acetonitrile was purchased from Shanghai Xingke Biochemistry Co., Ltd (Shanghai, China) and deionized water was prepared by a Milli-Q water  Samples preparation. The 70% methanolic extract of P. quassioides was provided by Qingfeng Medical Investment Group (Jiangxi, China). The extract was accurately weighed and dissolved in methanol to prepare a sample solution containing 10 mg·mL −1 . The reference compounds were separately dissolved in methanol to produce the reference solutions ranged from 0.365 to 0.115 mg·mL −1 . Aliquots of 10 μL of samples and reference solutions were injected for LC-Q-TOF MS analysis after filtered through 0.45 μm millipore membrane.
Chromatographic conditions. Mass spectrometric conditions. Mass spectra of test solutions were obtained from an Agilent 6520 Q-TOF spectrometry system (Agilent Corp., USA) equipped with an electrospray ionization interface. High purity nitrogen was used as the sheath gas and ultra high purity helium as the auxiliary gas. The sample was analyzed in positive ionization mode with the parameters as follows: drying gas flow at 8. The elemental compositions of target candidates ranged from 0 to 60 for carbon and hydrogen, from 0 to 8 for oxygen, from 2 to 4 for nitrogen and zero for other elements. Tolerance of predicted formula (mass error) less than 5 mDa. The candidate compounds were extracted according to the average element compositions of structural analogue ± mass defect tolerance (half width of mass defects range) by the qualitative analysis software Ver. B.04.00 (Agilent Corp., USA). And the relevant data of the candidate compounds including peak number, retention time, accurate mass, predicted chemical formula and the corresponding mass error were output. Strategy for network construction and candidate alkaloid identification. For identification of the candidate alkaloids, the first step was to extract the rational MS fragment ions. All the fragment ions from all the target candidates with the exact molecular mass (the maximum tolerance of mass error <5 mDa) and predicted molecular formula was selected and rearranged according to the NTAs. Based on the NTAs, these ions were divided into two groups: Group-I with NTAs ≥4 and Group-II with NTAs <4. The ions in Group-I with logical losses of molecular weight corresponding to a definite chemical formula were selected to establish the primary network. The secondary step was to determine the structures and the cleavage pathways of the ion fragments in the primary network. With the aids of the reported chemical information, the structures of known compounds could be easily identified. The fragment ions, which were obtained from these compounds and included in the primary network and can be linked by reasonable cleavage pathways, were utilized as DFIs. The DFIs can be utilized for determination of the same substructures.
The third step was to enhance the primary network with feedback DFIs. For the identified compounds, some fragment ions which can be linked by reasonable cleavage pathways were not occurred in Group-I but in Group-II. Such fragment ions, together with the cleavage pathways, could be fed back to the primary network to strengthen the identification function of the system.
After the identification of the known compounds, the unknown candidate alkaloids were mainly characterized by the enhanced network, combining with the cleavage pathways (including diagnostic central losses) and the metabolic relationships between the same type compounds. The new DFIs which were obtained from the identified alkaloids and included in Group-II could be continually fed back to the network.