Introduction

Traditional herbal medicine has long been practiced in health care systems in many countries worldwide. The global trade of herbal remedies and supplements is estimated to increase every year and is expected to reach approximately USD$ 117.02 billion by 20241. The usage of herbal products has gained significant momentum in the recent past and is expected to continue to increase in the near future. In Thailand, traditional Thai medicine (TTM) was the most conventional healthcare system until the establishment of modern health care2,3. Consequently, as a result of many social and economic status changes, the use of TTM became limited to indigenous Thai people. However, the government has been trying to rejuvenate TTM to benefit the Thai medical system, especially in rural areas4,5. The quality parameters of herbal products are generally documented in the Thai Herbal Pharmacopoeia (THP) and Monographs of Selected Thai Materia Medica (TMM), the two reference textbooks endorsed by the Thai government.

The THP, currently in its 2021 edition6, was first established in 1989 by the Bureau of Drug and Narcotics, Department of Medical Sciences, Ministry of Public Health, to set forth quality standards for plants or herb-based drugs and herbal product preparations marketed in Thailand to ensure their identity, quality, safety, and efficacy. Intentional or unintentional adulteration of herbs leads to lower efficacy and affects herbal trade7,8,9,10. The common traditional authentication process of herbal products includes methods of botanical identification such as plant taxonomy, microscopic and macroscopic examination, and advanced chemical methods11. However, each method has advantages and limitations. The most frequent approaches are macroscopic and microscopic identification, which are fast and cost‐effective qualitative techniques. However, macroscopic analysis requires the whole plant, and it is difficult to apply to forms where plant morphology cannot be determined, for instance, mixtures of multiple herbs or extracted samples12. Phytochemical approaches or metabolomics profiling has been used for the identification of botanical drugs, dietary or food supplements and plant extracts13. Generally, phytochemical authentication depends on the selection of chemical markers that are unique to the selected plant species and is not always successful due to variation in geographical location and environmental conditions, including soil type, plant age, plant part, processing, storage conditions and other factors8. In addition, phytochemical analysis requires more reference samples from multiple populations to account for natural variability14. Among all recently developed methods, DNA-based methods are well-established for identifying plants in mixtures of herbal medicine products15,16.

A precise assessment is of foremost significance for purchasers, customers, patients and researchers along herbal product value chains17 including collectors, processors, harvesters, producers, regulators, traders, distributors, retailers, and traditional and medical practitioners18. Today, the armamentarium of prescription treatment is communicated through ‘pharmacopoeias’, which are standard collections of information on the quality of pharmaceutical drugs, excipients and flavoring correctives. The pharmacopoeia includes information on testing methodologies, purity, storage guidelines, composition and concentration for drugs. Pharmacopoeias ensure the consistency of cures endorsed by delegates of a particular unit and outline required quality principles. However, the regulatory affairs or policies for natural herbal products differ among nations. In some countries, for example, Canada, the United States and countries in the European Union (EU), governing regulatory agencies assess the quality and safety of herbal drugs/medicines before they enter the herbal market, but in practice, activities to control the authenticity and quality of herbal products in the herbal market appear to be limited19. The European Medicines Agency (EMA) updates the European Pharmacopoeias, including the monographs and testing methods in their database20, and the databases provide the most recent monographs and suitable methods for quality estimations of particular herbal drug products21,22. With accurate and rapid DNA-based techniques, DNA barcoding is now officially recognized as a method for identifying herbal drugs method23. DNA barcoding for quality control of herbal drugs is included in the British Pharmacopoeia (BP)22,23, Pharmacopoeia of the People’s Republic of China24 and Korean Pharmacopeia25, which includes plant sampling, DNA isolation, PCR amplification and development of standard reference sequence databases8.

Herein, we aimed to develop a digital reference DNA barcode library of plants listed in the THP and TMM using the nuclear and chloroplast DNA regions and to test for species adulteration in selected herbal products obtained from local markets and the Thai FDA. The centralized digital DNA barcode database developed here will also aid in the identification of any botanicals or herbal products in registration or regulatory processes.

Results

DNA barcoding of selected plants in the THP and TMM

Genomic DNA was successfully extracted from all 101 plant species belonging to 89 genera and 51 families (Table S1). The core DNA barcode regions, namely, the ITS2, matK, rbcL and trnH-psbA intergenic spacer regions, were amplified. In the PCRs, positive and negative control amplifications gave accurate results. All PCR amplicons were clearly segregated and visible as single bands of the expected size. The partial sequence lengths ranged between 228 and 278 bp (average 258) for ITS2, 424 and 478 bp (average 450) for matK, 540 and 580 bp (average 550) for rbcL and 420 and 458 bp (average 428) for the trnH-psbA intergenic spacer. All nucleotide sequences were submitted to NCBI GenBank, and their accession numbers are listed in Table 1.

Table 1 List of medicinal plants used in this study and their detailed information.

Authentication of herbal products

Genomic DNA was successfully isolated from all twenty different dosage forms of herbal products (Fig. 1; Table S2) and amplified for four barcode regions, namely, the ITS2, matK, rbcL and trnH-psbA intergenic spacer regions. Furthermore, the authenticity of all twenty samples of single-herb formulation products was tested using our reference DNA barcode database and nucleotide Basic Local Alignment Search Tool (BLAST) analysis of available NCBI GenBank sequences (Table S3). The results confirmed the authenticity of eighteen out of the twenty samples tested. The sequences obtained from the other two samples, no. 3 and 13, which were purchased from local markets, did not match the name on their labels (Table 2). Sample no. 3 was labeled as Cyanthillium cinereum and sample no. 13 was labeled as Pueraria candollei. However, our nucleotide BLAST results showed that sample no. 3 and 13 were Emilia sonchifolia and Butea superba, respectively. All samples provided by the Thai FDA were correct according to their claims. The NCBI GenBank nucleotide blast results of these samples are provided in Table 2.

Figure 1
figure 1

Different dosage forms of herbal products analyzed in this study.

Table 2 Nucleotide sequence BLAST results of herbal products.

Maximum likelihood phylogenetic analysis

Maximum likelihood (ML) phylogenetic analysis of all reference plant species was performed using the ITS2, matK, rbcL, and psbA-trnH regions. The unrooted phylogenetic tree of the rbcL region showed clear clades, and each cluster represented a specific group of plant species (Fig. 2). Each color represents a monophyletic clade based on plant genera and families, indicating their close phylogenetic relationships. A large number of plant species clusters belonged to the Asteraceae, Fabaceae, Lamiaceae, Rutaceae, and Zingiberaceae families. The bootstrap values were estimated with 1000 replicates with support values. These findings showed that the rbcL region-based phylogenetic tree can be used as an efficient resource for species authentication of Thai medicinal plants. Our unrooted ML phylogenetic tree of reference species mirrored the taxonomic classification of Thai plants listed in the THP and TMM (Fig. 2).

Figure 2
figure 2

Maximum likelihood tree showing the phylogenetic relationships of reference Thai medicinal plants based on the Kimura-2-parameter (K2P) model using the rbcL region. The bootstrap support values were estimated with 1000 replicates. The respective family names are shown to the right.

Development of a centralized reference DNA barcode database

In this study, a centralized digital reference DNA barcode system for regulating herbal products was developed. The reference DNA barcode database incorporates voucher numbers, scientific names, common names, Thai names, plant habitats, collection forms, plant photographs, herbarium images and other information, such as collection dates, collection locations, collectors, and taxonomists, along with geocoordinates (Fig. 3). All DNA barcode marker information, including genes, gene sequences, and GenBank accession numbers, will be included in the database. Using the scientific name or Thai name in the search option, the end user can obtain all the information for a particular plant. An attempt to establish a digital database system is made, and the database is found to be an efficient tool with which to systematically assess traditional medicine and its herbal products and connect it with both national and international herbal trade regulators. This database system is a novel concept in Thai herbal development, and its availability to the industry as well as consumers and researchers will bring a noticeable change in the regulation of herbal trade.

Figure 3
figure 3

Overview of the proposed digital reference DNA barcode database.

Discussion

The global markets of herbal drugs are large and increasing every year. However, increasing demand leads to adulteration or substitution in the raw materials10,26,27. Many reports of adverse reactions may often be due to the consumption of unintended herbs, which has directly affected the marketing or campaign of herbal products9,10,12,16,27. Various identification methods, including taxonomic, genomic, and phytochemistry methods, have been used to authenticate herbal products28. However, each method has advantages and limitations. Recently, DNA-based methods have been widely established for the authentication of herbal products12,26,27.

In this study, DNA barcodes of 101 highly traded medicinal plants listed in the THP and TMM of Thailand were developed. The highly traded samples of single-herb formulations that are not restricted to closely related plant species obtained from a local market and the Thai FDA were tested for their authenticity. Irrespective of the herbal samples, DNA analysis has been done using our own reference database along with available NCBI nucleotide blast analysis. Due to the inherent limitations of single-locus of DNA barcoding, an emerging DNA-based, or phylogenetic method is needed for the identification of closely related plant species. The utilization of DNA as a source of information for identifying inaccurate plant ingredients on herbal product labels is starting to be explored9,10,16. Four core DNA barcode regions, namely, the ITS2, matK, rbcL, and trnH-psbA intergenic spacer regions, were used to develop a reference DNA barcode library for testing the authenticity of twenty single formulation herbal products. Our analysis indicated that all twenty samples tested for their authenticity were correct according to their labels, except samples no. 3 and no. 13, which were from powder and capsules labeled Cyanthillium cinereum and Pueraria candollei, respectively. Nucleotide BLAST results revealed that the Cyanthillium cinereum (sample no. 3) powder was replaced by Emilia sonchifolia, and the Pueraria candollei (sample no. 13) capsules contained instead Butea superba. Similar morphologies and confusion of vernacular names could explain this replacement. Cyanthillium cinereum has high antioxidant activity29 and is used in Thai medicine to reduce smoking withdrawal symptoms and treat skin ailments, as well as asthma, bronchitis, cough, cancer, malaria, gastrointestinal conditions, diuresis, pain, and diabetes30,31. Emilia sonchifolia is used for the treatment of anti-inflammatory stomach tumors, ophthalmia, diarrhea, wounds, intestinal worm infections and bleeding piles32. Pueraria candollei is used to relieve menopausal symptoms, including vasomotor symptoms, reproductive symptoms, depression, and musculoskeletal pain, in estrogen-deficient women33. Butea superba has been used for rejuvenation, for sexual arousal, and to prevent erectile dysfunction34. These results clearly indicate the extent of the problem that might occur due to the use of unauthentic raw drugs in Thai medicine. There were no rbcL reference sequences of sample no. 13 in NCBI GenBank; hence, the Barcode of Life Data System (BOLD) database was used to analyze this sample. Both of these samples were obtained from a local market. Herbal products purchased from local marketplaces could be more likely to obtain adulterations or admixtures, especially powder samples. It is very difficult to differentiate mixed powdered forms. Previously, many reports showed that the powdered form of samples had a greater chance of admixture than other forms, for example, the powdered form of ginger (Zingiber officinale Roscoe) admixed with chili powder (Capsicum annuum L.)35 and the powdered form of black pepper (Piper nigrum L.) admixed with chili powder (Capsicum annuum L.)36.

For the purpose of this study, an ML phylogenetic tree of our reference plant species was constructed using all four DNA barcode regions. Among the markers, rbcL is highly conserved, and its sequence query revealed the highest identity with plant species or closely related plant species. However, identification by this marker will not be reliable if the taxonomic identity of the nucleotide sequence in the GenBank database is incorrect. These issues can be resolved by using a phylogenetic tree wherein the incorrectly identified samples are highly likely to be located in unexpected clades37. Our rbcL region phylogenetic tree showed the arrangement of all the plant species in appropriate clades or plant groups, as would be expected based on phylogenetic relationships among the plant species (Fig. 2). Therefore, taxonomic identification using the rbcL region at the species level is more reliable than other regions tested in this study. These results were consistent with those of previous reports that the rbcL region is a suitable candidate region for plant species identification38,39. Previously, the utility of the rbcL region in discriminating land plants was successfully validated40. The use of rbcL has increased due to its high discrimination proportions at low taxonomic levels39. In this study, the matK and trnH-psbA regions were unable to differentiate the plant species, and the ITS2 region showed similar results, with a few of the plants of the same genus clustered with different groups of plants (Fig. S1). Therefore, this study was restricted to the rbcL region-based ML phylogenetic tree; however, multilocus DNA barcode techniques could be used as advanced tools for the accurate identification of medicinal plants.

Numerous adulteration and substitution studies of herbal products have been reported worldwide, including in Thailand. In addition, the international herbal product supply chain repeatedly lacks botanical expertise to provide suitable documentation for the identification of raw herbal materials37. Unfortunately, in Thailand, there is no systematic regulatory mechanism for the quality control of herbal drugs before entering the market. It is very important to use appropriate analytical techniques for herbal products. Through this study, we propose a centralized digital DNA barcode database to aid in the regulatory step of identifying the plants used in herbal products. This reference database incorporates voucher numbers, scientific and common names, Thai names, plant habitats, collection forms, and plant photographs, including herbarium images, and other information such as collection dates, collection locations, and geocoordinates. By using scientific or common names, one can obtain all the information on a particular plant species or herbal product. This database could play a very important role in monitoring or checking medicinal plants or herbal trade and could ensure that all essential information is freely accessible to consumers and regulatory authorities in Thailand. Herbal testing centers and certification facilities will enhance the quality control of herbal products and help regulate the national and international herbal trade. Further, we are planning to extend the test to medicinal and non-medicinal plants available in Thailand. Future research will continue to validate and update the reference DNA barcode library and protocol or procedure for analyzing herbal samples. Furthermore, certification of the ingredients mentioned on herbal product labels using our reference DNA barcode database will continue.

Conclusion

Admixture or adulteration in herbal products is one of the main problems in herbal trade because the identification of herbal ingredients is challenging. Hence, there is an important requirement to develop a reference DNA barcode library or centralized digital database system that could serve as a regulatory database for ensuring the safety and quality of traded herbs. It is very important that the Thai FDA immediately begin to strictly enforce the development of pharmacopeial standards as well as revisions or modifications of existing regulatory guidelines to check or monitor the authenticity of raw materials or herbal products before they enter the herbal market. For quality assessment of herbal products, we strongly recommend incorporating DNA-based methods into the THP and TMM to maintain the safety, quality and efficacy of herbal medicines prior to them entering the market.

Materials and methods

Plant materials and herbal products

Multiple accessions of plant species mentioned in the THP and TMM were collected from several locations in Thailand (Table 1). The procedures for plant collection and field studies were conducted by following standard guidelines of Chulalongkorn University, Thailand. Those collections including samples from Thai FDA are permitted and legal. A total of 101 plant species and their voucher numbers were prepared as herbarium specimens and deposited at the Museum of Natural Medicine, Chulalongkorn University, Bangkok, Thailand. All plant species were identified by an independent expert taxonomist, Associate Professor Thatree Phadungcharoen of the Faculty of Pharmaceutical Sciences, Chulalongkorn University. Details of the collection of plant species with their voucher numbers, respective Thai names and GenBank accession numbers are provided (Table 1). Their binomial names and author citations of the plant species were confirmed according to The Plant List (TPL)41. Seventeen single formulation herbal products from local herbal markets across Thailand and three herbal products from the Thai FDA were analyzed in this study. Herbal sample codes are listed in Table 2.

DNA isolation and PCR amplification

Genomic DNA was isolated from leaves using a DNeasy Plant Mini Kit (Qiagen, Germany) according to the manufacturer’s protocol. Further PCR amplification was carried out in a 25 µL reaction volume that consisted of 1X PCR buffer, 1.5 mM MgCl2, 0.2 mM dNTPs mix, 0.2 mM each forward and reverse primer, 0.5 U of Platinum Taq polymerase (Invitrogen, USA) and 30–40 ng of genomic DNA. Amplification was performed with an Eppendorf Master Cycler Gradient (Hamburg, Germany). PCR amplification with primers was carried out by using universal barcode regions40. the ITS2 nuclear region (ITS2F-ATTCCCGGACCACGCCTGGCTGA42; ITS4-TCCTCCGCTTATTGATATGC43) and three chloroplast regions: matK (matK_xF-TAATTTACGATCAATTCATTC44; matK-MALPR1- ACAAGAAAGTCGAAGTAT45), the trnH-psbA intergenic spacer (trnHf_05– CGCGCATGGTGGATTCACAATCC46; psbA3_f–GTTATGCATGAACGTAATGCTC47) and rbcL (rbcLa-F-ATGTCACCACAAACAGAGACTAAAGC48; rbcLa-R-GTAAAATCAAGTCCACCRCG49) were used. PCR amplification of the ITS and psbA-trnH intergenic spacer regions was performed at 95 °C for 4 min followed by 30 cycles of 94 °C for 45 s, 58 °C for 45 s, and 72 °C for 90 s, with a final extension at 72 °C for 7 min. The amplification profiles for matK and rbcL consisted of 94 °C for 4 min followed by 30 cycles of initial denaturation at 94 °C for 60 s, 55 °C for 45 s, and 72 °C for 90 s, with a final extension step at 72 °C for 10 min. The obtained PCR amplicons were sequenced bidirectionally using their respective primers on an ABI3500 sequencer (Applied Biosystem, USA).

Genomic DNA of different dosage forms of the herbal product was extracted using a DNeasy Plant Mini Kit (Qiagen, Germany) and further purified using a GENECLEAN Kit (MP Biomedicals, France). The DNA isolation of herbal samples required multiple attempts to obtain good PCR amplification against the ITS2, matK, rbcL and psbA-trnH intergenic spacer regions. Subsequently, all those PCR products were sequenced as described above.

DNA sequencing and phylogenetic analysis

The sequences were edited using BioEdit software (version 5.0.6). BLAST analysis was conducted with the sequences as queries to determine the similarity of the nucleotide sequences in NCBI GenBank. The sequences with the maximum query coverage, highest homology, and maximum score were downloaded in FASTA format from the database and included in our analysis. The ML method was used to construct the relationships among plant samples with an appropriate model of nucleotide evolution. The final alignment file was imported into MEGA 7 to determine the character information prior to phylogenetic analysis using the Kimura 2-parameter molecular evolution model with 1,000 rapid bootstrapping replicates50.