Introduction

Serenoa repens (W.Bartram) Small–commonly known as saw palmetto–is a palm (Arecaceae) indigenous to the southeastern United States of America (Alabama, Florida, Georgia, Louisiana, Mississippi and South Carolina)1. The closest living relative of S. repens2,3,4, Acoelorrhaphe wrightii (Grisebach & H.Wendland) H.Wendland ex Beccari, occurs in United States of America (southern Florida), Bahamas, Cuba, southeastern México (Campeche, Chiapas, Tabasco, Veracruz, Yucatán and Quintana Roo), Belize, Guatemala, Honduras, Nicaragua, Colombia (Isla de Providencia) and Costa Rica1,5,6. Although morphological and molecular data strongly support the close relationship between S. repens and A. wrightii, until recently their relationship to the other species of tribe Livistoneae could not be resolved2,3,4. New data suggest that S. repens and A. wrightii are sister to subtribe Livistoninae (Johannesteijsmannia, Lanonia, Licuala, Livistona, Pholidocarpus, Pritchardiopsis and Saribus) and that the Acoelorrhaphe/Serenoa/Livistoninae clade is in turn sister to Brahea and subtribe Rhapidinae (Chamaerops, Guihaia, Maxburretia, Rhapidophyllum, Rhapis and Trachycarpus)4.

The fruit (drupe) of S. repens are ellipsoid, about 2 cm in length, 1 cm wide, smooth, blue–black when mature (green to yellow–orange when immature)1,7,8. The fruits are eaten by an assortment of wild animals, livestock and people7,8,9. When labeled as saw palmetto, S. repens can be legally sold in the United States of America as an herbal dietary supplement10. In 2011, it was the third most popular supplement with sales totaling more than US$ 18 million11. Although the fruits of S. repens are reported to be useful in the treatment of 51 different medical ailments7, the fruits are most frequently taken to ameliorate benign prostate hyperplasia7,8,9,12. Extracts of S. repens fruit inhibit the conversion of testosterone to dihydrotestosterone by 5α-reductases13,14,15,16. Benign prostate hyperplasia is associated with elevated concentrations of dihydrotestosterone17. Clinical studies report few adverse events from S. repens consumption (mostly mild)18,19, but treatment outcomes vary greatly–on average little success has been reported19.

Wild S. repens grows abundantly on as many as 450,000 hectares7 of costal sand dunes, mesic hammocks, pine flatwoods and sand–pine scrub1,7,8,9. Each hectare annually produces an average of 200 kg of fruit (range = 100–1,500 kg/hectare)9. The magnitude of annual fruit harvest is unknown, but estimates are as high as 6,800,000 kg7. Almost all of the fruit is harvested from wild plants7,8 and approximately half is picked by independent wildcrafters8. Fruit is often harvested when immature: a final product with a minimum of 10% mature (blue–black) and 60% partially ripened (yellow–orange) fruit is commercially acceptable8.

DNA barcode researchers collectively aim to produce a global public reference library of standardized, high–quality, vouchered DNA sequences that can be used to identify specimens. The protein coding plastid genes matK and rbcL have been sanctioned by the Consortium for the Barcode of Life for use in plant DNA barcoding20,21. By using standard genomic regions, data and protocols can be shared thus maximizing scarce research funds.

We aim to (i) generate and test a DNA barcode reference library for S. repens, (ii) devise a barcode assay capable of unambiguously identifying S. repens and (iii) estimate the frequency of mislabeled saw palmetto herbal dietary supplements on the market in the United States of America.

Results

For this study, 27 matK and 37 rbcL barcode sequences where generated from 37 morphologically identifiable specimens (Table 1; GenBank KF746442–KF746505). Median sequence quality (B30)22 of the newly generated sequences was 0.891 (IQR = 0.829–0.928) for matK and 0.909 (IQR = 0.756–0.939) for rbcL. Trimmed and edited matK sequences were 840 bp in length for A. wrightii and 837 bp in length for all other species examined (A. wrightii has a lysine(AAG) inserted at nucleotide position 306). All Trimmed and edited rbcL sequences are 607 bp in length.

Table 1 Voucher information

When the newly generated sequences were analyzed in concert with publicly available sequences (Table 2)3,4,23,24,25,26,27,28,29,30,31, no unambiguous matK sequence variation was detected within S. repens (n = 12) or A. wrightii (n = 15). Variation was detected at two rbcL nucleotide positions in S. repens (n = 18): GenBank sequence AJ62193625 had a ‘C’ at nucleotide position 60 whereas all other sequences examined had an ‘A’ at that nucleotide position and GenBank sequence M8181523 had a ‘C’ at nucleotide position 234 whereas all other sequences examined had a ‘T’. Both nucleotide substitutions are predicted to result in amino acid substitutions. Neither nucleotide substitution has been detected in more than one individual. No rbcL sequence variation was detected in A. wrightii (n = 17).

Table 2 Previously published reference sequences deposited in GenBank that were analyzed alongside newly generated sequences

Serenoa repens and A. wrightii can be consistently distinguished from Brahea, Livistoninae and Rhapidinae by a combination of matK nucleotide positions 802 and 818 (Fig. 1). Serenoa repens, A. wrightii, and Pholidocarpus majadum Becc. (tribe Livistoneae) have a ‘G’ at nucleotide position 818 whereas all other examined species have an ‘A’. Pholidocarpus majadum has an ‘A’ at nucleotide position 802 and thus can be differentiated from S. repens and A. wrightii which have a ‘T’ at that nucleotide position. Serenoa repens and A. wrightii can be differentiated from one another by a three–base insertion in A. wrightii at matK nucleotide position 306. Serenoa repens and A. wrightii can also be differentiated from one another by rbcL nucleotide positions 292 (S. repens has a ‘C’, A. wrightii has a ‘T’) and 398 (S. repens has an ‘A’, A. wrightii has a ‘C’; Fig. 1).

Figure 1
figure 1

Variable nucleotide positions for mini–barcode sequences.

Diagnostic positions that, in combination, unambiguously differentiate Serenoa repens from its close relatives are highlighted. Nucleotide positions are numbered in reference to Britton et al. 9614 (NY). Periods indicate nucleotides identical to Brahea. Question marks indicate unsequenced positions. The four sequence types (A, B, C and D) found in herbal supplements are reported.

Preliminary attempts to PCR amplify full–length barcode markers from saw palmetto herbal supplements were unsuccessful. Fragmented DNA was determined to be the primary cause of PCR failure–the barcode regions are larger than the average fragment size in DNA extracts of saw palmetto herbal supplements. To overcome DNA fragmentation, novel mini–barcode PCR primers were designed to amplify positions diagnostic of S. repens while limiting the amplicon size to 200 bp or less (Fig. 1). Unfortunately, there are no regions less than 200 bp within matK or rbcL that can consistently distinguish S. repens from the other species examined. A novel matK mini–barcode was designed to span nucleotide positions 802 and 818 and can thus distinguish S. repens and A. wrightii from all of the other species examined. A novel rbcL mini–barcode was designed to span nucleotide positions 292 and 398 and can thus distinguish S. repens from A. wrightii (Fig. 1). In combination these novel mini–barcodes can distinguish S. repens from all of the other species examined.

PCR amplification with the novel mini–barcode primer sets worked well on the 31 morphologically identifiable validation samples as well as saw palmetto herbal supplements. Median sequence quality (B30) of validation mini–barcode sequences was 0.633 (IQR = 0.455–0.732) for matK and 0.530 (IQR = 0.386–0.689) for rbcL. All validation samples were correctly identified using the combination of matK and rbcL mini–barcodes [n = 13 S. repens; n = 18 A. wrightii; specificity = 1.00 (95% confidence interval = 0.74–1.00); sensitivity = 1.00 (95% confidence interval = 0.66–1.00)].

Of the 37 saw palmetto herbal supplements examined, amplifiable DNA could be extracted from 34 (92%). At least one mini–barcode could be PCR amplified and sequenced from all 34 samples. Both matK and rbcL mini–barcodes could be PCR amplified and sequenced from 30 of the samples (81%). Mini–barcode analysis conclusively demonstrates that 29 (85%) saw palmetto herbal supplements contain S. repens (Fig. 1, supplement type A). The identity of 3 (9%) supplements could not be definitively determined due to failure of the rbcL mini–barcode to amplify and sequence (Fig. 1, supplement type B). These supplements could be composed of S. repens, they could contain A. wrightii, or a mixture of S. repens and A. wrightii. Two (6%) supplements contain related species that cannot be legally sold as herbal dietary supplements in the United States of America10–one supplement (Fig. 1, supplement type C) is definitively A. wrightii; the other cannot be conclusively identified to species (Fig. 1, supplement type D; it is a species of Brahea, Chamaerops, Guihaia, Johannesteijsmannia, Lanonia, Licuala, Livistona, Maxburretia, Rhapidophyllum, Rhapis, Saribus, or Trachycarpus).

Discussion

All newly generated matK and rbcL reference sequences exceed the quality requirements of the BARCODE data standard (version 2.3)32.

Intraspecific sequence variation was detected at two rbcL nucleotide positions in previously published23,25 S. repens sequences. Such barcode variation is uncommon in plants–particularly in rbcL20,33,34,35,36,37,38,39,40,41. From the available data, we cannot determine if the variation is real or the result of sequencing error. If genuine, both of these nucleotide substitutions would result in amino acid substitutions. The fact that these variable sites have not been found in more than one individual each strongly suggests that the variation is artifactual. The rbcL mini–barcode does not include these, possibly variable, nucleotide positions and thus these nucleotide positions had no influence on the resulting species identifications (Fig. 1).

Our inability to PCR amplify full–length barcodes from saw palmetto herbal supplements was not unexpected: the processing of plant materials frequently results in highly fragmented DNA, particularly if the samples are heated42,43,44,45,46,47,48,49,50,51. Failure of PCR amplification from degraded DNA samples is frequently reported when amplicons are greater than 200 bp42,43,44,45,46,47,48,49,50,51, thus one cannot expect full–length barcodes to reliably amplify from processed materials given that the median full–length matK barcode is 889 bp (IQR = 880–889)21 and rbcL is uniformly 654 bp21. Mini–barcodes were thus designed to ensure PCR amplification from degraded samples.

Amplifiable DNA could not be extracted from three saw palmetto herbal supplements. It is possible that amplifiable DNA belonging to S. repens (or closely related species) was absent from the herbal supplements because (i) the supplements did not contain any S. repens (or closely related species); (ii) alternatively the herbal supplements contained S. repens (or closely related species), but the material was processed in such a way that all amplifiable S. repens DNA was destroyed; or (iii) amplifiable DNA was present, but PCR inhibitory compounds were co–purified with the DNA. The successful PCR amplification and sequencing of only the matK mini–barcode from four saw palmetto herbal supplements cannot be conclusively explained without assuming that region containing the rbcL mini–barcode is more sensitive to DNA degradation than the region containing the matK mini–barcode.

The validation experiment conclusively demonstrates that it is possible to distinguish between S. repens and closely related species using a combination of matK and rbcL mini–barcodes (specificity = 1.00; sensitivity = 1.00). Samples can be unambiguously identified provided that both mini–barcodes can be PCR amplified and sequenced. Without the matK mini–barcode, it is not possible to distinguish among S. repens, Brahea and most Rhapidinae (Fig. 1). Without the rbcL mini–barcode, it is not possible to distinguish between S. repens and A. wrightii (Fig. 1).

Two saw palmetto herbal supplements (6%), in our sample, were unambiguously mislabeled. One of these supplements contained A. wrightii (Fig. 1). Given the relative rarity of A. wrightii within the native geographic range of S. repens1,6 and the distinct macro–morphological differences (S. repens is an acaulous to short stemmed palm whereas A. wrightii grows in clusters of tall slender stems)1,6 it is difficult to imagine such a misidentification occurring at the point of harvest. It seems most likely that fruits–which appear similar in both species–were misidentified post harvest. We cannot explain the other mislabeled saw palmetto herbal supplement.

Variation in the chemical composition of S. repens fruit and fruit extracts52,53 is commonly cited to explain the mixed treatment outcomes observed in clinical studies19. An alternate explanation is species misidentification. Between 4 and 15% of the samples we examined were misidentified. If our sample is representative, misidentification may account for a substantial portion of the variation observed in clinical studies. To ensure that misidentified materials are not inadvertently used, clinical researchers should authenticate all saw palmetto herbal supplements using the DNA barcode methodology described here.

Methods

Plant material

Reference and validation samples were morphologically identified by the authors (Table 1). Validation samples were arbitrarily selected from the set of morphologically identified samples. Herbal supplements were purchased in retail stores or on–line. The herbal supplements consisted of dry, cut and sifted plant materials (gelatin capsules or compression tablets).

DNA extraction

Samples (10 mg) of dried leaf tissue or herbal supplements were disrupted in a 1.6 mL tube using two stainless steel ball bearings (3 mm) and a TissueLyser (Qiagen) at 30 Hz (2 × 1.5 min). Samples were incubated for 18 h at 42°C with 40 rpm horizontal shaking in 600 μL extraction buffer (8 mM NaCl, 16 mM sucrose, 5.8 mM EDTA, 0.5% [w/v] sodium dodecyl sulphate, 12.4 mM tris [pH 9.1] and 200 μg/mL proteinase K)54. After incubation, 200 μL of 3 M potassium acetate (pH 4.7) was added to each sample. Following 10 min of incubation at 0°C, samples were centrifuged at 14,000 g for 5 min. 600 μL of each sample's aqueous phase was mixed with 900 μL 2 M guanidine hydrochloride in 95% (v/v) ethanol. The mixtures were applied to silica spin columns (Epoch Life Science), 500 μL at a time, by centrifugation at 7,000 g for 1 min. Wash buffer (50% [v/v] ethanol, 10 mM tris [pH 7.4], 0.5 mM EDTA and 50 mM NaCl)55 was applied twice as described above. Columns were dried by centrifugation at 7,000 g for 2 min. Total DNA was eluted in 200 μL 10 mM tris (pH 8.0) by centrifugation at 7,000 g for 1 min.

DNA amplification and sequencing

Markers were amplified in 15 μL Polymerase Chain Reactions (PCR). Each reaction contained 1.5 μL PCR buffer (200 mM tris [pH 8.8], 100 mM KCl, 100 mM (NH4)2SO4, 20 mM MgSO4, 1% [v/v] Triton X-100, 50% [w/v] sucrose, 0.25% [w/v] cresol red), 0.2 μM dNTPs, 48 mM betaine (rbcL mini–barcode only), 0.5 (rbcL only) or 1.0 μM/mL of each amplification primer (Table 3), 0.25 units of Taq polymerase, 0.025 mg/mL bovine serum albumin and 0.5 μL purified DNA.

Table 3 Amplification and sequencing primers used

The matK reaction mixtures were incubated for 150 sec at 95°C, cycled 10 times (30 sec at 95°C, 30 sec at 56°C, 30 sec at 72°C), cycled 25 times (30 sec at 88°C, 30 sec at 56°C, 30 sec at 72°C) and incubated 10 min at 72°C. The rbcL reaction mixtures were incubated for 150 sec at 95°C, cycled 35 times (30 sec at 95°C, 30 sec at 58°C, 30 sec at 72°C) and incubated 10 min at 72°C. The matK mini–barcode and rbcL mini–barcode reaction mixtures were incubated for 150 sec at 95°C, cycled 35 times (30 sec at 95°C, 30 sec at 60°C) and incubated 10 min at 60°C.

PCR products were treated with ExoSAP-IT (USB) and bidirectionally sequenced on a 3730 automated sequencer (Life Technologies) using the amplification primers and BigDye v3.1 (Life Technologies; High–Throughput Genomics Unit, University of Washington).

Data analysis

Raw chromatograms were processed with KB (version 1.4; Life Technologies) and contigs were created and edited with Sequencher (version 4.10; Gene Codes). Sequence quality was evaluated using B (version 1.2)22 with the quality threshold (q) set to 30.

Publicly available reference sequences where analyzed along with the sequences generated for this study (Tables 1 and 2)3,4,23,24,25,26,27,28,29,30,31. Diagnostic nucleotide positions were located in multiple sequence alignments constructed with MUSCLE (version 3.8)56. Novel mini–barcode primers spanning diagnostic positions were designed with PRIMER3 (version 1.1)57.

Sequences from validation samples and herbal supplements were taxonomically identified using BRONX (version 2.0)58.