Developing single nucleotide polymorphism markers for the identification of pineapple (Ananas comosus) germplasm

Zhou, Lin; Matsumoto, Tracie; Tan, Hua-Wei; Meinhardt, Lyndel W; Mischke, Sue; Wang, Boyi; Zhang, Dapeng

doi:10.1038/hortres.2015.56

Download PDF

Article
Open access
Published: 25 November 2015

Developing single nucleotide polymorphism markers for the identification of pineapple (Ananas comosus) germplasm

Lin Zhou^1,3,
Tracie Matsumoto²,
Hua-Wei Tan³,
Lyndel W Meinhardt¹,
Sue Mischke¹,
Boyi Wang⁴ &
…
Dapeng Zhang¹

Horticulture Research volume 2, Article number: 15056 (2015) Cite this article

4747 Accesses
19 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Pineapple (Ananas comosus [L.] Merr.) is the third most important tropical fruit in the world after banana and mango. As a crop with vegetative propagation, genetic redundancy is a major challenge for efficient genebank management and in breeding. Using expressed sequence tag and nucleotide sequences from public databases, we developed 213 single nucleotide polymorphism (SNP) markers and validated 96 SNPs by genotyping the United States Department of Agriculture - Agricultural Research Service pineapple germplasm collection, maintained in Hilo, Hawaii. The validation resulted in designation of a set of 57 polymorphic SNP markers that revealed a high rate of duplicates in this pineapple collection. Twenty-four groups of duplicates were detected, encompassing 130 of the total 170 A cosmos accessions. The results show that somatic mutation has been the main source of intra-cultivar variations in pineapple. Multivariate clustering and a model-based population stratification suggest that the modern pineapple cultivars are comprised of progenies that are derived from different wild Ananas botanical varieties. Parentage analysis further revealed that both A. comosus var. bracteatus and A. comosus var. ananassoides are likely progenitors of pineapple cultivars. However, the traditional classification of cultivated pineapple into horticultural groups (e.g. ‘Cayenne’, ‘Spanish’, ‘Queen’) was not well supported by the present study. These SNP markers provide robust and universally comparable DNA fingerprints; thus, they can serve as an efficient genotyping tool to assist pineapple germplasm management, propagation of planting material, and pineapple cultivar protection. The high rate of genetic redundancy detected in this pineapple collection suggests the potential impact of applying this technology on other clonally propagated perennial crops.

Assessment of genetic diversity and SNP marker development within peanut germplasm in Taiwan by RAD-seq

Article Open access 25 August 2022

Genetic diversity, population structure, and relationships of apricot (Prunus) based on restriction site-associated DNA sequencing

Article Open access 01 May 2020

Genome-wide SNP discovery and core marker sets for assessment of genetic variations in cultivated pumpkin (Cucurbita spp.)

Article Open access 01 August 2020

Introduction

Pineapple, Ananas comosus (L.) Merr., is a perennial herbaceous fruit crop belonging to the family Bromeliaceae. The crop is cultivated in all tropical and subtropical regions and ranks third in production among noncitrus tropical fruits, following banana (including plantain) and mango. The annual worldwide production reached 21.9 million metric tons in 2012 and the top seven producers (Brazil, Philippines, Thailand, Costa Rica, Indonesia, India, and China) jointly accounted for 90% of the global production (FAO, 2014).¹ The pineapple plant is indigenous to South America.^2,3 The putative center of origin is located in the Paraná–Paraguay River drainages between southern Brazil and Paraguay, based on the diversity distribution of related species and botanical varieties of pineapple in this region.^4–6 However, the eastern part of the Guiana shield has also been hypothesized as the center of domestication for pineapple, based on the variation of chloroplast and nuclear DNA markers, the high level of phenotypic diversity, and the large number of primitive cultigens in this area.^7,8 Pineapple was widely cultivated in tropical Americas before the arrival of Christopher Columbus, the first European to see this fruit, in 1493.⁹

The introduction of pineapple into Asia and the Pacific began with the Spaniards in the early sixteenth century and pineapple reached Africa in mid-sixteenth century.⁴ Since then, there have been multiple introductions and exchanges of germplasm among the pineapple producing countries. ⁴ Although many landraces and traditional cultivars exist in the Americas, only a few cultivars have been dispersed to Asia and Africa for use in commercial production.^4,10 About 70% of the world’s production comes from a single cultivar, Smooth Cayenne,¹⁰ which is a highly productive pineapple excellent for canning.¹¹ The current fresh fruit pineapple market is largely comprised of two cultivars bred by the Pineapple Research Institute, CO-2 and MD-2.¹¹ Developing new cultivars with desirable resistance and postharvest traits will depend on the available germplasm of this species. The United States Department of Agriculture (USDA) - Agricultural Research Service pineapple germplasm collection in Hilo, Hawaii, is one of the major collections in the world, along with the collections maintained by EMBRAPA/CNPMF in Cruz das Almas, Brazil, and by CIRAD-FLHOR in Martinique. As part of the ARS National Clonal Germplasm Repository for Tropical and Subtropical Fruit and Nut Crops, the collection at Hilo currently maintains over 180 accessions of pineapple cultivars and their wild relatives.

As with many other tropical perennial crops, pineapple germplasm is almost exclusively maintained by vegetative propagation, by crowns, slips, suckers, or in vitro culture. Vegetative propagation has allowed the exchange of germplasm as clones among regions, countries, and continents. However, the exchange of vegetative planting materials has also resulted in problems for conservators of pineapple germplasm because records and labels of the cultivars have not always followed the same naming conventions, and accessions have limited information about their correct identity. Therefore, homonyms and synonyms are common among the names of pineapple cultivars and that restricts the sharing of information and materials among pineapple researchers and hampers the use of pineapple germplasm in breeding.^12–14 Another major challenge for pineapple cultivar identification is that the protracted vegetative propagation has led to the accumulation of somatic mutations. Some mutations caused noticeable phenotypic effects and created intra-cultivar variation, which became the target of clonal selection.¹⁰ While these selected mutants are important in horticultural production, it is necessary to identify them so that breeders and genebank curators can efficiently conserve and use these genetic materials.

The utilization of biochemical and DNA molecular markers for pineapple germplasm management has been recently reviewed.¹⁴ Using isozyme markers, Aradhya et al. studied pineapple germplasm and found considerable variation within and between species of Ananas.¹⁵ In the Hawaii pineapple collection, they identified 66 distinct zymotypes that were able to differentiate all species and botanical varieties. Their results also suggested that, rather than genetic divergence due to reproductive isolating barriers, differentiation among the species of Ananas may be due to ecological isolation, and therefore may represent a species complex.

Both dominant DNA markers (amplified fragment length polymorphism, AFLP) and co-dominant markers (restriction fragment length Polymorphism simple sequence repeat, SSR) have been used to assist pineapple cultivar identification and germplasm management.^16–23 In spite of the significant progress in marker-assisted germplasm management over the last 20 years, cultivar identification in pineapple remains a challenging task. Using AFLP markers, Kato et al. characterized 148 A. comosus accessions maintained in the USDA pineapple collection in Hilo, Hawaii.²⁰ They showed that a unique profile for major groups that had been classified by morphological traits, such as ‘Cayenne’, ‘Spanish’, and ‘Queen’, could not be established using AFLP-based DNA fingerprints. SSR markers likewise lacked congruence between phenotype and molecular marker-based classification in pineapple.^22,23 Moreover, neither AFLP nor SSR are the most suitable marker tool for detection of duplicates in the pineapple germplasm.

Single nucleotide polymorphisms (SNPs) are the most abundant class of polymorphisms in plant genomes. Compared to SSR markers, SNP analysis can be done without requiring DNA separation by size and can, therefore, be automated in high-throughput assay formats. The diallelic nature of SNPs facilitates a much lower error rate in allele calling and promotes compatibility between laboratories. These advantages have resulted in the increasing use of SNPs as the markers of choice for accurate genotype identification and diversity analysis in perennial crops, as recently demonstrated in cacao (Theobroma cacao, 2013),²⁴ grapevine (Vitis vinifera, 2011),²⁵ pummelo (Citrus maxima, 2014),²⁶ strawberry (Fragaria spp, 2013),²⁷ tea (Camellia sinensis, 2014), and longan (Dimocarpus longan, 2015). Like other perennial horticulture crops, DNA fingerprinting that uses a small set of SNP markers is in great demand by the pineapple community for a broad range of research and field applications. These applications include, but are not limited to, identification of mislabeled accessions, parentage, and sibship analysis for quality control in breeding and seeds programs, and authentication and traceability to support the production of high-value clones for premium markets. Nonetheless, this most powerful tool for germplasm management has not been applied to pineapple germplasm management.

Ample genomic resources have been developed for pineapple.^14,28–31 The premier online database, ‘PineappleDB’ (http://genet.imb.uq.edu.au/Pineapple/index.html), includes a more than 5,600 expressed sequence tags (ESTs) with 3,383 consensus sequences. The comprehensive sequence, bioinformatics, and functional classification of EST resources are available for text or sequence-based searches. A draft genome of A. comosus has been developed, which covers about 375 Mb (62%) of the estimated 526 Mb genome of this species.¹⁴ These readily available genomic resources provide opportunities for mining new markers to use for pineapple germplasm management and breeding. The objectives of the present study were to develop SNP markers through the data mining of ESTs and transcriptome data and to assess their potential application for pineapple cultivar identification. The results reported herein represent the first validation study of SNPs in pineapple, demonstrating the utility of a transcriptome as an approach for rapid development of a high-quality genotyping tool. These SNP markers, as well as the genotyping method, will be particularly useful for intellectual property rights in varietal protection, germplasm management, and pineapple breeding programs.

Materials and methods

Mining of putative SNPs from EST and nucleotide sequences

All available nucleotide sequences of Ananas spp. were downloaded from NCBI GenBank (http://www.ncbi.nlm.nih.gov, 4 October 2014). Redundant entries were examined and excluded using the CD-HIT program with a 95% sequence similarity threshold. The FASTA-formatted files of pineapple were merged into a single data set for further data mining. Putative EST-SNPs were detected using the QualitySNP program.³² All of these selected clusters included a minimum of six EST sequences, whereas both the minimum redundancy threshold and minimal confidence score required by QualitySNP was set at three. In order to meet the requirements and constraints for primer design, all candidates for SNP markers with less than 50 nucleotides between two neighboring SNPs were removed. A subset of 96 identified SNP sequences was then chosen for design and manufacture of SNP assay.

Validation of putative SNPs

To evaluate the putative SNP markers for suitability of cultivar identification, we used a nanofluidic genotyping system and validated the SNPs for 170 pineapple accessions (Table 1; Supplementary Table S1). The pineapple germplasm samples were from the pineapple collection maintained by the USDA-ARS Tropical Plant Genetic Resources and Disease Unit, at the National Plant Germplasm Repository in Hilo, Hawaii (http://www.ars.usda.gov/main/site_main.htm?modecode=20-40-05-10) were harvested and dried in silica gel. DNA was extracted from dried pineapple leaves with the DNeasy Plant Mini kit (Qiagen Inc., Valencia, CA, USA), which is based on the use of silica as an affinity matrix. The dry leaf tissue was placed in a 2-mL microcentrifuge tube with one quarter-inch ceramic sphere and 0.15 g garnet matrix (Lysing Matrix A; MP Biomedicals. Solon, OH, USA). The leaf samples were disrupted by high-speed shaking in a TissueLyser II (Qiagen Inc.) at 30 Hz for 1 min. Lysis solution (DNeasy kit buffer AP1 containing 25 mg/mL polyvinylpolypyrrolidone), along with RNase A, was added to the powdered leaf samples and the mixture was incubated at 65 °C, as specified in the kit instructions. The remainder of the extraction method followed the manufacturer’s suggestions. DNA was eluted from the silica column with two washes of 50 µL buffer AE, which were pooled, resulting in 100 µL DNA solution. Using a NanoDrop spectrophotometer (Thermo Scientific, Wilmington, DE, USA), DNA concentration was determined by absorbance at 260 nm. DNA purity was estimated by the 260:280 ratio and the 260:230 ratio.

Table 1 Species categories of 170 A. comosus accessions in USDA-ARS pineapple collection at Hilo, Hawaii.

Full size table

Ninety-six putative SNP sequences were submitted to the Assay Design Group at Fluidigm Corp. (South San Francisco, CA, USA) for design and manufacture of primers for a SNPtype genotyping panel. The assays were based on competitive allele-specific PCR, and they enable bi-allelic scoring of SNPs at specific loci (KBioscience Ltd, Hoddesdon, UK). The Fluidigm SNPtype Genotyping Reagent Kit was used according to the manufacturer’s instructions.^34,35 Using these primers, the isolated DNAs were subjected to Specific Target Amplification in order to enrich the SNP sequences of interest.³⁴ Genotyping was performed on a nanofluidic 96.96 Dynamic Array IFC (Integrated Fluidic Circuit; Fluidigm Corp.). This chip automatically assembles PCRs, enabling simultaneous testing of up to 96 samples with 96 SNP markers. The use of a 96.96 Dynamic Array IFC for SNP genotyping of human samples has been described by Wang et al.³³ End-point fluorescent images of the 96.96 IFC were acquired on an EP1 imager (Fluidigm Corp.). The data were analyzed with Fluidigm Genotyping Analysis Software (Fluidigm Corp.).³⁶

Data analysis

Duplicate accessions were identified using pairwise multilocus matching among all individual samples. DNA samples that were fully matched at the genotyped SNP loci were declared the same cultivar or clones. The program GenAlEx 6.5 (2006, 2012) was used for computation.^37,38

After duplicate identification, the redundant samples were removed and descriptive statistics for measuring the informativeness of the SNP markers were calculated based on the remaining distinctive cultivars. The key descriptive statistics included minor allele frequency, observed heterozygosity, expected heterozygosity, Shannon’s information index, and inbreeding coefficient. Computations were carried out using the same program.

A cluster analysis using the neighbor-joining (NJ) method was used to further examine the genetic relationship among accessions. Kinship coefficient was chosen as genetic distance measurement of shared ancestry among the individual accessions. The computation was executed using MICROSATELLITE ANALYZER (MSA, 2003).³⁹ A dendrogram was generated from the resulting distance matrix using the NJ algorithm available in PHYLIP.^40,41 The unrooted tree was visualized using the web-based tool Interactive Tree of Life v2 (http://itol.embl.de/).⁴²

A model-based clustering algorithm implemented in the STRUCTURE software program was applied to the SNP data.⁴³ This algorithm attempted to identify genetically distinct subpopulations based on allele frequencies. The admixture model was applied and the number of clusters (K-value), indicating the number of subpopulations the program attempted to find, was set from 1 to 10. The analyses were carried out without assuming any prior information about the genetic group or geographic origin of the samples. Ten independent runs were assessed for each fixed number of clusters (K), each consisting of 1 × 10⁶ iterations after a burn-in of 2 × 10⁶ iterations. The ΔK value was used to detect the most probable number of clusters and the computation was performed using the online program STRUCTURE HARVESTER.⁴⁴ Of the 10 independent runs, the one with the highest Ln Pr (X|K) value (log probability or log likelihood) was chosen and represented as bar plots.

To test the hypothesis that vars. ananassoides, bracteatus, and erectifolius are the putative progenitors for cultivated pineapples, we applied parentage analysis to verify the origin of the 53 accessions in A. comosus var. comosus (as labeled in Table 1). These cultivars or breeding lines were considered as ‘offspring’ for which parentage analyses were carried out. A. comosus vars. ananassoides, bracteatus, and erectifolius were used as candidate parents. A likelihood-based method implemented in the program CERVUS 3.0 was used for computation.^45,46 For each parent–offspring pair, the natural logarithm of the likelihood ratio (LOD score) was calculated. Critical LOD scores were determined for the assignment of parentage to a group of individuals without knowing the maternity or paternity. Simulations were run for 10 000 cycles, assuming that 10% of candidate parents were sampled, a total of 90% of loci was typed with a 1% typing error rate. The most probable single mother (or father) for each offspring was identified on the basis of the critical difference in LOD scores (D) between the most likely and the next most likely candidate parent at greater than 95 or 80% confidence.^45,46

Results

SNP discovery

A total of 13 203 mRNA nucleotide and 5941 EST sequences from pineapple were gathered using methods previously described. After adapter removal, trimming, and quality control, 18 241 higher quality sequences were selected. The program CAP3,⁴⁷ using default parameters, was used to assemble sequences into 1793 contigs and 11 809 singlets with an average size of 3.59 sequences per contig, among which putative SNPs were detected in only 48 contigs using the QualitySNP program. Each of these selected clusters included a minimum of six EST sequences. In total, we obtained 213 putative SNPs, including 75C/T, 59A/G, 10A/T, 12A/C, 4T/G, 11C/G, 41 indel, and 1 high tri-allelic polymorphism. To select high-quality SNPs for validation, candidate SNP sites with at least 50 bp before and after the site were filtered. We calculated the number of all sequences in a cluster and the number containing the SNP type in this cluster. We then selected 96 SNPs for validation by genotyping the 170 pineapple accessions in the USDA-ARS pineapple collection.

Screening for polymorphic SNP markers

Out of the chosen 96 SNP markers, 80 were successful for genotyping. The failure of the remaining 16 SNPs was likely due to the sequence complexity or the presence of polymorphisms within the flanking sequences. However, among the 80 successful SNPs, 23 were monomorphic across the 170 pineapple samples (i.e. only one SNP variant was identified in all individuals). These monomorphic markers may have resulted from errors in transcriptome sequencing, which then led to the incorrect identification of SNP. It is also possible that some of these SNPs may correspond to rare alleles that were not present in the set of pineapple accessions we analyzed. A total of 57 polymorphic SNPs were retained for further analysis of this sample set. These 57 SNPs were reliably scored across the validation panel and thus were considered true SNPs. The flanking sequences of these 57 SNPs are listed in Table 2.

Table 2 Flanking sequences and SNPs of the 57 polymorphic markers.

Full size table

Cultivar identification

SNP profiles of the multiple accessions from the same pineapple cultivar showed that genotyping results were highly consistent (Table 3). Multilocus matching of SNP fingerprints revealed a high rate of duplicates in this pineapple collection. A total of 130 accessions could be classified into 24 synonymous groups (Table 4). The largest synonymous group, which includes 36 accessions, was found in cultivar Cayenne. It is also noticeable that some accessions within the same synonymous group have apparent morphological differences, despite matching SNP profiles, indicting somaclonal mutation within the synonymous group. For example, Cayenne 7898 QC has atypical yellow flesh color, whereas Cayenne 7898 4N has a white color, but their SNP profiles are the same (Figure 1).

Table 3 Examples of DNA fingerprints based on the array of 57 SNPs for pineapple genotype identification (showing truncated profiles).

Full size table

Table 4 List of 24 synonymous groups, including 130 accessions, identified by SNP markers in USDA pineapple collection, Hilo, Hawaii.

Full size table

Descriptive statistics and clustering analysis of 64 distinctive pineapple accessions

From each of the synonymous groups, only one accession was retained and used for subsequent diversity analysis. Among the 170 genotyped accessions, there were 64 accessions with a unique SNP profile. Descriptive statistics were then computed for the 57 polymorphic SNPs across the 64 pineapple accessions with a unique SNP profile and the result is presented in Table 5. The minor allele frequencies of these 57 SNPs ranged from 0.090 to 0.495 with an average of 0.324. The mean information index was 0.601, ranging from 0.304 to 0.693. The observed heterozygosity ranged from 0.110 to 0.935 with an average of 0.520, whereas the mean expected heterozygosity was 0.414 ranging from 0.164 to 0.500 (Table 5).

Table 5 Minor allele frequency, information index, heterozygosity, and inbreeding coefficient of the 57 SNP loci scored on 64 pineapple accessions.

Full size table

The unrooted NJ tree grouped the 64 accessions into three main clusters (Figure 2). The clustering patterns presented relationships among accessions based on the different botanical varieties or origins from different geographical regions. The first cluster includes all the accessions of A. comosus vars. ananassoides, bracteatus, and erectifolius, as well as the hybrids derived from these related botanical varieties. Within this cluster, vars. ananassoides bracteatus and erectifolius are clearly separated. This cluster also included several cultivated pineapple clones, such as Bogota, Pina Lisa, and Criolla from Colombia, Bermuda from Barbados, Cayenne Lot 520 from Hawaii, Cabezona from Puerto Rico, and Trinidad from Trinidad. The proximity between these cultivars and the two related botanical varieties indicates that these cultivars are either selected or derived from vars. ananassoides and bracteatus. The two Bolivian accessions (N94-92 Short Fruit#1 and N94-92 Long Fruit#2) were labeled as Ananas species in their passport record data. The cluster result showed that they should be A. comosus var. ananassoides or hybrids derived from A. comosus var. ananassoides.

The second cluster comprised of exclusively A. comosus var.comosus, including several well-known cultivars such as Cayenne Hilo, Mauritius, and Antigua. Since these three cultivars represent the reference horticultural groups of ‘Cayenne’ (Cayenne Hilo) and ‘Queen’ (Mauritius and Antigua), respectively, their grouping here, in one main cluster, demonstrated that the differences used to designate membership to these two horticultural groups are relatively small, in comparison with the other botanical varieties. The third cluster includes 26 cultivated pineapples that formed a large and diverse group. Within this large cluster there are several important pineapple cultivars such as Criolla from Mexico, Montelirio from Guatemala, and Pernambuco and Manzana from Brazil. The majority of the accessions in this cluster seemed mainly cultivated in South America.

Assignment test by STRUCTURE

Population stratification of the 64 accessions, based on ΔK value computed by STRUCTURE HARVESTER, revealed two clusters as the most probable number of K (Figures 3 and 4) and this partitioning was largely compatible with the cluster analysis (Figure 2). All the accessions related to var. ananassoides were assigned to one Bayesian cluster, whereas the cultivated germplasm, as well as vars. bracteatus and erectifolius, were grouped in a different cluster. The F1 hybrid of Wild Brazil × Plot 520 was confirmed by analysis with STRUCTURE. In addition, several accessions were classified as hybrids of the two clusters, such as N94-92, F1 Ananassoides × Plot 435, Wild Brazil × Cayenne Lot 520, and Cb 32 (Figures 3 and 4). The result of assignment by STRUCTURE is largely compatible with the result of clustering analysis (Figure 4). All the accessions assigned by STRUCTURE in the cluster of var. ananassoides or its hybrids were in the first cluster of the NJ tree.

Parentage analysis

Among the 52 cultivars and hybrids derived from related botanical varieties, paternal or maternal parents were assigned (>80% confidence level) to 14 accessions (Table 6). A. comosus var. ananassoides was responsible for parentage of three accessions including Bogota, Bermuda, and Pina Lisa, whereas A. comosus var. bracteatus was assigned to parentage of 10 accessions. No parentage was assigned to A. comosus var. erectifolius. The result of parent–offspring assignment is largely compatible with the cluster analysis (Figure 2). Accessions assigned as offspring from the same parent tended to be grouped together in the NJ tree (Figure 2). For example, CB 17 was found to be the likely progenitor for Mauritius, Phu Qui, and Congo, all of which grouped together in the same subcluster in group 3 (Figure 2).

Table 6 Likelihood assignment of parentage of pineapple (A. comosus var. comosus) accessions based on 57 SNP markers with LOD scores above 80% probability.

Full size table

Discussion

Despite substantial progress in genomics research on pineapple, advanced molecular tools to support germplasm management are not available. Developing SNP markers from transcriptome sequences has been considered an efficient strategy for non-model species. In the present study, we validated 96 SNP markers based on the transcriptome sequences of pineapple at various development stages and used them to genotype a diverse panel of cultivated and wild germplasm. We obtained a success rate of approximately 60% for marker validation, which demonstrated that this approach can serve as a shortcut for SNP development. As shown in the present study, even a small set of SNP markers can significantly improve accuracy and efficiency in germplasm management.

Pineapple cultivar identification

Reliable identification of pineapple cultivars is invaluable for germplasm conservation and cultivar protection. In the present study, it has been demonstrated that the set of 57 SNP markers was effective for the assessment of genetic identity of pineapple germplasm. Results from multiple clones of the same cultivar showed 100% concordance, demonstrating that the nanofluidic system is a reliable platform for generating pineapple DNA fingerprints with high accuracy. The present result revealed a high rate of genetic redundancy in this pineapple collection. Some of the identified duplicates are well-documented synonymous cultivars. For example, the Cayenne cultivars are known to be derived from a few ancestral pineapple plants that originated from Cayenne, French Guiana.^10,48 But majority of the clones or synonymous groups have been less known to the pineapple community, such as Pernambuco vs. Sugar Loaf, Spanish Samoa vs. Natal, and Ruby vs. Los Banos. Identification of these clone groups will significantly facilitate the efficient exchange, conservation, and use of pineapple germplasm.

However, caution is needed for the interpretation of genetic redundancy in pineapple. It is well known that somatic mutation is common in pineapple. Many phenotypic traits such as spiny leaves, fruit flesh color, acidity, and sugar content of fruit have been well documented. These somatic mutations are the major source of variation exploited for the selection of new cultivars. For example, the spiny or smooth leaf margins, caused by a single gene,¹⁰ are the signature character for the cultivar group Smooth Cayenne. Such a mutation is difficult to detect when a small set of molecular markers are applied. Similar problems were found in fingerprinting projects dealing with other vegetative propagated crops such as bananas (Musa spp., 2014), ⁴⁹ bread fruit (2015),⁵⁰ and apple (Malus spp., 2012).⁵¹ More comprehensive genomic approaches, such as next-generation sequencing, would be needed to detect which genes or alleles had been changed, thereby causing the phenotypic variation. For this reason, the reduction of identified duplicates in pineapple germplasm genebank needs to be considered on a case-by-case basis. Characterization of phenotypic traits among the synonymous group members is still essential to complement DNA fingerprinting for genotype identification.

Classification of pineapple germplasm

A. comosus is a mostly self-incompatible diploid with 2n=2x=50 chromosomes.^52,53 This species includes five botanical varieties of A. comosus: vars. comosus, ananassoides, parguazensis, erectifolius, and bracteatus, based on the revised classification of Coppens d’Eeckenbrugge and Leal.⁷ The present results show that out of the 170 Ananas accessions maintained in the USDA pineapple collection, there are only 64 distinctive genotypes. Clustering analysis and model-based stratification both showed that A. comosus var. ananassoides differs from A. comosus var. bracteatus, thus supporting the revised taxonomy system that classified A. comosus var. ananassoides and var. bracteatus as two different botanical varieties.⁷ However, accessions of var. erectifolius were found to have high similarity and grouping closely together with the accessions of var. bracteatus. This result differs with a previous report based on isozyme variation,¹⁵ which showed that A. comosus var. erectifolius did not have a distinctive isozyme profile, in comparison with the rest of the A. comosus var. comosus cultivars. Nonetheless, the present study only used two accessions of A. comosus var. erectifolius, which may be a bias in terms of the sample representation. Additional samples of A. comosus var. erectifolius from other genebanks need to be examined and a larger number of SNP markers need to be analyzed to clarify if the classification of A. comosus var. erectifolius should be revised.

The second observation is that several cultivated pineapple accessions (Bogota, Pina Lisa, Bermuda, Cayenne Lot 520, Cabezona, and Trinidad) were grouped together with A. comosus var. ananassoides or A. comosus var. bracteatus, instead of with the rest of the A. comosus var. comosus accessions (Figures 2 and 3). This result indicates that the current system that classifies all cultivated pineapple into a single botanical variety (A. comosus var. comosus) may be questionable. It would be appropriate to consider cultivated pineapple as a complex of different botanical varieties, with possible significant gene flow among them.

The third observation is about the validity of the horticultural classification of pineapple germplasm. Pineapple cultivars are classified into several horticultural groups. The commonly known groups include ‘Abacaxi’, ‘Cayenne’, ‘Maipure’, or ‘Perolera’, ‘Queen’, and ‘Spanish’.^10,54,55 Despite these horticultural groups having been adopted by many users, little investigation has been done to show a genetic basis to reinforce this categorization. Kato et al. examined the efficacy of the horticultural groups and reported that the classifications of ‘Cayenne’, ‘Spanish’ and ‘Queen’ were not well supported by AFLP analysis.²⁰ Shoda et al. analyzed 31 pineapple accessions using SSR markers.²² Their results also showed disagreement between the horticultural type and the results of the SSR analysis. The current study showed that the ‘Cayenne’ cultivars have a distinguishable genetic identity, and most of the affiliated accessions were grouped in a single cluster. However, accessions in the other groups did not appear well clustered. For example, cultivars Mauritius and Antigua are two well-known reference cultivars in the ‘Queen’ group, but in the NJ tree (Figure 2) they were separated in different subclusters, where cv. Antigua showed higher proximity with the ‘Cayenne’ group than with Mauritius. Similar discordance was found between cultivars of the ‘Spanish’ group (Figure 2). Therefore, our results support the previous conclusions of Kato et al.²⁰ and Shoda et al.²² that the classification of pineapple cultivars into horticultural groups lacks consistency in terms of their genetic bases. Revision seems needed on this classification with the support of new evidence generated by DNA markers.

Putative progenitors of pineapple

Parentage analysis showed that both vars. bracteatus and ananassoides can be progenitors of pineapple cultivars (Table 6). This result is in agreement with the fact that both var. bracteatus and var. ananassoides can intercross successfully with var. comosus to produce fertile offspring.^7,56 Coppens d’Eeckenbrugge and Leal hypothesized that var. ananassoides is the likely progenitor of cultivated pineapple, and it is likely that domestication happened in the Guiana shield.⁷ One strong piece of evidence supporting this hypothesis is that all four chloroplast haplotypes that have been identified in cultivated materials are present in the wild var. ananassoides.⁷ On the other hand, var. bracteatus was not considered as a progenitor in this hypothesis, mainly because var. bracteatus appeared to be a homogeneous variety with narrow genetic diversity, which is an unlikely basis for diverse domesticated cultigens of pineapple.⁷ The present result, however, shows that 11 pineapple cultivars (Canterra, Papuri Vaupes Colombia, CB 30, Pina de Castilla, Rondon, Congo, Phu Qui, Mauritius, Cheese pine), which are dispersed across different clusters as shown in the NJ tree, Figure 2), could have their parentage (either male or female) traced back to var. bracteatus (Table 6). Ananas comosus var. bracteatus is native to Brazil, Bolivia, Argentina, Paraguay, and Ecuador but not to the Guiana shield. The present result thus indicates the possibility that pineapple could have been domesticated at multiple sites, involving both var. ananassoides and var. bracteatus. The Parana-Paraguay river drainage area could be one of the domestication sites, since both var. bracteatus and var. ananassoides are indigenously distributed in this area.^4,5 Geographically disparate origins of crop domestication are not uncommon in the Americas, as in the case of common bean (Phaseolus vulgaris), chili pepper (Capsicum spp.), potato (Solanum spp.), and cacao (T. cacao), as reviewed by Clement et al.⁵⁷

In conclusion, we conducted a study to develop a set of SNP markers for pineapple and employed them for fingerprinting the USDA’s pineapple collection, using a nanofluidic array. This approach enabled us to generate high-quality SNP profiles for the purpose of pineapple cultivar identification. This is a highly useful tool for genebank management, which will also lead to more efficient crop improvement and, furthermore, has the potential to protect intellectual property rights of breeders. Our result also generated significant insight regarding the origin and domestication of pineapple. Efforts to sequence multiple cultivars from the same synonymous groups with somaclonal mutations are underway, in order to gain a comprehensive understanding about the genetic basis for mutation-based changes in important agronomic traits. This information will be highly useful for verification of pineapple cultivars and will improve the efficiency of pineapple genebank operation. The high rate of genetic redundancy detected in this collection, also suggests the potential impact of applying this technology on other tropical perennial crops.

References

Food and Agriculture Organization of the United Nations. Statistical databases. United Nations: FAO, 2014. Available at http://faostat.fao.org/site/567/DesktopDefault.aspx?PageID=567#ancor (accessed December 31, 2014).
Baker KF, Collins JL . Notes on the distribution and ecology of Ananas and Pseudananas in South America. Am J Bot 1939; 26: 697–702.
Article Google Scholar
Pearsall DM . The origins of plant cultivation in South America. In: Cowan CW, Watson PJ, editors. The Origins of Agriculture: An International Perspective. Washington, DC: Smithsonian Institution Press; 1992, pp 173–205.
Google Scholar
Bartholomew DP, Paull RE, Rohrbach KG . The Pineapple: Botany, Production and Uses. Wallingford: CABI Publishing; 2003.
Book Google Scholar
Purseglove JW . Tropical Crops. Monocotyledons. London: Longman; 1972. pp 75–91.
Google Scholar
Loison-Cabot C . Origin, phylogeny and evolution of pineapple species. Fruits 1992; 47: 25–32.
Google Scholar
Coppens d'Eeckenbrugge G, Leal F . “Chapter 2: Morphology, Anatomy, and Taxonomy”. In: Bartholomew DP, Paull RE, Rohrbach KG, editors. The Pineapple: Botany, Production, and Uses. Wallingford: CABI Publishing; 2003. pp 13–32.
Chapter Google Scholar
Coppens d’Eeckenbrugge G, Sanewski GM, Smith MK et al. Ananas. In: Kole C editor. Wild Crop Relatives: Genomic and Breeding Resources, Tropical and Subtropical Fruits. Berlin and Heidelberg: Springer-Verlag; 2011, pp 21–41.
Chapter Google Scholar
Morrison SE . Journals and Other Documents of the Life and Voyages of Christopher Columbus. New York: Heritage Press; 1973.
Google Scholar
Collins JL . The Pineapple, Botany, Utilisation, Cultivation. London: Leonard Hill Ltd; 1960. p294.
Google Scholar
Bartholomew DP, Hawkins RA, Lopez JA . Hawaii pineapple: the rise and fall of an industry. HortScience 2012; 47: 1390–1398.
Article Google Scholar
Hidayat T, Abdullah FI, Kuppusamy C, Samad AA, Wagiran A . Molecular identification of Malaysian pineapple cultivar based on internal transcribed spacer region. APCBEE Procedia 2012; 4: 146–151.
Article CAS Google Scholar
Sripaoraya S, Marchant R, Power JB, Davey MR . Herbicide-tolerant transgenic pineapple (Ananas comosus) produced by microprojectile bombardment. Ann Bot 2001; 88: 597–603.
Article CAS Google Scholar
Zhang J, Liu J, Ming R . Genomic analyses of the CAM plant pineapple. J Exp Bot 2014; 65: 3395–3404.
Article PubMed Google Scholar
Aradhya MK, Zee F, Manshart RM . Isozyme variation in cultivated and wild pineapple. Euphytica 1994; 79: 87–99.
Article CAS Google Scholar
Noyer JL, Lanaud C, Duval MF, Coppens G . RFLP study on rDNA variability in Ananas genus. Acta Hort 1997; 425: 153–160.
Article CAS Google Scholar
Duval MF, Noyer JL, Perrier X, Coppens d’Eeckenbrugge G, Hamon P . Molecular diversity in pineapple assessed by RFLP markers. Theor Appl Genet 2001; 1: 83–90.
Article Google Scholar
Duval MF, Buso GSC, Ferreira FR et al. Relationships in Ananas and other related genera using chloroplast DNA restriction site variation. Genome 2003; 46: 990–1004.
Article CAS PubMed Google Scholar
Popluechai S, Onto S, Eungwanichayapant PD . Relationships between some Thai cultivars of pineapple (Ananas comosus) revealed by RAPD analysis. Songklanakarin J Sci Technol 2007; 29: 1491–1497.
Google Scholar
Kato CY, Nagai C, Moore PH et al. Intra-specific DNA polymorphism in pineapple (Ananas comosus (L.) Merr.) assessed by AFLP markers. Genet Resour Crop Evol 2005; 51: 815–825.
Article Google Scholar
Wöhrmann T, Weising K . In silico mining for simple sequence repeat loci in a pineapple expressed sequence tag database and cross-species amplification of EST-SSR markers across Bromeliaceae. Theor Appl Genet 2011; 123: 635–647.
Article PubMed Google Scholar
Shoda M, Urasaki N, Sakiyama S et al. DNA profiling of pineapple cultivars in Japan discriminated by SSR markers. Breeding Sci 2012; 62: 352–359.
Article CAS Google Scholar
Feng S, Tong H, Chen Y et al. Development of pineapple microsatellite markers and germplasm genetic diversity analysis. BioMed Res Int 2013; 2013: 11.
Google Scholar
Ji K, Zhang D, Motilal LA, Boccara M, Lachenaud P, Meinhardt LW . Genetic diversity and parentage in farmer varieties of cacao (Theobroma cacao L.) from Honduras and Nicaragua as revealed by single nucleotide polymorphism (SNP) markers. Genet Resour Crop Evol 2013; 60: 441–453.
Article Google Scholar
Cabezas JA, Ibáñez J, Lijavetzky D et al. A 48 SNP set for grapevine cultivar identification. BMC Plant Biol 2011; 11: 153.
Article CAS PubMed PubMed Central Google Scholar
Wu GA, Prochnik S, Jenkins J et al. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication. Nature Biotechnol 2014; 32: 656–662.
Article CAS Google Scholar
Ge AJ, Han J, Li XD et al. Characterization of SNPs in strawberry cultivars in China. Genet Mol Res 2013; 12: 639–645.
Article CAS PubMed Google Scholar
Koia JH, Moyle R, Botella JR . Microarray analysis of gene expression profiles in ripening pineapple fruits. BMC Plant Biol 2012; 12: 240.
Article CAS PubMed PubMed Central Google Scholar
Moyle R, Fairbairn DJ, Ripi J et al. Developing pineapple fruit has a small transcriptome dominated by metallothionein. J Exp Bot 2005; 56: 101–112.
Article CAS PubMed Google Scholar
Neuteboom LW, Kunimitsua WY, Webb D, Christopher DA . Characterization and tissue-regulated expression of genes involved in pineapple (Ananas comosus L.) root development. Plant Sci 2002; 5: 1021–1035.
Article Google Scholar
Ong WD, Voo L-YC, Kumar VS . De Novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing. PLoS One 2012; 7: e46937.
Article CAS PubMed PubMed Central Google Scholar
Tang J, Vosman B, Voorrips RE, van der Linden CG, Leunissen JA . QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species. BMC Bioinformatics 2006; 7: 438.
Article PubMed PubMed Central Google Scholar
Zhang HN, Wei YZ, Shen JY et al. Transcriptomic analysis of floral initiation in litchi (Litchi chinensis Sonn.) based on de novo RNA sequencing. Plant Cell Rep 2014; 33: 1723–1735.
Article CAS PubMed Google Scholar
Platel RK, Jain M . NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS One 2012; 7: e30619.
Article Google Scholar
Wang J, Lin M, Crenshaw A et al. High-throughput single nucleotide polymorphism genotyping using nanofluidic dynamic arrays. BMC Genomics 2009; 10: 561.
Article PubMed PubMed Central Google Scholar
Fluidigm. Fluidigm SNP Genotyping User Guide Rev H1, PN 68000098. South San Francisco, CA: Fluidigm Corporation; 2011.
Peakall R, Smouse PE . Genalex 6: Genetic analysis in excel. Population genetic software for teaching and research. Mol Ecol Notes 2006; 6: 288–295.
Article Google Scholar
Peakall R . Smouse PE. GenAlEx 6.5: Genetic analysis in excel. Population genetic software for teaching and research-an update. Bioinformatics 2012; 8: 2537–2539.
Article Google Scholar
Dieringer D, Schlötterer C . Microsatellite analyser (MSA): Aplatform independent analysis tool for large microsatellite datasets. Mol Ecol Notes 2003; 3: 167–169.
Article CAS Google Scholar
Saitou N, Nei M . The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 1987; 4: 406–425.
CAS PubMed Google Scholar
Felsenstein J . PHYLIP-phylogeny inference package (version 3.2). Cladistics 1989; 5: 164–166.
Google Scholar
Letunic I, Bork P . Interactive tree of life v2: Online annotation and display of phylogenetic trees made easy. Nucleic Acids Res 2011; 39: W478–W478.
Article Google Scholar
Pritchard JK, Stephens M, Donnelly P . Inference of population structure using multilocus genotype data. Genetics 2000; 155: 945–959.
CAS PubMed PubMed Central Google Scholar
Earl DA, vonHoldt BM . Structure Harvester: website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 2012; 4: 359–361.
Article Google Scholar
Marshall TC, Slate J, Kruuk LEB, Pemberton JM . Statistical confidence for likelihood-based paternity inference in natural populations. Mol Ecol 1998; 7: 639–655.
Article CAS PubMed Google Scholar
Kalinowski ST, Taper ML, Marshall TC . Revising how the computer program CERVUS accommodates genotyping error increases success in paternity assignment. Mol Ecol 2007; 16: 1099–1106.
Article PubMed Google Scholar
Huang X, Madan A . CAP3: A DNA sequence assembly program. Genome Res 1999; 9: 868–877.
Article CAS PubMed PubMed Central Google Scholar
NoyerJL. Preliminary study of genetic diversity of the genus Ananas by RFLP. Fruits (France) 1991; 46: 372–375.
Google Scholar
Irish BM, Cuevas HE, Simpson SA et al. Musa spp. germplasm management: microsatellite fingerprinting of USDA-ARS national plant germplasm system collection. Crop Sci 2014; 54: 2140–2151.
Article Google Scholar
Zerega N, Wiesner-Hanks T, Ragone D et al. Diversity in the breadfruit complex (Artocarpus, Moraceae): Genetic characterization of critical germplasm. Tree Genet Genomes 2015; 11: 4.
Article Google Scholar
Gross BL, Volk GM, Richards C . Identification of “Duplicate” accessions within the USDA-ARS national plant germplasm system Malus collection. JASHS 2012; 5: 333–342.
Google Scholar
Marchant CJ . Chromosome evolution in the Bromeliaceae. Kew Bulletin 1967; 21: 161–168.
Article Google Scholar
Brown GK, Palací CA, Luther HE . Chromosome numbers in Bromeliaceae. Am J Bot 1997; 76: 85–88.
Google Scholar
Leal F, Soule J . Maipure, a new spineless group of pineapple cultivars. HortScience 1977; 12: 393–403.
Google Scholar
Wee YC, Thongtham MLC . Ananas comosus (L.) Merr. In: Verheij EWM, Coronel RE, editors. Plant Resources of South-East Asia No. 2 Edible Fruits and Nuts. Wageningen: The Netherlands, Pudoc; 1991. pp66–71.
Google Scholar
Coppens d'Eeckenbrugge G, Leal F, Duval MF . Germplasm resources of pineapple. Hort Rev 1997; 21: 133–175.
Google Scholar
Clement CR, Cristo-Araújo MD, Coppens D’Eeckenbrugge G, Pereira AA, Picanço-Rodrigues D . Origin and domestication of native amazonian crops. Diversity 2010; 2: 72–106.
Article Google Scholar

Download references

Acknowledgements

We would like to give special thanks to Carol Mayo-Riley for sampling the pineapple germplasm and organizing the passport information; Stephen Pinney for assisting SNP genotyping using nanofluidic array. This work was partially supported by the National Natural Science Foundation of China (30800884, 31370688). References to a company and/or product by the USDA are only for the purposes of information and do not imply approval or recommendation of the product to the exclusion of others that may also be suitable.

Author information

Authors and Affiliations

Sustainable Perennial Crops Laboratory, USDA-ARS, Beltsville Agricultural Research Center, Beltsville, 20705, MD, USA
Lin Zhou, Lyndel W Meinhardt, Sue Mischke & Dapeng Zhang
Daniel K. Inouye Pacific Basin Agricultural Research Center, USDA-ARS, Hilo, 96720, HI, USA
Tracie Matsumoto
College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
Lin Zhou & Hua-Wei Tan
Yunnan Forestry Technological College, Kunming, 650224, Yunnan, China
Boyi Wang

Authors

Lin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Tracie Matsumoto
View author publications
You can also search for this author in PubMed Google Scholar
Hua-Wei Tan
View author publications
You can also search for this author in PubMed Google Scholar
Lyndel W Meinhardt
View author publications
You can also search for this author in PubMed Google Scholar
Sue Mischke
View author publications
You can also search for this author in PubMed Google Scholar
Boyi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dapeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dapeng Zhang.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplemental Information for this article can be found on the Horticulture Research website (http://www.nature.com/hortres).

Supplementary information

Supplementary Table S1 (PDF 49 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 Unported License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Zhou, L., Matsumoto, T., Tan, HW. et al. Developing single nucleotide polymorphism markers for the identification of pineapple (Ananas comosus) germplasm. Hortic Res 2, 15056 (2015). https://doi.org/10.1038/hortres.2015.56

Download citation

Received: 30 September 2015
Revised: 24 October 2015
Accepted: 25 October 2015
Published: 25 November 2015
DOI: https://doi.org/10.1038/hortres.2015.56

This article is cited by

Cultivar-specific markers, mutations, and chimerisim of Cavendish banana somaclonal variants resistant to Fusarium oxysporum f. sp. cubense tropical race 4
- Bo-Han Hou
- Yi-Heng Tsai
- Ho-Ming Chen
BMC Genomics (2022)
Complete chloroplast genome of Lilium ledebourii (Baker) Boiss and its comparative analysis: lights into selective pressure and adaptive evolution
- Morteza Sheikh-Assadi
- Roohangiz Naderi
- Vahid Shariati
Scientific Reports (2022)
DNA fingerprinting reveals varietal composition of Vietnamese cassava germplasm (Manihot esculenta Crantz) from farmers’ field and genebank collections
- John Ocampo
- Tatiana Ovalle
- Luis A. Becerra Lopez-Lavalle
Plant Molecular Biology (2022)
Complete chloroplast genomes shed light on phylogenetic relationships, divergence time, and biogeography of Allioideae (Amaryllidaceae)
- Ju Namgung
- Hoang Dang Khoa Do
- Joo‑Hwan Kim
Scientific Reports (2021)
Population structure analysis and association mapping for iron deficiency chlorosis in worldwide cowpea (Vigna unguiculata (L.) Walp) germplasm
- Waltram Ravelombola
- Jun Qin
- Wei Yang
Euphytica (2018)

Subjects

Abstract

Similar content being viewed by others

Introduction

Materials and methods

Mining of putative SNPs from EST and nucleotide sequences

Validation of putative SNPs

Data analysis

Results

SNP discovery

Screening for polymorphic SNP markers

Cultivar identification

Descriptive statistics and clustering analysis of 64 distinctive pineapple accessions

Assignment test by STRUCTURE

Parentage analysis

Discussion

Pineapple cultivar identification

Classification of pineapple germplasm

Putative progenitors of pineapple

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links