Abstract
Buckwheat is a nutritional and economically crop belonging to Polygonaceae, Fagopyrum. To better understand the mutation patterns and evolution trend in the chloroplast (cp) genome of buckwheat, and found sufficient number of variable regions to explore the phylogenetic relationships of this genus, two complete cp genomes of buckwheat including Fagopyrum dibotrys (F. dibotrys) and Fagopyrum luojishanense (F. luojishanense) were sequenced, and other two Fagopyrum cp genomes were used for comparative analysis. After morphological analysis, the main difference among these buckwheat were height, leaf shape, seeds and flower type. F. luojishanense was distinguishable from the cultivated species easily. Although the F. dibotrys and two cultivated species has some similarity, they different in habit and component contents. The cp genome of F. dibotrys was 159,320 bp while the F. luojishanense was 159,265 bp. 48 and 61 SSRs were found in F. dibotrys and F. luojishanense respectively. Meanwhile, 10 highly variable regions among these buckwheat species were located precisely. The phylogenetic relationships among four Fagopyrum species based on complete cp genomes was showed. The results suggested that F. dibotrys is more closely related to Fagopyrum tataricum. These data provided valuable genetic information for Fagopyrum species identification, taxonomy, phylogenetic study and molecular breeding.
Similar content being viewed by others
Introduction
Buckwheat is an ancient dicotyledonous crop belonging to Polygonaceae, Fagopyrum, which is widely distributed around the world1. The Fagopyrum was classified into 25 species, which including two recently identified wild species F. luojishanense and F. hailuogouense 2, 3. Buckwheat is a kind of traditional and economically food crop in Southwest China with the advantages of short growth period, high adaptability, high yield and resistance to harsh environments where the major food crops may hard to survive. At the same time, buckwheat also known for their high-quality proteins and rich in many rare components including flavonoids and phytosterols which have healing effects on some chronic diseases4 and play an important role in anti-oxidation metabolism. Due to its high content of starch, microelement, lysine and methionine, buckwheat also has been paid attention to its feeding value in recent years. At present, the potential commercial values of buckwheat have been widely paid attention. It was difficult for the buckwheat species identification, especially for the wild buckwheat, which impede the effective exploration and utilization of the resource advantage. Morphological research is the foundation of classification and evolution research. Meanwhile, the biological components and nutritions of plants which influenced by weather and metabolism are normally different among species. Based on morphological characteristics, the Polygonaceae family which is morphologically diverse and distributed was divided into two classical subfamilies, Polygonoideae Eaton and Eriogonoideae Arn 5. It has been supported by molecular phylogenies with a broad sampling of genera6.
During the past years, the use of cp DNA sequence data has been proved useful for phylogenetic relationships and inferring phylogenetic relationships in genus Fagopyrum at taxonomic levels. Ohnishi et al. classified 11 species of Fagopyrum based on their morphology, isozyme variability and RFLP of cp DNA7. After that, Yasui et al. analyzed the 3′ of rbcL, the accD 5′ end regions and the ITS regions of 12 species of Fagopyrum, they found the polyploidization in F. cymosum might have occurred more than once8. On the other hand, Ohsako et al. studied the intra- and interspecific phylogeny of Fagopyrum using trnC-rpoB, and trnK/matK in the urophyllum group, and results showed that these two regions can be used to study genetic relationships at lower taxonomic levels9. The maturase K (matK) gene of the chloroplast genome has been provided good resolution in phylogenetic analysis of several groups within genus Fagopyrum. In 2001, Ohsako et al. used trnK/matK regions of cp DNA to confirm the phylogenetic relationship of two wild Fagopyrum (Polygonaceae) species10. Two years later, Nishimoto et al. used two nucleotide genes (FLO/LFY and AG) and three segments of cp DNA to analyze the species of Fagopyrum, however their results showed that the topologies were partially incongruent between the nuclear and chloroplast DNA phylogenies11. All of these reports mentioned above demonstrated that cp DNA analysis is useful in Fagopyrum classification, but they only focused on some parts of the chloroplast genome. It became a major problem, only using some portions of the chloroplast genome for phylogenetic research, the incomplete dates will limit the development of plant evolutionary studies12. Owing to the incomplete sampling, insufficient phylogenetic markers and complex evolutionary issues, the phylogenetic relationships among Fagopyrum species are still not fully understood.
Chloroplast genome sequences could provide sufficient information for analysis and comparison the diversifications among plant species. It contains a series of function genes which play a key role in plant cells, especially orchestrate the photosynthesis process, sugar synthesizing and certain fixation12. On the other hand, the chloroplast DNA also has the high copy number which makes it could be easily extracted from samples. The chloroplast sequence is more conserved than nuclear sequence13, the non-recombinant and mostly uniparental inheritance characteristics of the chloroplast genome makes the gene content and genome structure is highly conservative in plants14. But it also exhibit different mutation events15, which will provide many sequences divergence information between species and it also could been used as a potential tool for evolutionary, taxonomy, phylogeny studies in plants16,17,18,19. All in all, the sequencing data of chloroplast DNA could be applied to DNA barcoding20, phylogeny reconstruction21 and transplastomic studies22.
Our research group has devoted to the collection of buckwheat germplasm resources for decades. F. luojishanense is a new wild buckwheat specie which was discovered recently2. It is a transitional species of buckwheat, both phylogenetic and genomic studies are lacking for this species. While the F. dibotrys belongs to Fagopyrum genus, which has been used as herb-medicine for many years, but the affinity of F. dibotrys also remain debatable and the relationship between F. dibotrys and other two cultivated buckwheat (F. esculentum and F. tataricum) was still controversy. To discuss the clear taxonomic status, additional sequence information such as chloroplast genome sequence is highly desirable. Until recently, there are only two complete chloroplast genome of buckwheat was sequenced and reported23, 24, which is insufficient for discussing the evolutionary relationships among buckwheat species in many cases. At the same time, the reported biomarkers like ITS and matK sequences could not to reveal the phylogenetic relationships in this genus perfectly and resolve the species identification problems accurately. Meanwhile, the results of different molecular markers also have considerable differences. The more important is the sequence of wild buckwheat chloroplast genomes not only could be utilized for its phylogenetic studies but also could provide more useful information for other practical aspects. For example, we can highlight the physiologically important traits genes in buckwheat25 and developing species specific transformation vectors26 through analyse the chloroplast genome sequence. All in all, the research of the buckwheat chloroplast genome sequence opens an avenue for the application of buckwheat plastid genetic engineering. And it is prerequisite for an efficient breeding program that assessing genetic variability within Fagopyrum and its variation among populations.
In the present study, we compared cultivated buckwheat and wild buckwheat in morphology aspect. At the same time, we sequenced chloroplast genomes of two wild buckwheat species and comparative analyzed four complete chloroplast genomes including cultivated buckwheat species. After that, we also discussed the phylogenetic relationship between the new species and some other species. Finally, we tried to answer the following questions: (1) Are the differences among buckwheat in morphology related with the chloroplast genomes evolution? (2) What is the typical structural pattern of chloroplast genomes in buckwheat? (3) What kind of mutation events happened among the four chloroplast genomes? (4) Identified highly variable regions by compared the four complete chloroplast genomes, which could be utilized as potential markers or candidate DNA barcoding for phylogenetic analysis. To our knowledge, this paper is the first report of different wild buckwheat’s complete chloroplast genome sequences and it is also the first comprehensive analysis based on complete chloroplast genomes for wild buckwheat with cultivated buckwheat. Using the Fagopyrum chloroplast genome sequences for the phylogenetic analysis could demonstrate the evolution of each species and provide abundant information for potential biomarkers to the species identification, taxonomy, phylogenetic research and assist in utilization and exploration of wild buckwheat.
Results
Morphological Analysis
Phylogenetic relationships based on morphological characteristics were discussed in this section. The morphological characteristics of different buckwheat species indicated that the differences among four buckwheat including plant height, stem thickness, node number, pistil number, inflorescence, seeds luster and so on. Results reflected the morphological differences of four buckwheat species were shown in Table 1. On the other hand, Fagopyrum species contained a variety of nutrients like flavonoids and amino acids. In the present study, we also measured the agronomy data of different buckwheat, such as thousand grain weight, protein, amino acid, flavonoids content and so on, which were showed in Supplementary Table S1. The values represent means of different independent replicates ± SE, respectively. From the results, the morphological characteristics of wild buckwheat species were prominently different from the cultivated species. Meanwhile, the amino acid content of the four different buckwheat was basically different, and the flavonoids in F. dibotrys and F. tataricum were higher than other two species. At the same time, in order to further determine and evaluate the differences of four Fagopyrum species, eight indicators were processed by principal component analysis (PCA). The scatter plot constructed by two discriminate components based on PCA was shown in Fig. 1. The components 1 explained 51.9% of the variance, and components 2 explained 47.4%. It shown that the wild buckwheat F. luojishanense was easily distinguished from other three buckwheat, but it still hard to discriminate the other three buckwheat species clearly. In summary, the PCA result based on combining the morphology data and agronomy data could not well determine the differences among the four buckwheat species. The relationships of these four species were still unclear when only using morphological and agronomy characteristics data. Therefore, other analysis should be used to obtain better and reliable results. And these differences probable derived from the chloroplast genome of buckwheat, and this hypothesis also need further comparative analysis of complete chloroplast genome sequences.
Chloroplast Genome Features of F. dibotrys and F. luojishanense
In this paper, the chloroplast genomes of two wild buckwheat species were sequenced using the Illumina HiSeq. 4000 system. The F. luojishanense and F. dibotrys produced 17,562,298 and 29,833,124 paired-end raw reads after sequencing. In these paired-end reads, 203,148 and 1,154,010 chloroplast genome reads were extracted after aligned with the reference genomes, and the content of chloroplast reads was 1.16% and 3.87%. The coverage of F. luojishanense and F. dibotrys chloroplast genomes were 191 and 1086 respectively. The entire chloroplast genome of F. luojishanense consisted of 159,265 bp nucleotides and the chloroplast genome of F. dibotrys had a length of 159,320 bp. The map of gene was shown in Fig. 2. The results of comparison of the four complete chloroplast genomes contents were shown in Table 2. The Fagopyrum chloroplast genomes include a pair of IR regions of 30,870 bp in F. luojishanense and 30,817 bp in F. dibotrys. And it was separated by a LSC region of 84,431 bp in F. luojishanense and 84,422 bp in F. dibotrys and a SSC region of 13,094 bp in F. luojishanense and 13,264 bp in F. dibotrys. The average GC content was 37.8% in F. luojishanense and 37.9% in F. dibotrys respectively, which was almost similar value with each other among the four complete Fagopyrum chloroplast genomes. Both of the two Fagopyrum chloroplast genomes contained 114 different functional genes, including 81 protein-coding genes, 29 tRNA genes, and 4 rRNA genes, which were identical to those of other published Fagopyrum chloroplast genomes in gene arrangement and composition. Among these functional genes, 11 protein-coding genes and 6 tRNA genes contained introns (Supplementary Table S2), whereas, ycf3 and clpP contained two introns. However, the rps12 was a special trans-splicing gene which the 5′ exon of the gene located in the LSC region and the 3′ exon located in the IR region.
In this paper, we updated the annotation on the chloroplast genome sequence of two wild buckwheat species after used the DOGMA program. Based on the gene function, the different functional genes were divided into three categories (Supplementary Table S2). The genes of the first category was participate in transcription and translation, a total of 60 genes, most of them were tRNA including RNA polymerase, ribosomal RNA, ribosomal proteins products and so on. The second category was photosynthesis related genes, a total of 47 genes, including rubisco genes, genes participate in electron transfer of photosynthesis and NADPH dehydrogenase genes. The third category was amino acids and fatty acids biosynthesis related genes, including one pseudogenes and two unknown function genes.
Sequence Divergence Hotspot Regions Analysis
The mVISTA program was used to perform a sequence identity analysis and elucidate the level of sequence divergence after compared the complete chloroplast genomes of the four Fagopyrum species. The results were revealed in Fig. 3, and the alignment showed high sequence similarity which suggested that the four chloroplast genomes were conserved. At the same time, we also used the DnaSP 5.0 software to elucidate the sequence divergence, which shown that the nucleotide diversity (π) was 0.02372 among the chloroplast genomes of Fagopyrum species. After sequence divergence analysis across the four buckwheat chloroplast genomes, the total number of mutations (Eta) was 7152, and the number of polymorphic (segregating) sites (S) was 7001. The theta (per site) from S (Theta-W) was 0.02472 and the theta (per site) from Eta was 0.02525 while the theta (per sequence) from S was 3818.727. Meanwhile, the Theta (per sequence) from S was 3818.727. The sliding window of DnaSp 5.0 software was displayed in Fig. 4, ten highly variable regions were located precisely, which the nucleotide diversity (π) was higher than 0.06, including trnS-trnG, rpoB-trnC, trnT-psbD, trnT-trnL, rbcL-accD, ycf4-cemA, psbE-petL, ndhF-rpl32, ndhA and ycf3-trnS. And the psbE-petL regain has a highest Pi value which is 0.07563 (Supplementary Table S3). Eight of these regions located at the LSC region, two of them located in the SSC region but no one in the IR region. After that, to validate the ten highly variable regions, we also amplified the fragments of these regions. Four Fagopyrum plants were chosen to test the discriminatory powers of these high variable sequences. The result was shown in Supplementary Figure S1. All the bands in different lanes could been seen, which means these ten variable regions could be used as new molecular markers for phylogeny analysis in Fagopyrum species.
Numbers and Pattern of SNP Mutations
We investigated the SNP mutations of the four chloroplast genomes, which were the most abundant type of mutations in the genomes. The numbers of nucleotide substitutions in four complete chloroplast genomes was shown in Table 3. We calculated the numbers of nucleotide substitutions between every two species. The number between F. luojishanense and other species was huge, which was 5940, 6260, 5992 between F. dibotrys, F. esculentum and F. tataricum respectively. In addition, we only found 317 nucleotide substitutions between F. dibotrys and F. tataricum, which was the smallest number across all the chloroplast genomes of the four species.
SSRs Polymorphisms
We used Perl script MISA to detect the SSRs mutations sites. SSRs consist of mononucleotide (A/T) repeats, dinucleotide (AT/TA) repeats, trinucleotide (CTT/AAG) repeats and tetranucleotide (GTCT/AATG/AATA) repeat were detected among the four Fagopyrum cp genomes. The number of SSRs in chloroplast genomes is different among the four Fagopyrum species which was shown in Table 4. The number of SSRs in F. dibotrys and F. luojishanense chloroplast genomes was 48 and 61, while the number of SSRs in F. esculentum and F. tataricum chloroplast genomes was 57 and 49 respectively. The mono-, di-, trin-, tetra-, penta-, and hexanucleotide SSRs were all counted among the four Fagopyrum chloroplast genomes. Two pentanucleotide repeats SSRs were found in F. dibotrys cp genome, and one pentanucleotide repeats SSRs were found in F. tataricum cp genome, and no hexanucleotide repeats were observed in these cp genomes. In F. dibotrys cp genomes, the mononucleotide repeats was the biggest with a portion of 58.33%, the dinucleotide, trinucleotide, tetranucleotide and pentanucleotide repeats were 11, 3, 4 and 2 respectively. In F. luojishanense chloroplast genomes, the mononucleotide repeats was highest with a portion of 62.30% among the species, the dinucleotide, trinucleotide, tetranucleotide and pentanucleotide repeats were 15, 4, 4 and 0 respectively. On the other hand, in the cultivated species, the mononucleotide repeats was the biggest with a portion of 57.89% while the dinucleotide, trinucleotide, tetranucleotide and pentanucleotide repeats were 16, 4, 4 and 0 in F. esculentum. At last, in F. tataricum cp genomes, the mononucleotide repeats was also the biggest with a portion of 61.22%, the dinucleotide, trinucleotide, tetranucleotide and pentanucleotide repeats were 11, 3, 4 and 1 respectively. Normally, the SSRs in cp genome were dominated by A/T mononucleotide repeats. From our results, the A/T repeat unit was the most abundant which has a particular repeat numbers of 10–15, and most repeats were consist of A or T.
All SSRs with a length of at least 10 bp among the four Fagopyrum chloroplast genomes species were detected (Supplementary Table S4). Based on the results, we also found that many SSRs were observed in the same 14 locus among the four different Fagopyrum species. Most of these locations were intergenic, including trnS-trnG, rpoB-trnC, trnS-psbZ, trnS-rps4, ndhC-trnV, rpl33-rps18, rpl32-trnL, ndhD-psaC, the remaining five were located in protein-coding genes, which were atpA, rpoC2, cemA, rpl22, ycf2 and ndhD. On the other hand, among the four Fagopyrum cp genomes sequences, SSRs mainly located in intergenic, following by genes and intron, and no SSRs were observed in the tRNAs and rRNAs. Most of those SSRs detected were located in LSC region, followed by IR and SSC regions. From our results, the IR reign speared by SSC contain the same number of nucleotide repeats because of concerted evolution in cp genome. In F. dibotrys cp genome, there were 16 SSRs in gene regions, 30 SSRs in intergenic regions and 2 SSRs in introns, which were ycf3 intron and petD intron. Most of these SSRs were located in LSC region, which the number was 33, followed by IR and SSC regions. Meanwhile, 61 SSRs were detected in the chloroplast genome of F. luojishanense. These SSRs mainly located in intergenic spacer which have 43 SSRs, following by 11 in gene regions and 7 in introns regions. Of all these SSRs events, 46 SSRs were in the LSC region while 8 SSRs were in the IR region and 7 in the SSC region. At the same time, the SSRs in F. esculentum and F. tataricum were also detected used the same method. The number of SSRs in F. esculentum was little more than that in F. tataricum. But both of them mainly located in the LSC region, while 37 of 57 SSRs in the F. esculentum and 35 of 49 SSRs in the F. tataricum were in the LSC region.
Repetitive Sequences Analysis
We used the REPuter to find all kinds of repeats in the cp genome sequences, the results of repetitive sequences analysis were shown in Fig. 5 with the criterion of copy size 30 bp or longer. A total of 135 repeats were detected in the four Fagopyrum cp genomes sequences, including forward, palindromic and reverse repeats, which abbreviated as letter F, P and R in Supplementary Table S5. Nevertheless, the reverse match repeats only been found in F. luojishanense and F. esculentum cp genomes. While the F. dibotrys and F. tataricum cp genomes only included two repeat types, forward and palindromic. We identified a total of 42 repeats in the F. dibotrys cp genomes and 31 repeats in the F. luojishanense cp genomes which including forward, reverse and palindromic repeats. At the same time, we also identified a total of 40 repeats in the F. tataricum chloroplast genomes and 22 repeats in the F. esculentum chloroplast genomes (Fig. 5A). The wild Fagopyrum species F. luojishanense possessed 16 forward repeats, 14 palindromic repeats and one reverse repeats, except for the other wild species F. dibotrys with 21 forward repeats and 21 palindromic repeats. On the other hand, we also detected 20 forward repeats and 20 palindromic repeats in F. tataricum chloroplast genomes while 11 forward repeats, 10 palindromic repeats and one reverse repeats in F. esculentum chloroplast genomes. The number of the forward repeats and palindromic repeats are basically the same, accounting for 68 forward repeats of the total repeats, and 65 palindromic repeats, following with two reverse repeats among the four cp genome sequences.
Most repeats sequences were detected in IR region, following by LSC region, and the SSC region also has a few repeats been found. The repeat length of the first part in four Fagopyrum cp genomes ranged from 30 to 59 bp. Most starting position of the first part of repetitive sequences were distributed in IR regions of F. dibotrys cp genomes. On the other hand, in F. luojishanense chloroplast genomes, most starting position of the first part of repetitive sequences were distributed in LSC region, only minority starting position of the first part of repetitive sequences were located in SSC regions (Supplementary Table S5). The lengths of most repeats in these four Fagopyrum cp genomes was shorter, mainly ranging from 30 to 40 bp (Fig. 5B), whereas the longer repeats, such as ranging from 51 to 60 bp, was also be found in the F. dibotrys and F. esculentum.
At the same time, we found the distribution of repeats are conserved among four different cp genomes. In this study, we investigated the location of these repeats. It was found that the majority repeats were located in intergenic spacer (53%) and gene coding regions (39%), while a minority of repeats was detected in intron (8%) (Fig. 5C). After that, we analysis the repeats deeply, and found there has many repeats shared in the same locus, which indicated that the repeats sequences could be detected in these locus among the four cp genomes. Here, we found nine locus speared in three different region, which all could detected repeats sequences among four Fagopyrum cp genomes. These locus were pasA, pasB, psbI-trnS, psbC-trnS, ycf3 and rbcL-accD in LSC, flowing with rrn4.5-rrn5, ycf1 in IR region and ndhA in SSC. At the same time, we also found the numbers, repeat types and distributions of the repeats between the cp genomes of F. dibotrys and F. tataricum were similar and conserved. From our results of highly divergent hotspot regions, it also showed that the repeat sequences were associated with the divergent regions of cp genomes, such as the ndhA and rbcL-accD. It implied the repeats sequences also could be used as genetic markers for phylogenetic studies.
Phylogenetic Analysis
In the present studies, four complete chloroplast genomes of Fagopyrum species and 24 outgroups were used to construct the phylogenetic trees. These outgroups were shown in Supplementary Table S6, which came from five different families that contain all cp genomes gene data in the Caryophyllales. And we used three different methods MP/ML/BI to build a phylogenetic tree based on these gene data. Figure 6 showed the results of ML analysis, which was completely coincident with the phylogenetic tree that build by the MP and BI analysis. The symbol * in the phylogenetic tree indicated that the support rate of branch was 100/100/1.0. It was clear that all the species were classified into five big groups, and every species in each group came from the same families which confirmed these species came from five different families. The results showed that all the Fagopyrum species cluster together with much higher internal resolution, and they were classified into a big group with Rheum palmatum and Rumexacetosa. In this Polygonaceae group, F. dibotrys, F. tataricum, F. esculentum and F. luojishanense formed a subgroup, which was different with Rheum palmatum and Rumexacetosa subgroup. As the Fig. 6 illustrated, F. dibotrys was closer to F. tataricum than to F. esculentum. In addition, F. dibotrys was closer to F. tataricum and F. esculentum than to F. luojishanense. The results suggested that F. dibotrys was more closely related to F. tataricum.
Discussion
The morphology data of different species could reflect the difference among species. Besides, the morphology data and agronomy data like amino acids content have already been used to determine the differences between species. From the result, two wild buckwheat species have been characterized by leaf shape, petioles and flower type, on the basis of morphological characteristics, F. loujishanenes was distinguishable from the two cultivated species easily. Genetically, F. loujishanenes is also quite distant to the other species of Fagopyrum. F. dibotrys has many similarities with F. tataricum and F. esculentum in morphological characteristics, especially with petal, leaf, inflorescence type and seed color. Although the F. dibotrys and two cultivated species are genetically similar, they differ considerably in habit and gross morphology. Sometimes, the morphology of species in genus is hard to distinguish different species accurately. On the other hand, the PCA result based on the means of morphology profiles and agronomy data also cannot well determine the differences among the four buckwheat species. In the PCA result, wild buckwheat F. luojishanense was obviously distinguished from other three buckwheat, but it is still hard to discriminated the other three buckwheat species clearly. Therefore, the divisions and the evolutionary relationships within Fagopyrum species need further investigating and searching more evidences. Combined evidence from morphology characteristics and cp DNA has been served as a powerful tool in hybrid studies27. On the other hand, it has been reported that the cp DNA genomes independent from the mechanism of cytoplasmic in the study of the modified flower type with the chloroplast DNA28. Because of the parallel and convergent evolution of morphological characteristics, the data of morphological analysis maybe internally inconsistent. Only based on morphological characteristics, we could not conclusive with distinction and identification the hybrids, wild relative species and paternal species29.
The chloroplast genomes normally have a circular structure which ranging from 115 to 165 kb in length, and it is composed from a large single copy (LSC) region, a small single copy (SSC) region and two copies of inverted repeat (IR) regions30. Both chloroplast genomes of the two Fagopyrum species showed a typical circular structure, which consisted of two copies of IR regions separated by the LSC region and the SSC region. The chloroplast genomes contain much genetic information, which could be discovered through comparative analysis of complete chloroplast genome sequences. From the results, the sequence divergence of both IR regions is lower than LSC and SSC regions while it was also occurred in many plants which have been reported before. And it probably due to the gene conversion between IR sequences31.
The molecular markers such as high variable sequences, SSRs and SNPs are useful tools in research. Comparative analysis of complete chloroplast genome sequences of five Camellia species identified 15 molecular markers which have over 1.5% sequence divergences. After that, these high divergent sequences were used to phylogenetic analysis and species identification32. Our analysis data could serve to enrich the resource of Fagopyrum in systematic, molecular phylogenetic and genetic breeding studies. The mutation region in the cp genome was not random, and these mutational dynamics created the highly variable regions in the genome33. Ten highly variable regions been detected, including trnS-trnG, rpoB-trnC, trnT-psbD, ycf3-trnS, trnT-trnL, rbcL-accD, ycf4-cemA, psbE-petL, ndhF-rpl32 and ndhA intron. As our results showed, most of them occurred in the LSC and SSC regions but not IR regions. Among these variable regions, the largest was located in the LSC region within the intergenic sequence trnT-trnL, and other two large variable regions were also found within psbE-petL and rpoB-trnC in the LSC region. Many highly variable regions have been identified in intergenic spacers, including ndhF-rpl32, trnE-trnT, rpl32-trnL, trnQ-rps16 and some other protein coding genes, like rpl20, ycf1, ycf15 and accD 34. Among these regions, the highly variable regions rpoB-trnC, psbE-petL, trnT-psbD and rbcL-accD have been reported in seed plants before, these regions showed very high nucleotide diversity per site (p-values) after across comparison35. At the same time, the ycf4-cemA also has been reported recently after comparative analysis of whole plastid genomes from the Apiales 36, while the ndhF-rpl32 and trnT-trnL also have been reported as a molecular maker for phylogeny analysis in Machilus 37 and Lupinus 38. Especially the trnS-trnG region, it has been used as the molecular maker for phylogenetic relationship analysis39 and solving origin problems40 in many species, such as Solanaceae 41, Bromeliacea 42 and Lamioideae 43.
However, the ycf3-trnS and ndhA intron seem to be especially variable in Fagopyrum, which were rarely reported highly variable regions before. For now, the most important is these highly variable regions have not been used as potential molecular markers to investigate the phylogenetic relationships and identification of Fagopyrum species, except the rpoB-trnC 44, rbcL-accD 8 and ndhF-rpl32 45. In contrast, the matK and trnK which in the chloroplast genome used for phylogenetic analysis of wild Fagopyrum species previously was not found to be highly variable10, 11. And most of these variable sequence regions had higher variation percentages than rbcL-accD, which has been used for interspecific relationships study before8. All of these highly variable regions are better to use for phylogenetic analysis in Fagopyrum at the species level. Because of the abundant germplasms resources and complex evolutionary issues, the taxonomy of Fagopyrum is still difficult to assess, especially the taxonomy status of wild species. Therefore, we believe these ten highly variable regions could provide abundant information for developing molecular markers, phylogenetic analysis and identification of Fagopyrum species.
There are many forms of SSRs in the genome, and the copy numbers of SSRs was different between species. Compared to other neutral DNA regions, SSRs usually have higher mutation rate because of slipped DNA strands, which could be used as potential genetic markers for assays detecting polymorphisms at population-level, phylogenetic relationships studies among species and plant ecological studies46. At the same time, the SSRs polymorphisms study has been used to investigate the evolutionary relationships among closely related species recently, which has the advantages of low cost but high precision47. From the results, the most abundant one was the mononucleotide repeats, followed by the di-, trin-, tetra-, pentanucleotide. All in all, the number of trinucleotide repeats and tetranucleotide repeats is substantially the same, and the number of pentanucleotide was very small across the chloroplast genomes. At the same time, no hexanucleotide SSRs was detected among the chloroplast genomes. These SSRs identified in our study could useful in phylogenetic and evolutionary studies as well as they were in Cocos nucifera 48, Pyrus pyrifolia 49 and Elodea Canadensis 50.
On the other hand, large and complex repeat sequences possibly related with the chloroplast genomes sequence divergence and rearrangement51. It has been suggested that the repeat sequences played an important role in chloroplast genomes variation and sequence rearranging because of the recombination46 and slipped-strand mispairing52. From our results, it also shown that there are connections between divergent regions of chloroplast genome and various repeat sequences. And most repeat sequences of F. luojishanense detected in the LSC and SSC regions which have higher sequence divergence than IR regions. Based on the results of highly divergent regions, it also showed that the repeat sequences were associated with the divergent regions of cp genomes, such as the ndhA and rbcL-accD. Interestingly, in F. dibotrys chloroplast genome sequences, most repeat sequences were located at the IR regions which was the same as the distribution of most repeat sequences of F. tataricum. This fact also could explain F. dibotrys has a closer phylogenetic relationship with F. tataricum. All in all, these repeats maybe further serve as potential genetic markers for phylogenetic studies on Fagopyrum species.
The single nucleotide polymorphism has been proven to have significant potential for genome structure analysis and species identification. From the results, the smallest number of nucleotide substitutions is 317, which was detected between F. dibotrys and F. tataricum chloroplast genome. It indicated the nucleotide substitution events between the chloroplast genome of F. dibotrys and F. tataricum are less than species of ginseng, potato, and orange53, 54. But most nucleotide substitutions events which detected between two Fagopyrum species was larger than 2000, especially the nucleotide substitutions between F. luojishanense and the two cultivated species. The nucleotide substitution events detected between F. luojishanense and F. esculentum was 6260 while the nucleotide substitution events detected between F. luojishanense and F. tataricum was 5992, which indicated that the variations in wild species are much higher than in cultivated species.
The chloroplast genome sequences have been used for the phylogenetic studies in many species successfully55,56,57. From the results, these phylogenetic analysis identify major geographic groups similar to which have been researched using the ITS and matK for phylogeny study before. But in our study, there are much higher internal resolution with a high level of bootstrap values, which reflecting the higher number of base substitutions between wild and cultivated species. It indicated that the cultivated species have obviously evolutionary trends compared with the wild species. The phylogenetic tree based on the analysis of chloroplast genome has highly supported nodes, which nearly resolved the phylogenetic problems between wild and cultivated buckwheat. In outline of classification of magnoliophyta, Cronquist suggested the Fagopyrum belongs to family Polygonaceae, order Polygonales 58. However, in the classification of APG III, the Fagopyrum belongs to family Polygonaceae, order Caryophyllales. Our results showed that APG III taxonomic system of Fagopyrum is reasonable on the whole and phylogenetic relationships within buckwheat species59. At the same time, our phylogeny inferred from complete chloroplast genome sequences also showed the F. dibotrys is closer to F. tataricum than to F. esculentum 10. Phylogenetic analysis of chloroplast sequences and morphology dates all indicate a unique position of F. loujishanenes. The positions of two wild species F. loujishanenes and F. dibotrys were consistent, and the two species both have high bootstrap values (<95%). We found convincing evidence for the phylogenetic relationship of F. loujishanenes and F. dibotrys from Fagopyrum. All in all, our results suggest that the chloroplast genome data could resolve the phylogenetic relationships of Fagopyrum effectively. Meanwhile, the evolutionary relationships of Fagopyrum genus also need further investigation and more complete evidences.
Materials and Methods
Morphological analysis
Plant materials were cultivated in green house of Sichuan Agricultural University. The young single seedlings of four different buckwheat species were collected for the morphological comparison. The details of plant materials used were observed and measured. To analysis Fagopyrum species based on morphological differences, we mainly focus on the plant type, stems, leaf, inflorescence and seeds. As food and medicinal raw materials, we also focus on the agronomic characteristics of these buckwheat. The methods of observation and measurement are shown in Table 5. The analysis data were checked and measured for 3 plants of each species. All observation data were designed in completely randomized block design. In this section, we also used four morphology data including plant height, stem thick, leaf long and leaf wide, as well as four agronomy data which were thousand grain weight, protein content, flavonoids content and amino acid content from four different species to reflect the differences of these buckwheat based on PCA. The PCA was used to reduce the dimensionality of the morphology and agronomy data for further determined and evaluate the difference of four Fagopyrum species. The statistical analysis were performed using IBM SPSS Statistics software version 16.0 (SPSS Inc., Chicago, IL, USA).
DNA Extraction, Sequencing, Assembly
The young leaves of F. dibotrys and F. luojichanese from single seedlings were sampled for DNA Extraction. Total genomic DNA of buckwheat was extracted using the modified CTAB protocol from 100 mg fresh leaves60. For all the Fagopyrum species in this paper, total genomic DNA was broken by ultrasound, yielded fragments of 300–500 bp in length, and the fragmentation quality was checked using Bioanalyzer 2100 (Agilent Technologies). The library of 400 bp DNAs was constructed using the NEBNext Ultra™ DNA Library Prep Kit (Illumina, San Diego, California, USA), and the genomic DNAs of the four Fagopyrum species were sequenced using HiSeq. 4000 PE150 (Illumina Inc.). For each of the species, the High-throughput sequencing dates were spliced using SPAdes 3.6.161 and SOAPdenovo262. The high quality reads were obtained using the CLC-quality trim tool, and the contig of the chloroplast genome was selected using the Blast program63. The contig of the chloroplast genome was assembled using Sequencher 4.10 (http://www.genecodes.com) with default parameters and the gaps of sequence found in the splicing were amplified with PCR-based conventional Sanger sequencing using ABI 3730. The specific primers at both ends of the gap of the four sampled species were designed for PCR until the chloroplast genome sequence is complete. The PCR program is 94 °C for 4 min followed by 34 cycles of 94 °C for 30 s, 55 °C for 30 s, 72 °C for 1.5 min, and 72° for 10 min. After that, all reads were mapped to the spliced chloroplast genome sequence using Geneious 8.164 to avoid assembly errors and proofread the contig is correct. Finally, we obtained two wild buckwheat high quality complete chloroplast genome sequences.
Chloroplast Genome Annotation
The gene annotation of the four chloroplast genomes was performed using DOGMA65, and the positions of coding genes, transfer RNAs and ribosomal RNAs were searched and identified by BLASTX and BLASTN. The potential coding gene promoter/terminator as well as the intron/exon boundaries were manually corrected and selected by compared to homologous genes of other sequenced chloroplast genomes. Because of the limitations of BLAST, some of the short exon regions (6–9 nt) cannot be annotated with the DOGMA, such as rps16, petB, petD and some intron and exon boundaries are not well recognized in the chloroplast genome annotation process. For these genes, we accurately adjusted and annotated them according to other published genomes. The circular genome maps of F. dibotrys and F. luojishanense were drawn by Organellar Genome DRAW (http://ogdraw.mpimp-golm.mpg.de/index.shtml)66.
Sequence Divergence Hotspot Regions Analysis
Comparison among the four Fagopyrum chloroplast genomes was performed using the mVISTA program67 to show the interspecific variation. First of all, the four chloroplast genomes were aligned by MAFFT v568 with the default parameters set and then the sequences were manually adjusted using Se-al (http://tree.bio.ed.ac.uk/software/seal.html) where necessary. The principle of multiple alignments was pulling open for the inversion of the sequence, in case the wrong data polymorphism occurs. The variation sites and sequence polymorphisms across the four chloroplast genomes were analyzed using DnaSp 5.069 after aligned. We used the Sliding Window of DnaSp 5.0 software for investigating and analyzing the hyper-mutation fragment all over the sequenced Fagopyrum chloroplast genomes, and the window length was set to 800 bp and step size was set as 50 bp. The LSC, SSC and IR regions of the four species were calculated also using the DnaSP v5.0 software. The primer designed based on the sequence of these four Fagopyrum species was showed in the Supplementary Table S7. PCR amplification to validate these hotspot regions carried out in 25 µL volumes containing 9.5 µL Taq Mix (Takara Bio, Shiga, Japan), 12.5 µL ddH2O, 1 µL forward primer, 1 µL reverse primer and 1 µL genomic DNA. The PCR protocol as follow: 94 °C 5 min; 94 °C 40 s, 58 °C 45 s, 72 °C 1 min for 32 cycles, 72 °C 10 min. After that, PCR products were determined through electrophoresis on ethidium bromide-stained agarose gels in 1X Tris Borate EDTA (TBE), and amplification profiles were photographed using Gel Doc™ XR System (Bio-Rad Inc.).
SSRs and Repeat sequence characterization
For the SSRs detection, the microsatellites were analyzed using the Perl script MISA70, and the SSRs parameters were defined as follow, thresholds of ten repeat units for mononucleotide SSRs, five repeat units for dinucleotide SSRs, four repeat units for trinucleotide SSRs, and three repeat units for tetra-, penta-, and hexanucleotide SSRs. We used the REPuter to find all kinds of repeats in the chloroplast genome sequences (http://bibiserv.techfak.uni-bielefeld.de/reputer), including forward match repeats, reverse match repeats, complement match repeats and palindromic match repeats. The minimal repeat size was 30 bp and the Maximum Computed Repeats was 60 bp71. Meanwhile, the similarity between two repeat copies was 90% at least, and the gap size between repeats had a maximum length of 3 kb.
Phylogenetic Analysis
Phylogenetic analysis was conducted based on the gene sequences of Fagopyrum and other taxa of basal dicotyledonous. In this study, we used 26 species from 5 different families, including 6 species in the Polygonaceae for investigating the evolution of Fagopyrum. The 20 species (Sileneconica, Carnegiea gigantean, Salicornia brachiate and so on) were used as outgroups. The chloroplast genome and the nucleotide sequence data were obtained from NCBI. The phylogenetic trees which based on maximum likelihood analysis were performed using RAxML v7.2.872, and the bootstrap replicates were 1000. The phylogenetic trees which based on maximum parsimony analysis were conducted using PAUP v4b1073, and 1000 random addition sequenceswere used for Heuristic search. The phylogenetic trees which based on Bayesian inference was performed using MrBayes v3.1.274. Markov chain Monte Carlo simulations run twice for 2 million generations independently, and the remaining trees were used to construct a majority-rule consensus tree after the first 25% of trees was discarded.
References
Yasui, Y. & Ohnishi, O. Interspecific relationships in Fagopyrum (Polygonaceae) revealed by the nucleotide sequences of the rbcL and accD genes and their intergenic region. Am. J. Bot. 85, 1134–1142 (1998).
Hou, L. L. et al. Fagopyrum luojishanense, a New Species of Polygonaceae from Sichuan, China. Novon 24, 22–26 (2015).
Zhou, M. L. et al. Fagopyrum hailuogouense (Polygonaceae), One New Species from Sichuan, China. Novon 24, 222–224 (2015).
Zhang, Z. L. et al. Bioactive compounds in functional buckwheat food. Food Res. Int. 49, 389–395 (2012).
Van Leeuwen, P., Punt, W. & Hoen, P. Polygonaceae. Rev. Palaeobot. Palynol. 57, 81–151 (1988).
Burke, J. M., Sanchez, A., Kron, K. & Luckow, M. Placing the woody tropical genera of Polygonaceae: A hypothesis of character evolution and phylogeny. Am. J. Bot. 97, 1377–1390 (2010).
Ohnishi, O. & Matsuoka, Y. Search for the wild ancestor of buckwheat, 2: Taxonomy of Fagopyrum (Polygonaceae) species based on morphology, isozymes and cpDNA variability. Genes Genet. Syst. 71, 383–390 (1996).
Yasui, Y. & Ohnishi, O. Phylogenetic relationships among Fagopyrum species revealed by the nucleotide sequences of the ITS region of the nuclear rRNA gene. Genes Genet. Syst. 73, 201–210 (1998).
Ohsako, T. & Ohnishi, O. Intra- and interspecific phylogeny of wild Fagopyrum (Polygonaceae) species based on nucleotide sequences of noncoding regions in chloroplast DNA. Am. J. Bot. 87, 573–582 (2000).
Ohsako, T. & Ohnishi, O. Nucleotide sequence variation of the chloroplast trnK/matK region in two wild Fagopyrum (Polygonaceae) species, F. leptopodum and F. statice. Genes Genet. Syst. 76, 39–46 (2001).
Nishimoto, Y., Ohnishi, O. & Hasegawa, M. Topological incongruence between nuclear and chloroplast DNA trees suggesting hybridization in the urophyllum group of the genus Fagopyrum (Polygonaceae). Genes Genet. Syst. 78, 139–153 (2003).
Grevich, J. J. & Daniell, H. Chloroplast Genetic Engineering: Recent Advances and Future Perspectives. Crit. Rev. Plant Sci. 24, 83–107 (2005).
Ingvarsson, P. K., Ribstein, S. & Taylor, D. R. Molecular evolution of insertions and deletion in the chloroplast genome of silene. Mol. Biol. Evol. 20, 1737–1740 (2003).
Shaw, J. et al. Chloroplast DNA sequence utility for the lowest phylogenetic and phylogeographic inferences in angiosperms: The tortoise and the hare IV. Am. J. Bot. 101, 1987–2004 (2014).
Guisinger, M. M., Chumley, T. W., Kuehl, J. V., Boore, J. L. & Jansen, R. K. Implications of the plastid genome sequence of typha (typhaceae, poales) for understanding genome evolution in poaceae. J. Mol. Evol. 70, 149–166 (2010).
Jung, J., Kim, K. H., Yang, K., Bang, K. H. & Yang, T. J. Practical application of DNA markers for high-throughput authentication of Panax ginseng and Panax quinquefolius from commercial ginseng products. J. Ginseng Res. 38, 123–129 (2014).
Jansen, R. K. & Palmer, J. D. A chloroplast DNA inversion marks an ancient evolutionary split in the sunflower family (Asteraceae). Proc. Natl. Acad. Sci. USA 84, 5818–5822 (1987).
Dong, W. et al. A chloroplast genomic strategy for designing taxon specific DNA mini-barcodes: a case study on ginsengs. BMC Genet. 15, 1–8 (2014).
Yang, J. B., Tang, M., Li, H. T., Zhang, Z. R. & Li, D. Z. Complete chloroplast genome of the genus Cymbidium: lights into the species identification, phylogenetic implications and population genetic analysis. BMC Evol. Biol. 13, doi:10.1186/1471-2148-13-84 (2013).
Hollingsworth, P. M., Graham, S. W. & Little, D. P. Choosing and Using a Plant DNA Barcode. PLoS ONE 6, e19254 (2011).
De, L. R. J., Lozano, J. J. & Ortiz, A. R. Comparative analysis of chloroplast genomes: functional annotation, genome-based phylogeny, and deduced evolutionary patterns. Genome Res. 12, 567–583 (2002).
Bock, R. & Khan, M. S. Taming plastids for a green future. Trends Biotechnol. 22, 311–318 (2004).
Logacheva, M. D., Samigullin, T. H., Dhingra, A. & Penin, A. A. Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale -A wild ancestor of cultivated buckwheat. BMC Plant Biol. 8, 1–15 (2008).
Cho, K. S. et al. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum). PLoS ONE 10, e0125332 (2014).
Daniell, H., Ruiz, O. N. & Dhingra, A. Chloroplast genetic engineering to improve agronomic traits. Methods Mol. Biol. 286, 111–137 (2004).
Madanala, R., Gupta, V., Singh, P. K. & Tuli, R. Development of chloroplast transformation vectors, and a new target region in the tobacco plastid genome. Plant Biotechnol. Rep. 6, 77–87 (2012).
Yuan, J. H. & Zhou, S. L. Hybrid origin of Paeonia × yananensis revealed by microsatellite markers, chloroplast gene sequences, and morphological characteristics. Int. J. Plant. Sci. 171, 409–420 (2010).
Belliard, G., Pelletier, G., Vedel, F. & Quetier, F. Morphological characteristics and chloroplast DNA distribution in different cytoplasmic parasexual hybrids of Nicotiana tabacum. Mol. Gen. Genet. 165, 231–237 (1978).
Mes, T. H. M., Wijers, G. J. & Hart, H. T. Phylogenetic relationships in Monanthes (Crassulaceae) based on morphological, chloroplast and nuclear DNA variation. J Evol. Biol. 10, 193–216 (1997).
Jansen, R. K, Ruhlman, T. A. Genomics of chloroplast and mitochondria in Plastid genomes of seed plants 103–126 (Springer, 2012).
Khakhlova, O. & Bock, R. Elimination of deleterious mutations in plastid genomes by gene conversion. Plant J. 46, 85–94 (2006).
Hui, H., Chao, S., Yuan, L., Mao, S. Y. & Gao, L. Z. Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships. BMC Evol. Biol. 14, 4302–4315 (2014).
Shaw, J., Lickey, E. B., Schilling, E. E. & Small, R. L. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III. Am. J. Bot. 94, 275–288 (2007).
Nashima, K. et al. Complete chloroplast genome sequence of pineapple (Ananas comosus). Tree Genet. Genomes 11, 1–11 (2015).
Dong, W., Liu, J., Yu, J., Wang, L. & Zhou, S. Highly variable chloroplast markers for evaluating plant phylogeny at low taxonomic levels and for DNA barcoding. PLoS ONE 7, e35071 (2012).
Downie, S. R. & Jansen, R. K. A Comparative Analysis of Whole Plastid Genomes from the Apiales: Expansion and Contraction of the Inverted Repeat, Mitochondrial to Plastid Transfer of DNA, and Identification of Highly Divergent Noncoding Regions. Syst. Bot. 40, 359–364 (2016).
Song, Y. et al. Comparative analysis of complete chloroplast genome sequences of two tropical trees Machilus yunnanensis and Machilus balansae in the family Lauraceae. Front. Plant Sci. 6, doi:10.3389/fpls.2015.00662 (2015).
Drummond, C. S. Diversification of Lupinus (Leguminosae) in the western New World: derived evolution of perennial life history and colonization of montane habitats. Mol. Phylogenet. Evol. 48, 408–421 (2008).
Lo, E. Y. Y. & Dickinson, T. A. Molecular Reappraisal of Relationships Between Crataegus and Mespilus (Rosaceae, Pyreae)—Two Genera or One? Syst. Bot. 32, 596–616 (2007).
Chomicki, G. & Renner, S. S. Watermelon origin solved with molecular phylogenetics including Linnaean material: another example of museomics. New Phytol. 205, 526–532 (2015).
Levin, R. A., Myers, N. R. & Bohs, L. Phylogenetic relationships among the “spiny solanums” (Solanum subgenus Leptostemonum, Solanaceae). Am. J. Bot. 93, 157–169 (2006).
Jabaily, R. S. & Sytsma, K. J. Phylogenetics of Puya (Bromeliaceae): Placement, major lineages, and evolution of Chilean species. Am. J. Bot. 97, 337–356 (2010).
Scheen, A. C. & Albert, V. A. Molecular Phylogenetics of the Leucas Group (Lamioideae; Lamiaceae). Syst. Bot. 34, 173–181 (2009).
Yamane, K., Yasui, Y. & Ohnishi, O. Intraspecific cpDNA variations of diploid and tetraploid perennial buckwheat, Fagopyrum cymosum (Polygonaceae). Am. J. Bot. 90, 339–346 (2003).
Hu, Y. et al. Genetic relationships of buckwheat species based on the sequence analysis of ITS and ndh F-rpl32. Biodivers. Sci. 24, 296–303 (2016).
Zhao, Y. et al. The complete chloroplast genome provides insight into the evolution and polymorphism of Panax ginseng. Front. Plant Sci. 5, doi:10.3389/fpls.2014.00696 (2014).
Matthee, C. A. et al. Indel evolution of mammalian introns and the utility of non-coding nuclear markers in eutherian phylogenetics. Mol. Phylogenet. Evol. 42, 827–837 (2007).
Ya-Yi Huang, A. J. M. M. Marjori Matzke. Complete Sequence and Comparative Analysis of the Chloroplast Genome of Coconut Palm (Cocos nucifera). PLoS ONE 8, e74736 (2013).
Terakami, S. et al. Complete sequence of the chloroplast genome from pear (Pyrus pyrifolia): genome structure and comparative analysis. Tree Genet. Genomes 8, 841–854 (2012).
Huotari, T. & Korpelainen, H. Complete chloroplast genome sequence of Elodea canadensis and comparative analysis with other monocot plastid genomes. Gene 508, 96–105 (2012).
Weng, M. L., Blazier, J. C., Govindu, M. & Jansen, R. K. Reconstruction of the ancestral plastid genome in Geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates. Mol. Biol. Evol. 31, 645–659 (2014).
Choi, K. S., Chung, M. G. & Park, S. The Complete Chloroplast Genome Sequences of Three Veroniceae Species (Plantaginaceae): Comparative Analysis and Highly Divergent Regions. Front. Plant Sci. 7, doi:10.3389/fpls.2016.00355 (2016).
Chung, H. J. et al. The complete chloroplast genome sequences of Solanum tuberosum and comparative analysis with Solanaceae species identified the presence of a 241-bp deletion in cultivated potato chloroplast DNA sequence. Plant Cell Rep. 25, 1369–1379 (2006).
Su, H. J., Hogenhout, S. A., Alsadi, A. M. & Kuo, C. H. Complete Chloroplast Genome Sequence of Omani Lime (Citrus aurantiifolia) and Comparative Analysis within the Rosids. PLoS ONE 9, e113049 (2014).
Jansen, R. K. et al. Phylogenetic analysis of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol. Biol. 6, doi:10.1186/1471-2148-6-32 (2006).
Liu, Y. et al. Complete chloroplast genome sequences of Mongolia medicine Artemisia frigida and phylogenetic relationships with other plants. PLoS ONE 8, e57533 (2013).
Kyunghee, K. et al. Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species. Sci. Rep. 5, doi:10.1038/srep15655 (2015).
Anderson, W. R. An Integrated System of Classification of Flowering Plants. Brittonia 34, 268–270 (1982).
Group, T. A. P. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Bot. J. Linn. Soc. 161, 105–121 (2009).
Li, J., Wang, S., Jing, Y. & Ling, W. A Modified CTAB Protocol for Plant DNA Extraction. Chin. Bull. Bot. 48, 72–78 (2013).
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
Luo, R. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1, 1–6 (2012).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Kearse, M. et al. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649 (2012).
Wyman, S. K., Jansen, R. K. & Boore, J. L. Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20, 3252–3255 (2004).
Lohse, M., Drechsel, O., Kahlau, S. & Bock, R. OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 41, 575–581 (2013).
Frazer, K. A., Pachter, L., Poliakov, A., Rubin, E. M. & Dubchak, I. VISTA: computational tools for comparative genomics. Nucleic Acids Res. 32, W273–279 (2004).
Katoh, K. & Standley, D. M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 30, 772–780 (2013).
Librado, P. & Rozas, J. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25, 1451–1452 (2009).
Thiel, T., Michalek, W., Varshney, R. K. & Graner, A. Exploiting EST databases for the development and characterization of gene-derived SSRS-markers in barley (Hordeum vulgare L.). Theor. Appl. Genet. 106, 411–422 (2003).
Kurtz, S. et al. REPuter: The Manifold Applications of Repeat Analysis on a Genomic Scale. Nucleic Acids Res. 29, 4633–4642 (2001).
Stamatakis, A. & Stamatakis, A. RAxML-VI-HPC: Maximum Likelihood—based Phylogenetic Analysis with Thousands of Taxa and Mixed Models, v. 7.03. Bioinformatics 22, 2688–2690 (2006).
Swofford, D. Commands used in the PAUP Block in PAUP 4.0: phylogenetic analysis using parsimony 132–135 (Smithsonian Institution, 1998).
Ronquist, F. & Huelsenbeck, J. P. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574 (2003).
Acknowledgements
This research was supported by the Key Project of Science and Technology of Sichuan, China (grant no. 04NG001–015), the National Program on Key Basic Research Project (973 Program) (grant no. 2014CB138701) and the Investigation of Forage Germplasm in Central China (grant no. 2017FY100604).
Author information
Authors and Affiliations
Contributions
C.W., J.S. and M.Z. conceived the experiments, M.D., C.Z., X.Z. and Y.T. collected the samples, C.W., M.D. and C.Z. analyzed the results, C.W. wrote the manuscript. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, CL., Ding, MQ., Zou, CY. et al. Comparative Analysis of Four Buckwheat Species Based on Morphology and Complete Chloroplast Genome Sequences. Sci Rep 7, 6514 (2017). https://doi.org/10.1038/s41598-017-06638-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-017-06638-6
This article is cited by
-
The pan-plastome of tartary buckwheat (fagopyrum tataricum): key insights into genetic diversity and the history of lineage divergence
BMC Plant Biology (2023)
-
Plastome comparison and phylogenomics of Fagopyrum (Polygonaceae): insights into sequence differences between Fagopyrum and its related taxa
BMC Plant Biology (2022)
-
Morphological Characterization Reveals High Intraspecies Diversity in Fagopyrum esculentum Moench and Fagopyrum sagittatum Gilib from North-Western Himalayan Regions
Agricultural Research (2022)
-
Towards a better understanding of Fagopyrum dibotrys: a systematic review
Chinese Medicine (2021)
-
Complete plastome sequencing resolves taxonomic relationships among species of Calligonum L. (Polygonaceae) in China
BMC Plant Biology (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.