Introduction

Pepper (Capsicum spp.) is an important vegetable crop and condiment widely grown in tropical and subtropical regions of the world. The genus Capsicum has 5 cultivated species, C. annuum, C. frutescens, C. baccatum, C. chinense and C. pubescens, and 33 wild species1, while its genome size ranges from 3.3 to 3.6 GB, the most complex and largest genome in the Solanaceae family2,3.

Pepper germplasm resources showed huge differences at the phenotypic, biochemical and DNA levels, leading to high genetic diversity in the germplasm resources4,5,6. To ascertain the origin and process of domestication, the scientists examined the genetic diversity and population structure of semi-wild Capsicum germplasm and domesticated varieties collected from Central America (from central Mexico to northwest Costa Rica)7,8. Using molecular markers such as Amplified Fragment Length Polymorphism (AFLP), Simple Sequence Repeats (SSR) and Single Nucleotide Polymorphism (SNP), the genetic diversity and population structure of large-scale germplasm materials were analyzed, and some core germplasm banks were constructed to further understand the genetic diversity of germplasm materials9,10,11,12. Numerous studies on the genetic diversity and structure of China's pepper germplasm resources, including those for particular regions13,14,15, specific uses or fruit shape of chili germplasm resources16,17,18,19, pepper germplasm resources introduced from abroad20,21,22, and core germplasm for breeding materials23 have also been published.

Pepper was introduced to China in the late sixteenth century, and the earliest record was found in Gao Lian’s "Zun Sheng Ba Jian" in 1591. Due to the differences in ecological conditions and dietary habits of various regions of China, a large number of local varieties of pepper have been formed in the long-term natural selection and artificial improvement process24. In China, from 1965 to 1985, research was carried out on the collection, identification and preservation of pepper germplasm resources. At present, more than 2000 pepper germplasm resources are available in the National Genebank of China, which mostly consists of local landraces and also includes a small number of foreign commercial varieties12,25.

The pepper varieties in China are dominated by C. annuum species, with a small amount of C. chinense. In 1972, breeders produced the first hybrid pepper variety26. Since then, Chinese breeders have developed a wide range of current breeding lines and hybrid varieties using classical breeding techniques like hybridization and backcrossing along with modern breeding technologies like molecular marker-assisted selection27,28. With the innovation of pepper breeding technologies, the development and promotion of pepper varieties has progressed a lot29,30. Pepper varieties have been improved for three to four generations in China. Currently, hybrids account for more than 95% of the total pepper production27,28,31,32. In the 1970s, our team at the Institute of Vegetable Crops, Jiangsu Academy of Agricultural Sciences bred the first hybrid variety of pepper 'Zaofeng No.1' in China by using the local landrace 'Nanjing Zao Jiao' and the foreign introduced germplasm 'German'. Since then, we obtained a large number of advanced generation inbred lines through genetic improvement of the progeny of local landraces and domestic and foreign commercial varieties. Over the past 50 years, we have bred the 'Su Jiao' series of varieties, which have been widely promoted and utilized in Jiangsu, Shandong, Anhui, Guangdong and other significant pepper production areas, directly reflecting the development trend of pepper breeding in China.

Breeders generally agree that the genetic diversity of current pepper breeding lines is decreasing, but the exact situation is unclear. The reason is that there are few comparative studies of genetic diversity between current breeding lines and former local landraces. In this study, 179 materials were selected as research materials, including 94 local landraces and 85 current breeding lines. The 179 materials were primary core germplasm collections that we identified by evaluating 26 SSR markers and 1 InDel marker on 697 accessions, including 334 local landraces and 363 current breeding lines (data unpublished). The 334 local landraces were introduced from the National Genebank of China and originated from different regions of China in the 1980s, which were highly representative of varieties and breeding lines used 40 years ago in China. The 363 current breeding lines were improved and purified by our team in the recent five years. Among them, some current breeding lines mainly came from the improvement and selection of local landraces, while other current breeding lines came from the isolation and purification of commercial varieties from various regions of China. Therefore, the current breeding lines basically represent the main types of breeding materials for the market of China in recent years. Based on 35 phenotypic traits, the 179 accessions in the core germplasm collections represented about 70% of the genetic information present in the 697 accessions evaluated initially33. For the research reported herein, we evaluated the genetic diversity of 94 local landraces and 85 current breeding lines based on their phenotypic and molecular marker polymorphism in order to explore the changes in genetic diversity of pepper breeding resources after 40 years of artificial improvement and selection. This study will serve as a useful reference for guiding the future improvement of pepper breeding lines and the breeding of new varieties.

Material and methods

Plant materials

In the previous study, a core collection of 179 materials according to SSR analysis was established based on 334 local landraces and 363 current breeding lines (data unpublished). The local landraces were collected in different regions of China before 1985, almost 40 years ago. In contrast, the current breeding lines are 8–10 generation inbred lines that were used for commercial breeding in the last 5 years. A small percentage of current breeding lines were improved based on local landraces and most of them were purified and improved based on market varieties. The core collection of 179 materials were used in the study, including 94 local landraces and 85 current breeding lines. The 94 local landraces, of which 85 accessions originated from 26 provinces or regions in China, and the remaining 9 accessions were introduced from abroad (Table 1). The 85 current breeding lines were belong to the pepper innovation team of the Institute of Vegetable Research, Jiangsu Academy of Agricultural Sciences.

Table 1 The accession number, variety name and origin of the 94 local landraces used in the research.

Phenotypic traits measurement

Six plants of each material were grown in green houses at Nanjing in 2021 and 2022 for phenotypic traits measurement. The 35 phenotypic traits that were investigated in this study included 22 qualitative traits and 13 quantitative traits. Pepper phenotypic traits investigations were carried out from April to July, referring to the methods of Li and Zhang34, and modifications were done appropriately according to the actual situation (Tables 2, 3).

Table 2 Distribution frequencies and Shannon Diversity indexes and their differences for 22 qualitative traits evaluated in 94 local landraces (LLR) and 85 current breeding lines (CBL) of pepper in China.
Table 3 Shannon Diversity indexes, coefficients of variation, and maximum, minimum, range, mean and standard deviation values and their differences for 13 quantitative traits evaluated in 94 local landrace (LLR) and 85 current breeding lines (CBL) of pepper in China.

SSR analyses

The genomic DNA was extracted from young plantlets that had been selected from six individuals of each material by using the Cat#DP3112 kit (Beijing Bioteke Corporation). The quality and concentration of DNA was measured by 1% agarose gel electrophoresis and Nano Drop One (ThermoFisher Scientific, USA). The DNA concentrations were adjusted to 50 ng/μL and kept at − 20 °C in a refrigerator.

For SSR analysis, we used 26 microsatellite markers publicly available from Nicola et al. and 1 InDel from Guo et al. that spanned 12 pepper chromosomes9,35. Each SSR and InDel forward primer was labeled with one of four fluorescent dyes: 6-carboxy-fluorescein, 6-carboxy-tetramethylrhodamine, hexachloro-6-carboxy-fluorescein, or carboxy-X-rhodamine. All primers were synthesized by Tsingke Biotechnology Company (Nanjing, China).

PCR was performed in 10 μL reaction volumes containing PrimeSTAR Max Premix (2 ×) 5 μL (Takara, Beijing), 0.25 μL of each primer (10 µmol L−1), double distilled H2O 3.5 μL and approximately 1 μL of genomic DNA (50 ng/μL) as templates. Thermocycling was started at 98 °C for 3 min and followed by 30 cycles of 98 °C for 20 s, 55 °C for 10 s and 72 °C for 20 s, with a final extension at 72 °C for 5 min. Following the reaction, 2 μL of PCR product was taken for electrophoretic detection, and then the PCR product was diluted to 2 ng/μL. 10 μL of liz500 mixture (liz500:HiDi = 1:130) were added to each PCR product, which was then heated at 95 °C for 5 min and then cooled in ice water. Fluorescence detection was carried out on an ABI 3730 genetic analyzer (Applied Biosystems, USA).

Data analysis

Phenotypic data were organized and summarized by using Microsoft Excel 2016 software. For qualitative traits, distribution frequencies were based on grading criteria. For quantitative traits, six plants or six fruits per material were evaluated, and results for each accession were given a score based on a scale of 1 to 10, according to the mean (M) and standard deviation (S) of the aggregate data. A score of 1 was given to any value less than M − 2S and a score of 10 was assigned to any value greater than or equal to M + 2S. Intermediate scores were based on increments of 0.5 S. Distribution frequencies were obtained from these scores. Shannon indices (H′) were also calculated from the scores (H′ = − ΣPiLnPi, where Pi denotes the frequency of accessions with the particular trait score), as was coefficient of variation (CV = SD/M × 100%, where SD is the standard deviation and M is the mean value.)

GeneMapper 4.0 software was used to evaluate the sizes of amplified fragments (SSR and InDel alleles) from different samples. The number of unique alleles, gene diversity index, heterozygosity, and polymorphism information content (PIC) were calculated using Power Marker v3.25 software. Genetic distances were calculated by the method of Cavalli-Sforza and Edwards (1967)36 and were used to construct a phylogenetic tree based on the unweighted pair-group method of mathematical averages (UPGMA 5.10 Software) as implemented in MEGA (Molecular Evolutionary Genetics Analysis) software.

To infer the population structure of the 179 C. annuum accessions from China, we used the model-based program STRUCTURE v2.3.4. The optimal number of genetic groups (K) were computed by performing eight Markov chain Monte Carlo runs for each value of K from 1 to 10. Each run consisted of 100,000 iterations, with a burn-in period of 30,000 iterations, and used an admixture model allowing for Hardy–Weinberg equilibrium, correlated allele frequencies, and independent loci. A Pr(X|K) index with respect to each K was used to calculate ΔK (Evanno et al. 2005) where X denotes the genotypes of the sampled individuals37. According to this approach, the optimal K is estimated as the location of the first peak of A ΔK =|L2(K)|/s[Pr(x|k)], where |L2(K)| denotes the absolute value of the second order rate of change of Pr(X|K), and s[Pr(x|k)] is the standard deviation of the Pr(X-|K), and s indicates standard deviation of L(K). The principal coordinate analysis (PCA) was performed with DARwin 5.0.158 Software.

Ethics declarations

All plant experiments were carried out in accordance to relevant institutional, national, and international guidelines and legislation.

Results

Phenotypic diversity of qualitative traits

Twenty-two qualitative traits of 94 landraces and 85 current breeding lines were investigated, and the total number of variation types were 86, with a very rich variation in fruit morphology (Table 2 and Fig. 1). The Shannon Diversity indices of 22 qualitative traits of 94 landraces were between 0.21 and 1.55, with a mean value of 0.82. The Shannon Diversity indices of qualitative traits of 85 current breeding lines were between 0.06 and 1.39, with a mean value of 0.82. For overall comparison, the differences in Shannon Diversity indices for qualitative traits between local landraces and current breeding lines were not significant, indicating that the phenotypic diversity of the above-mentioned qualitative traits showed little difference between the two populations.

Figure 1
figure 1

Examples of mature fruit types fruit harvested from the 94 local landraces (A) and from the 85 current breeding lines (B).

There were six qualitative traits related to plant morphology, including plant habit, main stem color, stem pubescence, leaf shape, leaf color and leaf pubescence (Table 2). The Shannon Diversity Index of stem pubescence was the greatest, but there was no significant difference between landraces and current breeding lines, while the Shannon Diversity Index of leaf pubescence had the greatest difference between the two populations. The Shannon Diversity index for leaf pubescence of the landraces was higher than that for the current breeding lines by 0.19. In current breeding lines, the proportion with no hairs on the leaf surface was 72.9%, which was 31.5% higher than that of local landraces, while the types with weak and medium hairs on the leaf surface was lower by 29.3% and 2.1%, respectively. The Shannon Diversity index of leaf shape was 0.12 higher in current breeding lines compared to local landraces. In local landraces and current breeding lines, the ovate leaf shape predominated, making up 89.4% and 83.5%, respectively. The frequency of lines with lanceolate leaf shape was higher by 5.7% in current breeding lines as compared to the local landraces. The Shannon Diversity indices of the other three traits were not significantly different between the two populations. (Table 2).

Five qualitative traits related to flowers, including peduncle attitude, corolla color, anther color, stigma color and stigma exertion were also examined (Table 2). The Shannon Diversity indices for corolla color were the lowest, but the greatest difference was found between the two populations, with local landraces being 0.20 higher than the current breeding lines. The percentage of lines with white corolla was 94.7% and 98.8% in the two populations. The percentage of lines with light purple or purple corolla was 0 and 4.2% in current breeding lines and local landraces, respectively. Secondly, the major difference between these two populations was found in the Shannon Diversity index for stigma color. The Shannon Diversity index of current breeding lines was 0.14 higher than that of the landraces. The white stigma was also the main trait, accounting for 87.2% and 81.2% of the landrace and current breeding lines, respectively. The proportion of light purple and purple stigmas in current breeding lines was higher by 3.4% and 1.4%, respectively. These results indicated that corolla color and stigma color were not closely linked, but inherited independently. The Shannon Diversity indices of the other three traits, peduncle attitude, anther color and stigma exertion were higher than those for the above two traits, but there were no significant differences between the landraces and current breeding lines (Table 2).

A total of 11 qualitative traits were related to the fruits, among which the fruit shape had the highest number (9) of variation types (Table 2). The Shannon Diversity indices were also highest, which were 1.55 and 1.32 in landraces and current breeding lines, respectively, with the largest difference between them. For fruit shape, lantern shape was most frequent, with 35.1% and 44.7%, respectively. Ox’s horn shape and goat’s horn shape had 17.0% and 31.8% in landraces, but 33.0% and 4.7% in current breeding lines, respectively. The proportion of linear shape peppers in current breeding lines was higher by 9.9% compared with that in landraces. In the current breeding lines, fruit morphologies other than spherical, teardrop, and finger-shaped had been eliminated. The acute fruit apex comprised the largest proportion of fruit apex shapes, accounting for 56.4% and 45.9%, respectively. The frequency of acute apexes in current breeding lines were 10.5% lower, while correspondingly there was an increase of 3.1%, 4.4% and 2.9% in rounded, depressed and depressed with acute apexes, respectively, thus the Shannon Diversity index of the apex fruit shape in the current breeding lines was 0.13 greater than that of the landraces. The two forms of variation "present" and "absent" for the quality traits of fruit appearance were fruit glossiness and fruit navel appendages. Both of the quality traits were dominated by “present” in local landraces and current breeding lines. So, the Shannon Diversity indices for both traits were low, but differed significantly between the two populations. For fruit glossiness, glossy materials accounted for 98.8% of current breeding lines, which was 4.1% higher than that of landraces. The Shannon Diversity index of landraces was 0.14 higher than that of current breeding lines. Regarding fruit navel appendages, the percentage of current breeding lines without fruit navel appendages was 19.1% greater than that of the landraces, and the Shannon Diversity index was 0.13 greater compared to that of landraces (Table 2).

For the other seven fruit related traits, the Shannon Diversity index differences between two populations was not significant, but there were certain regularities in the frequency distribution of variation types (Table 2). In terms of fruit color at maturity, the proportion of lines with yellow and orange fruits increased 3.8% and 1.4% in the current breeding lines. For texture of fruit surface, the proportion of current breeding lines with slightly wrinkled fruit surface increased by 15.1%, while the proportion of smooth fruit surface decreased by 15.4%. In terms of fruit spiciness, the frequency of absent spicy lines in current breeding lines increased by 22.1%, while the proportions of mildly spicy, medium spicy and spicy lines all decreased accordingly. The above results showed that the proportion of colorful sweet pepper lines, large fruit lines, fruit surface with slightly winkled lines and absent spicy lines increased in current breeding lines (Table 2).

Phenotypic diversity of quantitative traits

Thirteen quantitative traits of 94 landraces and 85 current breeding lines were investigated. The Shannon Diversity indices for the 13 quantitative traits in the 94 local landraces ranged from 0.98 to 2.17, with a mean value of 1.73, while the Shannon Diversity indices for these 13 quantitative traits of the 85 current breeding lines ranged from 1.70 to 2.04, with a mean value of 1.90. Overall, the Shannon Diversity indices of the current breeding lines were higher than those of the local landraces (Table 3).

Plant height is a very important quantitative trait for breeding lines. The Shannon Diversity indices of plant height were 2.04 and 2.01 for landraces and current breeding lines, respectively, with no significant differences between them. Plant height of the local landraces ranged from 38.3 to 168.3 cm, with a range of 130.0 cm and a mean of 112.5 cm. The plant height of current breeding lines ranged from 62.8 to 143.3 cm, with a range of 80.5 cm and a mean of 102.0 cm. As compared with the local landraces, the current breeding lines were on average 10.4 cm shorter than the landraces. The results showed that the plant height differences among landraces were very large, while the plant height differences among current breeding lines were less, as was the average plant height (Table 3).

Leaf morphology included three quantitative traits, including length of blade, width of blade and petiole length. Compared with landraces, the Shannon Diversity indices of the three traits of current breeding lines increased by 0.5, 0.34 and 0.43 respectively, and the coefficient of variation increased by 7.15%, 4.15% and 5.66% respectively. Comparing the mean values of the three traits in the two populations, it was found that the average length and width of blade in the current breeding lines increased by 3.91 cm and 1.88 cm, respectively, and the average petiole length increased by 2.25 cm (Table 3).

Data of first flower and the first flowering node are maturity-related traits. The Shannon Diversity indices of the two traits in landraces were 1.93 and 2.06, respectively, while those in current breeding lines were 1.70 and 1.75, a reduction of 0.23 and 0.31, respectively. Comparing the maximum, minimum, range and average values of the two traits in the two populations, it was found that the longest flowering period in the two populations were both 124.4 days and the shortest were 73.6 days and 54.1 days, respectively. The average time of beginning of flowering of the current breeding lines was 8.2 days less than that of the local landraces. The first flowering node of local landraces ranged from 5.2 to 23.8 with a mean value of 10.2, while the range of first flowering node of current breeding lines ranged from 4.5 to 16.3 with a mean value of 9.7 (Table 3). The above results showed that the flowering characteristics related to maturity varied greatly among landrace lines, while these characteristics for current breeding lines were earlier than those of the local landraces.

Seven quantitative traits related to fruit included fruit pedicel length, single fruit weight, fruit longitudinal diameter, fruit transverse diameter, thickness of flesh, number of locules and the fruit firmness (Table 3). Compared with local landraces, the Shannon Diversity indices of current breeding lines for single fruit weight, fruit longitudinal diameter, fruit transverse diameter and thickness of flesh increased by 0.88, 0.38, 0.14 and 0.36, respectively. The single fruit weight of local landraces ranged from 1.1 to 202.1 g, with a mean value of 37.8 g. The single fruit weight of current breeding lines ranged from 6.4 to 409.8 g, with an average value of 121.7 g. The mean value of single fruit weight of current breeding lines was 3.2 times higher than that of landraces. Fruit longitudinal diameter, transverse diameter and thickness of flesh were closely related to single fruit weight. The average values of the above three traits in current breeding lines increased by 8.9 cm, 1.4 cm and 0.13 cm, respectively. The Shannon Diversity indices of fruit pedicel length and number of locules in current breeding lines was less 0.13 and 0.17, respectively, compared to that of landraces. The fruit pedicel lengths of local landraces and current breeding lines ranged from 2.4 to 7.3 cm and 2.6 to 9.0 cm, respectively, with an average value of 4.1 cm and 5.0 cm, respectively. The average fruit pedicel length of current breeding lines was 0.9 cm longer than that of landraces. There was no difference in the range of the number of locules between landraces and current breeding lines, but the mean number of locules in landraces and current breeding lines were 2.8 and 3.2, respectively. This result may be related to the 9.6% increase in the proportion of lantern-shaped pepper in current breeding lines. Compared with landraces, the mean of fruit firmness was less 3.95 Pa in current breeding lines, which may be due to the higher proportion of lines with large fruit in current breeding lines (Table 3).

Genetic diversity based on SSR and InDel markers

The genotyping of 94 landraces and 85 current breeding lines revealed 26 SSR and 1 InDel marker loci diversity (Tables 4, 5).

Table 4 Genetic diversity analysis of 94 local landraces at 26 polymorphic simple sequence repeat (SSR) loci and 1 insertion-deletion (InDel) loci.
Table 5 Genetic diversity analysis of 85 current breeding lines at 26 polymorphic simple sequence repeat (SSR) loci and 1 insertion-deletion (InDel) loci.

A total of 251 alleles were amplified from 27 markers in the landraces. The number of alleles at different loci ranged from 2 to 20, and the average number of alleles per locus was 9.3. Gene Diversity index (GDI) and Polymorphism Information content (PIC) ranged from 0.16 to 0.88 and 0.14 to 0.87, with an average value of 0.56 and 0.53, respectively. Among these markers, the GDI and PIC of marker Epms331 were the highest, which were 0.88 and 0.87, respectively. The GDI and PIC of marker Epms391 was the lowest, which were 0.16 and 0.14, respectively (Table 4 and Supplementary Fig. 1). Observed heterozygosity was highly variable between loci (0.01–0.94), with an average value of 0.16, indicating that the local landraces were not fixed to homozygosity (Table 4).

A total of 175 alleles were amplified from 27 markers in the current breeding lines, with a range of 2 to 19 alleles at different loci and an average number of 6.48 alleles per locus. The GDI and PIC varied from 0.01 to 0.84 and 0.01 to 0.82, respectively, with mean values of 0.48 and 0.45. The GDI and PIC of Epms397 were the highest, which were 0.84 and 0.82, respectively. The lowest GDI and PIC were found for marker Es350, both of which were 0.01 (Table 5 and Supplementary Fig. 2). The Heterozygosity ranged from 0.01 to 0.93, with an average value of 0.07 in the current breeding lines (Table 5).

Population genetic structure and phylogenetic tree analysis

To investigate the differences of genetic background between 94 landraces and 85 current breeding lines, a population genetic structure analysis was performed by using the 27 markers. The ΔK analysis representing the population structure (Fig. 2A) showed that the ΔK value was maximum at K = 2, i.e., the 179 pepper lines could be divided into two groups, Pop1 and Pop2, where Pop1 contained 70 lines and Pop2 contained 109 lines (Fig. 2B).

Figure 2
figure 2

Results of population structure analysis based on molecular markers present in 94 Local landraces (green) and 85 current breeding lines (red) of pepper. (A) Distribution of ΔK as determined by eight Markov chain Monte Carlo runs. K, the optimal number of genetic groups. (B) Two pop inferred by genetic structure analysis. Red mean current breeding lines, Green mean local landraces. (C) Principal coordinate analysis of 179 accessions without missing simple sequence repeat (SSR) genotype data. In (C) red mean current breeding lines, green mean local landraces.

The Pop1 group was predominated by current breeding lines, with 55 current breeding lines (78.5%), and 15 landraces (21.5%). The Pop2 group was predominated by landraces, with 79 local landraces (72.5%), and 30 current breeding lines, (27.5%). The number of lines with Q values greater than 0.6 in Pop1 and Pop 2 were 62 and 95, accounting for 88.6% and 87.2%, respectively, and 8 and 14 lines with Q values less than 0.6, respectively. The above results indicated that most of the Pop1 and Pop2 lines had a separate sources of genetic components, but some lines combined genetic components from both sources. There was some genetic information shared between the groups. The results for all 179 pepper lines were subjected to principal coordinate analysis, which supported the classification of the two groups, and some lines shared genetic components for both sources (Fig. 2C).

The genetic divergence of the 179 lines ranged from 0.04 to 0.90 with an average genetic distance of 0.50. The genetic distances between the GW422 and B644 lines were the most distant, while the genetic distances between the B507 and B596 lines were the closest. Based on the genetic distance, the Neighbor Joining (NJ) cluster tree was constructed. The results showed that the 179 lines could be divided into two clusters. Cluster I included 85 lines, including 54 current breeding lines and 31 local landraces, accounting for 63.53% and 36.47%, respectively. Cluster II included 94 lines, including 63 local landraces and 31 current breeding lines, accounting for 67.02% and 32.98%, respectively. In the clustering process, some lines with similar agronomic traits were clustered together, such as B227, B247, B262, and B282were all long lantern-shaped, thin-flesh and mildly spicy peppers, while B305, GW090, GW143, GW221, GW160, and GW225 were all long lantern-shaped and thick-flesh sweet peppers. Meanwhile, the ox-horn-shaped and goat-horn-shaped lines, which accounted for a high proportion, also had similar preferential clustering. However, some lines of different fruit shape were clustered together, probably because some of the current breeding lines were obtained by purification after crossing of different fruit shapes (Fig. 3).

Figure 3
figure 3

Phylogenetic tree of 94 local landraces (No. start with GW) and 85 current breeding lines (No. start with B) based on SSR and InDel markers using the NJ clustering method. (UPGMA 5.10, https://www.megasoftware.net/.

Discussion

In this study, the Shannon Diversity indices of qualitative traits of current breeding lines were higher for fruit shape, depth of stalk cavity, fruit color (before maturity) and fruit spiciness (Table 2). All the above four traits were associated with fruit organs, indicating that fruit diversity was abundant, which was consistent with the results of previous studies20,22,38. The Shannon Diversity indices of qualitative traits of current breeding lines were lower for corolla color, fruit glossiness and leaf color (Table 2), indicate that there was less genetic diversity. Compared to the Shannon Diversity indices of qualitative traits, the Shannon Diversity indices of quantitative traits was higher, implying more diversity of quantitative traits, (Tables 2, 3). This result was consistent with those of in previous reports11,12. Among the 13 quantitative traits, the Shannon Diversity indices of the length of fruit pedicel was highest, followed by the plant height (Table 3). The Shannon Diversity indices of the length of fruit pedicel in this study was higher than the results of the previous study20,22,38. The coefficients of variation for single fruit weight and fruit transverse diameter were over 50% (Table 3), indicating great variation among lines within the population.

Analysis of 22 qualitative traits in 94 local landraces and 85 current breeding lines revealed significant differences in Shannon Diversity indices of eight qualitative traits between the two populations. In comparison with local landraces, the Shannon Diversity indices of four qualitative traits, including leaf pubescence, corolla color, fruit shape, and fruit glossiness in current breeding lines were lower. While the Shannon Diversity indices for leaf shape, stigma color, fruit shape of apex, and fruit navel appendages were higher in current breeding lines (Table 2). Specifically, for each trait, the percentages of current breeding lines without leaf pubescence, glossy fruit surface, lanceolate leaf shape, light purple and purple stigma, rounded, depressed and depressed with acute of fruit apex and without fruit navel appendages increased by 31.5%, 4.1%, 5.7%, 6.0%, 10.5% and 19.1%, respectively. In contrast, current breeding lines with purple or light purple corollas and finger-shaped, spherical-shaped and teardrop-shaped fruits were likely discarded in the selection process. The above results indicated that the increase in the proportion of current breeding lines with lanceolate leaf shape and rounded, depressed and depressed with acute fruit apex reflected the diversification of current breeding lines, while the breeding process improved the commercial acceptance of fruit appearance by improving the fruit surface gloss and reducing the fruit navel appendages.

To summarize, for the flowering characteristics related to maturity, the Shannon Diversity indices of date of first flower and first flowering node in current breeding lines were less, with flowering time being earlier and first flowering node being lower. The above results indicated that the diversity of flowering characteristics in current breeding lines were reduced and flowering stages tended to be earlier. For leaf-related traits, the Shannon Diversity indices for length of blade, width of blade and petiole length of current breeding lines were higher, while the mean values were greater, 3.9 cm, 1.9 cm and 2.3 cm, respectively. Compared to local landraces, the Shannon Diversity indices for five of the seven fruit organ-related traits in current breeding lines were higher, indicating increased fruit diversity during artificial selection. We thought the main reason for this result might be that most of the current breeding lines were obtained by isolation, purification and improvement of market varieties, including all kinds of varieties from different regions of China. In addition, the diversity of cultivation facilities as well as consumption diversity in China may also influence the genetic diversity of current breeding lines. The aforementioned findings showed that in order to adjust to the facilitated cultivation conditions and market demand, breeders preferred the lines with large leaves, earlier flowering, lower flowering nodes, and higher single fruit quality during these election process. However, the increase of the proportion of large fruit lines likely led to an increase in the number of fruits locules and a decrease in fruit firmness.

Compared with the current breeding lines, the local landraces had 76 more alleles in 27 markers, with 2.8 more alleles per locus, and the mean values of GDI and PIC were 0.08 and 0.09 higher, respectively. In addition, the mean value of heterozygosity of 27 markers in the landraces was 0.16, while the heterozygosity of the current breeding lines was 0.07. The results indicated that some accessions of the local landraces were more Heterozygosity, while the purity of current breeding lines was higher than that of local landraces (Tables 4, 5). The above results indicated that the genetic diversity of local landraces was higher than that of current breeding lines, while the heterozygosity of current breeding lines was less than that of local landraces.

While using the same SSR molecular markers as used in this study, the germplasm resources used in the previous studies were more numerous and more broadly sourced compared to those used in the present study, so the GDI and PIC values were higher in those studies compared to those reported herein. For example, Nicola et al. analyzed 1352 pepper accessions using 28 SSR markers, and the mean values of the GDI and PIC were 0.70 and 0.67, respectively9. Zhang et al. analyzed 372 landraces of China by using 28 SSR markers, and the mean values of the GDI and PIC were 0.63 and 0.60, respectively11. However, Lee et al. analyzed 3821 accessions of 11 Capsicum species using 48 SNP markers, and the mean value of PIC was 0.38. Among them, 3383 C. annuum accessions had the highest PIC of 0.6910. Cheng et al. used 5149 SNP markers to analyze the genetic diversity of 398 accessions (local landraces and backbone parents) with GDI and PIC of 0.36 and 0.29, respectively39. Genetic diversity analysis of a 200-member core collection using 61 SSR markers showed the mean values of GDI and PIC were 0.40 and 0.35, respectively23. The above results indicated that the diversity of pepper accessions of China was lower compared with accessions from some other countries, and artificial selection has led to a further reduction in the genetic diversity of breeding lines.

The population genetic structure analysis and principal coordinate analysis indicated that there were relatively significant differences of the genetic backgrounds between the 94 local landraces and 85 current breeding lines, but some lines in the two populations likely shared genetic backgrounds. More similar to the results of the present study, Gu et al. clustered 1904 germplasm resources of China into two major taxa by population structure analysis and principal coordinate analysis12. Cheng et al. used 5149 SNP markers to classify 398 local landraces and backbone lines into two different taxa39. Meanwhile, phylogenetic tree analysis in this study showed that the clustering results based on genetic distance were correlated with fruit shape, and germplasms of the same fruit shape were more likely to be clustered together. This result was consistent with several previous reports12,23.