Morphological and molecular characterization of some pumpkin (Cucurbita pepo L.) genotypes collected from Erzincan province of Turkey

Plant genetic resources constitute the most valuable assets of countries. It is of great importance to determine the genetic variation among these resources and to use the data in breeding studies. To determine the genetic diversity among genotypes of Cucurbita pepo L. species of pumpkin, which is widely grown in Erzincan, 29 different pumpkin genotypes collected were examined based on the morphological parameters and molecular characteristics. SSR (Simple Sequence Repeat) markers were used to determine genetic diversity at the molecular level. The analysis of morphological characterization within genotypes showed a wide variability in morphological traits of plant, flower, fruit, and leaf. In the evaluation performed using SSR markers, all primers exhibited polymorphism rate of %100. Seven SSR markers yielded a total of 15 polymorphic bands, the number of alleles per marker ranged from 2 to 3, and the mean number of alleles was 2.14. Polymorphic information content (PIC) ranged from 0.06 (GMT-M61) to 0.247 (GMT-P41), and the mean PIC value per marker was 0.152. Cluster analysis using Nei's genetic distance determined that 29 genotypes were divided into 4 major groups. The present findings have revealed the genetic diversity among pumpkin genotypes collected from Erzincan province and may form the basis for further breeding studies in pumpkin.

The family Cucurbitaceae comprises about 118 genera and 825 species 1 . The genus Cucurbita belonging to this family are among the leading ones that show great diversity in morphological characteristics. This genus consists of 22 wild and 5 cultivated species 2 . C. maxima Duch. (winter squash), C. moschata Duch. ex Lam. (butternut squash), C. pepo L. (pumpkin/summer squash), C. argyrosperma Hubersyn. C. mixta Pang and C. ficifolia Bouche are important cultivars 3 . Cucurbita pepo L. is an important species of Cucurbitaceae family with high economic value and genetic diversity 4 and shows a wide variation in fruit characteristics such as fruit size, shape and color. Although Turkey is outside the area of primary genetic diversity for Cucurbita species, its geographical location and favorable ecological conditions have allowed Cucurbita species with significant genetic diversity over the years 5 . However, despite the agricultural and biological importance of squash/pumpkin (Cucurbita spp.) species, molecular studies have been very limited so far. Today, the widespread use of biotechnological methods has provided many advantages in crop breeding. Different DNA markers have been used successfully in diversity studies evaluating inter-and intra-species genetic relationships. Many studies have been conducted to examine genetic diversity among Cucurbita species using various molecular markers such as Amplified fragment length polymorphism (AFLP) 6 , Random amplification of polymorphic DNA (RAPD) 7 , Inter Simple Sequence Repeat (ISSR) 8 , Sequence related amplified polymorphism (SRAP) 9 , and Simple sequence repeat (SSR) 10 . Allozymes and different DNA marker systems (RFLP, AFLP, ISSR) were used to detemine genetic variability within Cucurbita pepo L. species 8,11,12 . Most marker systems used to date have limitations associated with their dominant and/ or unreliable nature. Simple sequence repeats (SSRs) are suitable to detect variation within varieties since they are reliable, co-dominant and highly polymorphic as well as detect high levels of allelic diversity 13 . After these markers were first found in humans 14 , they began to be used in other living organisms as well. SSRs are repetitive DNA sequences of 1-6 base pair units 15 www.nature.com/scientificreports/ have functional significance in chromatin organization, regulation of gene activity, and recombination 17 , but they are more often apparently randomly distributed in the nonfunctional genomic regions. SSR markers can be used effectively in population genetics and gene mapping studies because of their advantages as an informative marker system including requiring small amounts of DNA, being codominant and stable, being abundant and scattered throughout the genome, being reproducible and suitable for automation, and having a high level of polymorphism 18 . The SSR technique has successfully been used in the assessment of genetic diversity in cucurbit species such as pumpkin/squash [19][20][21][22] , bowler 23 , snake melon 24 watermelon 25,26 , bitter melon 27 , cucumber 28 . The rate of foreign fertilization in pumpkin is very high. Due to foreign pollination, lines different from the original seed may occur, leading an increased genetic variation. Over time, pumpkin cultivars have spread to the regions of our country with both natural and artificial selections and have been formed from different populations in these regions. This type of plant genetic resources in our country establishes the basis of genetic materials of breeding studies. However, it is important to prevent the disappearance of such local genetic resources to be used in breeding studies. A comprehensive characterization study consisting of morphological and molecular parameters has not yet been carried out in Erzincan province. In this study, it was aimed to determine the degree of genetic relationship at the molecular level by using SSR markers as well as the morphological characteristics of certain pumpkin genotypes grown in Erzincan province.

Material and method
Plant material. In this study, the 29 pumpkin genotypes were collected from different regions of Erzincan   Table 2.

Data analysis.
The PIC values of each SSR markers were calculated using the formulas given below. Allelic data were used to compute PIC value of SSRs, the codominant molecular marker system, using the Power Marker 30 program 31 . Genetic variation within genotypes was determined by Nei's gene diversity index 32 , Shannon information index 33 , and the Popgen program 34 . NTSYS-pc version 2.11 f 35 was used for the clustering analysis of the data set obtained from the SSR markers. The clustering was performed with the SAHN subprogram using the unweighted pair group method with arithmetic Mean (UPGMA) method. The STRU CTU RE 2.2 program was used to determine the genetic structures of the genotypes 36 . In many genetic diversity studies with pumpkin, genotypes are successfully separated into groups using the STRU CTU RE program 37,38 . The F-statistic (FST) value reflects the variation between sub-populations 39 . By using the GenAlex program, principal coordinate analysis was performed to better understand the diversity among genotypes.

Results
Morphological properties of pumpkin genotypes. In this study, 29 pumpkin genotypes belonging to Cucurbita pepo were collected from different locations in Erzincan province. This pumpkin population has been characterized according to morphological and molecular traits. Since changes in morphological traits occurred in response to external conditions, it is important to support these morphological variations with molecular studies. Morphological features of genotypes are given in Tables 3, 4 and 5. It was observed that there were significant morphological differences in plant phenotype, leaf, flower and fruit characteristics among the collected www.nature.com/scientificreports/ Cucurbita pepo genotypes. The plant growth habit was considered as creeping in 14 genotypes, semi-creeping in 10 genotypes and shrub in 5 genotypes. Branching was determined in 24 genotypes, while other 5 genotypes did not have branching characteristics. Leaf attitude of petiole was identified as erect in 16 genotypes and semi-erect in 13 genotypes. In addition, pumpkin genotypes showed high variation in terms of leaf characteristics such as leaf blade size, incisions of leaf blade, green color of leaf blade and green color of petiole. Incisions of leaf blade was weak in 11 genotypes, medium in 9 genotypes, strong in 1 genotype and very strong in 1 genotype, whereas in 7 genotypes incisions of leaf blade were absent ( Table 3). As with other morphological features, it was observed that there was variation among genotypes in terms of flowers (male and female). It was determined that approximately 10 of the genotypes had ring at inner side of corolla and that there were no rings in the female flowers of 19 genotypes. In terms of pistil color in female flowers, genotypes are divided into 2 groups as yellow and orange. It was observed that in vast majority (approximately 76%) of the genotypes pistil colour was yellow. Based on the expression of colored ring at inner side of corolla of male flowers, genotypes are divided into 5 groups as absent, weak, medium, strong and very strong. It was observed that the majority of the genotypes (11 genotypes) had strong expression of colored ring at inner side of corolla. Genotypes were divided into 3 groups as yellow, yellowgreen and green according to color of pedicel of male flower. It was determined that 12 genotypes had yellow, 9 genotypes had yellow-green and 8 genotypes had green color. Differences were determined between genotypes according to the hairiness of pedicel of male flower. Genotypes were divided into 3 groups based on this trait. 9 genotypes were classified as weak, 11 genotypes as medium and 9 genotypes as strong (Table 4). In addition, pumpkin genotypes showed high variation in fruit shapes and skin colours. It was determined that fruit shape of 8 genotypes were transverse elliptical, 8 genotypes were wide elliptical, 6 genotypes were elliptical, 4 genotypes were transverse wide elliptical, 2 genotypes were cylindrical and 1 genotype was ovoid. Four different colors were determined as the major colour of skins of the pumpkin genotypes: cream (6 genotypes), yellow (2 genotypes), orange (1 genotype) and green (20 genotypes) ( Table 5).
SSR analysis. The   www.nature.com/scientificreports/ markers) to 3 (GMT-P68 marker) and the mean number of alleles was f 2.14 ( Table 6). The PIC value ranges from 0.06 (GMT-M61) to 0.247 (GMT-P41), with a mean of 0.152. The markers GMT-P41, GMT-P25 and GMT-P68 were found to be the best among the markers used to discriminate between genotypes due to their higher PIC values. (Table 6).

Cluster analyzes and principal component analyzes for SSR markers.
Comparative analysis of molecular sequence data enables the determination of proximity or distance between genotypes as well as the construction of a phylogenetic tree for clustering genotypes. For this purpose, cluster analysis was performed between pumpkin genotypes using UPGMA based on Nei's genetic distance. According to the results of this analysis, four major clusters were formed. Dice genetic similarity coefficient was used to estimate genetic diversity. This coefficient is often used to estimate genetic distance. The highest genetic difference (0.63) was found between genotypes ≠ 36 and ≠ 46 genotypes. As a result of the analysis, pumpkin genotypes were divided into four major groups. In the first cluster, mostly genotypes of Bahçeliköy (60%), Cevizli (90%), Çatalarmut (100%), Çayırlı (100%), Üzümlü (100%) and Ortayurt (50%) locations were included. In the second group, only single genotype of Bahçelikoy location (≠ 3) was determined. In the third group, single genotype was found for each of Bahçeliköy (≠ 2) and Ortayurt (≠ 51) locations. In the fourth group, there were 4 genotypes collected from Cevizli (≠ 46) and Ortayurt (≠ 49, ≠ 50 and ≠ 53) locations (Fig. 1). According to present findings, the genotypes Bahçeliköy (≠ 1, ≠ 2), Çatalarmut (≠ 7, Genetic structure analysis of SSR markers. ΔK is used to determine optimal values of K. The highest value in our study was obtained as K = 4 (Fig. 3). The low population size (K value) in our study is thought to www.nature.com/scientificreports/ be due to the high gene flow between the sample collection regions. Similar results have been reported for the population structure of pumpkin genotypes in other studies 21 . In our study, 22 genotypes were found in the first subpopulation, 1 genotype in the second subpopulation, 2 genotypes in the third subpopulation, and 4 genotypes in the fourth subpopulation ( Fig. 4; Table 7). The FST (F-statistics) values in the first, second, third and fourth subpopulations were determined as 0.0399, 0.0217, 0.072 and 0.000, respectively (Table 8).

Discussion
Examination of morphological characterization within genotypes showed a wide variation of genotypes in terms of morphological characteristics (plant, flower, fruit, leaf). In many studies of Cucurbitaceae family, it has been emphasized that diversity is high in terms of morphologic characteristics [40][41][42][43] . In a similar study by 8 , it has been determined that pumpkin genotypes showed high diversity in terms of fruit characteristics 44 have showed that   www.nature.com/scientificreports/ major color of the skin was yellow in 21 (24%) pumpkin genotypes green in 2 (2%), green-yellow grayish in 15 (18%), dark yellow -green grayish in 22 (27%), light yellow in 17 (21%) and dark yellow in 4 (5%). It was observed that 7 SSR markers used in pumpkin genotypes yielded a total of 15 bands and the number of alleles per locus was 2.14. The SSR method has been successfully applied to various species to identify genetic relationships 21,[45][46][47][48] . These markers have proven to effectively improve genetic diversity analysis and are very effective tools in genetic diversity and association studies due to their high polymorphic nature and transferability [49][50][51] . In similar studies of Cucurbita pepo species, researchers have found the mean number of alleles amplified per SSR marker primers as 3 21,52 . The results are similar to the results in our study. In many studies using SSR markers, it has been stated that SSR markers are successful to detect polymorphism and diversity in species belonging to the genus Cucurbita 11,52,53 . Polymorphic information content (PIC) is an important value that evaluates the efficiency of polymorphic loci and determines the discrimination ability of markers. In some studies, the PIC value changed according to the number of SSR markers used and the number of genotype and analysis method. In other studies, with SSR markers, the PIC value was found between 0.49 and 0.75 for melon and between 0.18 and 0.64 for cucumber. Of the markers, PKCT111 was considered the most informative as it showed the greatest genetic variation 54 . In a study conducted in Kenya with 96 pumpkin samples using SSR markers, the mean PIC value was determined as 0.49, and cluster analysis showed that the level of similarity between genotypes was high 55 . Based on genetic structure analysis and UPGMA analysis, 4 groups were identified. Principle component analysis (PCA) presents spatial distribution of relative genetic distance between the populations 56 . In present study, PCA analysis was performed for better and more detailed visualization of the variation within and between the populations. With the aid this method, a 2-D diagram is generated based on closeness or distance matrix between the genotypes and the distances between the resultant groups put forth the actual distances 57 . Expanding our knowledge about genetic variation of genotypes is crucial for crossbreeding studies used to obtain lines resistant to various stress conditions or more productive varieties. Therefore, the assessment of genetic variability in the gene source is the first step, called pre-breeding, to improve and develop superior varieties. SSRs with high polymorphism information content successfully assisted in the differentiation of genotypes in this study. The results of this study suggest that SSR analysis can be used successfully in the estimation of genetic diversity among   Table 4).