## Introduction

The migration of species in a new environment means that the environment offers favorable conditions for the species’ survival and accommodation. Such migration and accommodation are often accompanied by the expression of different phenotypes depending on the environment and the interaction between genotype and environment1,2,3,4,5,6. Their persistence in the new environment along with natural selection result in their adaptation, and other time, increase their fit to the environment7,8,9. The processes leading to the fitness of the species in the new environment involve natural random mating and genes transfer from parents to progeny10, mutation and genes recombination which result in an assortment of genotypes with dominance, co-dominance, and recessive individuals. Natural selection eventually removes the purely recessive genotypes from the population as they often cannot survive in the new environment. These processes occur gradually and lead to genetic variability. An illustration is the different degrees of resistance to pesticides or heavy metal toxins11,12.

Genetic variability is the driving force of species persistence and evolution. It is the natural insurance policy as it provides an insurance for the species survival against the unexpected. Natural random mating allows the transfer of genes from individuals to others. In addition to the gene transfer, the recombination of genes and genetic mutation are the main cause of genetic variability that insures the survival and the evolution of species. In agricultural research, genetic variability is the backbone of cultivar improvement13,14,15,16,17,18,19. It makes the genetic resources available for undertaking the breeding program for crop improvement. For example, the introgression of quantitative trait loci on southern leaf blight resistance in maize hybrids20 was feasible due to the genetic variation for the disease. Modern biotechnology tools can be used to introduce foreign genes into the genome of an individual21, thereby enhancing the genetic variability of the species as the introduced genes are heritable and can be passed to the progeny. With the importance of genetic variability in a breeding program, germplasm collection, maintenance and regular assessment have been an important part of the activities in research institutions22,23,24. The current study concerns the Cucurbita moschata germplasm in Cote d’Ivoire, precisely its morphological characterization as part of a program of Cucurbita moschata improvement.

Cucurbita moschata originated from Latin America25,26 and was an important crop for the indigenous people who first cultivated the crop for its edible seed and fruit27. From Latin America where it is first cultivated, Cucurbita moschata migrated to the Caribbean Islands where it adapted to the local ecosystems and further diversified28,29. After the discovery of America and through the intercontinental voyages that followed, Cucurbita moschata was spread to Europe, Asia and Africa and became naturalized to those continents, with increased genetic diversity in the new environments27,30. The diversification that resulted from the adaptation and acclimation of C. moschata to the diverse regions of the world created geographical subspecies31. Other studies add support to the geographical subspecies formation with the identification of distinct clusters based on geographic origins19,32,33.

C. moschata is an annual crop that grows well in warm tropical areas17,34,35,36. It is even called “tropical pumpkin”35. It is very well suited to the climatic conditions of Africa. Unfortunately, it is not yet an important revenue-generating crop in Africa and it is neglected across the whole continent. This neglect is especially true in Cote d’Ivoire. Cote d’Ivoire is a country where agriculture occupies more than 60% of the active population and contributes up to 35% to its gross domestic product. However, most of the agricultural research activities and crop promotion are focused on cash crops for export. Food crops for local consumption have received very little attention from the public authorities. And C. moschata has totally been ignored in agricultural research programs until now. In Cote d’Ivoire, C moschata is sparsely cultivated by small-scale farmers. To date, they are the only ones with any information on the vegetable crop. Given its lauded nutritional and medicinal importance34, consumption of the fruit pulp, the seed, the leaves and the flowers in diverse dishes is growing in Cote d’Ivoire and necessitates programs for the collections of the various accessions of C. moschata, the characterization, the conservation and the improvement of C. moschata genetic resources in Cote d’Ivoire. We conducted the present study to determine the agro-morphological characteristics of the different accessions of C. moschata collected from the main growing regions of Cote d’Ivoire and a region in Burkina Fasso. Because the environmental conditions of the growing regions are strikingly different, we evaluated the possibility of ecotypes of C. moschata in relation to those ecosystems. All the inferences in this study are based on the quantitative measures of the vegetative, phenological and yield traits.

## Results

### Description of the phenological, vegetative and yield traits of the accessions per habitat

The process of data management included the computation of mean squares for the assessed phenological, vegetative and yield traits of the accessions with the sampling habitats considered as the treatment factor. The error mean squares served in the multiple comparison of means reported in Table 1.

Regarding the phenological traits, the accessions from the habitat of Zh have the longest period from seeding to first male (102.39 d) and first female (108.14 d) flower appearances, and the longest period from seeding to physiological maturity (153.95 d). For those traits, the accessions from Tiassale and Soubre are not significantly different from those of Zh. And, accessions from Tiassale and Zh have the longest periods from seeding to 50% flowering. On the other hand, accessions from Korho, Ferke, Bondu and Burki develop their first male and female flowers and attain 50% flowering in a very short period. They also reach physiological maturity faster. Accessions from Korho, however, have the longest period from seeding to 50% emergence (6.07 d) and accessions from Bondu have the longest period from first female flower appearance to physiological maturity (53.04 d).

For the vegetative traits, accessions from Tiassale and Soubre have the largest girth size (4.43 cm and 4.63 cm, respectively). Accessions from Tiassale have the longest (24.98 cm) and widest (19.94 cm) leaves, the longest male (16.2 cm) and female (4.03 cm) peduncles and the longest petioles (34.94 cm). The measures for those organs on accessions from Soubre rank second to those of Tiassale. On the other hand, accessions from Korho, Ferke, Bondu and Burki are characterized by smaller girth size, smaller leaves, smaller petioles and smaller peduncles of male and female flowers. But the accessions from Bondu are the tallest (586.91 cm) followed by the accessions from Ferke (489.20 cm). And the accessions from Zh are the shortest (417.38 cm).

For the flowering and yield traits, accessions from Tiassale and Soubre show the largest numbers of male (27.33 units and 22.58 units, respectively) and female (5.22 units and 6.05 units, respectively) flowers per plant, largest numbers of fruits per plant (2.78 units and 2.53 units, respectively) and largest measures of all fruit-related traits. Their seeds are very large, but in small numbers. In contrast, accessions from Korho, Ferke, Bondu and Burki have the smallest numbers of male and female flowers per plant, the smallest numbers of fruits per plant and the smallest measures of fruit-related traits. They have large numbers of seeds, but their seeds are smaller, except the seeds of the accessions from Burki. Refer to Table 1 for more detailed information.

### Variability of the phenological, vegetative and yield traits

Table 2 shows the spread of the phenological and morphological traits of the assessed accessions of C. moschata. All the evaluated traits showed very wide ranges of distribution of the observations. Some conspicuously wide ranges of traits include number of days to 50% flowering (DTF) that goes from 52 to 152 d, plant height with a minimum of 48 cm and a maximum of 1510 cm, diameter of the fruit that is between 5.8 cm and 35 cm, weight of the fruit that varies between 150 g and 10,930 g and number of seeds per fruit that spreads in the interval from 32 units per fruit to 729 units per fruit. Excluding the number of days to 50% emergence (DTE), all the other assessed traits have remarkably wide ranges of phenotypic expressions (Table 2). All the traits but DTE, DTF, days from first female flower appearance to fruit maturity, fruit length and length of the dry seed, had outliers. The number of outliers ranged from 1 to 67. Except the outliers observed with the width of the dry seed, all the outliers were above 1.5*IQR + Q3 where IQR is the inter-quartile range and Q3 is the third quartile. The presence of outliers is indicative of the richness and large variability of the population of accessions. The outliers are exceptional performances that fall outside the normal distribution of the observations. They are a stock of unusual traits that can be used in a crop improvement program when beneficial. For example, the observed outliers for diameter of the fruit, weight of the fruit or thickness of the pulp can be used in a breeding program for the improvement of fruit yield. Similarly, outliers for beneficial traits related to the seed can be used to improve C. moschata crop for seed yield. Besides, the computed mean squares (data not reported) showed highly significant variations between accessions for the assessed traits. They all yielded p-values less than 0.01, providing additional support to the evidence of large variability among the accessions of C. moschata of Cote d’Ivoire. The computed standard deviation, and median absolute deviation for each trait are additional evidence. We should note that in most cases, the mean squares associated to year (data not reported) were not significant, indicating the relative stability of the assessed traits.

The components of variance, the quantitative genetic differentiation, the overall mean, and the coefficients of variation are reported in Table 3. The lme4 package37 used in the determination of the components of variance, does not provide p-values in the analysis of mixed or random models. The reported quantities in Table 3 are not accompanied with tests of significance. It is worth mentioning that the respective units of measure of the assessed traits are squared for the variances and the evaluated estimates will be reported without the units of measure. The phenotypic variance ($$\sigma_{p}^{2}$$) is partitioned into variance between morphotypes or genotypic variance ($$\sigma_{g}^{2}$$), and within morphotypes or residual variance ($$\sigma_{e}^{2}$$). For the class of phenological traits, considerable genotypic variances were observed with days to 50% flowering (266.21) and days to first male flower appearance (254.40), compared with their respective residual variances (148.13 and 199.50). Regarding the class of vegetative traits, only the peduncle length of male flowers had a genotypic variance (9.22) greater than its residual variance (8.86). In the class of flowering and yield traits, 8 of the 15 traits assessed showed large genotypic variances in comparison with their respective residual variances. They are number of female flowers per plant ($$\sigma_{g}^{2}$$ = 3.02 versus $$\sigma_{e}^{2}$$ = 2.36), length of the fruit ($$\sigma_{g}^{2}$$ = 53.96 versus $$\sigma_{e}^{2}$$ = 48.97), diameter of the fruit ($$\sigma_{g}^{2}$$ = 37.17 versus $$\sigma_{e}^{2}$$ = 16.76), volume of the fruit ($$\sigma_{g}^{2}$$ = 10,713,468 versus $$\sigma_{e}^{2}$$ = 3,904,590), weight of the fruit ($$\sigma_{g}^{2}$$ = 5,413,819 versus $$\sigma_{e}^{2}$$ = 1,420,187), diameter of the cavity enclosing the seed ($$\sigma_{g}^{2}$$ = 19.12 versus $$\sigma_{e}^{2}$$ = 7.75), thickness of the fruit pulp ($$\sigma_{g}^{2}$$ = 1.11 versus $$\sigma_{e}^{2}$$ = 0.94) and weight of the fruit pulp ($$\sigma_{g}^{2}$$ = 5,979,212 versus $$\sigma_{e}^{2}$$ = 1,088,750). For a trait to have a lager genotypic variance than the residual variance is synonymous to a relative ease of improvement of the crop for that trait through a breeding program.

The coefficient of variation (CV) is another statistic that measures variation. It is actually the dispersion of a trait per unit measure of its mean, which can be used to compare variations of traits with different measurement units or different scales. As a rule-of-thumb, a coefficient of variation greater than 20% is indicative of large variation for the trait. The phenotypic coefficient of variation is considerably high for 25 of the 28 assessed traits. Only the number of days from seeding to physiological maturity, the first and second longest axes of the dry seed show coefficients of variation less than 20%. Traits with very large phenotypic coefficients of variation include the peduncle length of female flowers ($$CV_{p}$$ = 93.98%), weight of the pulp ($$CV_{p}$$ = 92.96%), volume of the fruit ($$CV_{p}$$ = 89.17%), weight of the fruit ($$CV_{p}$$ = 78.30%) and number of female flowers per plant ($$CV_{p}$$ = 65.81%). With respect to the residual coefficients of variation, only the number of days from seeding to 50% emergence and number of days from first female flower appearance to physiological maturity have residual coefficients of variation greater than 20%, among the phenological traits. All the vegetative traits have residual coefficients of variation greater than 20%, and show a near-perfect linear relation (r = 0.98; p < 0.001) with the phenotypic coefficients of variation. From that observation, we may conclude that the variations in the phenotypic expressions of the vegetative traits are largely due to the variations within morphotypes. For the flowering and yield traits, all the residual coefficients of variation are greater than 20%, except the first and second longest axes of the seed. The genotypic coefficient of variation is less than 20% for all the phenological and vegetative traits except the peduncle length of male flowers with a coefficient of variation of 33.01%. On the other hand, 13 of the 15 flowering and yield traits have genotypic coefficient of variation greater than 20%. Among them, are the weight of the pulp with a genotypic coefficient of variation of 90.26%, the volume of the fruit with a genotypic coefficient of variation of 84.20% and the weight of the fruit with a genotypic coefficient of variation of 75.12%. Besides, for the flowering and yield traits, the genotypic coefficients of variation are highly correlated (r = 0.99, p < 0.001) with the phenotypic coefficients of variation in a near-perfect linear trend ($$b = 0.97;p < 0.001$$). That finding forms the basis to infer that most of the variations in the phenotypic expressions of the flowering and yield traits are largely caused by genotypic variability, without dismissing the contribution from the variability within morphotype.

The quantitative genetic differentiation, termed $$Q_{ST}$$38, is broadly the ratio of genotypic variance to phenotypic variance. It is closely related to the estimator of heritability. It scales between 0 and 1. It is well suited to the genetic analysis of morphological traits. In this study, the computed estimates of $$Q_{ST}$$ take values between 0.01 and 0.73. A value of $$Q_{ST}$$ = 0.28 is considered moderate quantitative genetic differentiation38. And it is easy to see that a value of $$Q_{ST} = \tfrac{1}{3}$$ implies that the between-morphotype variance is equal to the within-morphotype variance for a morphological or phenological trait. And a $$Q_{ST}$$ = 0.5 means the between-morphotype variance is twice the within-morphotype variance and can be qualified as a considerably large estimate of genetic differentiation. Based on our estimates of genetic differentiation, we may affirm that moderate to considerably large differentiation has occurred for several phenological, vegetative and yield traits and the differentiation is particularly high for fruit-related traits such as diameter of the fruit ($$Q_{ST}$$ = 0.53), diameter of the cavity enclosing the seeds ($$Q_{ST}$$ = 0.55), volume of the fruit ($$Q_{ST}$$ = 0.58), weight of the fruit ($$Q_{ST}$$ = 0.66), and weight of the pulp ($$Q_{ST}$$ = 0.73). The observed morphological differences of the accessions in the plots of the experimental trials led to the attempt to regroup the morphotypes of C. moschata in clusters with unsupervised methods. The results are given in the section below.

### Segmentation of the accessions and identification of ecotypes

The clustering was first performed with base R39. The NbClust package40 determined 3 clusters based on the majority rule. The hclust object was then used with the ape package41 to create the circular phylogenetic tree of Fig. 1. The cluster validation was verified with the fpc package42. The tree regrouped morphotypes according to their phenological and morphological similarities in three clusters that also reflect the 3 geographical zones labeled forest, mountain, and savannah. The 3 zones are characterized by distinct bioclimatic parameters (seasons, rainfall, temperature and humidity, see Table 4). Accessions from the same geographical zone were similar and accessions from different zones were distant. In general the forest region is characterized by two rainy seasons that alternate with two dry seasons, an accumulated annual rainfall of 1200 mm, an average daily temperature of 27 °C and a relative humidity of 70%. The mountain region has one very long rainy season with an accumulated annual rainfall of 1400 mm and a short dry season, an average daily temperature of 27 °C and a relative humidity of 69%. The savannah region has a long dry season followed by a short rainy season with an accumulated annual rainfall of 900 mm, an elevated daily temperature of 29 °C, and an elevated relative humidity of 80%. The forest region includes the morphotypes of Tiassale and Soubre, the mountain region has the morphotypes of the habitat of Zh and the savannah region has the morphotypes of the habitats of Bondu, Ferke, Korho, and Burki. The K-means algorithm was also used to cluster the accessions and the results similarly showed that the accessions within a cluster were from the same geographical zone as defined above (data not shown). The three regions showed large genotypic diversity among the accessions of C moschata. Figure 2 gives a picture of the diversity of the accessions with the dissimilarity measures between and within geographic regions representing the main growing areas of C. moschata. The diversity is presented by the quartiles, the minimum and the maximum rank of the dissimilarities within a region. The width of the box is determined by the number of morphotypes considered in the drawing of the boxplot and is not related to the genotypic richness of the accessions in a region. The median dissimilarity within the forest region is ranked approximately 190000th with a total number of 220,000 dissimilarity points in the population. The forest region also presents some outliers which are accessions of C. moschata that are morphologically or phenologically distinct from the commonly observed morphology and phenology of C. moschata in Cote d’Ivoire. The forest region has the largest genotypic diversity. The genotypic diversity in the other two regions is also considerably large with median dissimilarity ranking about 85000th and 90000th, respectively for the mountain and the savannah regions. The minimum and maximum dissimilarity ranks of the mountain region are about the same as the minimum and the maximum dissimilarity ranks of the distribution of accessions between regions.

A principal components analysis (Fig. 3) separated the morphotypes in distinct clusters. The morphotypes from the forest region form a distant cluster in the lower left of the two-dimensional representation of the first two principal components. The morphotypes of the mountain region form another cluster at the upper-right of the biplot. And the morphotypes from the savannah are grouped at the lower-right. The vectors indicate the traits that most characterize the morphotypes of a given region. It appears that the length of the vector is an indication of the degree of significance of the trait in the differentiation of the accessions. There is no evidence that the characters number of day to 50% emergence, weight of dry seeds and weight of fresh seeds exerted any genotypic differentiation. However, girth size and all fruit-related traits such as number of fruits per plant, diameter of the fruit, thickness of the pulp, weight of the fruit, diameter of the cavity enclosing the seeds, and volume of the fruit set the morphotypes of the forest region apart. The number of seeds per fruit and the phenological traits including the number of days from seeding to first female and male flower appearances, number of days to 50% flowering and number of days to physiological maturity are strong characteristics of the morphotypes from the mountain region. The morphotypes of the savannah region diverged morphologically with longer plant height, and accelerated phenology with shorter vegetative and reproductive phases. The morphological and phenological divergence of the accessions from the three regions is reflective of the ecosystems where they are thriving, to the point that the accessions from a region may be considered a separate variety or ecotype. All three regions showed high diversity of C. moschata with a Shanon-Weaver diversity index ranging between 1.39 for the forest region, to 1.95 for the savannah region. The Shanon-Weaver index is an indicator of the richness in term of number of different genotypes of C. moschata in a region, and evenness meaning that the different genotypes are represented in fairly equal proportion43,44,45. The Simpson index is an indicator of evenness. It scales between 0 and 1. The Simpson indices are 0.69 for the forest region, 0.77 for the mountain region and 0.80 for the savannah region (Fig. 3). The computation of the two indices takes into account the sample size. And the lower Shannon–Weaver and Simpson indices for the forest region could be due to smaller sample size compared to the other regions.

## Discussion

Cucurbita moschata has migrated to Cote d’Ivoire from Latin America with the intercontinental maritime exchanges that began in the fifteenth century. C. moschata has become an integral part of the Cote d’Ivoire’s landscape where it adapted to different ecosystems. A phenological and morphological analysis of C. moschata shows large variability for all assessed traits between the sampling habitats where the crop is largely grown. Further analysis of the structure of the population of C. moschata in Cote d’Ivoire indicates the existence of three clines, or ecotypes, that thrive in three separate and distinct ecosystems with their respective bioclimatic parameters. The forest region clines are comparatively shorter in height with larger girth size, larger dimensions of the leaves and carry about 2.7 fruits per plant. They have remarkably large fruits that are heavier with a thick pulp and a large diameter of the cavity containing the seeds. They comparatively have smaller number but large seeds. And they form a distant cluster with a very high genotypic diversity indicated by a high median dissimilarity rank and outliers. The savannah clines are longer in height with small girth size and smaller dimensions of the leaves. They carry about 1.34 fruits per plant and the fruits are smaller. They reach physiological maturity faster and have a high growth rate. The mountain clines are characterized by a delayed phenology with longer periods from seeding to flowering and to physiological maturity. They carry about 1.49 fruits per plant and the plants are short. The fruits have a larger number of seeds, larger weights of fresh and dry seeds, but the seeds are smaller compared to the size of the seeds from the forest region clines.

The observed genotypic differentiation of the clines is likely due to different mechanisms of adaptation that evolved through natural selection in order to better adapt and thrive in the different ecosystems with their bioclimatic exigencies and constraints. In the natural habitat, organisms must make changes to accommodate their physiology and anatomy to utilize the available resources of their environment. For example, an examination of the wood frog in the tundra of Canada, the mountains of Virginia and the lowlands of Maryland showed that observed larval developmental patterns were locally adaptive and reflected differential selection pressures unique to each environment. Environmental differences (especially temperature) accounted for most of the observed phenotypic variation46. In the current study, the three ecotypes adopted different mechanisms of adaptation to their respective environments. The mountain region characterized by a very long rainy season and a short dry season, with a temperature of 27 °C on the average and a relative humidity of 69% likely favors a longer retention of soil moisture and the availability of water to the growing plant. Genotypes of C. moschata in that region expressed delayed phenology with longer period of the vegetative phase and longer period to reach physiological maturity. They have a relatively higher seed yield. On the other hand, the bioclimatic parameters of the savannah region are characterized by a short rainy season and a long dry season, higher mean temperature and higher relative humidity that contribute to a shorter period of soil moisture retention and water availability to the growing plant. Accessions of the savannah region have accelerated phenology and a high growth rate to accommodate the short rainy season. They have longer plant height with small girth size and small leaves likely to reduce loss of water through transpiration under the high humidity and high temperature, while fulfilling their ontogenetic needs to produce harvestable fruits. Compared to the other two environments, the forest region appears to present bioclimatic parameters favorable for the growth of accessions of C. moschata. The forest region has two rainy seasons that alternate with two dry seasons and contribute to a better soil moisture retention and availability of appropriate quantity of water for the growing plant. With an average temperature of 27 °C and a relative humidity of 70%, it presents the best environment for the growth of the accessions of C. moschata in Cote d’Ivoire with larger leaves, longer peduncles, longer petioles, larger girth size, comparatively higher fruit yield, larger volume of the fruit and thicker fruit pulp. In an environment with low water availability, high water-use efficiency correlates with small leaf size and small organs, while larger organs are the norm in an environment with high water availability47. Smaller organ sizes may be favored in drier habitats because, for example, smaller leaves provide less surface area for transpiration water loss and smaller organ and plant size can reduce developmental time6.

Differences in the size of the organs may be due to differences in cell numbers and/or cell sizes. For example, divergence in the body size of Drosophilla melanogaster from two geographically isolated regions are linked to cytological differences, one caused by a variation in cell number and the other one by both cell number and cell size, attributable to diverse genetic mechanisms48. In this study, the three clines of C. moschata evolved as the result of local adaptation through natural selection in order to thrive in the ecologically different environments. The clines differ mostly in sizes of organs and phenology as responses to what each environment can allow for the full completion of the ontogeny of the accessions. Obviously, anatomical and physiological differences in response to the environments resulted in the observed differences in sizes and weights of organs, and phenology of the accessions of C. moschata. To meet local environmental conditions, accessions of C. moschata developed different genetic mechanisms which underlie the observed genotypic divergence. Clearly, the accessions are pre-disposed to survive and adapt to the new ecosystems in order to do so and to persist. Pre-existing variation for plasticity permitted the accessions to persist under the new environments and over time, the persistence allowed new genetic variation to arise through mutations and/or recombination3,8 that led to the adaptive divergence of the accessions and the emergence of the ecotypes of C. moschata.

## Conclusion

Cucurbita moschata is a native crop of Latin America that spreads to Africa, Asia and Europe with its predisposition for adaptation to various ecosystems. In Cote d’Ivoire, C. moschata grows in several habitats and expresses large morphological and phenological variability along with outliers. Three clines of C. moschata are distinguished and their developmental needs, their morphology and their phenology are in congruence with the three distinct ecosystems and their respective bioclimatic parameters. The allopatric formation of the three clines is likely conditioned by soil water availability for the ontogeny of the plant.

## Material and methods

### Origin of germplasm

The plant material consists of 85 accessions from 34 morphotypes of Cucurbita moschata maintained in the germplasm bank of University Nangui Abrogoua. They are collected from six sites in Cote d’Ivoire and one site in Burkina Fasso (Fig. 4). The distribution of the accessions used in this study is as follows: Twenty-one accessions are from the North of Cote d’Ivoire, including 11 accessions from the region of Korhogo (Korho) and 10 accessions from the region of Ferkessedougou (Ferke). Twenty-five accessions are from the West of Cote d’Ivoire, in the region of Zouan-Hounien (Zh). Nineteen accessions are from the North-East of Cote d’Ivoire in the region of Bondoukou (Bondu), 5 accessions are from the South-West in the region of Soubre and 5 accessions are from the South of Cote d’Ivoire, in the region of Tiassale. Finally, 10 accessions are from the South-West of Burkina Faso, in the region of Gaoua (Burki). The geographic coordinates of the collection sites of the accessions span from 5°47’ N to 10°21’ N and from 8°13’ W to 2°48’W (Fig. 4). Table 4 gives a detail on the origin, number of accessions, geographic coordinates, weather parameters and types of vegetation of the collection sites of the accessions of Cucurbita moschata germplasm.

### Experimental methods

The experiment was conducted at the experimental station of University Nangui Abrogoua, Abidjan (4° 10′ 20’’ W, 5° 35′ 20’’ N), Cote d’Ivoire, in 2019 and 2020. The climate of Abidjan is characterized by two rainy seasons from April to July and October to November and two dry seasons from December to March and August to September. But, this pattern of alternating seasons has been disturbed in recent years. The annual rainfall in Abidjan is over 1200 mm, the daily mean temperature is 27 °C, and the average relative humidity is 70%. The vegetation is characterized by a dense forest and the ferralitic soil is rich in organic matter. The seeding date for the first year of the experiment was September 10th, 2019. Due to changing weather patterns, the usually short rainy season started in early September of that year and sparsely extended to December. The seeding date for the second year of the experiment was April 20th, 2020, at the onset of the long rainy season. The experiment was arranged in a randomized complete block design with three replications and covered a total area of 0.816 ha in both 2019 and 2020. The accessions were randomly assigned to different plots in each block and the blocks were randomized. Each plot occupied an area of 288 m2 (12 m × 24 m) and was composed of 5 rows. The space between rows, and between plants within a row was 3 m, resulting in a planting density of 1190 plants ha−1. Each year a total, 335 plants were randomly sampled, followed from emergence to physiological maturity, and used for all observations and measurements. Figure 5 presents some images of the seedlings and the fruits of the accessions of C. moschata of the current study. All agricultural practices were followed according to recommendations.

### Assessment of phenological, agro-morphological and yield traits

The assessed agro-morphological characteristics are reported in Table 5. We followed the morphological descriptors suggested by Bioversity International and the European Cooperative Programme for Plant Genetic Resources.

### Statistical model and analysis

The factors, block, accessions and year were considered random and we fitted a random effects model to get a better assessment of the components of variance. We used the following model:

$$Y_{ijk} = \mu + \beta_{i} + \alpha_{j} + \tau_{k} + \varepsilon_{ijk}$$

where $$\mu$$ is the overall mean, $$\beta_{i}$$ is the random effect of block, $$\alpha_{j}$$ is the random effect of year, $$\tau_{k}$$ is the random effect of accession (genotype) and $$\varepsilon_{ijk}$$ is the random error term. The model assumes that $$\beta_{i} {, }\alpha_{{j{,}}} {\text{ and }}\tau_{k}$$ are independently distributed with Normal distributions having a mean zero and respective variances $$\sigma_{\beta }^{2} ,$$ $$\sigma_{\alpha }^{2} ,$$ and $$\sigma_{\tau }^{2}$$. The analysis showed that the differences between blocks were not significant and the model was reduced with the removal of the random effect of block. In addition, for many response variables, the year to year difference was null and for those traits the model was further reduced to only retain the main treatment variable,$$\tau_{k}$$. With the lme4 package37 of the R statistical software39, we obtained the genotypic variance ($$\sigma_{g}^{2} )$$ which is $$\sigma_{\tau }^{2}$$, the residual variance ($$\sigma_{e}^{2} )$$ and, when possible, the year to year variance ($$\sigma_{\alpha }^{2}$$). The phenotypic variance is $$\sigma_{p}^{2} =$$$$\sigma_{g}^{2}$$ + $$\sigma_{e}^{2}$$ and the quantitative genetic differentiation6,38,49 for a trait between accessions across all habitats is given by $$Q_{ST} = \sigma_{g}^{2} /(\sigma_{g}^{2} + 2\sigma_{e}^{2} )$$. Phenotypic coefficient of variation (CVp), genotypic coefficient of variation (CVg) and environmental coefficient of variation (CVe) are respectively computed as follow: $$CV_{p} = {{\left( {100*\sqrt {\sigma_{p}^{2} } } \right)} \mathord{\left/ {\vphantom {{\left( {100*\sqrt {\sigma_{p}^{2} } } \right)} \mu }} \right. \kern-0pt} \mu }$$, $$CV_{g} = {{\left( {100*\sqrt {\sigma_{g}^{2} } } \right)} \mathord{\left/ {\vphantom {{\left( {100*\sqrt {\sigma_{g}^{2} } } \right)} \mu }} \right. \kern-0pt} \mu }$$ and $$CV_{e} = {{\left( {100*\sqrt {\sigma_{e}^{2} } } \right)} \mathord{\left/ {\vphantom {{\left( {100*\sqrt {\sigma_{e}^{2} } } \right)} \mu }} \right. \kern-0pt} \mu }$$. The tidyverse package50 was used for data management. We computed the mean effect of each trait per sampling habitat, and used the Fisher’s least significant difference (LSD) procedure to separate means that are significantly different.

### Clustering accessions and identification of ecotypes

The segmentation procedure used to regroup closely related accessions is the hierarchical agglomerative clustering. With this procedure, each accession is assigned to its own cluster. The distances between clusters are computed and the clusters with the shortest distance are merged to form a new cluster, as they are the closest to each other. The distances are then recomputed and the process is repeated until all accessions are regrouped in one cluster. We used the hclust function in the stat package of R, version 4.2.139 with the “ward.D2” method51. We then used the phylo function of the ape package41 for visualization of the structure of the populations of C. moschata in Cote d’Ivoire. The NbClust package40 helped to determine the number of clusters, based on the majority rule. For the purpose of clarity, we used the averages of the accessions of the 34 morphotypes. We then conducted a principal components analysis (pca). The pca is a technique for reducing the dimensionality of a large data set while preserving most of its variability. It does so by creating new uncorrelated variables called principal components and concentrates the maximum variance in the first few components52. The technique increases interpretability and allow low-dimensional graphical representation of the data set. With those advantages of the pca, we graphically presented the accessions along with the variable vectors in the two axes formed by the first two principal components that have the largest variances. This graphical representation helped to assess the morphological traits that best characterize the accessions grouped together according to their similarities in the attempt to identify the ecotypes of C. moschata in Cote d’Ivoire, along with their particularities. We used the ggbiplot package53 for the visualization of the biplot.

### Compliance to plant material collection guidelines

For the sampling of the accessions of Cucurbita moschata, all procedures were conducted in accordance with the guidelines. And permission was given to collect the accessions from the sampling sites of this study.