Introduction

Strawberries (Fragaria × ananassa) are not true fruits. The red fleshy pseudocarp of a strawberry is formed from a swollen flower base or receptacle. The true fruits are, in fact, the achenes, which develop from a whorl of carpels and together form an aggregate-accessory fruit. The viability of both carpels and pollen has an important role in the resulting uniformity of berries1. Carpel position, density and viability dictate the shape, size and uniformity of a strawberry. Indeed, strawberry breeders have selected for high carpel densities in order to produce larger fruits1. Simple, classical studies, which remove all or part of the achenes from undeveloped pseudocarps has led to a cessation in the auxin “swelling signal” in the area beneath each achene and thus uneven fruit development2. In a similar fashion to achene removal, uneven pollination of the carpels, or absence of achene development, are the main causes of uneven pseudocarps3. Uneven successful pollination can be caused by damage to flowers through high temperature, frost or precipitation1. A late frost in spring could lead to carpel and other damage, resulting not only in malformation but also complete lack of strawberry development1. Strawberry flowers have a variable proportion of viable carpels and anthers between flower orders, both within a plant and also between different cultivars4,5. Indeed, primary fruit are more likely to be malformed due to the relatively lower quantities of viable anthers and pollen6,7.

In spite of the environmental factors known to influence uniformity, literature has shown that strawberry uniformity still has a large genetic component and can be improved through breeding8,9. Indeed, where breeders have selected for increased uniformity within and among berries, improvements in uniformity were observed over time8. Cultivars have been shown to differ in their susceptibility to misshapen fruit, indicating a significant genetic component controlling uniformity1,8,10. For example, ‘Florida Elyana’ is susceptible to rain damage, disrupting carpel development and thus misshapen fruit leading to lower market value9, similarly ‘Camarosa’ has been noted as a cultivar, which is particularly susceptible to misshapen fruit with ~4% of yields lost as a result of misshapes10,11. By contrast, ‘Florida Radiance’ has high marketable yields and does not exhibit a high proportion of misshapen fruits9. Breeders can influence the proportion of uniform strawberries through selecting—be it directly or indirectly—for (1) even allocation of viable carpels across the receptacle within the flower (2) ready access to pollen within flowers and (3) high fertility of carpels ensuring even successful pollination.

Strawberry is an important fruit crop with a global production of 9.25 million tonnes in 201712. Producing visually appealing strawberry fruit is one of the primary objectives in a strawberry breeding programme13. Shape uniformity is an essential trait of strawberry fruits due to the direct association with product quality and value14. Increasing the uniformity of berries can increase the proportion of marketable fruit as berry irregularity is one of the primary imperfections leading to culling and reduced marketable yield8.

As there are no well-defined strawberry phenotyping guidelines for fruit uniformity, the current system used in the strawberry breeding programmes at NIAB EMR relies on visual assessments, which are subjective and laborious. Unlike morphological traits such as length, volume and colour, which can be accurately measured manually in a low-throughput manner, uniformity assessment is extremely subjective. As there is no quantitative method of generating phenotypic data for uniformity, the genetic determinants of strawberry uniformity are still unknown.

Computer vision has shown great potential to quantify external fruit quality and 2D imaging has been successfully implemented to measure the shape and size of fruits such as strawberries15, apples16, watermelon17, cherries18 and mangos19. Basic shape traits such as length, width, aspect ratio and volume, and more sophisticated traits such as elliptic Fourier descriptors20 have been quantified and used to describe variation in fruit quality. 3D imaging has been successfully used for phenotyping the crop canopy21,22 and root architecture23,24, and a 3D strawberry phenotyping platform has been explored in our previous study25. With the 3D point cloud reconstructed based on the Structure from Motion (SfM) method26, basic size-related parameters have been measured in three dimensions allowing volume estimation with high accuracy27. Compared with shape and size evaluation, uniformity is a multidimensional trait, therefore it is not possible to quantify through 2D image analysis with a single viewing angle. The application of 3D image analysis for phenotyping the external qualities of fruit has not been sufficiently explored, and the basic, previously characterised, shape- and size-related parameters are not adequate for understanding uniformity.

Here the application of a 3D phenotyping platform allows us to investigate the genetic basis of strawberry shape uniformity. The 3D image analysis software leverages the previously developed platform25 in order to define eight new external variables and investigate their importance on manual uniformity assessment. This method was applied to a multi-parental strawberry mapping population in order to quantify the genetic components underpinning strawberry uniformity.

Materials and methods

Plant material and experimental set-up

A multi-parental strawberry population was generated through crossing 26 diverse cultivars and breeding lines to create a population of 416 genotypes made up of 26 crosses each containing up to 16 individuals (Supplementary Fig. 1). The progenitors were crossed in March 2016, seed was germinated in June 2016, experimental plants were pinned down in July–August 2017 and cold stored from January to June 2018. Progenitors were selected to represent diversity across multiple fruit quality traits. Twelve replicate runner plants were pinned down from each genotype into 9 cm square pots containing compost. Clonal plants were separated from parental plants and then placed in cold storage (−2 °C) until the start of the experiment. Plants were potted into 2 L pots containing coir and fertigated at 1 kg L−1 (rate: 10 s every 45 min) using Vitex Vitafeed (N:P:K, 176:36:255). The trial was located at NIAB EMR, Kent, UK (51° 17′ 24.202″ N 0° 26′50.918″ E). Two 150 m polytunnels with standard luminescence ‘3 year plastic’, 150 micron covers; external temperatures were: June 16(±4) °C, July 16(±4) °C, August 18(±4) °C, September 15(±4/3) °C and October 16(±3) °C. Pots were arranged in a randomised blocks design on the two central most gutters, pots were held in place using wire and cable ties. Due to the large scale of the experiment, replicate blocks were set up at intervals, plants were removed from the cold store on the following dates: block 1—29th May; block 2—20th June, block 3—2nd July, block 4—23rd July, block 5—1st August. A Natupol Koppert bumble bee hive was added into each polytunnel to assist even pollination. Where possible, three strawberries per plant were picked when ripe into egg boxes. The phenotyping took place over a single season in 2018. Picking for block one began on 11th July 2018 and picking for block 5 finished on 8th November 2018. Boxes were labelled with QR codes to assist tracking of genotypes. Strawberry shape uniformity was scored for individual berries on a scale from 1 (completely irregular and undulating fruit surface) to 9 (complete 3D uniformity around the central axis) with extensive training provided for all assessors. Strawberry shape was allocated into 9 categories: globose, globose-conic, conic, long-conic, bi-conic, conic-wedge, wedge, square and miscellaneous, strawberry shape categories were based on the definitions provided by strawberry breeders at NIAB EMR, some of these categories have been described before15. Manual uniformity and shape scores were recorded in the field book app 28, the QR scanning feature allowed quick access to the correct entry form.

3D reconstruction

The 3D imaging platform was a modified version of that developed by He et al.25. Strawberry fruit were placed in the middle of a turntable, on a dark blue holder made by polymeric foam (38 mm × 19 mm × 19 mm; height, length and width). Unlike the previous study, a webcam (Logitech C920, Newark, CA, USA) was fitted at a height of 30 cm and horizontal distance of 25 cm away from the sample. QR codes on containers were scanned through the webcams allowing tracking of berries and automated labelling of image files. The imaging rig was placed inside a photography studio tent with constant LED illumination. The turntable rotated at a frequency of 50 s per full turn, and an image was captured every second. Six imaging platforms allowed concurrent imaging of replicate berries. The 3D reconstruction was implemented with Agisoft Photoscan (Agisoft, LLC, St. Petersburg, Russia), and in order to increase the accuracy and processing speed, all images were pre-processed by cropping to a smaller size (400 × 600 pixels). Background subtraction was achieved through arbitrary colour thresholding. The image processing software for webcam control and automated image pre-processing were written in C++ utilising the OpenCV Library25,29.

Data processing pipeline of phenotypic trait extraction

Point cloud pre-processing

In the pre-processing stage (Fig. 1), each point cloud model was converted from the colour space of RGB (red, green and blue) to HSV (hue, saturation and value). Arbitrary thresholding on the hue channel was used to remove the noise introduced in the reconstruction stage. The clean point cloud was translated to the origin of the 3D coordinate system based on the distance between the moment of the point cloud and the origin. By calculating the eigenvector associated with the largest eigenvalue of the coordinates of points, a rotation matrix could be derived to represent the main orientation of the point cloud, which can be used to rotate the point cloud with the main orientation aligned with the z axis. After rotation, the arbitrary threshold was applied again on the hue channel in order to segment the strawberry body and blue holder from the whole point cloud. The height of the holder was obtained by calculating the difference between the maximum and minimum values of the holder point cloud on the z axis. As the original coordinate system generated by SfM method has an arbitrary scale, each point cloud model needed to be standardised by the height of the holder, so that the sizes of all point clouds are comparable.

Fig. 1
figure 1

Point cloud pre-processing for strawberry body extraction, translation to origin of xyz coordinate system and size standardisation

Shape uniformity-related trait measurements

Eight uniformity-related traits were calculated from the point cloud data of the strawberry body after pre-processing. These are:

Coefficient of variation (CV) of side view areas (CV_A) and the ratio between maximum and minimum side view area (Max_A/Min_A). All side views should be identical in a perfectly uniform strawberry. In order to eliminate the heterogeneity introduced from the calyx and the holder, only the points within the middle 50% of the body height of each point cloud were retained for analysis (Fig. 2). In order to understand the heterogeneity of different side views of a point cloud, each point cloud was rotated along the z axis by 3.6° for 99 rotations, and after each rotation, the side view of the point cloud was projected onto the xz plane in 2D (labelled in white). A convex hull was fitted to each projected image and the contour area was calculated. For area metrics, two traits were obtained; the CV of side view areas (CV_A) and the ratio between the maximum and minimum area (Max_A/Min_A). An ideal uniform strawberry will have a value of zero for CV_A and one for Max_A/Min_A.

Fig. 2: Side view of strawberry body for the CV measurement of the area and principal orientations.
figure 2

Convex hulls are outlined in blue, and blue and green arrows indicate the principal orientations (a). Extraction of example slice images horizontal to xy plane at the height of 20%, 40%, 60% and 80% of the total height. A minimum bounding box is fitted to each slice image (b). Sixteen patches of points labelled in different colours for curvature estimation (c)

CV of principal orientations (CV_D). The major eigenvector indicating the main orientation was calculated by principal component analysis (PCA) for all 100 side view projected images, and the heterogeneity of the orientations of the projected images was quantified by calculating the CV of angles of the main orientation. Like CV_A, a perfectly uniform strawberry will have a value of zero for CV_D.

Aspect ratio of the minimum bounding box (L/W). A lateral slice image was obtained by identifying the intersection between the plane in parallel with the xy plane and point cloud (Fig. 2b). Based on the values on the z axis, 100 evenly spaced slice images were obtained. The slice image with the largest contour was obtained by calculating the contour area of the convex hulls for all slice images. The principal orientation of the contour was identified by the eigenvectors of the first two principal components. With calculating the centroid of the contour, a bounding box could be computed along the principal orientation. The ratio between the length and width of the bounding box was derived and the ratio should be one for a perfectly uniform fruit.

Circularity of the maximum circumference (CIR). Visually, the circularities of the contours in horizontal slice images are high if the strawberry is uniform. Circularity (CIR) was calculated as previously described30:

$$C = \frac{{4\pi A}}{{p^2}},$$

where A and p are the area and perimeter of the convex hull respectively. For each point cloud, the circularity was calculated for the slice image with largest contour area.

Straightness of centre axis (STR). The coordinates of the centroids for each horizontal slice image can be located by calculating the moment of the contour. The centroids can be connected as a straight line for a uniform strawberry. The centroids were calculated for all the slice images within the middle 80% of the body height, and the straightness of the central axis was characterised by:

$${\mathrm{STR}} = \frac{{\mathop {\sum }\nolimits_{i = 1}^{N - 1} d_i}}{D},$$

where N (N = 80) is the number of slice images used for the analysis, di is the Euclidean distance between neighbouring slice images, and D is the Euclidean distance between the centroids of the top and bottom slice images.

CV of curvature and the ratio between maximum and minimum curvatures (Max_C/Min_C). The principal curvature can be calculated for each point in the point cloud, which describes how much the curve deviates from a straight line at this point. It can be imagined that the 3D curve can be sliced orthogonally around the direction of normal in to 2D curve, and the maximum curvature k1 and minimum curvatures k2 are the two principal curvatures for the 3D curve31. The average curvature k, which is defined as the mean value of the magnitudes of principal curvatures in the two main directions was applied to quantify the curvature for a given point. As the curvature measurement is sensitive to noise, the point cloud surface of strawberry body was first smoothed by using Moving Least Squares (MLS) method32, which could reconstruct a smooth surface from the noisy point cloud. Sixteen patches of the points were selected evenly from the points forming the largest slice in parallel with the xy plane (Fig. 2c). For each patch, the first half of the largest curvatures were averaged and used to represent the curvature of the patch. With the curvatures of all 16 patches, the CV of curvature (CV_C) and the ratio between maximum and minimum curvatures (Max_C/Min_C) were calculated.

Statistical analysis

Ordinal regression

Statistical analysis was performed using R (version 3.5.1) and the Genstat statistical package (Version 13.0, VSN International Ltd. England). Differences in uniformity traits within each shape groups were distinguished using ANOVA and Tukey post-hoc test. Pearson coefficients of correlation were calculated between all proposed uniformity-related traits. As the group labels are ordinal dependent variables, ordinal regression was used to evaluate the performances of all traits33. Model fit was ascertained by using selection criterion values based on the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). In general, a better model fit generates lower values for both AIC and BIC34. In order to identify the optimal variable combination related to manual assessment, stepwise AIC and BIC methods were applied35. The most significant variable was identified by comparing the criterion values of all models. Other variables were added successively and retained if the model fit was improved.

Genotyping and linkage map

DNA was extracted for each genotype from unopened leaflets using the Qiagen DNeasy plant mini extraction kit. Genotyping was conducted using the Axiom® IStraw35 384HT array36 (i35k). Marker curation was undertaken using the Affymetrix analysis power tools software and the SNPpolisher R package, parameters were 0.8 for minimum dish quality control, 96% minimum call rate and a call threshold of 0.01. A total of 18,790 markers were found to segregate across the multiparental population. Crosslink was used to generate linkage maps—a programme developed specifically for polyploid plant species37. A total of 34,949 markers were used to generate the consensus map. The map orders from 5 populations were combined to make the consensus map as detailed in the study of Cockerton et al.38Fragaria × ananassa chromosome number is denoted by 1–7 and the sub-genome number is represented by A–D as specified in ref. 39.

Genetic analysis

The best linear unbiased estimate (BLUE) was calculated for all genotypes in order to correct for the influence of assessor, data and block, estimated scores were weighted by replicate number. Linear mixed-effects models were generated for each phenotypic trait with and without covariates. Grand scores for each genotype were calculated using mixed models to account for significant covariates. Broad-sense heritability (H2) was calculated using mixed model output parameters generated through the asreml function in the statistical package ASReml-R version 4.1.0.10640, whereby H2 = VG/VP, where VG is the genotype variance and VP is the total phenotypic variance. Narrow sense heritability was calculated using the vpredict function in ASReml-R h2 = VA/VP, where VA is the additive genotypic variance.

Genomic prediction

A genetic additive relationship matrix was calculated for the population using r package snpReady41. Genetic best linear unbiased prediction (GBLUP) was performed using the statistical package ASReml-R version 4.1.0.10640. 75% of genotypes were used as the ‘training’ population to predict the phenotype of the remaining 25% of genotypes used as the ‘test’ population, as advised by Erbe et al.42. Phenotype was weighted by the number of replicates. Predictive ability (PA) was defined by the correlation between the predicted and actual phenotypic score for the test population over 10 cross validation events, thus allowing us to determine the PA of the model. Prediction accuracy (PAC) was calculated by PAC = PA * H2 as detailed in Gezan et al.43.

Composite interval mapping

Multi-parental QTL mapping was conducted in R using package “mppR”44. A permutation test determined the significance threshold45. A two-step QTL analysis was implemented: the selection of cofactors was achieved through Simple Interval Mapping (SIM) proceeded by a multi-QTL model search using composite interval mapping (CIM)46,47. As a multi-parent population CIM works on parent relationships. Therefore the ‘FA2654’ × ‘FA8752’ cross was removed as it is not directly related through the parental cultivar network. All other crosses were interrelated and formed a single network (Supplementary Fig. 1).

Results

Characterisation of uniformity-related traits

All the uniformity-related traits were calculated based on the point cloud. The mean values and the standard errors for each visual uniformity class are presented in Fig. 3. ANOVA results showed that significant differences were observed between uniformity classes for all traits (p < 0.001). The Pearson’s linear correlation coefficients were calculated between all traits, and strong correlations were found amongst Max_A/Min_A, L/W and CIR (Table 1).

Fig. 3
figure 3

Mean value and standard error of calculated uniformity-related traits by the newly developed 3D image analysis software against defined uniformity scale based on manual assessment

Table 1 Pearson’s linear correlation coefficients among all uniformity-related traits

Ordinal regression models were constructed for all variables and each variable independently. L/W was not significant due to the high correlation with other variables and CIR showed the best model fit with the lowest AIC and BIC values (Table 2). New variables were added sequentially to the model until no further improvement of the criterion value was observed. The AIC- and BIC-based stepwise selection methods showed inconsistent results (Table 3). The AIC-based method showed the optimal criterion value with all variables except L/W and Max_C/Min_C, but BIC-based method showed that STR could not improve the model fit.

Table 2 Summary of individual ordinal models and variable significance of ordinal model with all variables, toward prediction of manual assessment uniformity scores
Table 3 Model comparison values for uniformity metrics, toward prediction of manual assessment uniformity scores based on AIC and BIC

The influence of shape on uniformity

The shape of a strawberry influences the uniformity trait score. Bi-conic strawberries were seen to have high uniformity based on the area overlap measures (CV_A & Max_A/Min_A), L/W and CIR scores indicating bi-conic strawberries have consistently circular horizontal transects at the middle point. Whereas for curvature uniformity measures (CV_C & Max_C/Min_C), globose fruit are the most uniform and miscellaneous fruit the least (data not shown). Both the manual uniformity score and CIR could discriminate miscellaneous shapes from the other shape categories (Fig. 4).

Fig. 4: Circularity of the maximum circumference (CIR) scores for each manually classified strawberry shape category.
figure 4

Letters denote significant differences between categories. Error bars are standard errors of the mean

Genomic prediction

Prediction accuracy values varied between 0.05 and 0.32 (Table 5). Unsurprisingly, traits with low narrow sense heritability such as CV_A also show a low predictive ability as the model is based on additive variance (Table 5). CIR, a trait, which may be considered a valuable trait for a breeder exhibited a relatively high prediction accuracy of 0.29.

QTL identification

A total of 23 QTL were found to be associated with uniformity traits (Table 4). Of which 20 were detected in more than one progenitor. Five focal SNP’s, on chromosome 2B and 5D were found to represent more than one trait (Table 4, Fig. 5). The same focal SNP AX.166521293 was identified as an important region in Max_A/Min_A, CV_C and CIR uniformity traits. Global adjusted R2 values for all QTL in linear models were between 3.16 and 38.89 indicating the proportion of variation explained by identified QTLs (Table 5). The maximum variation explained by a single QTL is 10.96% (Table 4). All uniformity traits apart from CV_A were significantly affected by date of picking. CV_D had the largest broad-sense heritability score of 64.1 and narrow sense heritability of 51.4 (Table 5).

Table 4 Focal SNPs representing strawberry uniformity QTL
Fig. 5: Location of QTL on the octoploid consensus map scaled to the Fragaria vesca ‘version four’ genome.
figure 5

Horizontal grey lines represent iStraw-35k axiom array markers

Table 5 Broad-sense heritability (H2) and narrow sense heritability (h2) across the multiparental population for automated uniformity trait scores

Discussion

We report for the first time a robust method to measure strawberry shape uniformity and apply this technique to generate genetic markers for uniformity traits. Several studies have attempted to quantify strawberry fruit shape using 2D images with neural networks48, 3D imaging49 and by machine learning15. However, none of these studies investigated berry uniformity. Unlike the aforementioned studies, who measure a relatively small number of genotypes intensively, we have implemented a high-throughput imaging platform across a large population to facilitate genetic analysis of the trait. Although strawberry shape has received greater attention in the literature, in our opinion berry uniformity is a more important trait for a breeder to improve.

In current strawberry breeding practice, there are no widely accepted criteria for quantifying uniformity due to the difficulty of defining a multidimensional trait. Here, the manual strawberry uniformity scale has been designed by NIAB EMR breeders. As such, the absence of a straightforward definition, has meant that it has not been possible to study the genetic components controlling strawberry uniformity in the past. To overcome this, we have used 3D image analysis to define the parameters underlying a breeder’s perception of strawberry uniformity. The original 3D strawberry phenotyping system25 could accurately measure basic size-related traits. In this study, the point cloud analysis software was further developed to quantify strawberry uniformity through eight proposed metrics. By comparing with the manual scale, the image processing pipeline has demonstrated an objective method of characterising strawberry uniformity components.

Quantifying berry uniformity

Circularity of the maximum circumference (CIR) of strawberries showed the best predictive ability for manual uniformity scores based on the ordinal regression model fit, when studying individual variables alone. A completely misshapen fruit with a severely undulating fruit surface will score a value of 1 for manual assessments, and these completely misshapen fruits were the easiest category to identify by eye, as they were clearly distinct from regular shapes. A low CIR value appears to represent the undulating misshapen and “miscellaneous” fruit (Fig. 4). Miscellaneous berries are the most undesirable fruit shape category therefore it is highly beneficial to select against them. When multiple traits are combined to describe uniformity, the best fitting model required the combination of CIR, CV_A and Max_A/Min_A, CV_D and CV_C. The five factors required for optimal model construction indicate that there are multiple uniformity components influencing the manual uniformity score.

Misshapen fruit QTL

One of the QTL represented by the focal marker AX.166521293 on chromosome 2B was found to be associated with CIR, this QTL was also associated with CV_C and Max_A/Min_A, each of which were found in the best fitting model used to describe the manual uniformity score. The focal SNP AX.166521293 was found to be present and significant in six progenitors and had an effect size of 10.96% on Min_A/Max_A. Therefore, this marker is a potential candidate for marker-assisted breeding in selection against completely misshapen and irregular strawberries. Furthermore, this work has highlighted a region of interest for further study to pinpoint the causative allele associated with reduced uniformity. Dissecting the contribution of genetic and environmental components believed to underpin strawberry uniformity; susceptibility to heat stress, carpel and pollen viability, achene position, size and distribution1 may help to further elucidate the mechanism of uniformity segregating in the multiparental population.

Uniformity trait selection

The trait L/W shows little improvement on the overall combined trait model fit due to the high correlation with other traits including Max_A/Min_A and CIR, but it was still a good predictor of uniformity based on the model fit when studying individual variables alone. AIC- and BIC-based stepwise feature selection showed disagreement on the selection of the STR parameter. The difference between calculating AIC and BIC is that AIC does not account for the sample size, so when sample number is large, BIC applies larger penalty for complex models and leads to a simpler model50. However, this study does not aim to identify the optimal feature combination to develop prediction model related to manual uniformity evaluation, but develop a new image-based quantification to replace the manual scale, because the ground-truth data are subjective and as such any large bias can reduce the robustness of model development. Moreover, the manual scale cannot be considered a comprehensive assessment as the parameter STR cannot be visually evaluated by eye. However, it must be said that if a trait cannot be detected by the human eye, then it is not a valuable trait for a strawberry breeder to select upon.

Limitations of the system

The 3D point cloud analysis software is independent of the imaging acquisition system, and the uniformity-related traits can be extracted automatically in a high-throughput manner. However, the imaging collection throughput was 50 s per fruit and the 3D reconstruction has to be performed separately, which limits use to pre-breeding experiments. Due to the occlusion from the viewing angles, the strawberry nose cannot be fully reconstructed especially for globose-shaped fruit, which decreases the accuracy of STR measurements and also limits the study on automated shape classification. To increase the throughput and accuracy of 3D phenotyping, it is necessary to further develop the hardware with multiple cameras to allow more viewing angles or a structured light-based imaging system with a robotic arm, and also integrate the hardware driver with 3D reconstruction software. The current point cloud image analysis software was able to characterise many key external traits, which are important for strawberry breeding, however, the measurement of other parameters such as achene density must be investigated through an improved phenotyping platform in future studies.

Genetic control of fruit quality

Papers detailing strawberry fruit quality QTL report genetic alleles associated with multiple fruit quality traits including fresh weight, metabolites, external colour and firmness51,52, however, there are currently no papers, which report QTL associated with strawberry uniformity. Here, we provide a phenotyping platform, which has facilitated the assessment of the genetic components underlying strawberry uniformity for the first time. The use of a multi-parental population has allowed the study of a diverse set of germplasm and has ensured that resulting QTL have a greater relevance for breeders when compared to alleles identified in bi-parental studies. Furthermore, implementation of an additive genomic selection model indicates that strawberry shape uniformity—particularly CIR may be somewhat improved using a genomic selection approach. Overall, 20 of the QTL were found to have an effect on uniformity in more than one of the 26 progenitors indicating that there has been limited linkage decay between the causal allele and marker, and that the relationships have been maintained across generations. Furthermore, the QTL on chromosome 2B was observed three times across different uniformity traits, such traits are only partially correlated and thus describe discrete components, as such this allele can be seen to have a major role in uniformity.

Genetic control of strawberry fruit shape

Unlike uniformity the mechanism controlling fruit shape has been studied extensively in the wild strawberry; Fragaria vesca and may act as a surrogate model for the cultivated octoploid strawberry Fragaria × ananassa. In F. vesca, fruit shape is primarily controlled by phytohormones53,54. Auxin increases the width of fruit and by contrast gibberellic acid (GA) increases the length of a strawberry, whereas abscisic acid (ABA) down regulates both auxin and GA and thus reduces fruit expansion53,54. GA-deficient F. vesca mutants were found to have a “short” or globose berry shape, which, through the application of GA to the berry, could be restored to result in a “long” or long-conic fruit shape54.

It is clear that breeders wish to select for greater berry uniformity however the confounding relationship between shape and uniformity must also be considered. For example, square, wedge and wedge-conic strawberries may have high 2D symmetry but not 3D symmetry. UK breeders primarily aim to select for conic or long-conic fruit, whereas globose, square, wedge and miscellaneous berries are classified as undesirable and bi-conic, globose-conic and conic-wedge fruit are seen as acceptable shapes (Abigail Johnson, personal communication). Here we provide an objective measure (CIR) that can be used to discriminate the least desirable berries—miscellaneous or misshapen berries and select for regular fruit shapes.

Heritability of uniformity

Broad-sense heritability scores were between 36.6 and 64.1 for automated uniformity metrics, whereas narrow sense heritability varied between 2.1 and 51.4, as such, some traits were found to have a higher heritability than manual uniformity. These values indicate the proportion of variation segregating in the study population, however improvement in the heritability may also be caused, in part, by more accurate phenotypic measurements. In particular, high heritability was observed for CV_D, which indicates the angle of a strawberry related to whorl of carpels (Table 5) is under strong genetic control. Date of picking was seen to have a significant impact on all uniformity metrics apart from CV_D, which had a large genetic component. The high significance of date indicates the developmental environmental conditions have a significant impact on strawberry uniformity. Extreme temperatures were observed during the experiment, which may have caused the significance of date. All traits apart from CV_C showed a significant genotype by environment interaction indicating that genotypes were responding differently to heat stress. Misshapen fruit have been found to have a greater proportion of small underdeveloped achenes following exposure to extreme temperatures during embryo development55,56.

In this study, we detail a comprehensive dissection of the traits underlying strawberry uniformity and show that the visual perception of a strawberry can be represented by five metrics. The generation of an objective measure of uniformity has allowed the assessments of genetic components in a multi-parental breeding population. We show uniformity has a strong genetic component that can be improved by breeding and identify genetic components controlling uniformity that are present across a wide array of germplasm.