The rosaceae

The Rosaceae family comprises some 90 genera containing approximately 3000 species which include: economically important fruit trees such as cultivated apples (Malus pumila Mill.) and pears (Pyrus spp.); stone fruits including peaches (Prunus persica) and sweet cherries (P. avium); numerous ornamental species, including the roses (Rosa spp.), mountain ash (Sorbus aucuparia) and ninebark (Physocarpus opulifolius); and the soft-fruit species strawberry (Fragaria ×ananassa), raspberry (Rubus idaeus) and blackberry (Rubus spp.), among many others. Several classifications of the family based on morphology have been proposed (e.g., ref. 1), while Schulze-Menz2 proposed the categorisation of the family into four subfamilies: Maloideae, Amygdaloideae, Rosoideae and Spiraeoideae based on chromosome number and fruit type. Molecular phylogenetic analyses strongly supported the monophyly of the Rosaceae and demonstrated that chromosome number is a better character for subclassification within the family than fruit morphology.3,4

Rosoideae phylogenetics

Morgan et al.3 were the first to establish the monophyly of the Rosoideae within the Rosaceae, and the subsequent reclassification of the family by Potter et al.5 retained the Rosoideae as a subfamily resulting from the first phylogenetic split within the family. Figure 1 shows a schematic depiction of the phylogeny of the Rosaceae adapted from Potter et al.5 diplaying the three main subfamilies Rosoideae, Spiraeoideae and Dryadoideae, and the tribes within each subfamily. According to this molecular classification,5 the Rosoideae subfamily contains a single supertribe, the Rosodae, containing all genera except for Filipendula. Rosodae genera are contained within three tribes, the Sanguisorbeae, the Potentilleae (containing the Fragariinae subtribe to which the genus Fragaria belongs) and the Colurieae. Rosa and Rubus, which are contained within the Rosodae, are not contained within any tribe. Plants of the Rosoideae are characterized by a base chromosome number of x=7. The majority of the plants within the subfamily are perennial herbs and shrubs; however, a small number of annual herbs and perennial tree species exist within the current classification.5

Figure 1
figure 1

A schematic representation of the phylogeny of the Rosaceae adapted from Potter et al.,5 showing three subfamilies: Rosoideae (red); Spiraeoideae (blue) and Dryadoideae (yellow) and the tribes within each family. The relative size of each triangle represents the relative number of genera contained/analysed in each tribe in the phylogenetic reconstruction by Potter et al.5

Rosoideae genera contain diverse numbers of species; the genus Fragaria for example contains relatively few species, with just 21 currently recognised and described,6 while Rosa is thought to contain at least 120 species,7 Potentilla up to 500 species8 and Rubus some 750.9 Species contained within the Rosoideae genera exhibit impressive morphological diversity and exist at numerous ploidy levels, and thus, the classification and phylogenetic determination of species within each genus have been the subject of various phylogenetic analyses, with molecular phylogenies presented for Fragaria,10 Potentilla,8 Rosa11 and Rubus.9 Within these studies, the evolutionary origins of some important polyploid species in relation to extant diploid progenitors have been determined.

The rosoideae and its economic importance

The Rosoideae contains species prized for their ornamental value such as Rosa species, those that produce sweet edible aggregate and accessory ‘fruits’ including Rubus idaeus (red raspberry) and Fragaria ×ananassa (cultivated strawberry), highly dispersed wild herbaceous species such as Potentilla reptans and species that have become regarded as invasive pests such as Rosa rubiginosa.12 The economic importance of the subfamily is exemplified by three main genera, Fragaria, Rosa and Rubus, although other genera of more minor economic value exist. Fragaria and Rubus species are valued for their ‘fruits’, which take the form of berries (swollen receptacles that bear the true fruits, the achenes on their surface) in Fragaria, and aggregates of druplets in Rubus, while Rosa is prized for its large, often fragrant and colorful flowers and over 20 000 commercial rose cultivars are reported to exist worldwide.13 Together, products of these three genera are of immense economic significance, which has prompted the need to continually produce improved varieties of both flowers and ‘fruits’ for a market that is constantly increasing its demands in terms of quality and quantity. As a consequence of the demand for new varieties, breeding programmes for the genetic improvement of key species have been established worldwide, and in line with almost all other crop plants, a huge investment in the development of molecular tools to assist these programmes, and to answer more fundamental biological questions, has been made over the past 20 years. This review will outline the major advances made in molecular genetics and genomics research in the Rosoideae over the past two decades and highlight the direction this research will take in the coming years in the light of technological advances made in the field of biological sciences.

Rosoideae cytology and estimates of genome size

Cytological studies performed on members of the Rosoideae including Fragaria, Potentilla, Rosa and Rubus have demonstrated the base chromosome number of the subfamily to be x=7, and further works have categorised and described the karyotypes of different species within the genera.1417 While the base chromosome number of members of the Rosoideae is the smallest for any of the plants within the Rosaceae,18 polyploidisation events within each of the genera has led to species with diploid chromosome counts (2n=2x=14) through to tetradecaploid (2n=14x=98) being reported, as well as various aneuploids and interploidy hybrids.19 Fertile interspecific hybrids have been reported within discrete ploidy levels in many of the genera, suggesting that within a genus, species are closely related20,21 and while intrageneric hybrids between Fragaria and Potentilla22 have been produced, viable fertile hybrids were observed only at higher levels of ploidy.

The genomes of diploid Rosoideae species that have been investigated to date were shown to be relatively small, with estimates made using flow cytometry for diploid Rosa species ranging in size between subgenera from 0.78 pg/2C–1.29 pg/2C by Yokoya et al.23 to 1.10 pg/2C–1.36 pg/2C by Rajapakse et al.,7 who estimated a genome size in nucleotides of 600 Mbp/C, while the estimate for Rubus species ranged from 0.58 pg/2C for R. ideaus24 to 0.75 pg/2C for R. sanctus.25 The genome size estimate of F. vesca was determined by flow cytometry through comparison with the 125 Mbp/C sequenced genome size of Arabidopsis thaliana (0.34 pg/2C) and was determined to be 164 Mbp/C (and thus presumably in the range of 0.45 pg/2C);26 however, this was later adjusted in the light of changes in the size estimates of the Arabidopsis genome27 to be in the region of 206 Mbp/C. The genome size of the cultivated octoploid strawberry was also estimated in the same study and determined to be, following adjustment against new estimates of the genome size of A. thaliana, in the region of 703 Mbp, less than four times the estimated genome size of the diploid Fragaria, suggesting extensive loss of chromosomal content following polyploidisation.26 Thus, Rosoideae genomes have been estimated to be among some of the smallest of all angiosperm genomes, with Fragaria having the smallest, followed by Rubus and then Rosa.

Molecular marker development and application

Arbitrary marker systems

Numerous molecular marker systems have been employed in a large number of genetic analyses of Rosoideae species over the past 20 years. These include arbitrary Polymerase Chain Reaction (PCR)-based marker systems, such as amplified fragment length polymorphisms (AFLP), random amplified polymorphic DNA (RAPD) and inter-simple sequence repeats. These marker systems were used in Fragaria, Rosa and Rubus for a range of genetic analyses, including cultivar and sport identification and genetic fingerprinting of closely related germplasm;2830 parentage analyses and evaluations of genetic diversity among species;3134 linkage map construction;3537 and population genetics 38,39. The inherent limitations of arbitrary marker systems in transferability and reproducibility between populations, species and laboratories, however, have meant that, as in the vast majority of plant species, more robust, sequence-characterized marker systems such as microsatellites have gained greater favour in genetic analyses in recent years.

Microsatellite markers

Microsatellite development

Microsatellites, or simple sequence repeats (SSRs) are PCR-based, sequence-characterized markers that have become the marker of choice for genetic analysis. Prior to the advent of second-generation sequencing technologies, various protocols for the efficient characterisation of SSRs from enriched genomic libraries of Rosoideae species were reported,4042 along with the characterisation of SSR markers from expressed sequence tag (EST) libraries.4345 Subsequently, following the availability of high-throughput sequencing methodologies and the release of the F. vesca genome sequence,46 SSR markers were developed in abundance for both Fragaria, where more than 4000 SSRs have been characterized from the genome sequence and mapped in both wild and cultivated species,47,48 and Potentilla from which 74 SSR markers were characterised from 1476 sequences identified in 454 genomic sequence data derived from P. pusilla.49

Cultivar identification

In Fragaria, Rosa and Rubus, SSR markers have been successfully employed for cultivar identification. In Fragaria, a set of 10 SSRs in three multiplexes have been reported for fingerprinting cultivated strawberry germplasm,50,51, and more recently, Chambers et al.52 used 16 high repeat number SSR markers to discriminate between a collection of 219 cultivated strawberry varieties. In rose, 24 polymorphic SSR markers were used to characterise over 70 hybrid tea and rootstock varieties,53 while just six SSR markers were needed to effectively discriminate between 65 accessions of old garden roses.54 Likewise in Rubus, 21 SSR markers were used to discriminate between 148 wild and cultivated black raspberry accessions,55 and two fingerprinting sets have been developed for red raspberry.56,57

Transferability between species

While SSR markers are robust and reliable for genetic analysis within a species, their transferability between species and genera is more limited than for other marker types. Despite reports of transferability of SSR markers between Rosaceous genera being possible,58,59 in practice, most of the markers reported display very low levels of polymorphism and thus there are only a few examples of SSR markers from other Rosaceous genera being employed successfully for genetic analysis in Rosoideae species.7

The rate of transferability and heterozygosity of SSR markers has been investigated within the Rosoideae. Within genera, the transferability of SSR markers is high and markers developed from one species have been shown to be directly applicable to other species. Davis et al.60 and Zorilla-Fontanesi et al.61 demonstrated up to 90% transferability of polymorphic SSR markers from diploid to octoploid strawberry and others47,48,62 have employed large numbers of diploid Fragaria SSR markers in the construction of linkage maps of the cultivated strawberry. Likewise in Rubus, SSR markers developed for red raspberry were successfully used to genotype accessions of black raspberry and five other Rubus species by Fernandez Fernandez et al.,56 while Marulanda et al.63 demonstrated the transferability of SSR markers developed in a number of Rubus species to the Andean blackberry, R. glaucus. Similarly, high rates of SSR transferability have been observed in other Rosoideae genera, including Potentilla49 where 86%–94% transferability between species was reported.

Lewers et al.43 reported that transferability of SSR markers between the genera was low (31%–19%), while Rousseau-Gueutin et al.64 reported transferability of Fragaria EST-SSRs to Potentilla as 75%, while to Rosa, it was just 30%. Koning-Boucoiran et al.65 found that just 17% of the Fragaria SSR markers tested transferred to Rosa and just 2% were heterozygous in a rose mapping population. Park et al.,44 however, demonstrated higher rates of transferability from Rosa to Fragaria, reporting that 61% of the EST-SSR markers they developed for Rosa were transferred to Fragaria, 47% of which were reported to be polymorphic. These studies highlight the potential for cross-genera application of SSR markers; however, they also demonstrate that the more distantly related the two genera are, the lower the transferability and polymorphism of SSR loci will be. Thus in practice, as SSR development and characterisation has become easier, molecular genetics research has focused on applying SSR markers to species within the genera from which they were developed, and cross-genera studies have mainly utilized other marker systems for the evaluation of synteny between genera within the Rosaceae and the Rosoideae, such as restriction fragment length polymorphism (RFLP) and sequence-tagged site markers developed from ESTs.66,67


RFLPs have been used to a limited extent for the molecular characterisation of Rosoideae species, including cultivar identification68 and phylogenetic analyses.39 Since RFLP probes are commonly developed from gene-coding sequence, they are often highly conserved between species. Such markers, termed conserved orthologous set markers, have found enormous utility in comparative mapping studies between plant genera and even families.69,70 In the Rosoideae, they have been employed in the development of a linkage map of rose71 and their conserved nature was also exploited, along with PCR-based markers designed from the conserved coding regions of ESTs, in a study of the conservation of synteny between Fragaria and Prunus.66

Single nucleotide polymorphism (SNP) detection and analysis

Traditionally, SNPs in the genomes of Rosoideae species were identified through projects that employed Sanger sequencing to sequence EST collections72,73 or through direct sequencing of PCR products amplified from genomic regions of interest.66,74,75 SNP markers were then scored in progenies either by direct sequencing of PCR products from specific genotypes or segregating progenies66,74,75 or through the utilisation of the cleaved-amplified polymorphic site approach where suitable restriction enzymes were available, to differentially digest alternative alleles at a locus.76,77 Such approaches, while effective and reliable, remain extremely low throughput since individual assays can rarely be multiplexed effectively, and the results are most usually visualised through agarose gel electrophoresis. Thus, for the development of large numbers of markers for mapping and large-scale surveys of populations or varieties, SNPs have not traditionally been used routinely.

Since the advent of second-generation sequencing technologies, however, high-throughput methods for SNP discovery and analysis have been developed, which have revolutionised the ability to discover, screen and associate SNPs in progenies and germplasm collections. High-throughput deep sequencing of both genomic DNA and mRNA (termed RNA-seq) has been used to identify SNPs in both Fragaria,78 Rubus79 and Rosa,80 following which a number of downstream analysis methods are available for interrogation and analysis of thousands of SNP markers in a single genotyping assay. Novel techniques have been developed for assaying large numbers of SNPs using the Illumina sequencing platform; Celton et al.78 exploited reduced representation restriction fragment libraries of the FV×FB selective mapping progeny to identify and map large numbers of SNP markers in diploid Fragaria, while Ward et al.79, using the ‘genotyping by sequencing’ (GBS) approach of Elshire et al.81, developed a densely saturated linkage map for a red raspberry mapping population. Likewise, using a novel sequence capture technique utilizing RNA baits termed targeted capture,82 Tennessen et al.83 developed a high density SNP-based linkage map of F. vesca subsp. bracteata composed entirely of segregating SNP markers.

The use of direct sequencing permits the identification and analysis of large numbers of SNPs without a priori knowledge of their genomic position or nature, but the data generated are technically demanding to analyse and interpret reliably and accurately. If SNP data are already available for an organism, however, a number of other genotyping assays have become available and have begun to be exploited recently in Rosoideae species. Kompetitive (sic) allele-specific PCR, which can be used to genotype individual SNPs in moderate numbers (96-1536) of individuals reliably and accurately, has been used to generate segregation data for mapping in tetraploid rose,65 while high-throughput, massively multiplexed genotyping array technology, which permits the interrogation of tens of thousands of SNPs simultaneously using microarray technology, has been exploited recently for the development of genotyping tools for wild and cultivated strawberry species.84 The development and exploitation of such technologies in the Rosoideae is still in its infancy and there are yet to be full reports of their application in the scientific literature; however, similar technology is currently being applied in related Rosaceous genera such as apple,85 peach86 and cherry,87 and unpublished data suggest that the Istraw90 array for F. ×ananassa is effective at producing mapping data for markers spanning the cultivated strawberry genome.

Linkage map development

Linkage mapping permits the location of genes controlling traits of importance to chromosomes through their association with molecular markers and thus, represents a powerful tool for the positional cloning of major genes, the characterisation of quantitative trait loci (QTL), and the anchoring and ordering of contigs and scaffolds of both physical maps and genome sequence assemblies. With the rapid developments made in molecular marker characterisation and the ease in which they can be used to genotype individuals, genetic linkage maps of Rosoideae species have continually evolved to become increasingly saturated, enabling the genetic dissection of the genomes of members of the Rosoideae genera and the identification of loci controlling of traits of agronomic importance in Fragaria, Rosa and Rubus.


Diploid Fragaria linkage maps

Fragaria species are found in a range of ploidy levels from diploid to decaploid, while the most economically important species, F. ×ananassa, is a complex allo-octoploid. The phylogenetic origins of the octoploid species Fragaria have been investigated and evidence that a number of extant diploid species, including F. vesca and F. iinumae, have genomes similar to the diploid progenitors of the modern octoploid species has been presented.10 Thus, the application of molecular markers to genetic mapping studies in Fragaria has permitted the investigation of the structure of both the diploid and octoploid genomes, comparative genomic analyses, the development of saturated maps of the cultivated species and the identification of markers useful for breeding.

Genetic investigations of the genome of the cultivated strawberry were initially confounded by its complex polyploid nature and thus, initial linkage studies focused on F. vesca a wild diploid closely related to the polyploid strawberry species.10 The first linkage map for F. vesca spanning the expected seven chromosomes of the species was reported by Davis and Yu.35 The map was developed from an F2 population comprising 80 individuals derived from an intraspecific cross between F. vesca var. Baron Solemacher and the F. vesca accession WC6. The map was composed of a total of 80 markers resolved into seven linkage groups and covering a genetic distance of 445 cM. This map included 75 RAPD markers (64 dominant and 11 codominant), a gene-specific marker for the alcohol dehydrogenase locus, two isoenzyme markers, phosphoglucose isomerase (Pgi-2) and shikimate dehydrogenase (Sdh), and morphological markers for the major genes controlling the runnerless (r) and yellow fruit color (c) phenotypes.

Subsequently, a second diploid Fragaria linkage map was reported by Deng and Davis.88 They used two F2 populations derived from a cross between F. vesca subsp. bracteata DN1C and the F. vesca variety Yellow Wonder, and a cross between Yellow Wonder and the F. nubicola accession FRA520 containing 40 individuals each, to map six candidate genes: chalcone synthase, chalcone isomerase, flavanone 3-hydroxylase, dihydroflavonol 4-reductase, anthocyanidin synthase structural genes and a Del-like regulatory gene. The genes mapped to five of the seven Fragaria linkage groups, identified through comparative mapping of RAPD markers previously mapped by Davis and Yu.35 In the study, they showed, through cosegregation, that a mutation in the flavanone 3-hydroxylase gene of the anthocyanin biosynthesis pathway was likely responsible for yellow fruit colour in the variety Yellow Wonder.

Following the development of these initial linkage maps using predominantly arbitrary PCR-based markers, a map characterizing an interspecific mapping progeny was reported by Sargent et al.20. A total of 73 molecular markers (66 SSR, 1 sequence-characterized amplified region (SCAR) and 6 gene-specific markers) and three genes controlling morphological traits were mapped in an F2 population of 94 individuals (later reduced to 76) obtained from a cross between F. vesca f. semperflorens ‘815’ and F. bucharica ‘601’ (formerly F. nubicola ‘601’) (FV×FB). The map covered 448 cM and was resolved into the seven linkage groups expected for the genus. A high degree of segregation distortion was observed along the linkage groups, which was suggested to have arisen from the interspecific nature of the cross.20 This genetic differentiation, however, permitted greater numbers of markers to be mapped compared with previous intraspecific mapping progenies and thus, it was adopted as the international reference mapping progeny for Fragaria.

Successive studies have refined and improved the FV×FB reference map, continually adding markers, including SSRs, gene-specific markers, RFLPs, SNPs and ESTs, and increasing the degree of saturation along the linkage groups.46,61,66,76,78,8993 The most recent incarnation of the FV×FB map93 comprises a total of 411 sequence characterised markers (SSR, RFLP, EST and gene-specific markers) mapped in the full progeny of 76 individuals, and a further 298 markers mapped using a selective mapping strategy using just six seedlings that divided the linkage groups into a total of 46 mapping bins covering the Fragaria genome.91 The total length of the current FV×FB linkage map is 442.8 cM across seven linkage groups, comparable to that of the first diploid strawberry map which covered 445 cM35 and is estimated to almost completely cover the diploid Fragaria genome.93 Due to the exclusive use of sequence-characterized markers in its construction, the diploid Fragaria reference map was employed to anchor and orientate the genome sequence scaffolds that were derived from the sequencing of the F. vesca Hawaii 4 (FvH4) genome by Shulaev et al.46

The release of the FvH4 genome sequence46 provided an essential resource for re-sequencing projects aiming to develop SNP-based linkage maps for Fragaria. In order to study male sterility in the gynodioecious diploid strawberry F. vesca subsp. bracteata, parental molecular maps of an intraspecific mapping population were developed.83 The maps contained a total of 7802 SNP and in-del markers, and following linkage analysis, the seven expected linkage groups were resolved on both maps. The 4338 markers that mapped to the maternal map covered 410 cM, while the paternal map comprised 4305 markers spanning 406 cM. These lengths are comparable with the F. vesca diploid maps previously developed (445 cM in the map of Davis and Yu,35 442.8 cM in the map of Sargent et al.93).

Octoploid Fragaria linkage maps

The cultivated strawberry F. ×ananassa originated from the chance hybridization between the allo-octoploid species F. virginiana and F. chiloensis.94 Several genomic formulas, based on cytological observations, have been proposed for the cultivated strawberry, the most commonly accepted of which is that of Bringhurst,95 who proposed the genomic composition AAA′A′BBB′B′ reflecting the contention that the allo-octoploid Fragaria genomes are completely diploidised and that segregation is completely disomic. Initial mapping efforts in the cultivated strawberry suggested that evidence for mixed disomic and polysomic inheritance was observed.96 However, the growing availability of numerous codominant transferable markers previously mapped to the diploid reference map greatly facilitated the development of well-characterized linkage maps for both F. ×ananassa and F. virginiana47,62,97 and novel techniques for studying segregation through microsatellite allele dose and configuration establishment98 have demonstrated that segregation in the allo-octoploid Fragaria is exclusively disomic and that the genome is fully diploidised as suggested by the genome model proposed by Bringhurst.95.

The first linkage maps of the cultivated strawberry were reported by Lerceteau-Köhler et al.96. The maps were constructed from a full-sib F1 progeny comprising 133 individuals from a cross between the variety Capitola and the breeding line CF1116 ([Pajaro×Earlyglow]×Chandler) (CA×CF) and were generated using 789 AFLP markers and two putative genes, alcohol acetyl transferase and dihydro-flavonol 4-reductase. The female and male parental maps contained 235 and 280 markers and were resolved into 43 linkage groups covering 1604 cM and 1496 cM, respectively. Subsequently, the CA×CF map was extended using a larger population of 213 seedlings and new AFLP, SCAR and SSR markers were added to the existing AFLP framework.99 The additional markers extended the maps to include 367 markers covering 2582 cM on the maternal map and 440 markers covering 2165 cM on the paternal map. Integration of the two maps resulted in a final consensus map for the progeny spanning 2195 cM across 32 Linkage Group (LG)s. Through comparison of common markers mapped to the diploid Fragaria reference map, linkage groups representing four homoeologous groups were identified for six of the seven diploid LGs, with just three homoeologous groups recovered for LG2. The study highlighted extremely high levels of macrosynteny between the diploid and octoploid maps, suggesting the absence of major chromosomal rearrangements during the evolution of polyploid Fragaria species from their diploid progenitors and validating the use of the FV×FB linkage map as a reference for the genus. The addition of sequence-characterized markers also supported the conclusion that disomy was the predominant, if not the only, meiotic behaviour exhibited by the F. ×ananassa genome.

While the maps of the CA×CF progeny spanned the majority of the F. ×ananassa genome and were well saturated with markers, the predominant marker type used for the development of the maps was AFLPs. Subsequent maps of octoploid strawberry progenies employed larger numbers of transferable markers. In order to study sex determination, Spigler et al.97 developed the first SSR-based linkage map of an octoploid strawberry species. A total of 210 SSR markers derived from various genomic sources were mapped in a F. virginiana mapping population raised from a cross between a female maternal line and a hermaphrodite male line. The progeny consisted of 184 individuals and the resultant consensus linkage map had a total length of 2373 cM across 42 linkage group fragments, 14 more than expected for a consensus map of the species. An updated map for the progeny was presented by Spigler et al.,100 composed of linkage group fragments associated with 30 linkage groups per parent; however, in their second report, the authors chose to present the female and male linkage maps separately rather than as a consensus map.

Subsequently, two integrated linkage maps of the cultivated strawberry, F. ×ananassa, were reported. The first was developed to investigate the inheritance of resistance to Verticillium dahliae and was derived from a cross between the resistant cultivar Redgauntlet and the susceptible cultivar Hapil.47,101 The integrated map had a total length of 2140.3 cM and was developed from a progeny of 174 seedlings. The map, which was almost the same length as the AFLP-based linkage map developed by Rousseau-Gueutin et al.,99 contained 549 molecular markers (89% of which were SSR and gene specific markers) and was resolved into the expected 28 LGs for an integrated map. One of the groups, however, was composed of separate female and male maps since no biparental markers for that LG segregated in the progeny. While no data pertaining to wilt resistance were reported in either mapping study, due to the transferable nature of the markers, a comparison with maps developed in previous investigations and of loci on each of the four F. ×ananassa homeologous linkage groups corresponding to the seven linkage groups of the diploid reference map was possible. In line with other studies, the authors demonstrated a high degree of colinearity between the diploid and the octoploid Fragaria genomes. Moreover, this work highlighted regions of homozygosity within the F. ×ananassa genome also present in other F. ×ananassa linkage maps, including those of Rousseau-Gueutin et al.99 and Zorilla-Fontanesi et al.,62 which the authors suggested were evidence of selection in the genome of the cultivated strawberry, since such regions of homozygosity were not evident on the linkage map of F. virginiana presented by Spigler et al.100 The second integrated SSR map reported for F. ×ananassa was composed of 338 markers, of which 250 were SSRs, and was developed from a mapping population obtained from a cross between two F. ×ananassa lines 232 (4-43×Vilanova) and 1393 (Gaviota×Camarosa) (232×1393), for the purposes of studying the genetics of fruit quality.62 The total length of the map was 1259.8 cM spanning 37 linkage groups, exceeding the 28 expected groups for an integrated map of the species; however, all the LGs included a marker present in the reference map (FV×FB) allowing the assignment of each linkage group fragment to one of the seven diploid LGs and permitting further map comparisons between the diploid and octoploid maps to be performed. Later, the resolution of the linkage map was increased through the mapping of additional markers.102 In that study the O-methyltransferase locus, along with other candidate genes, was mapped and a QTL analysis to identify loci controlling the production of volatile compounds was performed. The total length of the integrated map was increased to 1400.1 cM and 363 markers (70% of which were SSRs) were mapped to 39 linkage groups fragments, all but one of which was assigned to one of the four homeology groups associated with each diploid linkage group.

The most comprehensive linkage map developed for the cultivated strawberry to date was recently reported by Isobe et al.48 In their study, the authors developed an integrated linkage map composed exclusively of transferable SSR markers from parental linkage maps of three mapping populations: the F1 progeny of the crosses 02-19×Sachinoka containing 188 individuals and Kaorino×Akihime containing 140 individuals, and an F2 progeny 0212921 containing 169 individuals derived from the selfing of an unnamed genotype. Constructed from five linkage maps, the integrated consensus map produced encompassed a larger amount of diversity than one map could individually and contained a total of 1856 loci distributed throughout 28 linkage groups spanning a total length of 2364.1 cM. This map is of a comparable length to the maps previously reported by Rousseau-Gueutin et al.99 (2195 cM) and Sargent et al.47 (2140.3 cM) and was nearly twice the length of the map developed by Zorilla-Fontanesi et al.102 Given the uniformity in linkage group lengths of the majority of maps of the cultivated strawberry presented to date, and the high numbers of markers mapped in the study of Isobe et al.,48 it is reasonable to assume that the current integrated linkage map covers the majority of the cultivated strawberry genome.

Mapped traits and QTL analysis

Molecular maps are essential for the investigation of the inheritance of important traits and the genes that control them, and thus, are an important precursor for the development of marker assisted selection (MAS) programs. Targeting the genetic elements controlling important agronomic traits like fruit quality, disease resistance or metabolic content, is a crucial step for crop improvement. These traits are, however, usually controlled by QTL, which makes the identification of the genetic determinants accounting for smaller percentages of the genetic variance challenging. Genetic analysis of QTL in polyploid species gives insight into the relationship of how gene copy number influences the trait of interest, and one of the major challenges to genetic research in the cultivated strawberry is understanding how homoeologous gene loci impact the genetic control of agronomically-important traits. The sections below describe some of the traits that have been characterized genetically in Fragaria species.

Disease resistance

There are a number of economically significant fungal pathogens which cause severe damage to strawberry crops worldwide. These diseases include Botrytis cinerea (grey mould), those of the genus Phytopthora, including Phytophthora fragariae (red stele) and P. cactorum (crown rot), V. dahliae, and members of the genus Colletotrichum, most notably C. acutatum (blackspot or anthracnose). As natural resistance to many of these diseases exists in F. ×ananassa germplasm, attention has been focused on the development of molecular markers to characterize and select for resistance through linkage mapping. An F1 population containing 60 individuals obtained from a cross between the F. ×ananassa selection Md683 and the variety Senga Sengana was used in the first study of marker-trait linkage in the cultivated strawberry.103,104 The trait under investigation was resistance to the soilborne fungus P. fragariae var. fragariae, the cause of the red stele root rot disease.103 Seven RAPD markers linked to the Rpf1 gene were identified and one of them was cloned and later converted to a SCAR marker linked in coupling phase to the resistance gene.104 Hitherto, however, there have been no reports of the use of this marker in breeding programmes. Indeed alone, it has been reported that the marker is not sufficient for the selection of comprehensive resistance to red stele, which to be effective involves the pyramiding of three resistance genes, Rpf1, Rpf2 and Rpf3.105 Using the map of Lerceteau-Köhler et al.,96 five QTL for resistance to P. cactorum and five for resistance to C. acutatum were characterized and mapped by Denoyes-Rothan et al.106 Later two SCAR markers linked to the Rca2 anthracnose resistance gene conferring resistance to C. acutatum were developed,107 and subsequently, the markers have been used in a programme of marker-assisted breeding, despite permitting selection for just C. acutatum pathogenicity group 2.105 Recently, markers were identified linked to V. dahliae resistance in ‘Redgauntlet’ using the RG×H mapping population of Sargent et al.47. A number of QTL were identified, with the most promising markers developed from a hydroxyproline rich glycoprotein on LG1, one allele of which was present in the majority of the resistant cultivars screened, and absent from susceptible genotypes (Šurbanovski et al., unpublished data), showing promise for the development of an MAS strategy for the selection of novel resistant material.

Flowering habit

Cultivated strawberries are divided, according to their flowering habits, into two primary types: short-day (SD) and day-neutral (DN). SD genotypes require a day length shorter than 14 h or temperatures below 15°C to initiate flowers, while DN plants require only moderate temperatures to begin flowering, being insensitive to photoperiod. DN plants are also defined as everbearing; however, not all everbearing plants have the same flowering behaviour as DN types. The first DN cultivars were released in 1979 and all are descended from an accession of F. virginiana subsp. glauca from the Utah mountains, with the DN flowering characteristic introduced by backcross breeding. In favourable conditions, DN cultivars can produce fruit continuously all summer and into the autumn. Many everbearing cultivars have existed for much longer than DN types, but typically, they have a different cropping pattern, with two or three peaks during each growing season.

In order to study the genetics of flowering habit in the cultivated strawberry, a map resolving a single linkage group was developed containing markers linked to the Everbearing (Ev) trait.108 In the study, 199 F1 seedlings from a cross between two Japanese octoploid varieties Ever Berry and Toyonoka were used to map 5 RAPD markers in the 39.7 cM surrounding the dominant gene (Ev) regulating day-neutrality.108 This was the first report of DNA markers linked to Ev in strawberry, but as the markers were 11.8 cM and 15.8 cM on either side of the gene, they were not close enough to be useful in breeding selection. Later 127 seedlings obtained from a cross between the DN variety Tribute and the short day Honeoye (T×H) were scored with AFLP primer combinations to identify QTLs associated with the day neutrality habit.109 The mapping approach allowed the identification of eight QTL linked to the trait, one of which explained the 36% of the variability, indicating that while day-neutrality is accepted as being controlled by a single dominant gene in cultivated strawberry progenies, the regulation of the trait under different environmental conditions may be polygenic, at least in the T×H cross. More recently, Gaston et al.110 scored perpetual flowering as a dominant gene (PF) under Mendelian inheritance in the CA×CF population and mapped the trait to LG4b-f. When scored as a QTL, the locus was shown to colocalize with a major dominant QTL that the authors reported also regulated the production of runners (RU), and thus, the locus was denoted FaPFRU. The results of this study have implications for the development of markers for the selection of these two traits, both of which are of immense importance to commercial strawberry production.

Fruit quality

Fruit quality traits are major targets in breeding programs and as such a better understanding of the genetic basis controlling these traits is a crucial step towards the implementation of MAS for fruit quality. Agronomic, physical and chemical traits have been evaluated in different studies to determine their genetic control.62,102,111 In total, 87 unique QTLs were detected for 19 analysed traits in the study of Lerceteau-Köhler et al.,111 33 were detected for 14 of the 17 agronomic and quality traits studied in the 232×1393 progeny by Zorilla-Fontanesi et al.,62 and 70 QTLs for 48 different volatile compounds were detected in the companion study performed by Zorilla-Fontanesi et al.102 on the same population. Some of the considered traits were common between the studies and those such as fruit weight, firmness, soluble-solids content, titratable acidity, pH and anthocyanins, mapped to the same LG in the two mapping populations, and in all three works, a non-random distribution of QTLs was observed, with clustering of QTL observed. Another common feature of the three investigations was the presence of homoeo-QTLs, that is, QTL controlling a particular trait that mapped to orthologous positions on homoeologous linkage groups. In the two progenies studied in the three reports, plant width62, fruit shape, firmness, glucose and malate content, pH,111 terpenes linalool and terpineol102 were shown to be controlled by homoeo-QTLs, suggesting that more than one homoeologous gene copy regulates the expression of particular quality traits, and raising implications for marker development for MAS.

To date, linkage maps of the cultivated strawberry have been developed by scoring the presence or absence of alleles, i.e., using the alleles as single dose markers. This approach limits the efficiency of QTL mapping as information is discarded where two homeologous loci share a common (multidose) allele. If many of the most economically important traits in the cultivated strawberry are governed by homeo-QTL, a large amount of genetic information and potentially valuable markers would be lost through single-dose allele mapping, since homeo-QTL are likely to share alleles at loci linked to the genes regulating phenotypic expression. A possible solution to this limitation could be the scoring of multidose markers, as has been described by van Dijk et al.,98 who were able to define map positions of multidose SSR markers through analysis following capillary electrophoresis by calculating allele dose based on their electropherogram peak intensities. The methodology proposed to score multidose markers is effective; however, it is labour intensive and is currently not automated, limiting its applicability. Furthermore, due to the nature of SSR variability and segregation in F. ×ananassa, reference markers and multiplexes would almost certainly have to be developed de novo for each progeny investigated.


The cultivated red raspberry, R. ideaus subsp. idaeus, has a complex pedigree and in many cases R. idaeus subsp. idaeus cultivars contain, to a greater or lesser degree, introgression from R. idaeus subsp. strigosus (the American red raspberry), a fact that has contributed to the high number of molecular markers heterozygous in many red raspberry mapping progenies. The distinction between the idaeus and strigosus subspecies is often dismissed by breeders due to the high levels of interfertility between the two species, but the geographic separation has certainly contributed to greater allelic diversity in modern cultivars. In contrast to the high diversity and thus, heterozygosity of R. idaeus, members of R. occidentalis (black raspberry) populations tend to be very homozygous and very limited genetic diversity has been revealed in genetic studies of accessions from throughout the entire geographic range of the species.43,55

Linkage mapping in red raspberry

The first genetic linkage map for red raspberry (R. idaeus L.) was developed by Graham et al.37 from a full-sib family from a cross between two cultivars Latham and Glen Moy (L×GM) belonging to different red raspberry subspecies, R. idaeus subsp. strigosus (American red raspberry) and R. idaeus subsp. idaeus (European red raspberry), respectively. The two cultivars used have very different morphological characteristics; while Glen Moy bears commercially acceptable fruits in terms of flavour, size and colour, on a spine-free plant, it is susceptible to diseases and low-temperature damage, while ‘Latham’ is in contrast extremely hardy, spiny and bears small fruits that are not of a commercially acceptable quality.

A total of 30 heterozygous SSR markers, 4 EST-SSR and 206 AFLP markers (240 markers in total) were mapped in the progeny to produce a linkage map covering a total of 789 cM across nine linkage groups. QTL analysis performed on the progeny identified two significant QTL on LG2 accounting for 48%–50% of the phenotypic variation related to spines on the canes, two QTL on LG8 that explained 33%–79% of the phenotypic variance associated with root sucker spread (diameter), and a single QTL also on linkage group 8 explaining 53% of the variation in root sucker density.

An additional 20 SSR markers were subsequently added to the same linkage map by Graham et al.112, along with additional markers that were scored, but not mapped in the study of Graham et al.37 The updated map thus comprised 349 markers and covered a total distance of 669 cM. While some of the linkage groups from the initial study of Graham et al.37 were joined following the addition of the new markers, there were still two groups, one from each parent, which could not be associated with each other and therefore, the updated map comprised eight linkage groups. The mapping population was scored for resistance to cane botrytis (B. cinerea), spur blight (Didymella applanata), cane spot (Elsinoe veneta) and yellow rust (Phragmidium rubi-idaei). The diseases were scored in two growing sites for presence or absence of the disease, as well as on a severity scale for cane botrytis and spur blight.

The gene controlling cane pubescence, gene H, was also mapped in the L×GM mapping population.112 The homozygous form of the gene (HH) is rarely found because it is linked with a lethal recessive gene;113 however, seedlings carrying a dominant allele for gene H in heterozygous form have been shown to be resistant to cane botrytis and spur blight,114,115 but more susceptible to cane spot, powdery mildew (Sphaerotheca macularis) and yellow rust.116118 Gene H controlling cane pubescence was mapped to LG2 of ‘Glen Moy’ and its association with a QTL for resistance to cane botrytis and spur blight was confirmed.112 An additional QTL for resistance to rust, spur blight and botrytis was identified on LG3 of the ‘Latham’ linkage map, along with QTL for resistance to cane spot on LG4 of the ‘Glen Moy’ map and rust on LG5 of the ‘Latham’ map.

A set of 23 EST-SSRs developed from cDNA libraries of roots and flower buds was subsequently mapped to the L×GM map by Woodhead et al.,45 locating to six of the seven linkage groups of the map. Fourteen of the newly mapped markers were reported to be associated with QTL for disease resistance, fruit quality, fruit size and developmental stage, although the authors did not elaborate on how variation in these traits was quantified. Subsequently, additional phenotypic data relating to the stages of fruit development from bud break to over ripe fruit, fruit colour, the production of volatile compounds, cane height and cane splitting were scored in the L×GM progeny, and QTL and candidate genes associated variously with these traits were mapped to the seven Rubus linkage groups.119121 In its most recently updated version, 37 gene-specific markers were added to the L×GM linkage map, bringing the total number of functional markers mapped in the progeny to 97.122 The linkage map now contains a total of 223 molecular markers (57 AFLP and 69 SSR markers, along with the 97 functional markers), covering a total genetic distance of 840.3 cM across the seven expected groups for a consensus linkage map.122

Mapping aphid resistance

A second red raspberry linkage map was developed from a mapping progeny derived from a cross between the cultivars Malling Jewel and Malling Orion (MJ×MO) containing 94 seedlings. The map was composed of 95 AFLP and 22 SSR markers and covered a total distance of 505 cM over the expected seven linkage groups for a consensus map of the species.90 The authors estimated that the MJ×MO genetic linkage map covered approximately 80% of the raspberry genome, with some regions in LG1, which was composed mainly of markers segregating in the maternal parent, and further regions of LG3 not completely covered. The map was used to determine the position of a major gene A1 involved in the resistance to biotype 1 of the aphid Amphorophora idaei, carried by the ‘Malling Orion’ parent, which was located on LG3, 5 cM from the codominant SSR marker Ru103a. Interestingly, the region on LG3 associated with the A1 resistance gene was also where QTLs for resistance to cane botrytis, spur blight and rust were located on the linkage map of Graham et al.112 In addition to aphid resistance, dwarfism (dw) was also scored in the MJ×MO progeny. Different segregation models have been proposed for this trait, including a two gene model,123 but the segregation data recorded for the MJ×MO progeny (31) suggested a single gene Mendelian model, in line with that proposed by Jennings.113 However, the gene mapped to LG6 of the MJ×MO map, and was not linked to gene H as suggested by Jennings.113

Mapping Phytopthora root rot resistance

Linkage maps were also produced from a backcross progeny of the cross NY00-34 (Titan × Latham) × Titan comprising 159 individuals to investigate the inheritance of resistance to root rot (Phytophthora fragariae var. rubi). Linkage maps of the NY00-34 and Titan parents were generated comprising 138 AFLP, 68 RAPD and 20 resistance gene analogue polymorphism markers spanning 440 cM and seven linkage groups on the ‘NY00-34’ map, and 153 AFLPs, 47 RAPDs and 11 resistance gene analogue polymorphisms spanning 370 cM and seven linkage groups on the ‘Titan’ map.124 QTL analyses performed on both parental maps for a number of disease criteria identified a region on LG1 of ‘NY 00-34’ and ‘Titan’ accounting for 30%, 61% and 25% and 26%, 33% and 29% of the variance for plant disease index, incidence of petiole lesions and root regeneration, respectively, while a second QTL region located on LG5 of ‘NY 00-34’ was responsible for 28%, 10%, 28% and 15% of the variance of plant disease index, stem lesion size, incidence of petiole lesions and root regeneration, respectively. A second QTL on LG7 of the Titan map was responsible for 15%, 18%, 14% and 16% of the variance for plant disease index, stem lesion size, incidence of petiole lesions and root regeneration score, respectively.

Development of a saturated sequence-characterized reference map for red raspberry

The most comprehensive linkage maps for red raspberry produced to date were reported recently by Ward et al.79 Using GBS to generate SNP markers, supplemented with a genome-spanning SSR set, the authors produced highly saturated linkage maps of the parental lines of a mapping progeny produced from the cross Heritage × Tulameen (H×T) comprising 71 progeny. The study revealed almost twice the number of heterozygous markers in the Heritage genome than in the Tulameen genome. Linkage maps were composed only of markers segregating in the individual parental genotypes and spanned 462.7 cM across seven linkage groups containing a total of 4521 SNP and 33 SSR markers in 487 genotyping bins on the Heritage linkage map, and 376.6 cM across seven linkage groups containing a total of 2391 SNP and 12 SSR markers in 274 genotyping bins on the Tulameen linkage map (Figure 2).

Figure 2
figure 2

The SSR- and SNP-based linkage map of the cultivar H×T mapping population developed by Ward et al.79 comprising 33 SSR markers and 4521 SNP markers in 487 genotyping bins distributed throughout the seven linkage groups of the Rubus genome. Genetic distances are given in centiMorgan, cM.

Due to the nature of the data generated by GBS, the authors proposed a novel imputation strategy that permitted linkage maps to be developed, despite 30% missing data in each of the parental marker data sets. Almost complete colinearity was observed between the common SSR markers mapped between the H×T linkage map and those previously reported for the L×GM mapping progeny.37,112 However, while the maps covered approximately the same physical distance in the two populations, significantly less genetic distance was covered by the H×T linkage maps,79 suggesting that recombination rates in the parents of the L×GM map are much higher than those of other Rubus mapping populations. The density of molecular markers mapped in the study permitted the precise mapping of segregation distortion along the linkage groups of the two parental maps and the authors postulated the presence of lethal or detrimental sublethal effects of a number of genetic loci in the R. idaeus genome responsible for the high degree of segregation distortion observed on the H×T linkage maps. Since the SNP segregation data generated was derived from sequencing the progeny of the H×T mapping population, the dense saturation of these sequence-characterized markers across the seven R. idaeus linkage groups will be of immense utility for the anchoring and orientation of genome sequence scaffolds produced by the Heritage genome sequencing initiative.

Linkage map development in other Rubus species

Linkage mapping in black raspberry

Linkage mapping investigations were recently extended to other members of the Rubus genus. Bushakra et al.125 developed a linkage map for black raspberry (R. occidentalis) for the purposes of comparison to maps of red raspberry and to other members of the Rosaceae. In their study, the authors developed an interspecific mapping population between an advanced selection of R. occidentalis (96395S1) displaying thornless canes and purple fruit, and the red raspberry cultivar Latham which is spiny and bears red fruits (S1×L). A linkage map of the S1×L progeny was constructed composed of 131 sequence-characterized markers; however, the majority of these were mapped in ‘Latham’ due to the inherently low genetic diversity in black raspberry.55 The 96395S1 linkage map was composed of 29 markers over six linkage groups corresponding to linkage groups LG1–LG6 of the ‘Latham’ linkage map, and spanning a total genetic distance of 306 cM, while the ‘Latham’ map contained 114 markers spanning the expected seven linkage groups and covering a total genetic distance of 561 cM. Subsequently, Bushakra et al.125 used the S1×L linkage map to identify compounds relating to the modification of cyanadin compounds in Rubus. The black raspberry fruits were characterized by two xylose-containing pigments, cyanidin 3-O-sambubioside and cyanidin 3-O-2G-xylosylrutinoside, while the red raspberries contained cyanidin 3-O-sophoroside and cyaniding 3-O-2G-glucosylrutinoside. In total, 27 QTL were identified in the progeny associated with the concentrations of five anthocyanins analysed with ultrahigh performance liquid chromatography that were detected in all 3 years of the study. On the ‘Latham’ parental map, two polyphenolic biosynthetic pathway genes and three transcription factor-derived markers were significantly associated with QTL for anthocyanin production. The QTL and their associated markers on LG2 and LG7, one of which encodes a putative component of the MYB/bHLH/WD protein complex involved in the regulation of anthocyanin biosynthetic pathway genes, were previously identified as influencing the concentrations of anthocyanin compounds in raspberry fruits by Kassim et al.126 Thus, these loci were confirmed in the study of Bushakra et al.,125 and the authors concluded that these loci were extremely important for the control of the accumulation of anthocyanns in raspberry fruits.

Linkage mapping in blackberry

As with other Rosoideae genera, Rubus species exist in a number of ploidy levels. Blackberry (Rubus subgenus Rubus Watson) is a species complex and thus, commercial cultivars have a number of species in their pedigree. Tupy is the most important commercial cultivar worldwide but has an uncertain pedigree (Comanche × wild Urugauayan blackberry) and an unknown ploidy level. The remaining important cultivars are contained in two groups: the western US developed cultivars (e.g., Marion, Black Diamond, Boysen) that have western and eastern US blackberry species as well as red raspberry (R. idaeus L.) in their pedigree and range in ploidy from 6× to 10× and the cultivars developed from eastern US blackberry species (i.e.. Chester Thornless, Loch Ness, Ouachita, Prime-Ark 45) that are auto-tetraploids. The complex genetic background of the species complex has made linkage map construction for these species more challenging than for other polyploid Rosoideae species such as F. ×ananassa due to the complex polysomic inheritance patterns observed in autotetraplooid segregating progenies. Using a full-sib BC1 progeny of the cross PrimeJim (Arapaho × Ark.830) × Arapaho (PJ×AR) containing 188 individuals, a linkage map was developed for tetraploid blackberry to study the inheritance of two single gene traits, primocane fruiting and thornlessness.127 Since PrimeJim is duplex for thorny habit (ssSS) and nulliplex for floricane fruiting (ffff) and Arapaho is nulliplex for thorny habit (ssss) and duplex for floricane fruiting (ffFF), the progeny was observed to segregate 51 for both traits in accordance with tetrasomic inheritance in duplex × nulliplex crosses. A total of 120 molecular markers were scored in the progeny of which 40 uniparental markers segregated in simplex (11), and six segregated in duplex (51). A further 74 markers segregated bi-parentally; 56 in double simplex (31), 14 in double duplex (351) and the remaining 15 in an 111 ratio indicating a simplex × duplex conformation (e.g., Aaaa × AAaa or the reciprocal). Highly distorted markers and those segregating 111 and 351 were excluded from the linkage analysis and a linkage map containing eight linkage groups for Arapaho and nine groups for PrimeJim was resolved; however, following comparison with common markers placed on other Rubus genetic maps, LG4 and LG5 in PrimeJim and LG5 in Arapaho were resolved into single linkage groups containing two fragments each.127 Evidence for double reduction was observed at a number of loci, but no genotypes were observed that could only have originated from double reduction. The total length of the Arapaho and PrimeJim linkage maps was 788 cM and 768 cM, respectively. The thornless trait S was mapped to LG4 of the map, while primocane fruiting F was mapped to LG7.127


Linkage mapping in diploid rose

Debener and Mattiesch36 were the first to present a genetic linkage map for Rosa, composed of AFLP and RAPD markers. The two parental linkage maps, produced from an F1 population of 60 diploid R. multiflora accessions from the cross 93/1-117 × 93/1-119, were composed of a total of 305 molecular markers spanning the expected seven linkage groups for the species. In addition to the molecular markers, two morphological traits, petal number (Blfo) and petal colour (Blfa), were mapped to linkage groups 3 and 2 of the map, respectively. This linkage map was further extended to include an additional 469 markers comprising SSR and RFLP markers, along with gene-specific sequence characterised markers derived from protein kinase and resistance gene analogue polymorphism genes.71 The data set contained a significant number of biparental markers, which permitted the authors to construct an integrated map of both parents, spanning seven linkage groups and containing a total of 520 molecular markers covering 545 cM and estimated to cover more than 90% of the diploid rose genome.

Following the map of the 93/1-117 × 93/1-119 population, Crespel et al.128 presented a linkage map of an interspecific cross between a dihaploid rose accession of the tetraploid R. hybrida and the diploid species R. wichuriana which contained 91 progeny. This linkage map, composed of AFLP markers, was used to map two major genes, recurrent blooming (r4) and double corolla (d6), the latter being synonymous with the Blfo gene mapped earlier by Debener and Mattiesch,36 along with two QTL for the number of thorns. The linkage map was later extended through the mapping of 64 EST- and genomic-SSR markers by Hibrand-Saint Oyant et al.,129 which enabled the maps of the parental genotypes to be associated with one another. In that study, a QTL controlling flowering date was also identified. Linde et al.130 developed a linkage map of a cross between parental lines differing in their susceptibility to powdery mildew (Podosphaera pannosa) and identified three QTL explaining over 80% of the variance for resistance to mildew race 9 and other QTL for resistance to natural populations of mildew. The progeny also segregated for four traits controlled by major genes, Bflo and Bfla mapped previously,36 along with absence of prickles and a striped petal phenotype. The map was composed predominantly of AFLP and RGA markers, along with a modest number of SSR markers and a single SCAR, but the relatively high number (30%) of biparental markers identified permitted an integrated linkage map of the progeny to be developed.

These three linkage maps, along with a fourth described by Shupert et al.131 derived from the cross R. chinensis cultivar Old Blush × (R. wichurana cultivar Basye’s Thornless × Old Blush), were used by Spiller et al.132 to develop an integrated consensus linkage map for rose. The consensus map contained a total of 597 markers, of which 206 were sequence characterised and 59 were common to all four linkage maps. The consensus map spanned a total map length of 530 cM and provided the means to propose a standardised linkage group nomenclature for Rosa. Perhaps most importantly, the consensus map permitted the location of ten phenotypic traits controlled by major genes, QTL for seven quantitative traits and 51 gene-based markers, providing a valuable resource for the development of tools for MAS in rose.

Linkage mapping in tetraploid rose

Despite the existence of many diploid rose species, modern rose cultivars are predominantly tetraploid, derived from around eight diploid and a smaller number of tetraploid progenitor species.13 Diploid rose species are thus of immense value in understanding the genetics of specific traits, which would be desirable to introgress into commercial cultivars, and as a tool for unravelling the complex genetics of their tetraploid relations. However, such studies on the diploid are not directly applicable to rose breeding and are of limited value in understanding the inheritance of traits in commercial tetraploid rose genetic background.

Thus, following the development of diploid linkage maps for the genus, a linkage map developed from a tetraploid F2 rose progeny was reported by Rajapakse et al.7 The progeny studied was derived from a cross between an amphidiploid rose selection 86-7 and a tetraploid cultivar Basye’s Blueberry (86-7×BB). The meiotic behaviour of genotypes of this progeny had previously been investigated by Ma et al.,133 who observed the multivalent formation of between 15% and 74%, and those authors suggested that the progeny exhibited partial tetrasomic inheritance. The linkage map of Rajapakse et al.7 was produced from an F2 population derived from the self-pollination of seedling 90-69 from the 86-7×BB cross and contained 52 plants. Approximately 70% of markers segregated in simplex in each parent, 20% in duplex, while the remaining 10% displayed other segregation types, including markers that displayed distorted segregation ratios. While the markers segregating in duplex appeared to display segregation patterns fitting to both 151 and 351 ratios, indicating that both disomic and tetrasomic inheritance occurred in the progeny, the population size was not large enough to discriminate between the two inheritance types. Thus, the resultant linkage map was composed exclusively of dominant markers segregating in simplex and therefore, separate parental maps were recovered, containing 15 linkage groups for the 86-7 map and 14 for the 82-1134 map.

Gar et al.134 developed a linkage map of a cross between two tetraploid R.hybrida cultivars Golden Gate and Fragrant Cloud (GG×FC). Assuming tetrasomic inheritance due to the segregation ratios observed in the progeny, they mapped 403 markers, which were variously sequence-characterized RFLP, SSR and SNP (cleaved-amplified polymorphic site) markers (65%) and arbitrary AFLP markers (35%) to seven consensus linkage groups for each parent spanning 632 cM and 616 cM for the female and male parents, respectively. Each linkage group corresponded to a set of four coupling phase homeologous groups. Due to the high number of sequence-characterized markers mapped in the study, the authors were able to show the homology between the tetraploid map and the diploid consensus map of Spiller et al.132 and to determine a high degree of conservation in macrosyntenic genome structure between the Rosa and Fragaria genomes.

Further work investigating the modes of inheritance in tetraploid rose progenies was performed by Koning-Boucoiran et al.65 Using a progeny from the cross P540 × P867 consisting of 184 genotypes, segregation of AFLP, nucleotide-binding site and SSR markers segregating in both simplex and duplex with both uniparental and biparental inheritance was scored. While the hypothesis of complete disomic inheritance in the progeny was rejected, marker segregation data displayed evidence for both disomic and tetrasomic inheritance, and so the authors proposed the hypothesis that the progeny conferred tetrasomic inheritance with some preferential pairing occurring on certain chromosomes. Evidence for double reduction at some codominant SSR loci was also observed; however, it was based on the assumption that all alleles scored mapped to the same locus, and as with the data of Castro et al.127 for blackberry, the mapping data they presented did not conclusively support this assumption. In total, the expected 28 linkage groups (seven sets of four coupling phase linkage groups) were recovered for the female map and 30 groups were recovered for the male map, but not all of the groups contained SSR markers and so four female and five male groups remained unassigned. Using the map, the authors identified three QTL for prickles on the stem explaining 44.1% of the trait variance, and single QTL for both petal number and powdery mildew resistance explaining 12.7% and 8.5% of the variance, respectively.

Genome sequencing initiatives

High molecular weight DNA library development and sequencing

High molecular weight bacterial artificial chromosome (BAC) and fosmid libraries have been developed for Fragaria135,136 and Rosa,137 while in Rubus, a protocol for high molecular weight DNA extraction suitable for BAC library construction was published138 and the use of the subsequently-constructed BAC libraries was reported in the literature,139 but the characterisation of the BAC library was not. The 18 432-clone Fragaria BAC library was characterized through the development of pools for PCR-based screening and clones containing markers from the diploid Fragaria reference map were identified.135 However, the library awaits full exploitation. Additionally, a 33 000-clone fosmid library has been developed from the Fragaria species F. vesca subsp. americana, which was characterized through sequencing of a randomly selected set of 30 clones containing 1 Mbp of genomic DNA sequence and gave the first insight into gene content, composition and distribution in the Fragaria genome.136 A BAC library developed for Rosa was used to assemble a BAC contig containing a minimal tiling path of six clones spanning a 400 kbp region of the telomeric section of LG1 of the R. rugosa genome containing a gene conferring resistance to the blackspot fungus (Diplocarpon rosae).137 Subsequent PCR analysis and sequencing of the clones contained in the 340.1 kbp contig revealed the presence of sequences encoding nucleotide-binding site domains typical of resistance gene analogues and from this region, molecular markers tightly linked to the resistance locus were subsequently developed.140,141 In Rubus, two BAC clones associated with QTL for Phytopthora root rot resistance located on LG3 and LG6 of the Rubus genome were sequenced and genes putatively associated with root vigour were identified, which the authors speculated may possibly have a role in Phytopthora root rot resistance.139

Plastid sequencing and applications

The chloroplast genome of F. vesca was recently described.46 The genome is 155 691 bp long and encodes a total of 78 proteins, 30 tRNAs and 4 rRNAs, and its sequence has been utilized as a powerful tool for phylogenetic and population genetics studies within the genus. While previous studies in Fragaria have used chloroplast DNA sequences for phylogentic analysis,64,142 the publication of the complete chloroplast DNA sequence for F. vesca facilitated a phylogenetic treatment of the genus Fragaria by Njuguna et al.10 based on whole chloroplast DNA re-sequencing. The study revealed compelling evidence for the maternal ancestry of the extant octoploid Fragria species, indicating that the origins of the genus rest in East Asia and that the maternal origins of the polyploid Fragaria are rooted in the Americas. The study also permitted hypotheses to be formulated about the evolution of the mating and sexual systems in the genus. More recently, parental species-specific SNP markers identified through whole chloroplast re-sequencing were used in conjunction with nuclear SSR markers to study hybrid zones of F. virginiana and F. chiloensis, which showed that there is a high degree of natural interfertility, significant admixture and relatively little maternal bias among parental and hybrid species, and revealed the extent of the genetic complexity within octoploid Fragaria hybrid zones.143

Recently, the whole chloroplast genome sequence of Potentilla micrantha was sequenced using the Pacific Biosciences PacBio RS sequencing platform.144 The genome spanned a total of 154 959 bp and was structurally similar to the F. vesca chloroplast genome, containing the two expected inverted repeats spanning 25 530 bp in length each, a large single copy and small single copy region covering 85 137 bp and 18 762 bp, respectively, and 120 genes, 21 of which were duplicated in the inverted repeats, giving a total of 141 genes of known function (Figure 3). Through comparison with the F. vesca chloroplast genome sequence, the authors determined that gene order within the chloroplasts of the two species was identical. The chloroplast genome of P.micrantha was the first to be sequenced using the PacBio RS platform, which, following de novo assembly, yielded a single, highly accurate contiguous sequence spanning the entire chloroplast, and suggesting that PacBio RS data may be of immense utility in whole-genome shotgun sequencing and BAC-sequencing projects.

Figure 3
figure 3

The chloroplast genome of Potentilla micrantha assembled from PacBio RS and Illumina HiSeq2000 data144 detailing genes transcribed clockwise (inside circle) and counterclockwise (outside circle). Genes are coloured according to functional categorisation, IRA and IRB denote inverted repeats and LSC and SSC denote long and short single copy regions, respectively.

Whole-genome shotgun sequencing projects

The advent of second generation sequencing platforms made whole-genome shotgun sequencing of moderately-sized genomes and de novo assembly of draft genome sequences a realistic prospect for non-model organisms. Indeed, the continuing development of second generation sequencing technology, with greater throughput and longer sequencing reads, is driving down the cost of sequencing, and this has permitted the initiation of sequencing efforts for genomes of Rosoideae species by interested consortia, without the requirement for central funding for the projects. Thus, the sequencing efforts are community-driven, community-funded endeavours, the results of which are proving to be of immense benefit for the communities that developed them as well as many other researchers in the plant sciences field. The various sequencing projects in Fragaria, Potentilla, Rosa and Rubus that are currently underway are at various stages of completion. While the genome of F. vesca has been published and is publically available,46 for others, such as those of P. micrantha144 and R. chinensis, very little public information was available at the time of writing.

Fragaria vesca genome sequence

The first whole-genome sequencing initiative in the Rosoideae was the genome of the woodland strawberry, F. vesca. The genome of F. vesca cultivar Hawaii 4 (FvH4) was sequenced by an international consortium, and since it was the first genome to be characterised using exclusively short-read technology, the project represented a significant milestone in plant genomics.145 The 39× average depth of sequence coverage of the genome was achieved using predominantly data generated with the Roche 454 platform, supplemented with Illumina GAII and ABI/SOLiD data, and was assembled de novo into 3200 sequencing scaffolds with an N50 of 1.3 Mb. The majority of the sequenced genome (209.8 Mb, representing in excess of 95% of the total assembled genome length) was contained in 272 scaffolds of which 222, covering 197.7 Mb (94%), were anchored to the diploid Fragaria reference map to create seven pseudo-chromosomes (Figure 4).93

Figure 4
figure 4

The seven pseudochromosomes (yellow) of the Fragaria vesca ‘Hawaii 4’ (FvH4) genome sequence as anchored to the FV×FB reference map (red).93 The pseudochromosomes are composed of a total of 222 sequencing scaffolds covering 197.7 Mb of sequence, representing 94% of the F. vesca genome. cM, centiMorgan.

Multiplexed cDNA pools representing a comprehensive set of diverse tissues were sequenced and a set of 34 809 hybrid gene models, of which over 90% were supported by transcript-based evidence, were presented. A total of 957 genes identified in the sequence were unique to strawberry, of which 416 were assigned gene ontology categories, while the remaining 541 were previously unidentified predicted proteins. The F. vesca sequence, the first genome of the Rosoideae and second only to Malus to be sequenced for the Rosaceae, is one of the smallest plant genomes so far characterised, and is just 25% larger than the genome of Arabidopsis thaliana.26 Its sequencing represents a gateway to functional and structural genomics investigations in the genus, and it has already found many applications in genetics and genomics research, enabling detailed comparisons with the genomes of other related genera,146 permitting the fine mapping83 and positional cloning147 of genes controlling specific phenotypes, and enabling unprecedented numbers of markers to be identified and mapped in both diploid83 and octoploid Fragaria species.48

Rubus genome sequencing initiatives

Two reference genome sequences are currently under construction for Rubus. One from the commercially important red raspberry, R. idaeus,148 and the other from the black raspberry, R. occidentalis,149 which is only grown commercially in specific regions of the United States. Both species are diploid, and the genome sizes of the two species are similar at approximately 350 Mb and 300 Mb for R. idaeus and R. occidentalis, respectively. Despite their close relationship and small genomes, the assembly strategy for the two genomes differs based on their relative heterozygosity.

The cultivar Heritage was selected for the red raspberry sequencing project, because of its importance as one of the first commercially important primocane fruiting raspberry varieties. Heritage was initially sequenced to over 100× coverage using a combination of Roche 454 and Illumina technologies. Standard assembly algorithms, such as Newbler (Roche, Basel, Switzerland) and ABySS resulted in fragmented assemblies. An assembly covering about 252 Mb of the genome with an N50 scaffold size of about 218 000 bp was produced,148 with an estimated SNP rate of approximately 1.5%. To improve the assembly, novel algorithms aimed at augmenting the problem of the heterozygous assembly were specifically developed (J Price, personal communication), and thus, currently, using a two-pronged approach which takes advantage of recently-developed Moleculo technology (Illumina) to produce single-molecule derived ‘long-read’ contigs together with improved mate pair library construction methodologies, the contiguity of the assembly is being significantly improved.150

Parallel to the sequencing initiative in red raspberry, a preliminary draft genome of the black raspberry has been produced from the sequencing and assembly of data generated from clone ‘ORUS 4115-3’.149 The particular clone was chosen because of its highly homozygous genome, and thus extremely low SNP rate of just 0.06%, which is 25 times less than that of the red raspberry cultivar Heritage. A combination of overlapping fragment and mate pair libraries was sequenced to a combined coverage of over 370×, resulting in an assembly that produced 2226 scaffolds with an average size of approximately 107 000 bp, an N50 of approximately 350 000 bp and genome reconstruction of approximately 240 Mbp. The Rubus genome sequencing initiatives are currently ongoing, but the expectation is that both sequences will be published and made publically available in 2014.

Evidence that whole-genome sequencing is becoming a commonplace tool for genetic analysis in plant research increased recently when it was reported that the genome of a relatively unknown Rosoideae species P. micrantha was being sequenced using both Illumina and PacBio sequencing platforms.144 The species, while possessing little perceived economic value, is being developed to study a number of evolutionary processes that have led to the expression of economically important traits in strawberry. The fact that whole-genome shotgun sequencing is being used as an economically viable tool to answer such questions in a non-model, non-crop plant, along with the increasing use of genotyping by sequencing for linkage map development and genomic analysis, suggests that x-generation sequencing is coming of age and will soon be as commonplace as the use of PCR in the plant molecular genetics and genomics toolkit.

Comparative analyses with other species

Since, at the time of writing, only the genome of a single Rosoideae species, F. vesca, is publically available, interspecific synteny studies within the subfamily have relied on comparisons between the genetic positions of markers in Rubus and Rosa linkage maps with the F. vesca genome sequence to infer synteny between the genomes of the subfamily. Gar et al.134 were the first researchers to evaluate the conservation of macrosynteny between genera within the Rosoideae. In their study, they exploited the conserved sequence of EST markers mapped in Rosa to infer evolutionary relationships between chromosome structure of rose and diploid strawberry. BLAST analysis of the sequence of 70 EST markers mapped to the GG×FC linkage map that had a single clear unambiguous position on the FvH4 genome sequence revealed similar distributions of markers in the genomes of the two species, with conserved chromosomal locations of 77% (54) of the markers. Indeed synteny between the genomes of the two species was high, with five Rosa linkage (RG3-7) groups corresponding to a single Fragaria chromosome each and the remaining two groups showing evidence for a single fusion–fission event in Fragaria leading to the evolution of Fragaria chromosome 6. In fact, just four translocations and six inversions were required to place all orthologous markers in the same order between the two genomes. Later, using the L×S1 linkage map, Bushakra et al.67 compared the genetic positions of 77 Rubus markers with conserved sequences to their positions on the FvH4 genome sequence and evaluated the conservation of synteny between the two genera. Each of the Rubus linkage groups in that study was shown to correspond with a single Fragaria chromosome, suggesting a high degree of conservation of synteny between Rubus and Fragaria as had been demonstrated previously between Rosa and Fragaria.134 Following the conclusion that each of the seven haploid chromosomes of Rubus had a specific identifiable orthologue in Fragaria, the authors proposed that the Rubus linkage groups be renamed to correspond with those of Fragaria, and this revised nomenclature was adopted by Ward et al.79 for their SNP-based map of R. ideaus and Castro et al.127 on their blackberry linkage map.

Recently, large numbers of markers mapped in the cultivated strawberry48 and F. vesca subsp. bracteata83 have enabled a detailed comparison of the genomes within the genus Fragaria. While the studies revealed a high degree of colinearity between the FvH4 genome sequence and the genomes of both the cultivated strawberry and F. vesca subsp. bracteata, both investigations demonstrated that the mapped positions of a proportion of markers disagreed with their expected chromosomal locations on the FvH4 genome sequence. In the case of the comparison of the F. vesca subspecies,83 22 inversions, 11 intrachromosomal translocations and 40 interchromosomal translocations were observed between the F. vesca subsp. bracteata map and the ‘Hawaii 4’ pseudochromosomes. The reported differences were all supported with more than one mapped genetic marker and thus, the authors postulated that these differences could have been the result of real evolutionary differences between the F. vesca subspecies, errors in the genome assembly of the FvH4 sequence or rearrangements resulting from the hybrid nature of the linkage map (FV×FB) used to anchor the FvH4 sequence. The cost of map development using large numbers of SNP markers is becoming ever more affordable as sequencing technologies and GBS protocols develop, and thus, the development of a F. vesca linkage map saturated with sequence characterized SNP markers would enable a rigorous testing of the anchoring of the FvH4 sequence scaffolds, as well as the quality of the assembly of the sequencing scaffolds themselves.

Comparative analyses of floral regulation in diploid Rosa and Fragaria

Comparative genomics has recently been used within the Rosoideae to demonstrate the role of orthologous genes in the regulation and control of a trait of economic importance in both rose and strawberry. Timing of flowering is an extremely important aspect of a plant’s reproductive biology, and the ability to flower several times during a year is of significant economic importance in ornamental plants, such as roses for which the value is in the flowers, as well as fruit crops like strawberry. In diploid Rosa and Fragaria, two flowering types have been observed, a single period of flower initiation, described as ‘once-flowering’ in the context of roses by Iwata et al.151 and more commonly referred to as ‘short-day’ flowering in the context of strawberry,152 and continuous flowering, also referred to as day neutral, or semperflorens type in Fragaria.153 Continuous flowering is controlled in strawberry by a recessive mutation in a single gene at the locus first described as s by Brown and Wareing154 and later as the seasonal flowering locus (sfl) to denote the wild-type phenotype by Albani et al.152 In rose, the CF phenotype was described first as r4 by Crespel et al.128 and later as recurrent blooming (rb) by Spiller et al.132 The sfl and rb genes were mapped to linkage groups FG6 and RG3 of the diploid Fragaria and Rosa linkage maps respectively by Sargent et al.20 and Spiller et al.,132 which following the synteny comparisons of Gar et al.134 were shown to be homologous chromosomes, suggesting that the genes controlling sfl and rb were also homologous.

The rb locus was shown to co-segregate initially with the RoSPINDLY gene by Remay et al.155 and subsequently also with the RoKSN gene by Kawamura et al.156, which were confusingly mapped to LG4 of the diploid rose progeny from the cross The Fairy × R. wichurana (TF×RW) used in those studies. This linkage group, however, is the homologue of RG3 on the map of Spiller et al.132 and the colocation of the BFACT-047 SSR previously shown to map closely to the sfl locus in the diploid Fragaria FV×FB progeny supported the homology of the rb and sfl loci in Rosa and Fragaria. RoKSN is a homolog of the floral repressor TERMINAL FLOWER 1 (TFL1), which acts as photoperiodically regulated repressor of flowering. In the TF×RW population, plants conferring the rb phenotype were shown to be homozygous for a mutant allele of the RoKSN gene containing a 9-kbp copia-like retrotransposon in its second intron. These same plants were shown not to accumulate RoKSN mRNA and it was hypothesized by the authors that the retrotransposon blocked the synthesis or accumulation of the mRNA and they proposed a model for the regulation of seasonal flowering by RoKSN.151 In the same study, the authors showed a 2 bp deletion in the coding region of the Fragaria homologue of TFL-1 (termed FvKSN in that study) caused a truncated TFL-1 mRNA, which coincided with the sfl phenotype.151 A concurrent study by Koskela et al.147 provided functional evidence that TFL-1 is the gene responsible for floral repression in diploid Fragaria. In their study, the authors showed that TFL-1 cosegregated with the sfl locus and that it confers a short-day requirement for flowering through repression of FvAP1/FUL. Additionally, they demonstrated that the 2 bp deletion in the mutant TFL-1 gene leads to FT1 dependent long day flowering in F. vesca.

Such studies demonstrate the utility of both whole-genome sequences, and comparative genomics for the investigation of key traits of agronomic importance in closely related species. It is likely that other traits controlled by both single genes and QTL in the Rosoideae are governed by homologous genes, and thus, as sequencing data from a growing number of species within the subfamily accumulate, the elucidation of the genetic basis of traits in one species will help inform studies in other species and genera and enable the development of markers for MAS in numerous Rosoideae species.

Development of genome-wide functional genomics resources for Rosoideae species

Numerous studies have been performed investigation the structure, function and expression patterns of specific genes in Rosoideae species. A comprehensive review of this body of excellent work is beyond the scope of this article; however, recently, a number of studies have exploited second-generation sequencing technology to characterize the global transcriptome of Rosoideae species challenged with a variety of different biotic and abiotic stresses.

Prior to the sequencing of the genome of FvH4, Folta et al.157 published a data set derived from mRNA from F. ×ananassa cultivars that had been subjected to a range of treatments including harvesting at different stages of diurnal rhythms, plants subjected to different light conditions, the application of growth regulators along with biotic stresses and treatment with a variety of chemicals. mRNA was sequenced using the Roche 454 platform and assembled into over 32 000 contigs, representing a substantial proportion of the expressed portion of the cultivated strawberry genome, as well as characterizing allelic variation in the genes identified that the authors suggested could be applied to the development of genomics tools for breeding.

Likewise, in the other major Rosoideae species, for which genome sequences are currently not available, a number of approaches have been used to characterize the global transcriptome in specific tissues or during specific stresses. In Rubus, Ward et al.158 sequenced mRNA pools derived from resistant and susceptible R. idaeus plants challenged with the Phytopthora rubi and their corresponding uninoculated controls. Exploiting the genetic similarity between Rubus and Fragaria transcripts, they aligned Rubus transcripts against the F. vesca genome sequence46 and identified over 16 000 putatively expressed genes. Additionally, de novo assembly of transcripts enabled the identification of sequences unique to Rubus and also to the pathogen the plants were challenged with. The authors reported that this approach constituted a powerful analysis method in non-model organisms and could be used to investigate specific, testable hypotheses.

While no genome sequence currently exists for rose, much work has been performed characterizing the expressed portion of the Rosa genome. Using data generated with both the Roche 454 and the Illumina platforms, a transcriptome database resource was created from mRNA isolated from various stages of floral development, leaves and stems, as well as from drought stressed plants.159 In total, 80 714 ESTs based on transcript clusters longer than 100 nucleotides and containing more than two sequences were identified from which 76 611 peptides were predicted, 20 997 of which were clustered into 13 900 protein families. Comparison with the F. vesca transcript set46 revealed that 44 656 clusters had significant homology to 14 252 Fragaria gene predictions, with a mean identify of over 90%. Thus, 53% of the predicted transcripts were represented in the Rosa database. Similarly, Kim et al.160 employed deep transcriptome sequencing from rose flowers of four cultivated varieties of R. hybrida to generate a database containing an average of 35 385 transcripts per cultivar.

Recently, Kang et al.,161 exploiting the structural genomics resources developed for F. vesca,46 performed genome-scale transcriptomic analyses of floral tissue of Hawaii 4 plants at various stages of fruit development from floral anthesis to enlarged fruits. By studying the developmental changes at the whole transcriptome level, they were able to determine that the endosperm and seed coat of strawberry achenes may play a more vital role in auxin and giberrellin biosynthesis than the embryo itself. Their study provided a global and comprehensive overview of the changes in gene expression during this economically important process, providing a platform for future studies of individual genes and pathways involved in fruit set and subsequent development both in Fragaria and other species within the Rosoideae and the Rosaceae as a whole.

These resources, along with the annotated gene predictions derived from the F. vesca ‘Hawaii 4’ genome sequence by Shulaev et al.46, provide a wealth of global genomic information from the expressed portion of the genomes of the three major Rosoideae genera. Such resources, in combination with the structural genomics resources developed for these species, afford researchers working with Rosoideae species the opportunity to link structural and functional information relating to traits of interest and to develop both a detailed understanding of the genetic mechanisms controlling important agronomic traits, and the development of functional markers for use in breeding and selection.

Concluding remarks and future perspectives

The past two decades have seen prolific developments in the molecular genetics and genomics of Rosoideae species. Like in other plant species, a variety of techniques have been used to construct genetic linkage maps from segregating progenies, including the use of molecular markers ranging from isoenzymes and RFLPs,66 arbitrary PCR-based markers, sequence-characterized PCR-based markers such as SSRs48 and the development of array-based and second-generation sequencing-based SNP marker characterization and their application.79,83 Developments in second- and third-generation sequencing technologies led to the first genome sequence of a Rosoideae plant, F. vesca,46 and the promise of sequences of an additional four Rosoideae genomes in the near future. These sequence resources enable an improved understanding of the conservation of synteny between Rosoideae species and genera67,134 and will assist in rapidly expanding our basic understanding of key agronomic traits.

Molecular genetics and genomics studies of Rosoideae species have benefited the whole production system, from breeding to nurseries and growers, by providing tools to effectively evaluate genetic diversity and to fingerprint for variety identification. The tools have also led to a greater understanding of the genetics of numerous traits of agronomic importance including pest and disease resistance,112 anthocyanin and flavonol production affecting fruit quality102,125, the timing of flowering,147 and have aided in the development of molecular markers linked to a range of traits desirable in modern soft-fruit and cut flower cultivars.141

To fully realize the potential of the molecular genetics and genomics data that have been amassed for Rosoideae species, MAS will need to be fully integrated into breeding programmes. To meet this goal, there must be an open dialogue between breeders and genomicists leading towards a more focused research strategy dependent on accurate phenotyping. This will lead to high-throughput and user-friendly markers for deployment in breeding and selection of strawberries, raspberries, roses and other Rosoideae crops of the future. With the continued development of ever more cost-effective genomics solutions, we predict that MAS will be quickly incorporated into breeding programs worldwide, permitting the development of varieties with improved resistance to pests and disease, quality and yield.