Introduction

Rice is one of the major crops in the world, with an annual production over 700 million metric tons1. Half of the world population consumes rice as the staple food2. Currently, the demand for rice is rapidly increasing due to the growth of the human population3. However, the current rice production cannot meet the increasing demand causing severe food security issues. The biotic and abiotic stresses also exert a negative influence on rice production4. The rice farming is also a way of living for many people, especially in numerous Asian countries5. At present, 1.8 million Sri Lankan families engage in rice farming over 870,000 hectares6. The annual rice production in Sri Lanka is approximately 2.3 million metric tons (MT), which is insufficient to fulfill the domestic rice demand of 3.0 million7. Hence, the Sri Lankan government spends about USD 400 million to import rice annually7,8.

The rice production is mainly affected by drought and irregular rainfall patterns caused by climate change9,10,11, adverse soil conditions such as salinity6,12, and pest and disease attacks13. The biotic and abiotic stresses in rice farming can be controlled using numerous agronomic practices such as irrigation, drainage, fertilization, and the application of pesticides. However, the rate of success of the controlling methods is limited13 due to the unpredictable nature of climate change, soil degradation, variations in pest dynamics, and development of pest resistance14. Therefore, breeding is considered as the most successful strategy to produce high yielding and stress-resilient rice varieties15. The improved rice genotypes can also contain the traits for higher consumer preference and organic farming16. In the past, the rice varietal improvement was conducted with classical breeding techniques, which are tedious, lengthy, and less feasible in cases such as breeding for pest resistance and submergence tolerance. However, the marker-assisted breeding (MAB) is employed in modern breeding programs to introgress valuable genetic loci from landraces and traditional varieties17,18 and the desirable haplotypes of Quantitative Trait Loci (QTL) to the improved rice varieties19,20,21.

The decision-making process in a breeding program is crucial for successful outcomes. The formulation of decisions before breeding is a multi-step process that consists of the identification of breeding priorities, determination of the genetics and estimated breeding values (EBV) of target traits, and employment of pre-breeding methods if required. The economic and technical feasibility, number of parents for crosses, number of selfing and outcrossing cycles, length of the breeding program/cycles, and identification of the selection methods must also be assessed22. In the decisioning process, initially, the market trends based on consumer and other stakeholder preferences must be recognized23. Subsequently, the novelty and the uniqueness of the breeding objective must be assessed before the execution of the breeding program22.

The selection of suitable varieties or individual plants as parents and the determination of the selection methods are the two most critical aspects in planning breeding programs24. The parental selection depends on the number of prioritized traits for breeding. When multiple characteristics are to be introgressed, the breeders require a prioritized order of parents for stepwise crossing and selection25,26. The decision-making process in breeding is entirely based on the available information on phenotypes, genotypes, pedigree, EBVs of key traits, available budget, field and greenhouse space, desired time-to-market, etc. Although the data for decision-making for breeding are indispensable, haphazardly collected information would provide less value to the breeders. In many conventional breeding programs, most of the data are recorded in field notebooks and stored in the breeding stations, while very little information is available as computerized databases. If an organized database containing all the essential information for the rice varieties released and the parental genotypes used in breeding, the decisions can be easily made.

The construction of a database with all the necessary information from varieties and their parents promotes the capacity of data sharing, mining, visualization, and retrieval27. Pedimap is a pedigree visualization software. The data needed can be imported to Pedimap from FlexQTL, or with some custom script from any other database program. Pedimap is used by many contemporary plant genetics and breeding programs worldwide. As stated in Voorips et al.28, Pedimap can be used to record and utilize breeding history. Pedimap illustrates the available phenotypic and genetic data through pedigrees. All the information, including parentage, qualitative and quantitative data, marker alleles/genotypes, and the calculated identity-by-descent (IBD) probabilities can be presented in Pedimap. Currently, breeders prefer to use pedigree visualization tools like Pedimap since it allows them to access the large pool of genetic and phenotypic data quickly and generate pedigrees that are essential in making breeding decisions.

In Sri Lanka, Rice Research and Development Institute (RRDI) is the sole organization conducting the rice breeding programs for the national needs. Therefore, in the present study, we report an attempt to organize the information of the released varieties and the parental genotypes of RRDI breeding programs as a Pedimap based database, which is a valuable step to take accurate breeding decisions and speed up the process of releasing novel varieties.

Materials and methods

Data curation

The data were collected from RRDI, Sri Lanka, and classified under three main categories, namely pedigree history, phenotypic data, and molecular data on rice varieties/ landraces/ genotypes (herein after collectively referred to as cultivars). The male and female parents and the order of crosses were taken under pedigree history. The average yield of the rice plants, the maturity period in different growing seasons (Yala and Maha seasons of Sri Lanka, two main rice growing seasons, based on the two different seasons of monsoonal rains. Yala season is generally drier29), plant height, basal leaf sheath color, and additional color patterns, recommended type of the land, level of phosphorus deficiency tolerance, amount of brown rice recovery, milling recovery, head rice recovery, amylose content, gelatinization temperature, the weight of 1,000 grains, shape of the grain, pericarp color, the weight of a kg, the color of the buff coat and resistance/susceptibility to pests and diseases; brown planthopper (BPH), bacterial leaf blight and rice blast disease were recorded under phenotypic data (Supplementary Table S1 online). The available DNA marker alleles, marker positions in the linkage map, and allelic scores were entered under molecular data30,31,32,33 (Supplementary Table S2 online).

Pedimap procedure

A Pedimap input data file is created in MS Excel (2019), and the data file is exported as a tab-delimited text (.txt) file (Supplementary Table S3 online). The input file contains four main subdivisions; header, pedigree, marker data, and IBD probability section (Fig. 1). The header consists of five essential elements and one additional element. The name of the population and symbols for unavailable or missing data, null homozygous alleles, and confirmed null alleles are entered to the pedigree section, as shown in Fig. 1a. The name of the cultivar must be a string with text or numerical values without spaces.

Figure 1
figure 1

The input data file structure of the Pedimap; the input file was created as an MS Excel worksheet, contains four main sections. (a): Header, (b): Pedigree and phenotypic data, (c): Genotypic data. (a): In the header section, essential elements are highlighted in blue, which contains the population name, ploidy and codes used in the data. (i): abbreviations for missing data (i.e., unknown), possible null alleles, confirmed null alleles; (ii): NALLELES is only necessary if the IBD probabilities are used, and specifies the total number of founder alleles (i.e. the number of founder times the ploidy). (b): The Pedigree section contains the pedigree data of all the individuals, and any phenotypic data of the individuals. The pedigree part is highlighted in purple. (iii): founders (initial parents) are entered with missing values for their parents. Phenotypic data are entered in subsequent columns (iv). (c): The Genotypic data section (if present) is divided into three parts: one part for each linkage group the genetic map (v), general information per locus (vi) and positions where IBD probabilities are calculated (vii); a part with the observed alleles per locus per individual (viii), and a part with the Identity-by Descent (IBD) probabilities per position per individual (ix). The final file must be saved as a text (.txt) file.

Next to the header, the pedigree section is entered, as shown in Fig. 1b. The first column denotes the name of the variety or landrace, and second and third columns are reserved for maternity and paternity information, respectively. The numbers and strings can be included to represent the phenotypes in the first three columns. From the fourth column onwards, any desirable quantitative or qualitative trait values can be entered. All the collected phenotypic data are introduced, as shown in Fig. 1b. The third section of the input data file is for marker information. The linkage group of the DNA marker and the marker positions in the linkage map are entered, as shown in Fig. 1b. If there are more than one linkage group, all the linkage group maps should be defined successively before entering the allelic scores. The detailed data for each DNA marker can be inserted after revealing the map positions. The respective number of columns, according to the ploidy level, should be incorporated to enter allelic scores. The fourth section is for IBD probability values (Fig. 1c). The IBD probabilities cannot be calculated within Pedimap but can be calculated using other software e.g. FlexQTL34, which is a software for QTL analysis (https://www.wur.nl/en/show/FlexQTL.htm; FlexQTL Version 0.1.0.42). FlexQTL can also generate a complete Pedimap input data file.

Demonstration of the usability of Pedimap

We used the examples 1 and 2 given in Table 1 to show how parental cultivars can be selected for crossing based on diverse breeding objectives and the prioritized traits. The example 3 in Table 1 was used to select parents, indicate the DNA marker allelic representation for MAB, identity by descent calculations, and planning crosses to deduce related details necessary for decision-making for breeding.

Table 1 The examples used to demonstrate the use of Pedimap in making breeding decisions.

Estimation of breeding values (EBV)

The selection of parental cultivars for the examples 1, 2 and 3 (Table 1) to illustrate the procedure of breeding decisioning using Pedimap was verified by calculating EBVs36 for yield, maturity period, and plant height. The EBVs were calculated for all rice cultivars according to the following formula. The true breeding values (TBV) and the accuracies of EBVs based on the correlation with TBVs were also calculated37. The representative heritability (H2) estimates for the traits (0.50 for yield, 0.85 for maturity period, and 0.85 for plant height) were obtained from breeding records available at RRDI and other published sources38,39,40.

$$ {\text{EBV }} = {\text{ H}}^{2} { }\left( {{\text{P}} - {\overline{\text{P}}}} \right)\quad \quad ({\text{Ref}}^{37} ) $$

where H2 = heritability of the trait; P = trait values of the individual or cultivar; \({\overline{\text{P}}}\) = population mean value of the trait; \({\text{P}} - {\overline{\text{P}}}\) = phenotypic superiority.

Results

Worldwide plant genetics and breeding programs use Pedimap as the platform for maintaining breeding databases and pedigree visualization. In the RosBREED project41, the parental and progeny identification, tracing founders, and calculation of allelic representation are conducted using Pedimap. The pedigree display of Pedimap is used to plan crosses in the Rosaceae research community42,43, HIDRAS project44, and visualize of Arabidopsis thaliana crosses45. Selecting parentage, sketching out crossing schemes, estimating the probability of allelic segregation, and choosing compatible molecular markers for MAB can be achieved using Pedimap28. The use of Pedimap as a pedigree visualization tool for the decision-making process in rice breeding is described using three examples (Table 1).

Example 1: Selecting parents for higher yield, BPH tolerance, short duration and white pericarp with diverse grain shapes

The Pedimap database of rice breeding germplasm in Sri Lanka has a total of 224 input cultivars. There are 36 intermediate genotypes such as F1 and F2 that were not reported, but we included them to complete the pedigree in Pedimap. Thus, the database has a total of 188 rice cultivars and accessions with known identities with records (Supplementary Table S1 online and Supplementary Fig. 1 online). In Example 1, we considered a scheme to select cultivars as parents with the parameters given in Table 1 for white pericarp, yield, BPH resistance, maturity period, and the grain shape. These thresholds defined a subpopulation of 26 cultivars (Fig. 2). The variation of the yield is given in Fig. 2a. According to the color shading given, the breeder can select the required parents for crossing to obtain higher yield levels. However, as shown in Fig. 2b, only three cultivars show the complete resistance to BPH. If breeder plans to introgress the complete BPH resistance to the novel varieties, only Bg250, At307, and At306 are available as the sources of resistance. Figure 2c displays the variation for the maturity period. The breeder can choose the parents depending on his objective for the intended maturity period for the novel varieties. Example 1 was exclusively planned to breed for white pericarp. However, the grain shape is also important as a significant quality trait to become a successful variety in the market. Figure 2d shows the variation for grain shapes for the breeder to carry out the selection. If we consider all the traits and selected At307 as a parent based on the pedigree visualization in Pedimap, At307 can provide the genetic basis for high yield, complete resistance to BPH, approximately three months for maturity, and intermediate-bold shaped grains. If Bg450 was selected, the yield is still in the higher range with moderate resistance for BPH and short-round grains. However, Bg450 brings the alleles for an extended maturity period (Fig. 2).

Figure 2
figure 2

The pedigree visualization for Example 1 (Parents with white pericarp, yield ≥ 3.5 mt/ha, moderate or complete BPH resistance, maturity period ≤ 125 days, and diverse grain shapes). The selected pedigree is colored separately for four traits. (a): Yield; (b): Degree of resistance to brown planthopper (BPH); (c): Maturity period; (d): Grain shape. Female and male parentages are indicated by red and purple lines, respectively. The symbol ‘×’ indicates the cross between two parents. The background colors of the cultivar-name boxes indicate the trait values, as shown in the colored legends below.

Example 2: Selecting parents for high/high-intermediate amylose content, higher yield, short duration, and resistance to blast disease

In Example 2, we considered a scheme to select cultivars/accessions as parents with the parameters given in Table 1 for high/high-intermediate amylose content, higher yield, short duration, and resistance to blast disease. These thresholds defined a subpopulation of 37 cultivars/accessions (Fig. 3). The breeder can select the high yielding, short-duration, and blast-resistant cultivars as parents from pedigrees visualized in Fig. 3a–c, respectively. The high, high-intermediate, and intermediate amylose contents are depicted in the pedigree given in Fig. 3d. Only Bw351, At307, Bg407H, At308, and Bg252 show the complete resistance to blast (Fig. 3c). However, At307 is the most promising parent with high yield (Fig. 3a), short duration (Fig. 3b), and high amylose content (Fig. 3d) along with complete resistance to blast (Fig. 3c). Also, Bg407H is the highest yielding (Fig. 3a), blast-resistant (Fig. 3c), and high in amylose content (Fig. 3d). However, Bg407H is a long duration variety compared to At307. Therefore, the breeder may plan to cross At307 and Bg407H to accomplish the breeding objective of Example 2.

Figure 3
figure 3

The pedigree visualization for Example 2 (parents with high, high-intermediate, and intermediate amylose content, yield ≥ 3.5 mt/ha, moderate or complete resistance to rice blast disease and maturity period ≤ 125 days). The selected pedigree is colored separately for four traits. (a): Yield; (b): Maturity period; (c): Degree of resistance to rice blast disease; (d): Amylose content. Female and male parentages are indicated by red and purple lines, respectively. The symbol ‘×’ indicates the cross between two parents, and ‘×’ inside the circle represents selfing. The background colors of the cultivar-name boxes indicate the trait values, as shown in the colored legends below.

Example 3: Selecting parents for phosphorus deficiency tolerance, higher yield, short duration, resistance to both BPH and blast, and high/intermediate-high amylose content

We selected a set of rice cultivars from the Pedimap database based on the availability of ranked scores for phosphorus deficiency tolerance (PDT). Twenty-four cultivars contain the PDT ranks of high, moderate, and sensitive (Fig. 4a). The same set was illustrated using Pedimap for yield (Fig. 4b), maturity period (Fig. 4c), degree of resistance to BPH (Fig. 4d) and blast (Fig. 4e), and amylose content (Fig. 4f). If At362 is considered as a parent, it can bring resistance to phosphorus deficiency (PD), and BPH, moderate resistance to blast, high yield, average maturity period, and intermediate-high amylose content. Similarly, if Bg250 is selected, it can bring moderate resistance to PD and blast, resistance to BPH, moderate yield and shortest maturity period, and high amylose content (Fig. 4).

Figure 4
figure 4

The pedigree visualization for Example 3 (parents ranked for phosphorus deficiency tolerance). The selected pedigree is colored separately for six traits. (a): PDT; (b): Yield; (c): Maturity period; (d): Degree of resistance to BPH; (e): Degree of resistance to BLAST; (f): Amylose content. Female and male parentages are indicated by red and purple lines, respectively. The symbol ‘×’ indicates the cross between two parents. The background colors of the cultivar-name boxes indicate the trait values, as shown in the colored legends below. White boxes indicate the cultivars with missing-trait values.

A sample crossing scheme is shown in Fig. 5 to produce a rice variety with high PDT, mean yield ≥ 5.0 mt/ha, maturity period ≤ 105 days, resistant to BPH and blast disease, and higher amylose content. Since there is no reported cultivar for high PDT with complete blast resistance (Fig. 4), the illustrated crossing scheme in Fig. 5 is proposed with two phases. In the first phase, the crossing of At362 and Bg250, followed by numerous rounds of selfing and selection of the most beneficial lines among the recombinant inbred lines (RILs) at advanced generations, would accomplish the breeding objective only without complete resistance to blast (i.e., a moderate level of blast resistance is possible). In the second phase, the selected RILs from phase 1 can be backcrossed to Bg252 as the donor parent to introgress the complete resistance to blast. The breeder can come up with diverse crossing schemes like the one given in Fig. 5 to make effective decisions for breeding and maximize the resource utilization to release varieties in the shortest possible time. The breeder can select any number of parents that are needed to use as sources of resistance and other traits to start crossing. Also, the marker alleles and the IBD probabilities can be checked, as illustrated in Supplementary Fig. 2a,b online, respectively.

Figure 5
figure 5

The pedigree visualization for planning a crossing scheme. Phase 1: Initial crossing of At362 and Bg250 and pedigree selection to obtain RILs with ≥ 5.0 mt/ha of mean yield, ≤ 105 days of the maturity period, resistant to BPH, moderately resistant to blast and high level of amylose content. Phase 2: Then backcrossing with Bg252 as the donor parent to introgress the blast resistance.

Estimated breeding values (EBV) of the rice cultivars to support the breeding decisions in examples 1, 2 and 3

The calculations revealed that the EBV-yield of the cultivar At307 is 1.03, which received the second rank that justifying its selection for the cross selected in example 1. The cultivar Bg450 is also ranked at 29th position in terms of its EBV-yield. Thus, Bg450 also brings a higher genetic effect for yield. However, Bg450 carries genes for extended maturity (EBV-maturity period rank of 62); however, At307 got the rank 20th; hence, the progeny has a chance to receive genes for a shorter maturity period. For plant height, At307 and Bg450 got rank of 50th and 24th positions, respectively, indicating that progeny would have a strong basis for shorter plant height, which is desirable to prevent lodging and increase the fertilizer use efficiently (Table 2; Figs. 6, 7 and Supplementary Table S4 online).

Table 2 Estimated breeding values (EBV) and the ranks of the cultivars based on EBV for yield, maturity period (Yala and Maha seasons) and plant height.
Figure 6
figure 6

The distribution of the estimated breeding values (EBV) for rice cultivars assessed. (a): yield; (b): plant height; (c): maturity period in in the Yala season; (d): maturity period in Maha season. The positions for the selected parental cultivars in examples 1, 2, and 3 (Table 1; Figs. 2, 3, 4 and 5) are marked within the histograms. The colored bars in the histograms show the cultivars with desirable EBVs (Table 2; Supplementary Table S4 online).

Figure 7
figure 7

The comparative visualization of the ranks of rice cultivars based on the estimated breeding values (EBV) for yield, maturity period and plant height. The shorter maturity period and lower plant height were considered as desirable in ranking. (a): 3-D scatter plot depicting the cultivar-positions with respect to EBV ranks of yield, maturity period and plant height; (b): The linear relationship of maturity periods of the rice cultivars reported in Yala and Maha seasons; (c): The linear relationship of the EBVs of the maturity periods of rice cultivars reported in Yala and Maha seasons; (d): The linear relationship of the ranks of EBVs of the maturity periods of rice cultivars reported in Yala and Maha seasons. The positions for the selected parental cultivars in examples 1, 2, and 3 (Table 1; Figs. 2, 3, 4 and 5) are marked on the curves of the figure-panels (a), (b) and (c). In some instances, more than one cultivar is represented by the ‘dot’ positions of the curve, therefore, numbers with pointing arrows are indicated to show the number of cultivars represented by each dot in the figure panels (b), (c) and (d). Because of the patterns observed in (b), (c) and (d), only one axis was used to represent the maturity period in the figure panel (a).

In example 2, our selection of Bg407H and At307 is firmly validated by the EBV-yield ranks of first and second received by these two cultivars, respectively. However, the EBV-maturity period of Bg407H was ranked 55th, and EBV-plant height was ranked 67th, indicating that Bg407H would bring favorable genes for extended maturity and taller plants. However, At307 got rank 20th for the EBV-maturity period and 15th for plant height, causing decreasing genetic effects on extended maturity period and tallness of the plants (Table 2; Figs. 6, 7 and Supplementary Table S4 online).

In example 3, our selection of At362 is validated by the EBV-yield rank. This cultivar provides the second-best possible genetic effect for yield in the breeding germplasm available in RRDI. In the proposed crossing schemes in Figs. 4 and 5, Bg250 and Bg252 were selected to provide the genetic basis for the shorter maturity period, and the EBV-maturity period values of those cultivars support this selection. Also, Bg250 was previously used by RRDI as a breeding parent, it’s accuracies of EBV for maturity period were 0.98 and 0.99 for Yala and Maha seasons, respectively. The perfect correlations between EBV and TBV of Bg250 regarding the maturity period, indicates that it is an ideal parent to provide the genetic basis for shorter maturity period to the progeny. However, Bg250 and Bg252 provide a higher genetic effect for taller plants than that of At362 (Table 2; Figs. 6, 7 and Supplementary Table S4 online).

Discussion

The decision-making process in breeding is a tedious task22. The breeding germplasm is complex with large numbers of improved varieties, traditional cultivars, landraces, wild germplasm, and accessions. Also, there can be large mapping populations and unreleased varieties due to various reasons. The numerous cultivars in breeding germplasm may have extensive records on agronomic data, pest and disease resistance, quality traits, availability of samples, geographic locations, and utilization in diverse breeding programs as parents46,47. With the advent of DNA markers and sequencing technologies, a wealth of genomic information is also available48. However, one of the recurrent problems in any breeding germplasm in the world is most of the cultivars remain uncharacterized. Thus, they cannot be used directly in breeding activities. Traditionally, breeders keep records in field books. With the development of computer technology, data tabulation is becoming a common practice. However, given the highly complex nature of the datasets in breeding germplasm, data tables have a limited value to the breeders. The tables created with contemporary data managing software cannot graphically display complex pedigrees and variations of qualitative and quantitative traits along with DNA marker information. These database handling platforms do not make use of the pedigree-based capabilities of Pedimap, like selecting related parental varieties/accessions. In this context, Pedimap provides a considerable advantage, as it can visualize pedigree relationships, trait variations, and any other useful information required for decision-making and planning crosses in breeding programs28. If all the available details on breeding germplasm are arranged as a database, the breeder can come up with subpopulations based on diverse traits and select the parents for improving multiple traits. However, simple spreadsheets or manually prepared note pages cannot be used to visualize the essential information and complex pedigrees. Breeding programs often suffer a lot when the breeder gets retired or moved to a different position49,50,51. The newly hired breeder cannot practically go through the individual records of the existing breeding germplasm. Thus, there is a strong possibility that valuable breeding germplasm might get lost, wasting time, resources, and courage of the retired breeder and his team. However, as a routine practice, if the breeder maintains and updates a Pedimap file for the developing germplasm of breeding materials, the newly hired breeders can go through and identify the value and gaps in the available material for him to plan further. The creation of a Pedimap file is simple, and a novice to informatics can curate and use Pedimap with a little training. Pedimap allows breeders to store data, fetch and visualize genomic information at any time with less effort and complete accuracy52. The straightforward accessibility, direct data interpretation, ability to customize the views in multiple fashions, and editable output file formats are the significant features of Pedimap. The graphic files created can be readily imported to image editing software for further visualizations and illustrations. Pedimap is not an opensource software but can be freely obtained by contacting the developers; thus, even the breeders in developing countries can benefit from Pedimap28.

In the current study, we created a Pedimap database for the rice cultivars and accessions prominently used by breeding programs in Sri Lanka. With the available information, significant breeding decisions can be made, as we explain in three examples (Figs. 2, 3, 4 and 5). However, it is essential to characterize the cultivars for all the important traits, molecular markers, and SNP haplotypes53, so that breeding decisions can be effectively made17. The EBVs for the parental cultivars and progenies further can be consolidated with the pedigrees to intensify the reliability of the breeding decisioning54. The phenotyping methods must be standard and should follow common procedures across different locations so that the power of the Pedimap database would go up dramatically. Therefore, breeders should always follow the standard, globally acceptable phenomic platforms to characterize the material in breeding germplasm44,55. The novel Agri-tech practices such as vertical farming, artificial intelligence-powered technology, and re-energizing the plant microbiome would improve conventional breeding, leading to second green-revolution36,56. Therefore, in addition to genotyping technologies, including whole genome sequencing and DNA marker-assisted selection techniques, high-throughput phenotyping tools/phenome platforms are also essential to develop further breeding systems.

The application of EBV in breeding is a common practice to decide the additive genetic effect that each parent can bring to the progeny57. There are only three quantitative parameters (yield, maturity period for Yala and Maha seasons and plant height) available in the breeding germplasm at RRDI (Supplementary Table S1 online). We calculated EBV for these three parameters (Supplementary Table S4 online). The first 30 top tanked cultivars for EBV, together with two other important cultivars used in example 3, are given in Table 2. Also, the accuracy/reliability of EBVs are given in Table 2 and Supplementary Table S4 online for the cultivars that were used by RRDI as breeding parents. It is evident from the trait data, EBVs and accuracy of EBVs given in Table 2 that the breeding germplasm at RRDI got elite cultivars that can be used as breeding parents in the future. Interestingly, all these are newly improved and released rice varieties. It is evident that the crossing schemes must always plan using these elite cultivars as parents while carefully adding other landraces of exotic types as resistant sources to avoid linkage drags. The EBVs of yield (Fig. 6a), plant height (Fig. 6b), and maturity periods in Yala and Maha seasons (Fig. 6c,d) show continuous distribution. For yield and maturity period, RRDI germplasm has promising rice cultivars. However, for plant height, the high yielding parents tend to have increasing genetic effects. The lodging is a frequent problem in rice farming in Sri Lanka; thus, current EBV-plant height estimates imply the necessity of boarding the breeding germplasm with the parental cultivars that can provide a genetic basis for short plants.

The relative rankings of rice cultivars for EBEs calculated for yield, maturity period, and plant height are given in Fig. 7a. The even distribution of rice cultivars in the 3-d sphere (i.e., box based on ranks) highlights the broad genetic diversity of Sri Lankan rice breeding germplasm. However, it has to be completely characterized, and the EBVs must be calculated to understand the complex-multidimensional diversity structures to carry out the Pedimap decision procedures in designing crosses efficiently. The EBV estimates for maturity period imply that except five cultivars, on average other cultivars do not show any significant differences in Yala and Maha seasons. The lack of seasonal variations for the maturity period is an advantage for breeding programs as two seasons of selection are possible in every year to fast track the variety development process. (Fig. 7b–d).

In the present study, we only used phenotypic data available for traits to calculate the EBVs. However, for efficient genomic selection, high throughput genomic data such as marker alleles, sequence polymorphisms, and haplotype variants are needed. Thereby EBVs can be translated into more robust genome EBVs (GEBVs). The GEBVs would facilitate the efficient introgression of desirable traits to new varieties through MAB with efficient background and foreground selection schemes58,59.

Conclusion

The pedigree visualization with variations of phenotypic and molecular data using Pedimap is a user-friendly tool to plan rice breeding programs with higher accuracy and resource optimization. The present study explains the applicability of Pedimap as a decision-making tool to streamline the rice breeding programs in Sri Lanka and the calculated EBVs highly supports to the validity of decisioning based on Pedimap. However, it is also important to note that accurate characterization of the breeding germplasm for phenotypic and molecular data is the critical prior step to harness the value of Pedimap for breeding.