Detecting CSSLs and yield QTLs with additive, epistatic and QTL×environment interaction effects from Oryza sativa × O. nivara IRGC81832 cross.

Chromosome segment substitution lines (CSSLs) are useful tools for precise mapping of quantitative trait loci (QTLs) and the evaluation of gene action and interaction in inter-specific crosses. In this study, a set of 90 back cross lines at BC2F8 generation derived from Swarna x Oryza nivara IRGC81832 was evaluated for yield traits under irrigated conditions in wet seasons of 3 consecutive years. We identified a set of 70 chromosome segment substitution lines, using genotyping data from 140 SSR markers covering 94.4% of O. nivara genome. Among these, 23 CSSLs were significantly different for 7 traits. 22 QTLs were detected for 11 traits with 6.51 to 46.77% phenotypic variation in 90 BILs. Three pleiotropic genomic regions associated with yield traits were mapped on chromosomes 1, 8 and 11. The marker interval RM206-RM144 at chromosome 11 was recurrently detected for various yield traits. Ten QTLs were identified consistently in the three consecutive years of testing. Seventeen pairs of significant epistatic QTLs (E-QTLs) were detected for days to flowering, days to maturity and plant height. Chromosome segments from O. nivara contributed trait enhancing alleles. The significantly improved lines and the stable QTLs identified in this study are valuable resource for gene discovery and yield improvement.

Rice is one of the major crops which grow across wide range of environments and supports majority of the world population mainly located in developing countries of Asia, West Africa, Latin America and the Caribbean. Global rice yields and total rice consumption are projected to rise by 12% and 13% respectively by 2027 1 . However there is a large gap between anticipated production and consumption due to expected population rise and restricted cultivated area expansion in major rice growing countries of Asia. Even though a large number of rice cultivars exist in the world, the most widely grown varieties are few and they harbour a limited genetic diversity due to common parents and origin within Oryza sativa species. The crop is exposed to multiple stresses and evolving pests, as range of cultivation is wide and also climate scenario is changing. Therefore, much focus is needed to widen the gene pool of cultivated germplasm to sustain and improve yield potential. Prebreeding and utilization of wild accessions for introgression of novel genes / QTLs / chromosomal regions are gaining more importance in rice improvement 2,3 .
Chromosome segment substitution lines (CSSLs) are prebreeding genetic stocks which can be utilised in crop breeding and genomics studies for improvement of various traits 4 . Wild species are a reservoir of valuable genes, however the unpredictability of genotypic potential of wild species in a cultivar background, cross incompatibility and sterility in inter specific crosses are common hindrance in prebreeding programs. CSSLs developed using wild donors can overcome these problems as a bridging prebreeding product and can be used for identification, mapping and introgression of novel alleles from the wild. CSSLs can be utilized for faster and easier development of NILs and varieties than using an un-adapted or wild genotype directly. Since a single major introgression is present in each line, its effects are not masked by other components of donor genome and thus linkage drag and negative effect on QTL is limited in an ideal CSSL set 4,5 .
ICAR-Indian Institute of Rice Research, Hyderabad, India. ✉ e-mail: sarla_neelamraju@yahoo.com open About fifty inter-specific and intra-specific CSSL libraries are reported in rice in the last two decades. Most of the initial CSSLs developed in rice involved indica/ japonica crosses but the use of wild species as donor has increased since the last two decades. Accessions from wild rice species viz., O. rufipogon, O. nivara, O. glumaepatula, O.barthii, and O. meridionalis were used as donor parent in CSSL development programmes [6][7][8] . Furuta et al 9 . developed CSSLs using Oryza rufipogon accessions as donor. Rangel et al 10 . developed CSSL in indica cultivar BG 90-2 using O. glumaepatula. Arbelaz et al 11 . developed two populations of interspecific introgression lines (ILs) in cv. Curinga, a tropical japonica upland cultivar using two wild donor species, O.meridionalis and O. rufipogon Griff having 76.73% to 97.6% donor genome respectively. CSSLs are ideal for accurate mapping of gene/ QTL and genetic analysis, largely due to low interaction of genetic backgrounds 12 . Thus, CSSLs have been used to identify chromosome regions affecting any trait 13 , mapping, fine-mapping and cloning genes from novel genomic regions and QTL hotspots 14 . In rice, superiority of introgression lines over recurrent parents has been reported [15][16][17][18][19][20][21][22] , however the effect of alien introgression using CSSLs and their advantage over RILs/ BILs has not been studied extensively. Ideally if each CSSL has only a segment from the donor it becomes easy to define the QTL position. Usually a primary CSSL population contains more than one donor segment in each line, so the gene and marker frequencies do not fit into mendelian ratios as that of F 2 and BC mapping population 23 . Therefore, stepwise regression-based likelihood ratio test (RSTEP-LRT) is used for QTL mapping in CSSLs. Inclusive composite interval mapping (ICIM) 24,25 was used in QTL mapping and to study the epigenetic interactions due to its unbiased QTL effect estimation 26,27 .
In our previous study, population was developed from Swarna /O. nivara IRGC81848 for QTL mapping for yield traits. Individually advanced lines were genotyped at BC 2 F 8 and a set of 74 CSSLs were constructed with donor genome coverage of 89% 20 . Another population developed from a cross between Swarna and O.nivara IRGC81832, was evaluated for yield and quality related traits and QTLs were mapped at BC 2 F 2 generation 16,21,28 . This population was advanced to BC 2 F 8 and was phenotyped in this study for 3 consecutive wet seasons and genotyped using SSR markers to identify a set of CSSLs representing complete genome of O. nivara as overlapping segments substituting Swarna segments and to detect major QTLs for yield traits. Phenotyping in multi environments is essential to determine stable constitutive QTLs as well as environment specific adaptive QTLs. The mapping population was tested for 3 years for identifying environment and epistatic interactions of QTLs using multiple mapping procedures to identify stable significant QTLs and to understand the genetic implications of these QTLs.

Methods
The research work was conducted at Indian Institute of Rice Research (IIRR), Hyderabad during the period 2014 to 2017 wet seasons. The area is located at latitude of 11°00′ N longitude of 77°00′ E and an elevation of 426.72 m above mean sea level (MSL).

Plant material. Advanced back cross introgression lines derived from a cross between an elite cultivar
Swarna (O. sativa) as a recurrent parent and a wild accession O. nivara IRGC81832 as a donor parent 16 were used in this study at BC 2 F 8 to BC 2 F 10 stage after conducting single panicle selection for consecutive generations from BC 2 F 5 and these NPK or K lines are now designated as NK 1-90. Phenotypic evaluation. The experiment was conducted following randomised complete block design (RCBD) with three replications at experimental plots in Indian Institute of Rice Research (IIRR), Hyderabad. A set of advanced backcross introgression lines (BC 2 F 8 ) along with recurrent parent Swarna, were raised during wet seasons of 2014 (E1), 2015 (E2) and 2016 (E3). 25 days old seedlings were transplanted following 15 cm interplant spacing and 20 cm inter row spacing. Package of practices for fertilizer application and pest control measures under irrigated conditions were followed to avoid yield loss. Randomly selected five plants from each replication were evaluated for 9 yield-related traits, viz., days to 50% flowering (DFF), days to maturity (DM), plant height (PH), number of tillers per plant (NT), number of productive tillers per plant (NPT), biomass (BM), single panicle weight (PW), grain yield/plant (GY), bulk yield per square metre (BY) and two derived traits; harvest index (HI) and per day productivity (PDP). The observations on yield and morpho-agronomic traits were recorded from the field experiments following Standard Evaluation System (IRRI, 2013).
The phenotypic data was analysed using Statistical Tool for Agricultural Research (STAR v2.0.1, IRRI) for the descriptive statistics (Supplementary Table 1) and ANOVA (Supplementary Table 2). Broad-sense heritability was calculated using the formula: h 2 = σg 2 /(σg 2 + σe 2 ), where σ 2 g is the genotypic variance and σ 2 e is the environmental variance 29 . Levene and Bartlett test for homogeneity of variance was conducted for yearly means and a Shapiro-Wilk test for normality (Supplementary Table 3 and 4) were performed to test the assumptions of the ANOVA (Supplementary Table 5). Significant pairwise comparisons were carried out comparing with control Swarna. Boxplot and histogram of each trait in different seasons were derived using R software 30 . Phenotypic and genotypic correlation among trait averages was computed for Prob > |r| at 0.05, 0.01 and 0.001 in STAR (STAR v2.0, IRRI) using Pearson's product-moment correlation using replicated phenotypic data across the years. The observed trait data were analyzed with PB tools (Version 1.4, http://bbi.irri.org/products) by considering blocks as random and entry as fixed effect to obtain LS mean estimates and were then used for QTL analysis. SSR genotyping. Genomic DNA was extracted from leaf using CTAB method. The extracted DNA was estimated at 260 nm/280 nm and 260 nm/230 nm for quality and quantity using UV spectrophotometer. The estimated DNA quantity was between 94 to 1294 ng/ul with 1.8 to 2.1 OD at 260 nm/280 nm and 260 nm/230 nm. For genotyping, working DNA concentration of 40 ng/ul was made as template. A set of 300 markers including 165 SSRs 31 which are equally distributed throughout rice genome were used to study parental polymorphism and 140 SSR markers, identified as polymorphic were used in screening CSSL population. Genomic DNA was isolated www.nature.com/scientificreports www.nature.com/scientificreports/ from fresh leaf samples of 90 lines following CTAB method. 140 SSR makers distributed across the genome were used to genotype the mapping population. PCR was performed in thermal cycler (G-STORM, USA) with a final reaction volume of 10 µl containing 15 ng of genomic DNA, 1X assay buffer, 200 μM of dNTPs, 1.5 mM MgCl 2 , 10 pmol of forward and reverse primer and 1 unit of Taq DNA polymerase (Thermo Scientific). PCR cycles were programmed as follows: initial denaturation at 94 °C for 5 min followed by 35 cycles of 94 °C for 30 sec, 55 °C for 30 sec, 72 °C for 1 min and a final extension of 10 min at 72 °C. Amplified products were resolved in 3% agarose gel prepared in 0.5 X TBE buffer and electrophoresis was carried out at 120 V for 2 h. Gels were stained with ethidium bromide and documented using gel documentation system (Alfa imager, USA).
Identification of CSSL and QTL analysis. SSRs were scored as co-dominant markers, and recorded as "A" for band presence female parent Swarna and "B" for presence of male O.nivara parent and "AB" for heterozygous bands. CSSLs were identified using segregation data of 140 polymorphic SSR markers in 90 BILs in the background of Swarna genome using CSSL Finder (http://mapdisto.free.fr/CSSLFinder/). Substituted chromosome segments were estimated on the basis of origin of alleles of flanking markers. A chi-square test for the 1:1 segregation ratio was conducted in the advanced backcross mapping population and no segregation distortion was found. Linkage map was constructed based on genotypic data of 90 BILs using 140 polymorphic SSR markers on all chromosomes. MAP functionality was used for the construction of genetic linkage map involving interfaces of grouping, ordering, and rippling procedures in map construction. Ordering of loci is done by an efficient approximate algorithm of nnTwoOpt in ICIM.
The Kosambi's mapping function was used to transform recombination frequencies between linked loci into cM distances. QTL mapping was performed by single marker analysis (SMA), inclusive composite interval mapping (ICIM-ADD) of additive and dominant QTL and inclusive composite interval mapping of epistatic QTL (ICIM-EPI) functions implemented in the QTL IciMapping v4.1 (www.isbreeding.net) in a stepwise regression for the adjusted means of each trait. The threshold LOD value was determined by a permutation test involving 10000 runs at a significance level of p = 0.05 and the QTLs in a particular genomic region with LOD values larger than this threshold for each trait were used. The LOD test statistic used was −2ln (L0/L1), where L0/L1 is the ratio of the likelihood under the null hypothesis (there is no QTL in the interval) and the alternative hypothesis (there is a QTL in the interval). The QTLs were deemed to exist only at positions where the LOD score exceeded the corresponding significant threshold. Estimation of position, genetic effects and phenotypic variation percentage of QTLs was done at the significant LOD peak in the region under consideration. The phenotypic variance explained by a single QTL was estimated by the square of the partial correlation coefficient (R 2 ). Estimates of the R 2 value or the phenotypic variance explained (PVE) and the additive effects of QTL at their peak LOD positions were obtained from the output of QTL analysis. Parent for trait enhancing allele was identified using the sign of the additive effects, negative sign denotes that trait -enhancing allele is from O. nivara.
Mapping of additive and digenic epistasis genes with chromosome segment substitution lines was carried out with QTL IciMapping v4.0 26 with CSL (Chromosome Segment Line) option using method 'Likelihood ratio test based on stepwise regression for additive QTL' (RSTEP-LRT-ADD) which specifically deals with QTL mapping in collections with multiple introgressions per line 32 . Parameters setting was 'Multi collinearity control'= −1 (equivalent to deletion of duplicated markers only) and 'PIN' = 0.0001 (PIN: the largest-value for entering variables in stepwise regression of residual phenotype on marker variables). Logarithm of odds (LOD) threshold values for each trait was computed after 10000 permutations using a type I error=0.05 and reported in corresponding tables as footnotes. The proportion of variance explained by a single QTL was as provided from single marker analysis in RSTEP-LRT-ADD. QTL-by-environment interaction analysis in biparental population was carried out using MET. Epistatic effect QTLs were analysed using ICIM-EPI with a probability value (PIN) of 0.0001 and threshold LOD 26 .

Results
Phenotyping. The variance components and descriptive statistics of yield related traits are presented in Table 1. The genotypes showed significant variation for BY, PH, DFF, DM, BM and TDM in the BIL population. High broad-sense heritability (h 2 ) of 96.43%, 95.98%, 94.15% and 58.45% was observed for days to flowering, days to maturity, plant height and panicle weight respectively. The variance component associated with genotypes contributed to 92% of total variation. The mean performance of BILs using pooled data for three seasons showed significant increase over Swarna for 7 traits. 31 transgressive BILs that were significantly superior or inferior to the recurrent parent were identified after evaluating the BILs for yield traits over three years from their frequency distribution ( Supplementary Fig. 1). Of these BILs, NK61 out-performed Swarna for GY and NK58 and NK83 for BY in all three seasons. 18 BILs with significantly lower DFF than Swarna were identified. For PH and BM, 13 and 7 BILs were identified respectively with significantly higher trait values compared to Swarna (Supplementary Table 6).
Among 31 lines identified through significant pair wise comparison 21 showed significantly positive values for 7 traits and 16 showed significantly negative values for 3 traits compared to Swarna. Highly significant phenotypic associations were observed among the traits and across years ( Table 2). Although all yield traits were positively associated and were significant at P < 0.01, DFF and DM showed negative association with all yield traits except biomass. GY showed strong positive correlation (P < 0.01) with NT, NPT, BM, PW, TDM, HI and PDP. PH was highly significantly correlated with both TDM and BM. GY showed highly significant correlation with BM and BY. PW was significantly correlated with GY, TDM, HI, BY and PDP but negatively with DFF and DM. CSSL identification. When genotypic data was subjected to CSSL finder, 70 out of 90 lines formed a minimal set of CSSLs with 11.94% of donor and 88.6% of recurrent parent segments (Fig. 1). The minimum donor genome in a CSSL was 2.33% in NK24 and maximum was 26.51% in NK61. NK61 was significantly better than the Swarna for YLDP and PDP and negatively significant for DFF and DM. NK24 was not significant for any trait. The number of substituted chromosome segments per line ranged from 2 to 16 with an average of 8 in the CSSL set and the number of segments observed per chromosome were 2 to 11 with an average of 6 segments. 70 CSSLs represented 94.4% of O. nivara genome but the donor genome was absent in small regions on chromosomes 5, 6, 9, 11 and 12. Chromosomes 5, 9, 11 and 12 were represented by 3 CSSLs each and chromosome 1 by 14 CSSLs. The line NK24 had two donor segments on chromosomes 1 and 5 while CSSLs NK2, NK11, NK18, NK19 and NK21 had only three segments each and on different chromosomes (Supplementary Table 7 Construction of linkage map and QTL mapping. QTL detection was carried out using composite interval mapping (CIM), mapping of additive and digenic epistasis genes with chromosome segment substitution lines (CSL) and QTL-by-environment interaction analysis (MET) to dissect the genetic basis of yield traits. Consistent detection of QTLs was observed across the analytical methods in same genomic region. Eight chromosomes (chromosome 1, 2, 3, 4, 7, 8, 11 and12) showed significant genomic regions associated with major QTLs for yield traits (Fig. 2). LOD scores of QTLs ranged from 2.55 to 22.98 and explained PVE from 6.51% to 46.77% (Table 3). QTLs were detected above threshold LOD values for all the traits except for number of productive tillers after 1000 permutations. 3 QTLs each for days to flowering and days to maturity were detected in chromosome 3, 7 and 12. For plant height 3 QTLs qPH1.1, qPH1.2 and qPH4.1 were mapped. One QTL each was identified for number of tillers (qNT2.1), panicle weight (qPW11.1), grain yield (qGY11.1), biomass (qBM1.1) and harvest index (qHI4.1). Two QTLs, qBY8.1 and qBY11.1 were mapped for bulk yield and qTDM1.1 and qTDM11.1 for total dry matter. For per day productivity (PDP) three QTLs were detected in chromosomes 1, 8 and 11. The QTL on chromosome 1 with the flanking markers RM128 and RM431 was associated with PH, BM and TDM and    Table 10). QTLs that were mapped to the genomic regions located within the www.nature.com/scientificreports www.nature.com/scientificreports/ same set of flanking markers were considered as representing the same QTL. The QTL region within RM128 and RM431 on chromosome 1 showed PVE ranging from 38.74 to 44.11% for PH across the environments, and the same region showed QTL for BM also in all the 3 years. Similarly, the locus between RM16 to RM55 at chromosome 3, RM214 to RM11 at chromosome 7 and RM19 to RM247 at chromosome 12 showed major QTLs for DFF and DM consistently over the years. QTL for NT was detected between RM8080 and RM341 at chromosome 2 in all the 3 years, with PVE varying between 11.28 to 15.32%. Similarly, region RM206 to RM144 at chromosome 11 was consistently identified for TDM. QTLs for GY and PDP were detected on chromosome 8 (RM137-RM408) and chromosome 11 (RM206-RM144), in 2 consecutive years.
The locus RM128-RM431 at chromosome 1 showed QTLs for grain yield, harvest index, biomass, total dry matter and plant height by multi environment testing (MET) QTL mapping and also with high PVE and LOD. Another region RM137-RM408 at chromosome 8 had QTLs for total dry matter, bulk yield and per day productivity and QTLs showed additive by environment interaction. Similarly, the region between RM206 and RM144 on chromosome 11 also harboured QTLs for plant height, biomass, total dry matter, bulk yield, grain yield, per day productivity and panicle weight showing high PVE, PVE (additive) and PVE (additive by environment). 21 out of 31 QTLs showed negative additive effects, indicating that the majority of introgressed segments from O.nivara in these regions contributed to trait enhancing alleles.
QTL mapping using CSSL set. A likelihood ratio test based on stepwise regression-based likelihood ratio test (RSTEP-LRT) was used to map QTLs in marker defined chromosomal regions of 70 non-idealized CSS lines. This method ensures accuracy of QTL detection using CSSLs. The results of QTL mapping from BILs and CSSL set or a subset representing the whole donor genome of mapping population was compared.
Altogether 10 marker loci were identified as significantly linked with trait improvement using 3 environment data (Table 5). Among these RM214, RM55 for DFF and DM; and RM431 for PH were consistently associated with trait increase in three environments and also the average environment. RM206 showed trait increase for BY, PDP and GY in 2015, 2016 and in mean. Similarly RM25 was a significant locus for panicle weight in CSL QTL mapping for two environments. These markers also showed significant trait linkage when analysed through single marker analysis using the complete set of back cross introgression lines. The mean value of the QTL genotype of parents M(QQ) and M(qq) showed that Swarna has higher values of QTL genotype for all traits except DFF and DM.  (Table 6). Other traits did not show any epistasis above LOD 5. At LOD > 3 we found minor interactions between number of loci across chromosomes (Fig. 3). A total of 6 digenic interactions were detected for days to fifty percent flowering. Interestingly, three of them occurred between significant main effect QTLs for DFF and non-significant loci in other chromosomes. A negative digenic interaction involving loci at chromosome 1 and chromosome 5 was detected and both additive effects were from O. nivara. Four of the epistatic interactions for DM were similar in case of DFF involving the main effect QTLs identified through ICIM. However, two different loci RM493-RM1220 on chromosome 1 and RM55-RM85 on chromosome 3 showed epistatic interaction with the same locus RM19-RM247 on chromosome 12 for DFF and DM respectively. An epistatic interaction between chromosome 3 (RM55-RM85) and chromosome 12 (RM19-RM247) explained a high phenotypic variance of 25.85%. Six digenic loci showing epistatic interactions were detected for plant height. Single locus RM16742-RM307 on chromosome 4 showed interactions with loci in chromosomes 1, 3, 5, 8, 9 and 10. It is worthwhile to note that the epistatic interactions had slightly decreased LOD values of already detected main effect QTLs. The total PVE explained by the epistatic QTLs was lower than the PVE of main effect QTLs for plant height.
Ethics statement. The authors declare that the experiments comply with the current laws of the country in which they were performed and in compliance with ethical standards.

Discussion
Swarna is one of the most popular mega varieties in India and other rice growing countries and well-known for its high yield, long duration and stable performance across different ecosystems 33 . Introgressions from wild species O.nivara are reported to improve yield traits in Swarna and a variety DRRDhan 40 has also been released 34 . Improved backcross introgression lines with favourable agronomic traits and high yield stability were also identified through multi environment trials in Swarna background 29 . In this study the advanced back cross introgression lines (BC 2 F 8 ) of Swarna/O.nivara were phenotyped for three years for yield and related traits and transgressive segregation with significantly positive and negative difference with the recurrent parent was observed ( Supplementary Fig. 2) indicating the existence of allelic variations for each trait in each parent. Continuous frequency distribution was observed for all traits demonstrating quantitative inheritance of the traits. DFF, DM, NT, NPT, GY and PDP showed kurtosis values above 1 and among these only PDP showed +ve    www.nature.com/scientificreports www.nature.com/scientificreports/ skewness. Except these, the absolute values of skewness and kurtosis were less than 1 for all traits, and showed normal distribution indicating the suitability for QTL analysis. In addition, PH, BM, BY and TDM also showed higher standard deviations suggesting significant effect of environment in their phenotypic expression.
Significant pair wise comparison showed 31 lines significantly different than Swarna for atleast one trait and 27 lines showed significant difference for more than one trait. This suggests interspecific combination of two divergent parents results in accumulation of superior alleles with additive and complementary effects 35 which can help to not only broaden the genetic base but create desirable novel variation in cultivar gene pools. Lines NK58 and NK83 showed significantly higher bulk yield, biomass and plant height and are valuable breeding materials for yield improvement. NK61 which showed significantly higher grain yield than Swarna and significantly lower days to flowering and maturity has commercial value as an early, high yielding breeding line. A set of lines which are either positively or negatively significant over Swarna for days to flowering and days to maturity were identified in this biparental population and they are useful in dissecting the genetics of heading date in rice.   www.nature.com/scientificreports www.nature.com/scientificreports/ height was found between RM226 and RM5 in chromosome 1. Kadambari et al. 37 reported a major QTL alternate semi-dwarf 1 (asd1) governing plant height on chromosome 1(chr01:2559897-2564321) approximately 36 Mb upstream from sd1. Several factors, such as the number of detected QTL, the degree of linkage of markers with the genes, the range of variation within the population, the population size, genetic background and phenotyping environments influence the magnitude of phenotypic variation explained by a QTL 38 . The main limitation in using QTLs for marker assisted selection and breeding is the poor consistency of QTL effects across environments and so stable QTLs need to be identified 39,40 . Dissection of candidate chromosomal regions to elucidate

Trait
Chr   www.nature.com/scientificreports www.nature.com/scientificreports/ the genetic mechanisms for grain yield, using multi-environment testing is a key approach in crop improvement. Therefore, to detect such chromosomal regions we conducted MET QTL mapping by validating the phenotypic performance in three environments for accurate mapping and effective use of QTL/genes.
QTLs for DFF and DM colocalised in three regions on chromosomes 3, 7 and 12 and explained more than 55% cumulative phenotypic variance in these regions for each trait. Thomson et al. 41 identified a stable QTL dth7.1 with peak marker RM214 using BC 2 F 2 population O. sativa cv. Jefferson/ O.rufipogon IRGC105491 in the same chromosome region as in our study. RM11, the other flanking marker of qDFF7.1 and Qdm7.1, was also reported as linked to heading date 42,43 . The region flanked by RM214 -RM11 on chromosome 7 was reported as a major QTL Qhd7 42 , Hd4 44 , dth7.1 41,42 , qDTH-7-1 45 and hd7 46 for heading date. Similarly, flanking markers (RM16-RM55) for qDFF3.1 were associated with QTLs dth3.2, dth3.3, dth3.4 41,47 and RM19 of dth12.1 was reported linked to heading date previously 48 . Interestingly these regions were also reported as linked to several other traits e.g. yield per plant, grain width, number of tillers per plant, spikelet fertility and grain weight 18,36,49,50 (Supplementary Table 11). QTL analysis for heading date conducted over five years located stably expressing QTL on chromosomes 3 51 .
The QTL qPW11.1 flanked by markers RM206 and RM144 was detected for PW on chromosome 11 and explained 15.40% of the phenotypic variation. The regions RM128-RM431 at chromosome 1, RM137-RM408 at chromosome 8 and RM206-RM144 at chromosome 11 showed clusters of significant QTLs for various yield traits with QTL by environment interaction. The chromosome 11 cluster had QTLs for panicle weight, grain yield, bulk yield, total dry matter and per day productivity. RM206 was reported to be significantly associated with panicle length, grain sterility, spikelet setting density, percent sterility, plant height, number of productive tillers, yield per plant, 1000 grain weight, panicle diameter and seed shattering 18,52,53 . Marathi et al. 52 reported that RM144 was linked to panicle length and yield per plant. A major QTL containing genes Pikh/Pi54 was previously detected between RM206 and RM144 54 for resistance to neck blast and leaf blast 55,56 . RM206 is also reported as linked to canopy temperature at reproductive stage in drought 57 . RM408 was found closely linked to multi-tiller and dwarf mutation on the short arm of chromosome 8 58 .
These traits were also correlated and have contributory effect on each other. Wang et al. 38 suggested pleiotropic effects of individual locus or tight linkage between loci in the same region is responsible for significant correlation between traits. The nature and direction of these strong associations suggests that these traits are under similar genetic control, conferred by pleiotropic effects of the underlying genes. It is mainly due to cluster of co-expressed genes serving different functions and selection of these QTLs may help in simultaneous improvement of these traits 59 . The QTL effect always depends on the background genome as well as direction of correlation among the traits even though the QTLs for correlated traits are colocalised 60 . Development and phenotyping of segregating population of recombinants of these regions are promising in unravelling genetic mechanisms underlying these regions 61 .
Grain yield and yield component traits are typically quantitative in nature and are influenced by environment and epistatic interactions 62 . A total of 10 putative major effect QTLs were detected across three environments and 31 QTLs with significant LOD values were identified for all yield traits. QTL region flanked by RM206 and RM144 at chromosome 11 for panicle weight was detected with significant LOD values and highest PVE additive × environment interaction suggesting it as a stable QTL. Even though, QTLs related to days to flowering, maturity and plant height were consistent with significant LOD levels their phenotypic variance in MET analysis was low indicating major effect of environment and the environment interaction on phenotypic expression of these traits. Hosseini et al. 63 detected QTL clusters for rice grain appearance quality traits with main, epistatic and QTL×environment interaction effects using two populations of backcross inbred lines (BILs) and suggested that co-located QTL or genes with pleiotropic effects could control these traits. Li et al. 48 carried out QTL × E interaction studies of rice doubled-haploid lines in nine diverse environments for heading date and plant height and found varying effects of QTL both in magnitude and direction due to more pronounced epistatic effects and non www.nature.com/scientificreports www.nature.com/scientificreports/ expression of some QTLs, weak expression or their different expression in different environments. Quantitative traits are also influenced by interaction between non allelic genes and QTL × environment interactions 63 and even the minor effect interactions influence trait expression significantly. It is observed that in most of the cases, complete effects of detected QTLs are not expressed in advanced generations or in marker assisted selection. This is mainly due to the fact that generally the major effect of QTLs are determined only by the additive component, however epistatic and environmental interactions are mostly ignored. Simultaneous detection of major effect, epistatic and environmental interaction is required for unbiased determination of QTL effect and breeding value 64,65 .
A total of 17 pair wise epistatic interaction QTLs were in this study above a threshold LOD of 5 with significant additive × additive (AA) effects and showed interaction loci for either days to flowering or plant height. We found only 17 significant epistatic interactions but a large network of weak epistatic interactions among yield traits. Additive by additive effects of epistatic QTLs were comparatively lower than additive effect of any of the corresponding QTL. This might be mainly due to presence of few individuals with rare loci combinations in the populations, interference from other loci and low epistatic variance from interactions 66 .
Interestingly, all the pair wise interacting loci for days to flowering and maturity involved at least one of the major QTL except an interaction QTL pair on chromosome 1 and 5. 7 pairs of loci showed epistatic interaction for days to 50% flowering (DFF) and 3 of them also showed interaction effects for days to maturity (DM) and the fourth pair of epistatic loci for DM had only one common QTL with DFF 48,67,68 . Yang et al. 69 characterized epistatic interaction between heading date genes of different chromosomes in rice. Exploration of epistatic interactions among multiple QTLs improved the understanding on genetic network of flowering and the information is useful for pyramiding of QTLs. Subudhi et al. 70 demonstrated the presence of epistatic interactions and found RM214 linked to Ghd7/ qHD7-1 is involved and, in our study, also RM214 was one of 5 pairs of epistatic interacting loci among different flowering related QTLs. Similar to our study, Nemato et al. 71 and Zhang et al. 72 also reported the involvement of novel epistatic loci (which are not detected in composite interval mapping) in regulation of flowering. Yadav et al. 73 reported the existence of epistatic interactions between QTLs that influence the expression of a complex trait such as grain yield under stress and non-stress conditions.
The QTL for plant height at chromosome 4 (RM16742-RM307) showed interactions with all the 6 loci for plant height on other chromosomes. However, these 6 interacting epistatic loci for PH did not involve the major QTL for plant height close to semidwarf gene at chromosome 1 suggesting its independent major effect. Zhao et al. 74 ; Shang et al. 75 reported that digenic epistasis was generally not observed among detected major QTLs; as observed in our study. Thomson et al. 41 showed that qDTH-4-3(t) (dth4.1) located on chromosome 4 and closely linked with marker RM307 controls days to heading. Zhao et al. 76 mapped a semidwarf gene sd-t2 closely linked to RM307. Gaikwad et al. 19 also reported that O. nivara allele at RM307 was responsible for increased plant height in introgression lines. In our study this QTL even with a small additive effect from O. nivara showed a complementary effect on other QTLs for PH indicating an important role of epistatic interaction in phenotypic variation. However, these epistatic loci showed significant interactions only in any two of three environments.
QTLs with main, epistatic and QTL and environment interaction effects were identified previously for salt tolerance 38 , yield traits 62 , grain chalkiness 74 and heading date 77 to dissect the genetic mechanisms of these complex traits. Zhang et al. 78 reported ICIM is an efficient tool to find epistatic QTLs and unbiased by population type or size and the genetic effects from simulation studies using F 2 and doubled haploid (DH) populations. Wang et al. 79 reported that epistatic interaction effects depend on the significance and the direction of effects of participant QTL with a consistent trend towards negative epistatic effects as in the current study and these are of much importance when the QTLs/ gene involved are co introgressed or pyramided for crop improvement. Cumulative effects of QTL were significant for flowering duration and pyramiding QTL by marker-assisted selection would be an appropriate strategy to improve the traits showing cumulative or complementary epistasis 38 . Yield QtL in Bc 2 f 8 vs. BC 2 f 2 . QTL mapping in this study was based on BILs and CSSLs at BC 2 F 8 generation phenotyped for three years in normal irrigated conditions. The current mapping population was derived through single panicle selection from an earlier generation of BC 2 F 2 16 . We compared the previous study where QTLs were detected with CIM and IM using QTL Cartographer in a population of 245 BC 2 F 2 families and genotyped with 75 SSR markers; to understand the changes in QTL detection and QTL effect after 6 generations of selfing. Interestingly the single significant QTL for GY identified in the present study was also reported in the same region with a common flanking marker RM206 in the previous study. It is noteworthy, that QTL for plant height at chromosome 4, number of tillers at chromosome 2 and bulk yield at chromosome 8 were in the same region in both BC 2 F 2 and BC 2 F 8 generations showing a similar region of genetic control for components of biomass production. Population size and marker density directly affect the accuracy of QTL mapping 27,80,81 . However, in this study marker density was higher while population size was lesser than that of BC 2 F 2 QTL mapping. The LOD levels for the same QTL were lower in BC 2 F 8 than in BC 2 F 2 . A QTL for plant height showed highest LOD in both BC 2 F 2 and BC 2 F 8 with common flanking marker RM128. This region was previously reported for plant height and is close to semi dwarf gene sd1 20,32,82 and QTLs for other yield contributing traits 21,83 .
Malathi et al. 20 compared BILs of another mapping population Swarna/ O.nivara IRGC81842 and found only 3 out of 15 QTLs in BC 2 F 8 were reported previously in BC 2 F 2 generation. Similarly, Rangel et al. 10 and Wickneswari et al. 84 identified few common major QTLs across generations. Studies comparing previous QTL mapping with an advanced generation of RILs or back crosses found genetic distance of intervals was reduced resulting in more accurate mapping of QTLs with shorter average marker distances 75,85 . Since many major epistatic or environmental QTLs are context-dependent and do not appear across generations it is prudent to assess worthiness of a QTL for use in MAS. The advanced backcross generations also help in discovering minor QTLs with significant role in phenotypic expression due to a uniform genetic background which may have been masked in an early generation (2020) 10:7766 | https://doi.org/10.1038/s41598-020-64300-0 www.nature.com/scientificreports www.nature.com/scientificreports/ with large number of introgressions 86 . Any QTLs which are stable over generations, background and genomic context are high priority candidates for fine mapping, cloning, gene discovery and marker assisted breeding to improve the associated trait. Ten QTLs qBM1.1, qDFF3.1, qDFF7.1, qDFF12.1, qDM3.1, qDM7.1, qDM12.1 88 . Subudhi et al. 70 reported that the increased recurrent parent genome recovery of individual lines of CSSL helps in improved confidence of QTL by reducing the background noise in the mapping population. The presence of more than one chromosome segment substitution and missing segments is mainly because this population was generated after only two back crosses and no genomic or phenotypic selection during selfing. Incomplete genome coverage of wild accession is mainly attributed to inadvertent or deliberate selection in the early generation which may have removed the lines with donor segments contributing to any unfavourable trait in the population. Presence of genetic factors for cross incompatibility, hybrid sterility, lethality and recombination barriers which are well known in wild introgressions also are the probable reasons for missing segments in substitution lines. Chromosome segments in the CSSL genome composition graph showed uneven distribution of many segments on the two sides of the diagonal segments along with heterozygosity and missing segments and it might reduce mapping accuracy 89 . NK61 with maximum 26.51% alleles from O. nivara was the only CSSL showing significantly higher grain yield and lower flowering days compared to the parent Swarna. Thus, wild introgressions enhance yield and testing under large scale multilocation trials is required.
CSSLs with uniform genetic background harbouring specific chromosome segments from a donor species are ideal for mapping of QTLs 59,77 . Nine out of 20 QTL (45%) detected using CSSLs were observed in at least two environments and also based on mean data across environments. In comparison, only 14 out of 45 QTLs (31%) from the larger BIL population were detected in more than one environment indicating usefulness of CSSLs in identifying stable QTLs compared to primary mapping populations. It was demonstrated that CSSLs are more powerful than primary mapping populations like RIL and F 2 to detect QTLs in multi environment testing using a set of inter specific CSSLs and RILs of cabbage (Brassica rapa) 79 . Most of the major QTLs were consistently detected across mapping populations and methods, however variation was observed in number, loci and phenotypic variance of many QTLs. In CSSLs the masking and interaction effects of multiple genes and loci are comparatively lower than in other mapping populations. This leads to more precise detection of observed variation in QTL mapping along with the genome effect of a background genotype 77,90,91 . Thus, CSSLs has the advantage of increasing the accuracy of QTL mapping while reducing the population size to be phenotyped, by representing the whole genome of the donor parent in a comparatively smaller subset.
Significant quantitative trait loci (QTLs) for different yield traits were located in 10 chromosomal segments above threshold levels and in at least one environment. These included QTLs for BM, BY, DFF, DM, HI, PDP, PH, PW and GY located in all chromosomes except 5, 6 and 9. The QTL locus near RM431 with large phenotypic variance for plant height and identified in 3 environments was previously identified as associated with sd1 gene 20,21,32,82 . This may be a new allele for semi dwarf locus which can be confirmed by sequencing. RM431was reported as closely linked flanking marker for a major and consistent QTL qDTY1.1 in multiple elite genetic backgrounds under both drought stress and non-stress condition 81 and the same marker was used for transfer of the QTLs for yield under drought into different genetic backgrounds 92 .
The QTL locus near RM206 showed large phenotypic variance for BY, HI, PDP and GY with trait increase contributed by O.nivara alleles across two environments. QTL loci RM16-RM55 on chromosome 3 and RM214-RM11 on chromosome 7 for days to flowering and days to maturity showed low environment interaction and were consistently detected across the environments. But interaction between chromosome substitution and environment was not significant in case of regions at chromosome1 (RM431) for BM, at chromosome 2 for PW, at chromosome 3 for DFF and DM, at chromosome 4 and 12 for PH and at chromosome 8 for GY and PDP indicated by their low LOD (QTL×E). For enhancing grain yield, O. nivara allele at RM206 could be explored in breeding for wide adaptation. Wild donor genome segments detected through macro-segment mapping using CSSLs need further fine mapping to understand the genetic mechanisms for phenotypic expression 89 . Near isogenic lines and secondary mapping populations targeting these chromosome substitution segments are essential to assess the use of QTLs in MAS.
Identification of QTL candidate genes for yield and related traits would help in unravelling genotype to phenotype relationships. In genome wide QTL meta analysis reported by Swamy et al. 17 , PPR genes were associated with 15 out of 23 meta QTLs for yield. PPR genes are known to restore fertility in rice 40 . In the present study, cytochrome b561 underlies qGY8.1 for yield. Cytochrome b561, a protein family plays an important role in plant growth, development, and prevention of damage to plants from excess light under drought condition 93 . Recently, the novel MYB-like transcription factor OsMPH1 was identified, where the over expression of OsMPH1 increases plant height and grain yield in rice 72 . SnRK1 (Sucrose-non fermentation1-related protein kinase1) activity plays multiple essential roles in plant growth and in modulating early seedling development after seed germination and in controlling age-dependent leaf senescence 94 . MYB-like transcription factor and SnRK1 underlie the stable QTLs for plant height (qPH1), biomass (qBM1) and total dry matter (qTDM1) on chromosome 1 in our study. Lim et al. 53 identified strong candidate genes underlying qTln2 for tiller number, which encode receptor-like protein kinase precursor, DUF803 domain containing proteins. A major QTL qSSL1b for seedling shoot length was delimited to 80.5 kb region and 16 genes were annotated which includes DUF803 domain containing protein, expressed proteins and candidate region for plant height on chromosome 1 72 . Thus, several candidate genes www.nature.com/scientificreports www.nature.com/scientificreports/ including F box protein and DUF domain and zinc finger proteins underlie the major effect QTLs for plant height, yield and biomass.
Among 31 significant lines identified in the BILs, 23 also belonged to the CSSL set. Among these 22 lines for DFF and 4 lines for DM were positively significant and 11 were negatively significant for these two traits. Two lines NK41 and NK70 which were negatively significant for flowering days were positively significant for plant height. Plant height and days to flowering were associated in Swarna/O. nivara introgression lines 16,95 . In all, 8 lines were positively significant for plant height and 7 of them had a chromosome segment harbouring QTL for plant height at chromosome 1 from O.nivara. Five of the lines positively significant for PH were positively significant for BM also. NK61 showed significantly higher value than Swarna for GY and PDP and significantly lower value for DFF and DM. This line harboured alleles from O.nivara for GY, PW and PDP at chromosome 8 and for GY, BY, HI, PDP at chromosome 11 but Swarna alleles for DFF and is an important line for fine mapping of major QTLs for these traits.
Among the CSSLs, 4 lines were positively significant over Swarna for days to flowering and maturity and harboured Swarna alleles at all the three QTL loci significantly associated with DFF and DM. Conversely, 11 significantly negative lines identified for flowering days harboured a combination of alleles of O.nivara and Swarna except in case of NK79, NK82 and NK88 where alleles at chromosome 3 for DFF and DM were only from O.nivara. Secondary mapping populations from crosses involving a QTL-CSSL with recurrent parent or among two CSSLs carrying different QTL segments will help in developing near isogenic lines (NILs), fine mapping and studying QTL × QTL interaction 39,67,96 . The set of lines which show significant phenotypic and corresponding genotypic difference for the particular chromosome can be used to create a secondary mapping population for fine mapping of the specific trait.

conclusion
A CSSL set of 70 lines of Swarna/ Oryza nivara IRGC 81832 with 94.4% genome coverage of O. nivara was identified. The 23 significantly different CSSLs identified in this study with wide range of variation for 7 yield traits can be further crossed among themselves for trait pyramiding and to study gene interactions in a largely similar genetic background. The chromosomal region between RM206 and RM144 showed a significant major QTL qGY11.1 for GY detected consistently across years and average environment. The same locus was mapped as closely linked to BY, GY and PDP in the derived CSSL population also. This is a high priority candidate region for yield improvement. Similarly ten consistent QTLs qBM1.1, qDFF3.1, qDFF7.1, qDFF12.1, qDM3.1, qDM7.1,  qDM12.1, qPH1.2, qNT2.1 and qTDM11.1 for BM, DFF, DM, PH, NT and TDM were observed over the years and across different mapping techniques such as SMA, ICIM, MET and CSL options. CSSLs with marker defined chromosome segments help in identification of QTLs, underlying genes, and superior lines for any trait of interest and in broadening the genetic base of cultivars.

Data availability
All data generated or analysed during this study are included in this published article and its Supplementary information files.