Deciphering the genetic basis of root morphology, nutrient uptake, yield, and yield-related traits in rice under dry direct-seeded cultivation systems

In the face of global water scarcity, a successful transition of rice cultivation from puddled to dry direct-seeded rice (DDSR) is a future need. A genome-wide association study was performed on a complex mapping population for 39 traits: 9 seedling-establishment traits, 14 root and nutrient-uptake traits, 5 plant morphological traits, 4 lodging resistance traits, and 7 yield and yield-contributing traits. A total of 10 significant marker-trait associations (MTAs) were found along with 25 QTLs associated with 25 traits. The percent phenotypic variance explained by SNPs ranged from 8% to 84%. Grain yield was found to be significantly and positively correlated with seedling-establishment traits, root morphological traits, nutrient uptake-related traits, and grain yield-contributing traits. The genomic colocation of different root morphological traits, nutrient uptake-related traits, and grain-yield-contributing traits further supports the role of root morphological traits in improving nutrient uptake and grain yield under DDSR. The QTLs/candidate genes underlying the significant MTAs were identified. The identified promising progenies carrying these QTLs may serve as potential donors to be exploited in genomics-assisted breeding programs for improving grain yield and adaptability under DDSR.

conditions from flooded to aerobic and vice-versa is also a major constraint to water-nutrient uptake under DDSR. The plant root system architecture is reported to be dependent on water-nutrient availability, uptake, and signaling 4,[6][7][8] . A clear picture of an ideal root system required for higher water-nutrient uptake under DDSR may offer a real possibility of grain yield improvement under DDSR. Non-uniform and fast emergence together with significant seedling death just after seeding and very low weed competitiveness are important factors that determine reduced yield under DDSR. Plant morphological traits such as leaf area index; leaf size; leaf insertion angle, shape, and thickness 9 contributing to photosynthesis 10 ; flowering time [11][12][13] ; growth rate; lodging resistance 4,14 ; root traits that enhance nutrient uptake, especially nitrogen (N), phosphorus (P), and iron (Fe) 4 ; tillering ability; panicle length; and spikelet fertility are important traits that determine crop productivity under DDSR.
To date, no detailed study has been conducted on the genetic basis associated with plant morphological traits and root system architecture associated with nutrient uptake, yield, and yield-contributing traits 15 under DDSR. There is thus an urgent need to decipher an appropriate plant and root system architecture for improving nutrient uptake, grain yield, and adaptability under DDSR.
Emerging genomic technologies may allow the systematic investigation on a genome-wide scale to harness genetic diversity 16,17 and associations leading to rice yield improvement under DDSR. High-throughput, timely, cost-effective, and easy-to-use genotyping technologies offer modern tools and techniques to the plant breeding community to be used for genomics-assisted breeding 18 . With the advent of whole-genome sequencing, high-density SNP arrays enable the detection of significant marker-trait associations, quantitative trait loci (QTLs), and candidate genes in genome-wide association studies (GWAS) 19,20 . High-density SNP arrays covering the whole genome are necessary to monitor the recombination breakpoints in the diverse panel 20,21 . GWAS can be used to identify the specific functional genetic variants such as alleles and QTLs that are linked to the phenotypic differences in a particular trait of interest to facilitate the detection of traits and selection of genotypes possessing traits of interest 22 . Association studies involving landraces, diverse germplasm, multi-parent breeding populations, NAM (nested association mapping) populations [23][24][25][26] , and breeding populations 27 have identified significant marker-trait associations and fine-mapped candidate genes 28 that will help rice breeders to undertake a systematic breeding program. GWAS have been reported as a major success in different crop species, including maize 29,30 , soybean 31 , foxtail millet 32 , rice 24 , chickpea 33 , and pigeonpea 34,35 . The proper understanding of the traits, donors, and markers underlying the genes needed for grain yield improvement and root traits enhancing nutrient uptake under DDSR may lead to breakthroughs in developing climate-resilient DDSR-adapted rice varieties.
In the present study, GWAS were conducted on 39 traits, including seedling establishment, root morphological, plant morphological, grain yield, and yield-contributing traits, in a complex mapping population. The aim of the study is to identify significant associations and QTLs/putative candidate genes to be used directly in marker-assisted breeding programs to develop high-yielding and nutrient-efficient rice varieties for DDSR conditions. Results phenotypic variations of targeted traits. More variability was observed in the dry season (DS) than in the wet season (WS) under DDSR. No significant variability was observed for relative growth rate from 22 to 29 days after seeding (DAS) and from 15 to 29 DAS, maximum root length at any sampling point, flag-leaf length and width, leaf color chart (LCC), and total biomass at flowering in 2015WS (Table 1). Significant variability was observed for all the observed traits in 2016DS except for root length at 29 DAS and vegetative vigor ( Table 1). The mean grain yield of the complex mapping population ranged from 3020 kg ha −1 in 2015WS to 4263 kg ha −1 in 2016DS. Negative skewness was observed for days to 50% flowering, grain yield (2016DS), SPAD (2016DS), and panicle length. Positive skewness was observed for days to first and full emergence, flagleaf angle, flag-leaf area, nodal roots, root hair length, root hair density, total number of productive tillers, 1000-grain weight, plant height, and culm diameter. The parents UPLRi 7 and IR 74371-70-1-1 showed a better performance in terms of grain yield and number of nodal roots in both the wet and dry season ( Table 1). The parent IRRI 123 had better root hair length and density in 2016DS. The parents UPLRi 7 and IRRI 148 had more flag-leaf area than the other parents. In terms of SPAD and LCC, the parents IR 74371-70-1-1 and UPLRi 7 showed a better performance. Across seasons, UPLRi 7 and Vandana showed better lodging resistance than the other parents.
Population structure and linkage disequilibrium (LD). Population structure and LD decay were estimated using the genetic structure of the 300 progenies from the complex mapping population and 6 parents using 10,588 SNPs distributed across all 12 rice chromosomes. The GAPIT screen plot showed that the first two principal components (PCs) were informative, and then a gradual decrease occurred (Fig. 2a) until the tenth PC. The first PC explained 38.3% and the second PC explained 13.2% of the total variance (Fig. 2b). All the progenies were divided into two distinct major groups and further subdivided into subgroups (Fig. 2c). The kinship heatmap showed that most of the kinship value was concentrated from the *0.0 to 0.5 level (0.0: no relatedness, 0.5: weak relatedness), indicating a weak relatedness in the association panel (Fig. 2c). Pairwise linkage disequilibrium Significant MTAs for molecular breeding for DDSR. A total of 971,790 SNPs were called from the genotyping by sequencing (GBS) data. A stringent selection criterion to filter out the SNPs was used, including missing percentage, MAF (minor allele frequency), and percent heterozygosity, to select the panel of robust SNPs. After filtering the SNPs with call rate 85% and MAF of >5%, a total of 10,588 polymorphic SNPs were retained to conduct GWAS studies on 39 traits measured in 2015WS and 2016DS. The significant MTAs that surpass the Bonferroni threshold and that were consistent over seasons and combined seasons analysis were reported in the present study. The marker-trait associations identified on the same chromosome within a region of 2.5 Mb were assigned as putative QTLs.
A total of 10 significant trait-SNP associations and 25 QTLs associated with 25 traits of interest were identified. The QTL span ranged from 141 to 2345 kb. The false discovery rate (FDR) ranged from 0.0227 to 1.21 × 10 −6 , 0.192 to 2.44 × 10 −6 , and 0.190 to 3.66 × 10 −7 in 2015WS, 2016DS, and combined seasons analysis, respectively (Table 2). Detailed information on SNPs, SNP position, QTL span, p-values, R 2 , and FDR is shown in Table 2. The Manhattan plots depicting -log (p-values) and Q-Q plots (quantile quantile) showing the expected vs observed p-values for the MTAs are presented in Fig. 3 (seedling-establishment traits and plant morphological traits), Fig. 4 (root traits and nutrient uptake), and Fig. 5 (agronomic traits, grain yield, and lodging-resistance traits).
A total of eight QTLs on chromosomes 2, 4, 5, and 6 were reported to be associated with root morphological traits and nutrient uptake (N, P, Fe). Putative QTLs were found for nodal root number on chromosome 4 spanning the 760 kb region, for root hair length spanning the 0.07 kb region on chromosome 5, and for root hair density spanning the 1032 kb region on chromosome 2. A 3.4 Mb region on chromosome 2 was associated with QTLs for root hair density and nitrogen uptake. The QTLs for root hair length and phosphorus uptake were 4.6 Mb apart from each other on chromosome 5. On the long arm of chromosome 3, a 3.4 Mb region was identified to be associated with flag-leaf angle, days to 50% flowering, and stem diameter. QTLs were found for lodging resistance traits such as stem diameter spanning the 187 Mb region on chromosome 3, for bending strength and bending moment spanning the 0.038 kb region on chromosome 3, and for culm diameter spanning 1.3 Mb on chromosome 2. A 9 Mb region on the long arm of chromosome 5 was reported to be associated with traits such as root hair length, phosphorus uptake, and flag-leaf area. The 6.9 Mb region on the long arm of chromosome 1 Table 1. Mean data of the co mplex mapping population and six parents for different seedling-establishment, root, plant morphological, yield, and yield-contributing traits under dry direct-seeded conditions. RGR1: relative growth rate from 15    www.nature.com/scientificreports www.nature.com/scientificreports/ was reported to be associated with plant height and relative growth rate. The grain yield QTL harbors a 2.3 Mb region on chromosome 11. The long arm of chromosome 11 was reported to be associated with grain yield, grain yield-contributing traits (panicle length, 1000-grain weight), and seedling-establishment traits (days to first and full emergence) under DDSR. The QTLs for days to first emergence and grain yield were separated from each other by a 0.9 Mb region on chromosome 11. In addition to the grain yield QTL on chromosome 11, another grain yield QTL on chromosome 4 in 2016DS and combined seasons analysis was observed. Over the seasons, the p-value of the SNPs that passed the Bonferroni threshold ranged from 0.00032 to 4.48 × 10 −8 for nodal roots at 15 DAS, 0.000208 to 2.00 × 10 −10 for nodal roots at 22 DAS, 0.05 to 4.96 × 10 −11 for grain yield, and 2.933 × 10 −7 to 3.207 × 10 −7 for P uptake. The SNPs S3_18257114 and S3_18257152 on chromosome 3 were observed to be significantly associated with bending strength and bending moment both and S4_31903323 on chromosome 4 with nodal root number at different growth points. The associated SNPs with a very small interval (≤14 kb) can be considered as the same locus.
The allelic effects of the significantly associated markers for root, nutrient uptake, and grain yield traits are shown in Fig. 6. The detailed description of the identified candidate genes underlying the significant marker-trait www.nature.com/scientificreports www.nature.com/scientificreports/ association/putative QTLs is presented in Supplementary Table S1. The identified candidate genes in the present study that were in the vicinity of the previously reported SNPs/putative QTLs warrant further investigation and validation.
performance of selected promising progenies. Eight improved breeding progenies possessing better root morphology for higher nutrient uptake and higher yield under DDSR were identified ( Table 3). The percentage improvement in the promising breeding progenies ranged from 17% to 52% for nitrogen uptake over the best performing parent, Vandana. The percentage improvement for phosphorus and iron uptake in the selected progenies was 17% to 69% and 14%, respectively, over the parent Kali Aus ( Table 3). The best performing progenies for zinc uptake showed percentage improvement ranging from 13% to 97% over the best parent identified for zinc uptake (IR 74371-70-1-1) under DDSR. The percentage improvement of number of nodal roots at 29 DAS in selected promising progenies varied from 4% to 92%, 7% to 14%, and 5% to 63% in 2015WS, 2016DS, and combined seasons, respectively, over parent Vandana (Table 3). The percentage grain yield advantage of the promising progenies ranged from 0.6% to 45% in 2015WS, 0.6% to 6% in 2016DS, and 0.1% to 18% in combined seasons analysis over the best yielding parent (UPLRi 7) across seasons (Table 3).

Discussion
phenotypic variability and correlation studies. Higher nutrient uptake, weed competitiveness, good crop stand, yield improvement, and adaptability are some of the prerequisites to developing resource-efficient high-yielding rice varieties for dry direct-seeded conditions 36 . The significant differences for different traits in the progenies indicated the existence of variability in the population that can be exploited for marker-assisted breeding. The fast and uniform seedling establishment represented by more nodal roots, longer root length, and more root hair length and density would be important traits for DDSR. The similarity in LCC and SPAD values across the progenies indicated the LCC as a cheap alternative to SPAD, which is expensive for farmers to buy to decide on the time and dose of fertilizer application under DDSR.
The significant positive correlation of grain yield with yield-contributing traits and seedling-establishment traits indicated the role of seedling-and reproductive-stage traits in improving grain yield. The significant positive correlation of nutrient uptake with root traits and plant morphological traits such as flag-leaf length, width, and area and chlorophyll content (which was indirectly estimated with LCC and SPAD) indicated the role of these traits in enhancing photosynthetic ability. Further, the preferential distribution of photosynthetic assimilates from leaves to vigorous roots may allow rice to uptake nutrient efficiently under DDSR.
The direction of the correlation (whether positive or negative) across seasons was the same but the significance levels were different. Traits such as DTFirst emergence, VVG, NR, RHD, RHL, LCC, CD, and PHT showed the effect of seasonal variability. Early vigor and uniform emergence are complex traits that are influenced by different environmental and soil factors. The seasonal variation also indicated the role of phenotypic plasticity behavior of different traits in increasing yield across different seasons. A better understanding of the relationship between root traits, plant morphological traits, and grain yield across different seasons, soil profiles, and developmental stages with water-nutrient uptake is likely to provide a novel dimension in selecting traits for developing the   Marker-trait associations. To take advantage of existing variations 37 , with a high-throughput phenotyping-genotyping platform, advanced statistical strategies 25,26,38 , marker-trait associations, and/or GS (genomic selection) 27 , association mapping was performed directly on the complex mapping population in the present study.
The significant phenotypic variability present in the complex mapping population and the high genome coverage provided a strong base for the genome-wide association study on the complex mapping population. The allelic distribution of each progeny in the population presented an approximately 25% contribution of Vandana, 25% of NSICRc 192, and 12.5% each of the other four parents (IR 74371-70-1-1, Kali Aus, IRRI 123, and UPLRi 7). Among the 25 identified QTLs, 16 were located in the proximity of earlier identified candidate genes (http:// qtaro.abr.affrc.go.jp), and, among the 10 identified significant marker-trait associations, two were located in the proximity of earlier identified candidate genes (http://qtaro.abr.affrc.go.jp) (Supplementary Table S1). A significant association signal on chromosome 1 was detected for relative growth rate at different time points in 2016DS. The reported region in the present study was located in the promixity of lax 39 and moc2 40 candidate genes controlling shoot branching and tiller growth, respectively, indicating the robustness of the associations identified in the present study.
For the number of nodal roots at 15 (NR1) and 22 DAS (NR2) on chromosome 4, ~290 kb upstream, a positional match regulating leaf and adventitious root development in rice was identified as the nal1 (nal5) gene, NARROW LEAF1 41   www.nature.com/scientificreports www.nature.com/scientificreports/ thickness 42,43 and rooting depth 42 and in the region of rfw4a, a QTL was identified for root fresh weight 44 , indicating the role of the region (30-33 Mb) on the long arm of chromosome 4 in root development. The QTL for root hair length on chromosome 5 was observed to be colocated with the earlier identified gene OsIPT3, a gene tightly linked with root growth in rice 45 , and with qRHD 5.1 , a QTL for root hair density 4 . The root hair density QTL on chromosome 2 was found to be colocated with the earlier reported genes EL5 46 , OsCK11 47 , and OsNAR2.1 48 , which had shown a major role in root development and nitrate uptake, respectively. This QTL is also reported to be colocated with the QTL for nitrogen uptake in the present study. This may explain the role of root hair density in the uptake of the nitrate form of nitrogen under DDSR. The SNP S2_5964826 on chromosome 2 associated with SPAD was identified to be colocated with SPK2 (SYG2), a gene that inhibits phosphate starvation responses in rice 49 .
The earlier reported gene bc3 (Rice BRITTLE CULM 3) that encodes OsDRP2B, a classical dynamin that is essential for secondary cell wall synthesis 50 , is located in the region of the QTL identified for culm diameter on chromosome 3 (30.98-32.31 Mb) in the present study. The QTL for stem diameter on chromosome 3 was found to be colocated with the earlier reported QTL/gene SCM3 5 . In the present study, the QTL for plant height on chromosome 1 colocalized with the rice semi-dwarfing gene 51  www.nature.com/scientificreports www.nature.com/scientificreports/ in the present study was located near the earlier reported genes OsPT9 55 (involved in phosphorus uptake in rice), OsGLK1 56 (induced chloroplast development in rice), and nyc3 57 (stay green phenotype during leaf senescence in rice). The genomic region from 20 to 24 Mb on chromosome 2 was reported to be associated with genes DP2 58 (dense panicle), RCN2 59 (panicle morphology), OsAPX8 60 (photosynthesis and adaptation under photo-oxidative stress), and OsNAR2.1 48 (nitrate uptake transporter). This signified that the QTL for nitrogen uptake underlying the genetic region is associated with source and sink capacities in rice.
The phosphorus uptake QTL on chromosome 5 was located in the region associated with the OsRPK1 gene that regulates root development 61 , OsCCaMK gene for microbial symbiosis 62 , OsHAP3B and OsTPS1 genes for chloroplast biogenesis 63 , and OsSTN8 gene for protein phosphorylation of photosystem II 64 . The colocation of identified QTLs or significant SNPs with earlier reported genes/QTLs for root development, photosynthetic traits, nutrient transporter, nutrient uptake, and stress-responsive genes further confirms the contribution of these traits/QTLs in enhancing yield and adaptability under DDSR. The identification of novel markers/SNPs/ candidate genes in the QTL region helps to harness their benefits in genomics-assisted selection.
GWAS: traits and progenies. The 10 identified significant MTAs, 25 QTLs associated with 25 traits of interest, and the 8 selected promising breeding progenies with favorable allele combinations can be deployed after validation for improving grain yield and adaptatbility using molecular breeding. The parental alleles providing improved grain yield under DDSR were contributed by parents Kali Aus, UPLRi 7, IR 74371-70-1-1, and IRRI 123. The parental alleles reported to be reponsible for the increase in nodal root number were contributed by Kali Aus and for nutrient uptake by the parents (including Kali Aus, IRRI 148, IRRI 123, and Vandana). In the identification of suitable traits needed for DDSR, promising progenies possessing those traits and the SNPs associated www.nature.com/scientificreports www.nature.com/scientificreports/ with colocated traits would significantly promote genomics-assisted breeding in developing DDSR varieties. The introgression of a superior haplotype improving grain yield and water-nutrient uptake under DDSR using haplotype-based breeding may open new avenues for designing next-generation DDSR-adapted rice varieties.

Conclusions
A complex mapping population having wide phenotypic variability coupled with large genome coverage was used to identify significant marker-phenotype associations to be exploited directly in breeding programs. A total of 10 significant trait-SNP associations and 25 QTLs associated with 25 traits of interest were identified under DDSR. The colocation of the SNPs related to root morphologcal traits, nutrient uptake, and grain yield under DDSR was further confirmed by the direct significant positive correlation of root traits with nutrient uptake. The identification of suitable traits and promising progenies possessing colocated QTLs may serve as novel breeding material for developing DDSR varieties. The identification and introgression of a superior haplotype exploiting haplotype-based breeding would be the next step to improve the grain yield and adaptability of rice under DDSR.

Materials and Methods
Development of a complex mapping population. The plant material used in the present study comprised a complex mapping population derived from six diverse parents: IR 74371-70-1-1, UPLRi 7, IRRI 123, Kali Aus, Vandana, and IRRI 148. The detailed characterstics of the parental lines used to develop the complex mapping population are presented in Supplementary Table S2. The scheme for the development of the complex mapping population and details on the phenotyping and genotyping screening are provided in Supplementary Fig. S1.  14°10′11.81″N, 121°15′22″E). Seeding was done at ~2 cm depth in 3.0 m rows with two rows per progeny in dry plots. Out of the total 3 m row, the central 2 m [20 cm (hill to hill) × 20 cm (row to row)] was used for the measurment of grain yield and yield-related traits and the remaining 1.0 m [10 cm (hill to hill) × 20 cm (row to row)] was used for the destructive sampling for the root traits at different growth points (15,22,and 29 DAS). A total of 450 F 2 -derived F 3 progenies and 300 F 3 -derived F 4 progenies along with six parents and four checks were evaluated in 2015WS and 2016DS, respectively, under DDSR conditions. In 2015WS, the field evaluations were carried out in an augmented design with 450 F 3 progenies, six parents, and four checks (100 progenies per block and a total of 5 blocks); whereas, in 2016DS, the evaluations involved 300 lines, six parents, and four checks in an alpha-lattice design (5 progenies per block and a total of 62 blocks and 2 replications). soil characteristics. The sand, silt, and clay content of the experimental soil was 21%, 36%, and 39%, respectively, with pH 7.7. The Ca (calcium), Mg (magnesium), K (potassium), P (phosphorus), and KjN (Kjeldahl nitrogen) content was 16.0 meq (milliequivalents) 100 g −1 , 7.8 meq 100 g −1 , 1.09 meq 100 g −1 , 20 mg kg −1 , and 0.10%, www.nature.com/scientificreports www.nature.com/scientificreports/ respectively. The data on average rainfall, temperature, humidity, solar radiation, and pressure across 2015WS and 2016DS are presented in Supplementary Fig. S2.
Details on field managment. The term "dry direct-seeded" used in the present study refers to the cultivation method in which the rice crop is seeded directly in the dry field as wheat and corn (maize) without any nursery-bed raising in non-puddled, laser-leveled fields.
To ensure better pulverization, uniform germination, and, most importantly, weed control under DDSR, the field was prepared 1 month prior to sowing. Land preparation involved plowing using a terra disc plow followed by three rotovations at weekly intervals. The field was laser-leveled and allowed for the first flush of weeds to emerge and grow for 3 weeks, before control with the application of glyphosate (1.0 kg ai ha −1 ; ai: active ingredients). A combination of pre-emergence (oxadiazon at 0.5 kg ai ha −1 at 6 DAS), early post-emergence (bispyribac sodium at 0.03 kg ai ha −1 (9.7%, Nominee) at 11 and 22 DAS), and spot weeding at 35 and 55 DAS was used to control weeds.
During the seedling-establishment stage, sprinkler irrigation was used for 1 month and thereafter surface irrigation was applied once or twice a week depending on the weather and crop water status. The irrigated field was allowed to drain naturally through normal seepage and percolation.
The total fertilizer rate was 100-35-30 N-P-K kg ha −1 and 120-40-40 N-P-K kg ha −1 in 2015WS and 2016DS, respectively. A detailed description of the experiments conducted and traits measured in the present study is presented in Supplementary Table S3. phenotypic evaluation of the complex mapping population. Measurement of seedling establishment traits. Each plot was monitored regularly for first and full emergence from 3 to 12 DAS. The term "days to first emergence" used in the present study refers to the emergence of one or two seedlings from the soil and "days to full emergence" refers to the emergence of 90-95% of the seedlings per plot.
Vegetative vigor in terms of relative growth rate (RGR) was recorded from three randomly selected seedlings per plot. The seedlings were uprooted using a trowel at 15, 22, and 29 DAS. The roots and shoots were separated and the shoots were oven-dried at 60 °C for 72 hours with each sampling. The oven-dried shoots were weighed (DSW) immediately after being taken out from the oven to calculate RGR. RGR was calculated at different timepoints (i.e., from 15 to 22 DAS (RGR1), 22 to 29 DAS (RGR2), and 15 to 29 DAS (RGR3)) using the following equation 4 Table 3. Significant variation (t-test) for selected promising progenies for nutrient uptake, root traits, and grain yield under dry direct-seeded conditions. + Parent, DS: dry season, WS: wet season, Comb seas: combined seasons, **significance at <0.01 level, NR3: number of nodal roots at 29 DAS, RHL: root hair length, RHD: root hair density, GY: grain yield (kg ha −1 ), N uptake: nitrogen uptake (kg ha −1 ), P uptake: phosphorus uptake (kg ha −1 ), Fe uptake: iron uptake (kg ha −1 ), Zn uptake: zinc uptake (kg ha −1 ).
Measurement of root traits. Randomly, three plants were removed from the soil at each sampling to measure root traits. The soil sections surrounding the plant containing roots were removed by digging a hole (40 cm deep). The roots and shoots were then separated by cutting from the topsoil line. The separated root samples at 15, 22, and 29 DAS were washed gently and properly with running tap water on a sieve and root fresh weight (RFW in g) was measured quickly. The number of nodal roots (NR) [NR1 (number of nodal roots at 15 DAS), NR2 (number of nodal roots at 22 DAS), NR3 (number of nodal roots at 29 DAS)] was counted manually. Maximum root length (RL) [RL1 (maximum root length at 15 DAS), RL2 (maximum root length at 22 DAS), RL3 (maximum root length at 29 DAS)] was measured using the centimeter scale. The roots were then oven-dried at 60 °C for 72 hours for root dry weight (RDW) measurements. Root hair length (RHL) and root hair density (RHD) were recorded on six root samples following the procedure as described by Sandhu et al. 4 .
At booting stage, the color of the central part of the middle lobe of three randomly chosen flag leaves was compared with the color strip (LCC: leaf color chart developed by IRRI); to validate the greenness of fully expanded flag leaves, a Minolta 502 chlorophyll meter (SPAD) was used. To estimate the correlation with nitrogen concentration and N uptake, total aboveground fresh biomass was measured immediately after uprooting the same three plants. Then, the plant samples from each progeny were cut into small pieces, mixed properly, and a total of 200 g of plant sample were taken, oven-dried at 70 °C for 3 days, and then weighed immediately for dry biomass. A total of 60 progenies (30 high-yielding and 30 low-yielding and parents) were selected in 2015WS and analyzed for nutrient uptake (N, P, Fe, and Zn) in the IRRI Analytical Service Laboratory. The nitrogen estimation was carried out using the Kjeldahl digestion method. The plant material was digested with a mixture of K 2 SO 4 (potassium sulfate) and concentrated H 2 SO 4 (sulfuric acid) in the presence of a catalyst (fine-powdered selenium). The NH 3 (ammonia) produced was estimated by the colorimetric method using a Technicon autoanalyzer. The other nutrients phosphorus (P), iron (Fe), and zinc (Zn) were estimated using the nitric/perchloric acid digestion method using the AIM 500 Digestion Block System.
Lodging resistance traits. The lodging resistance traits stem diameter (SD; mm), culm diamter (CD; mm), bending strength (BS; kg cm), and bending moment (BM, kg cm 2 ) were measured on three randomly chosen plants. SD and CD were measured using a Vernier caliper. A prostrate tester (Daiki Rika Kogyou Co., Tokyo) 66 was used to measure stem strength. Twenty days after flowering, the main stem of each randomly chosen plant was cut from the ground level and force was given to the second and third internode of the stem at the middle (5 cm) of the internode and the bending ability of the internode was measured by bending the stem to the point at which the stem broke. The displacement in mm was measured as stem strength at the breaking point. The scale displacement (mm) on the prostrate tester due to bending ability was recorded.
Plant morphological, grain yield, and yield-related traits. Five plants were randomly chosen to record data on plant morphological, grain yield, and yield-related traits. The plant morphological traits cover flag-leaf length (FLL), flag-leaf width (FLW), flag-leaf area (FLA), flag-leaf angle (FLAngle), and plant height (PHT). The grain yield and yield-related traits include days to 50% flowering (DTF), number of productive tillers per plant (NPT), panicle length (PL), filled grains per panicle (FG/P), 1000-grain weight (1000 GW), biomass at flowering (BMF), and straw yield (SY). DTF was recorded when ~50% of the plants in a plot showed panicle exsertion. PHT, PL, FLL, and FLW were measured on a centimeter scale (cm). PHT was measured from the ground level to the tip of the highest panicle at maturity stage. NPT was counted manually. Panicle length was measured from the node of the panicle to the tip of the panicle. FLL was measured from the base to the tip of the flag leaf whereas FLW was measured at the middle part of the flag leaf. Flag-leaf area (FLA; cm 2 ) was calculated according to Palamiswamy and Gomez 67 . FLAngle was measured using a protector keeping the stem as the horizontal base. The plants were harvested at hard dough stage and the grains and straw were separated. The harvested grains were threshed, cleaned, and oven-dried for 3 days at 50 °C to 14% mositure content and then weighed to record grain yield. The filled grains per panicle were counted and 1000-grain weight was recorded in g. For straw yield, the harvested straw was oven-dried for 3 days at 70 °C and weighed. Biomass at flowering (BMF) was measured (in g) after harvesting and drying the aboveground biomass at 70 °C in an oven till there was no change in the dry weight of the plants.
DNA isolation, genotyping by sequencing, and SNP calling. The samples for genotyping comprised six parents, four checks, and 300 progenies from the complex mapping population in 2016DS. The fresh leaf tissue samples from six plants per progeny were collected at 40 DAS. Automated leaf sampling with high-throughput DNA extraction using the Brooks' PlantTrak Hx rice leaf tissue sampler and LGC Genomics' oKtopure systems, respectively, was performed to increase throughput and efficiency coupled with precision. ApeKI, a type II restriction endonuclease, was used for DNA digestion. The digested DNAs were ligated to the adapter and 96-plex libraries were constructed as per the genotyping by sequencing (GBS) protocol. GBS was carried out using the HiSeq. 2000 100PE platform of Macrogen Inc. (Korea). From GBS data, a total of 971,790 SNPs were called from the GBS genotyping. A stringent selection criterion to filter out the SNPs was used, including missing percentage, MAF (minor allele frequency), and percent heterozygosity, to select the panel of robust SNPs. SNPs with a call rate of 85% and MAF of >5% (10,588 SNPs) were filtered using Tassel 5 68 . The 10,588 SNPs were used to estimate the genetic relationship, building of a neighbor joining (NJ) tree, and to carry out GWAS. In order to detect and correct for population structure, a PCA was carried out using 10,588 SNP markers.
www.nature.com/scientificreports www.nature.com/scientificreports/ Phenotypic data analysis. Analysis of variance (ANOVA) was computed using PBTools V 1.4.0 (http://bbi.irri. org/products). The trial mean and trait mean for each season were computed using mixed model analysis considering replications, and block within replication as a random effect and progenies as a fixed effect. Broad-sense heritability was computed using the equation where H represents the broad-sense heritability, σ 2 G represents the genetic variance, σ 2 E the error variance, and r the number of replications. Correlation among traits was calculated using function rcorr() [in Hmisc package] in R. v.1.1.423. The best parent for the promising traits was selected and the percent improvement in selected promising breeding progenies over the best parent for each promising trait was calculated as % improvement = (trait value in selected promising progeny − trait value in best parent/trait value in best parent) *100.
Population structure, linkage disequilibrium, and association analysis. A total of 10,558 SNPs, 300 progenies, and 6 parents were used to access the population structure using STRUCTURE V. 2.3.4 software. A series of models with K value ranging from 1 to 10 was run with a burn-in period to 50,000 and running length to 10,000 to give consistent results over runs. A total of 10 independent runs for each K were performed to verify the consistency and accuracy of the results. The most probable number of clusters was determined by the K value with maximum likelihood over the runs 69 . The number of principal components (PC) explaining genetic variation was estimated using R/GAPIT and iteratively added to the fixed part of the model, ranging from PC1 to PC10.
The distance matrix was estimated using Tassel 5.0 68 and FigTree v1.4.2 70 was used to visualize an unweighted neighbor joining tree. Tassel 5.0 was used to calculate the pairwise r 2 values. These pairwise r 2 values were averaged over the SNPs grouped based on stepwise increasing base-pair distance (0-10, 10-50, 50-500, 500-1000, 1000-1500, 1500-2000, 2000-2500, 2500-3000, 3000-3500, 3500-4000, 4000-4500, and 4500-5000). The LD (linkage disequilibrium) decay in the complex mapping population was estimated from the average LD over the calculated grouped base-pair distance in the complex mapping population. The genome-wide significant associations of genomic regions and traits of interest were identified using CMLM (compressed mixed linear model)/ P3D (population parameters previously defined) in GAPIT (Genome Association and Prediction Integrated Tool) executed by R package 71 on 2015WS, 2016DS, and the combined data over seasons.
Genetic similarity between individuals was estimated using the relatedness matrix and the random effects were estimated from identical by state (IBS) values. The population structure (Q value) and kinship matrix (K) estimated from the genotyping data were used to improve the statistical power of genome-wide association mapping. The allelic effect of the significant markers associated with the trait of interest in the present study was determined by representing the phenotypic data for the alleles as boxplots and the significant allelic variation for the associated trait of interest was determined performing the Kruskal-Wallis test in "R".