Genotyping-by-sequencing based QTL mapping for rice grain yield under reproductive stage drought stress tolerance

QTLs for rice grain yield under reproductive stage drought stress (qDTY) identified earlier with low density markers have shown linkage drag and need to be fine mapped before their utilization in breeding programs. In this study, genotyping-by-sequencing (GBS) based high-density linkage map of rice was developed using two BC1F3 mapping populations namely Swarna*2/Dular (3929 SNPs covering 1454.68 cM) and IR11N121*2/Aus196 (1191 SNPs covering 1399.68 cM) with average marker density of 0.37 cM to 1.18 cM respectively. In total, six qDTY QTLs including three consistent effect QTLs were identified in Swarna*2/Dular while eight qDTY QTLs including two consistent effect QTLs were identified in IR11N121*2/Aus 196 mapping population. Comparative analysis revealed four stable and novel QTLs (qDTY2.4, qDTY3.3, qDTY6.3, and qDTY11.2) which explained 8.62 to 14.92% PVE. However, one of the identified stable grain yield QTL qDTY1.1 in both the populations was located nearly at the same physical position of an earlier mapped major qDTY QTL. Further, the effect of the identified qDTY1.1 was validated in a subset of lines derived from five mapping populations confirming robustness of qDTY1.1 across various genetic backgrounds/seasons. The study successfully identified stable grain yield QTLs free from undesirable linkages of tall plant height/early maturity utilizing high density linkage maps.

GBS offers a powerful approach in producing large number of SNPs for genotyping and genetic analysis for implementing in genome wide association studies (GWAS), diversity analysis, genomic selection (GS), marker and gene discovery, genome profiling and high-resolution QTL mapping 8,[26][27][28] . QTL mapping through GBS using high density linkage maps has been studied for various traits like fusarium wilt resistance 29 and sterility mosaic resistance 30 in pigeon pea, drought tolerance in chickpea 31 rust resistance 32 and flag leaf traits 33 in wheat, plant architecture 23 and yield traits 34 in maize and grain weight, grain length in rice 35 .
In the present study, we genotyped two mapping populations segregating for grain yield using a high throughput genotyping strategy called genotyping-by-sequencing approach. The identified high-quality SNP markers were used to construct high density linkage maps in order to find consistent grain yield QTLs under drought stress situation. Further, the identified stable QTL was validated in the subset of 250 lines derived from five mapping populations.  Table 1. The results indicated reduction in mean grain yield of parents and lines under reproductive stage drought (RS) compared with NS conditions in all the 12 experiments clearly indicating that drought stress imposed during 2016 wet season (WS) and 2017 dry season (DS) was effective. Severity of imposed drought during 2016WS was much higher than 2017DS. Drought stress delayed mean DTF in Swarna*2/Dular population by 9 days under MS while the population was accelerated by 6 days in SS compared to their mean NS trials. Mean PH was drastically reduced under both SS and MS conditions. Mean grain yield of donor parent (Dular) was reduced from 4064 kgha −1 (NS) to 485 kgha −1 and 1382 kgha −1 under SS and MS conditions, respectively; while that of the recipient parent (Swarna) was reduced from 4600 kgha −1 under NS, to 0.0 kgha −1 and 561 kgha −1 under SS and MS, respectively. The mean GY of Swarna*2/Dular population was 2894 kgha −1 , 469 kgha −1 and 1382 kgha −1 in NS, SS and MS conditions, respectively with percent mean grain yield reduction varying from 83% in SS to 52% under MS compared to NS condition. The heritability (H) estimate ranged from 0.28 under NS to 0.30 under SS and 0.42 under MS for Swarna*2/Dular population.

Results
Similarly, mean DTF, PH and GY, H, and LSD 0.05 of parents and lines for IR11N121*2/Aus 196 population is also presented in Table 1 Table S4 and S5). There is a significant reduction in the numbers of SNPs from several thousands to few thousands, due to stringent selection criteria used in the present analysis. SNP calls from a nucleotide-based format were converted to parent-based format using ABH-plugin in TASSEL pipeline. After converting the generated SNPs into parent-based format, in total 3929 SNPs (Swarna*2/Dular) and 1191 SNPs (IR11N121*2/Aus 196) with contrasting alleles in parental genotype were retained to be used for construction of linkage maps.
Snp based high density linkage maps. The genome wide polymorphic SNPs were evaluated for expected segregation ratio using Chi-square analyses in both Swarna*2/Dular and IR11N121*2/Aus 196 populations. For the Swarna*2/Dular population, a total 3929 SNPs with contrasting alleles between the parents were mapped on all 12 chromosomes while 1191 polymorphic SNPs were mapped for IR11N121*2/Aus 196 population (Supplementary Tables S6 and S7). In total, 20 SNPs for Swarna*2/Dular and 8 SNPs for IR11N121*2/Aus 196 population had shown segregation distortion and unable to locate on their respective linkage maps. The total length of genetic map computed for Swarna*2/Dular population was 1454.68 cM (varied from 89.60 cM in chromosome 9 to 169.52 cM in chromosome 1). Average genetic distance between two SNPs was 0.37 cM across the chromosomes, reflecting its utility in fine mapping of QTLs/genes. The number of SNPs mapped to each chromosome varied from 245 SNPs on chromosome 10 to 484 SNPs identified on chromosome 1 with an average of 327 SNPs per chromosome (Table 3). Similarly, genetic map of IR11N121*2/Aus 196 population consisted of 1191 SNPs with total map length of 1399.8 cM ( Table 3). The number of SNPs varied from 47 SNPs on chromosome 6 to 171 SNPs on chromosome 1 with map length of 99.53 cM to 168.97 cM. An average of 99 SNPs were mapped on each chromosome for this population. A calculated average genetic distance between two SNPs across the chromosomes was 1.18 cM (ranged from 0.70 cM on chromosome 4 to 2.12 cM on chromosome 6). The developed high-density genetic maps using filtered polymorphic SNPs were integrated with data for grain yield under drought and its associated traits for QTL analysis (Fig. 1).   Five QTLs for days to flowering (DTF) including two stable ones were expressed under both SS and MS conditions while three QTLs for PH were identified in MS conditions only (Table 4 and Fig. 2). Two significant QTLs (qDTF 3.3 and qDTF 6.3 ) for DTF mapped under both SS and MS conditions were consistent and one of them (qDTF 6.3 ) was located at similar genetic locations with the grain yield QTL qDTY 6.3 in Swarna*2/Dular population. One of the DTF QTL (qDTF 1.1 ) detected under SS in marker interval of S1_42655097S1_42885648 was detected nearly 1 Mb away from the GY QTL under drought, qDTY 1.1 identified in Swarna*2/Dular population. Phenotypic variations explained by DTF QTLs were significantly low (4 to 6%) as compared to GY QTLs under drought (4.34 to 13.50%). For plant height (PH), three QTLs (qPH 1.2 , qPH 1.3 and qPH 5.1 ) were detected with PVE ranged from 3 to 27% under MS on chromosomes 1 and 5. No stable QTL for PH under drought had been identified in Swarna*2/Dular population.  2.4 in marker interval of S2_16924409-S2_17554671 had explained PVE ranged from 8.84 to 14.92% while qDTY 11.2 was flanked by markers S11_23405441 and S11_25462601 with PVE ranged from 4.75 to 9.35% under MS and SS conditions respectively. Additive effect for most of the QTLs except qDTY 1.4 , qDTY 3.4 , qDTY 8.1 had negative values suggesting parent IR11N121 contributed alleles for increased GY under drought. The QTL qDTY 1.1 identified under MS in this population (IR11N121*2/Aus 196) was also detected in Swarna*2/Dular population under MS and SS drought conditions. This common QTL qDTY 1.1 in both the populations lying in the QTL-region of qDTY 1.1 , may have a key region of rice genome to explore the underlying variation related to drought. Several QTLs for DTF and PH under SS and MS conditions were observed at chromosomes 1, 2, 3, 5 and 11 under SS or MS conditions (Table 5 and Fig. 3). However only two of the identified QTLs (qDTF 2.2 and qDTF 11.2 ) were detected in both SS and MS conditions across the years. Stable QTLs for DTF namely, qDTF 2.2 and qDTF 11.2 had explained PVE from 3 to 11%. The positive allele for duration increase was contributed by IR11N121 parent for most of the DTF QTLs except qDTF 3.4. One of the DTF QTL qDTF 11.2 was co-located with qDTY 11.2 detected in IR11N121*2/Aus 196 population. Four QTLs for PH including one stable QTL (qPH 11.2 ) were mapped at chromosomes 1, 5 and 11. The plant height QTL qPH 11.1 was also co-located with QTL for DTF (qDTF 11.2 ) and GY (qDTY 11.2 ) under drought in IR11N121*2/Aus 196 population.

QtL analysis.
Co-localization of qDTY QtLs. Many of the previously reported qDTY QTLs including qDTY 1.1 were linked with undesirable alleles of tallness/earliness and fine mapping approach was to be followed before introgression of any such QTLs. In the present study, we have detected four stable QTLs (qDTY 2.4 , qDTY 3.3 , qDTY 6.3 and qDTY 11.2 ) . One QTL (qDTY 1.1 ) was co-located at the same position as previously reported. We found that qDTY 6.3 was linked with the QTL for DTF (qDTF 6.3 ) and qDTY 11.2 with the QTLs for DTF (qDTF 11.2 ) and PH (qPH 11.2 ).   1 . The SSR markers present within the vicinity of the newly mapped QTL region was utilized on 250 lines derived from five mapping populations to validate the qDTY 1.1 . Out of tested six SSRs, two markers (RM12091and RM12146-at 40.21 Mb & 40.7 Mb respectively) were found polymorphic between the parents and were utilized for the validation. SSR markers reported in the study, presented in the vicinity of the fine mapped QTLs (1.2 Mb), free from undesirable linkages shall be utilized in the breeding programs. Lines with and without qDTY 1.1 QTL and their GY data under drought is provided in Table S8. A significant difference was found in comparative mean for grain yield under drought among the lines with having qDTY 1.1 and lines without having qDTY 1.1 QTL (Table S9). The estimation of QTL effect had shown an advantage of 473.43 kgha −1 in the lines with having positive allele for qDTY 1 Supplementary Fig. S1(a,b). The genomic regions underlying these fine mapped QTLs (qDTY 2.4 , qDTY 3.3 and qDTY 6.3 ) including qDTY 1.1 will be an important candidate region for its utilization in marker assisted selection, sequencing and allele mining for drought tolerance in rice.

Discussion
Marker assisted breeding had enormous potential to achieve desirable phenotypic variation in less time through deployment of markers linked to QTLs for desirable trait 37 . However, discovery and development of SSR markers, their scoring across populations is time consuming, labor intensive and costly process 38,39 . Even large QTL regions identified through low density SSR markers may introduce undesirable linkages through MAS and can make www.nature.com/scientificreports www.nature.com/scientificreports/ the introgression line unacceptable for release and cultivation 40 . Such limitations of SSR marker makes it unrealistic in fine mapping of complex traits and for full use in accelerated breeding programme compare to recent available sequencing technologies at much cheaper cost. In some of the recent studies, undesirable linkages were successfully eliminated using next generation markers such as SNPs by identifying the recombinants to break       www.nature.com/scientificreports www.nature.com/scientificreports/ linkage a favorable allele conferring drought tolerance and an unfavorable allele for tall plant height 41 . Currently, next generation sequencing (NGS) technologies have become powerful tools for discovery of millions of SNPs in cost effective manner to develop high density linkage maps, dissect the complex traits and identify key genomic regions underlying the associated traits.
A rapid, robust and cost-effective genotypic platform called GBS is nowadays becoming more feasible in genomics assisted breeding by providing many markers for QTL/gene discovery at much cheaper rate 14,18,42 . In this study, we have developed two high density genetic maps using GBS approach to identify QTLs for grain yield under various drought conditions (SS and MS) using multi-season phenotypic data from two mapping populations. The average inter-marker distance between two SNPs varied from 0.37 cM to 1.18 cM in both populations used in this study which may be one of the most saturated genetic maps developed for QTL identification in rice for drought tolerance. Drought donors (Dular and Aus 196) used in this study for population development belongs to a distinct genetic group so called aus-type 43 and known for valuable genetic resource for abiotic stress tolerance including drought 5,44 . Recently, drought-responsive metabolite associated with tolerance had been identified in two Aus rice varieties (Dular and N22) underlying genes and pathways for drought tolerance in rice 45 . Both donors (Dular and Aus 196) used in this study gave significantly higher yields than the recipient parents (Swarna, IR11N121 and TDK1) under various drought conditions tested over two years, suggesting their usefulness and reliability in utilization as donors.
Drought screening in rain-out shelter during 2016WS was more effective than field screening in 2017 DS and could be attributed to precise control and monitoring of the amount and timing of irrigation. Broad-sense heritability (H) for GY was moderate under SS, MS and NS conditions while it was moderate to high for DTF and PH in both the populations and in both years. Previous studies also reported moderate heritability of grain yield under drought and high heritability for DTF and PH under non-stress and drought conditions 5,46,47 .
A total of fourteen QTLs for GY under SS and MS conditions were detected from the two populations used in this study. Most of the GY QTLs identified in this study were expressed either in SS or MS conditions of drought and only few of them were detected in both SS and MS drought across the years 2016 and 2017. Three such QTLs (qDTY 1.1 , qDTY 3.3 and qDTY 6.3 ) for Swarna*2/Dular and two QTLs (qDTY 2.4 and qDTY 11.2 ) in IR11N121*2/Aus 196 population were found stable across the environments/seasons. The lack of stability of QTL effects across the environments/genetic backgrounds has been one of the most limiting factors in successful deployment of QTLs through MAS breeding for various complex traits including drought 5,48-50 .
One of the stable grain yield QTL qDTY 3.3 identified in our study was located far away from previously mapped qDTY 3.1 at 30-31 Mb physical position in rice genome reported by Venuprasad et al. 3 . One of the GY QTLs qDTY 4.3 identified at 0.7-0.8 Mb in Swarna*2/Dular population under SS was co-located with qDTY 4.1 reported earlier by Swamy et al. 51 , however this QTL was not detected under moderate level of drought stress in the present study. It is interesting to note that we have detected QTL qDTY 1.1 with significant effects under varying severity of drought (SS and MS) in both the populations used in this study. Also, the physical position (S1_40013502-S1_41216734) of this QTL was nearly same of previously identified QTL qDTY 1. 1 1 a most relevant grain yield QTL under drought. It indicates authenticity, reliability of this study and usefulness of the stable QTLs under drought.
Furthermore, we have validated the effect of this QTL qDTY 1.1 in five alternate mapping populations derived from three genetic backgrounds (TDK1, Swarna and IR11N121) developed in this study. Consistency of qDTY 1.1 in multiple genetic backgrounds were also found in many previous studies in both (lowland and upland) the ecosystems of rice 1,2,52 . The positive alleles of qDTY 1.1 was found in 64% 53 and >50% 54 of the drought tolerant lines from a panel of random drought-tolerant lines used. These findings clearly suggest that the qDTY 1.1 on chromosome 1 could be a hot spot for alleles with positive effect on GY under drought and useful candidate region for explore in genomics assisted breeding.
Two of the consistent QTL qDTY 2.4 and qDTY 11.2 in IR11N121*2/Aus 196 population located at 17 Mb and 27 Mb were far from qDTY 2.1 3 and qDTY 11.1 55 . The QTLs for GY under drought with PVE upto 14.92% were detected in this study, which was low, despite using the highly saturated genetic map and multiple season of precise phenotyping. However, use of high-density maps could be helpful in precise detection of QTLs for complex traits by reducing the chances of getting false positive 29 . Similar finding was discussed in earlier reports, where upto 15% PVE was achieved using GBS based QTL mapping for fusarium wilt resistance in pigeon pea and flag leaf traits in bread wheat 29,33 . Most of previously identified major effect drought QTLs except qDTY 12.1 had explained 10 to 30% PVE with yield advantage of 300-500 kg ha −1 under RS drought stress 5,56 and QTL pyramiding looks a feasible strategy here to achieve the desired level of phenotypic variation (yield advantage of 1.0 t ha −1 ) under severe drought stress 5,57 .
In this study, a stable GY QTL stable qDTY 6.3 was co-located with QTLs for DTF qDTF 6.3 in Swarna*2/Dular population while qDTY 11.2 was co-located with qDTF 11.2 and qPH 11.2 in IR11N121*2/Aus 196 population. Many of the previous studies also reported the co-existence of drought grain yield QTLs with QTLs for DTF and PH under stress. For instance, qDTY 1.1 was coinciding with QTLs for DTF and PH 1 , qDTY 3.1 co-located with DTF 3 and qDTY 12.1 with QTLs for DTF, PH and other morphological traits 4 . Three QTLs (qDTY 1.1 , qDTY 2.4 , qDTY 3.3 ) found in this study were critical and desirable loci without any linkage drag and can be introgressed in multiple genetic backgrounds to find their individual and combined effects. Transfer of major grain yield QTLs under drought co-located with PH is not preferable in MAS breeding as positive alleles of tallness could make the introgression line unacceptable for varietal release. Most of the consistent QTLs for grain yield under drought detected in this study (qDTY 1.1 , qDTY 2.4 and qDTY 3.3 ) were free from undesirable linkages with positive alleles of tallness/earliness except qDTY 11.2 linked with QTLs for PH and DTF and qDTY 6.3 had linked with QTLs for DTF. Introgression of GY QTLs unlinked from PH QTLs will led to development of rice varieties tolerant to drought with optimal plant stature and higher yield. The association of QTL for DTF and PH with grain yield QTLs under drought prevailed for two qDTYs found in this study and a suitable breeding strategy should be followed before introgression www.nature.com/scientificreports www.nature.com/scientificreports/ of such QTLs in elite varieties of rice. Recently, marker assisted linkage -elimination strategy was followed to remove the undesirable linkages of PH QTLs from grain yield QTLs such as qDTY 1.1 , qDTY 3.1 and qDTY 12.1 58 .

conclusion
The developed high-density genetic maps in this study could be a strong foundation for fine mapping of grain yield QTLs under drought stress and identified genomics regions could be utilized in breeding programs. We have identified some novel candidate genomic regions from two populations that contained four stable QTLs for grain yield under drought in multiple environments. Two novel qDTY QTLs (qDTY 2.4 and qDTY 3.3 ) along with qDTY 1.1 detected at same position of previously known drought QTL qDTY 1.1 in rice genome, free from any undesirable linkages have suggested the importance and utility of these QTL cluster regions for rice breeders to be utilized in MAS work and candidate gene identification for higher grain yield under drought stress situation. Methods plant materials. Two drought tolerant donors (Dular and Aus196) and three recipient parents (Swarna, IR11N121, and TDK1), were utilized for the development of five BC 1 F 3 bi-parental mapping populations (Swarna*2/Dular, IR11N121*2/Dular, TDK1*2/Dular, IR11N121*2/Aus 196 and TDK1*2/Aus196). Dular, a drought-tolerant donor identified at IRRI is an early maturing, low yielding, blast resistant, traditional cultivar originated from India 43 . Aus 196, an improved drought tolerant cultivar belongs to aus subspecies and originated from Eastern India 43,59 . Recipient parent, Swarna is a long duration, high yielding, mega variety for rainfed and irrigated rice ecosystems of India, Nepal and Bangladesh but highly susceptible to reproductive stage (RS) drought stress 1,3 . IR11N121 is a high yielding rice variety released for lowland rice ecology of South East Asia, susceptible to drought. TDK1 is a high yielding, long duration, glutinous Lao variety, highly susceptible to drought and submergence 60 . Crop management practices in the field were followed as in Vikram et al. 1 . The standard protocol for reproductive stage drought (RS) screening was adopted as described previously by Kumar et al. 5 . In brief, the stress was imposed by draining out water from the field at 50 DAS (days after seeding) in 2016WS and 2017DS, respectively and the cyclic soil moisture deficit stress was maintained till the maturity stage. Water table depth was measured using PVC pipe of 1.1 m length installed at regular places across the field. When water table level in the PVC fell below 1 meter from soil surface and all the susceptible checks started showing severe leaf rolling and dying, a life-saving irrigation through flash flooding was provided. Water was drained out immediately after 24 hrs to initiate next cycle of stress.
Data on days to 50% flowering (DTF) in days, plant height (PH) in cm and grain yield (GY) in kg ha −1 (GYKGPHA) from NS and RS trials were collected and analyzed using PBTools (http://bbi.irri.org/products) for computation of means, LSD and heritability (H). Experiments with grain yield reduction of more than 65% were classified as severe stress (SS), while 31-64% yield reduction was classified as moderate stress (MS) 39 . Linear mixed model was used for analysis of variance considering the lines/genotypes as fixed and the effect of replications and blocks within replications as random.
The model used for augmented-RCBD design was: where, Yijk is measurement recorded in plot, M is the overall mean of plot, Gi is the effect of the i th genotype, Bl is the effect of the l th block and E ilk is the experimental error.
The model used for alpha-lattice design was: where, Yijk is measurement recorded in plot, M is the overall mean of plot, G i is the effect of the i th genotype, Rj is the effect of the jth replicate, Blj is the effect of the l th block within jth replicate, and E ilk is the experimental error.
The QTL effect on grain yield under drought stress were estimated using mixed model analysis (REML) in CROPSTAT version 7.2.3 using the lines with and without QTL. The effects of the QTL and genotypes within the QTL are considered as fixed while the replicate and blocks within replicate effects are considered random. We first computed the genotype means using PB tools by adjusting blocks in augmented RCBD design and in second stage these genotype means were used to analyze the QTL mean comparisons involving the two classes "with" and "without" QTLs using CROPSTAT.
Leaf tissue sampling, DNA extraction and preparation of GBS libraries. The fresh leaf tissue samples from six plants per line were collected at 42 DAS. High throughput automated leaf sampling and genomic DNA extraction from leaf tissue was performed using Brooks PlantTrak Hx rice leaf tissue sampler and LGC Genomics oKtopure systems at IRRI genotyping service laboratory. The assessment of DNA quality and quality was done by running on 1% agarose gel. Using this platform, a high-quality DNA yield ranges from 40-60 ng/µl was achieved for SNP genotyping. GBS libraries were prepared using the protocol adapted from Elshire et al. 14 .
For GBS a type II restriction endonuclease (ApeKI) was used for DNA digestion, and the digested DNAs were ligated to the adapter, and then 96-plex library was constructed as per GBS protocol 14 . GBS was carried out using HiSeq2500 sequencing platform with Macrogen Inc. (Korea).
SNP identification and genotyping. The sequence reads generated in FASTQ file were processed and analyzed for SNP identification using TASSELGBS analysis pipeline 61 . Pipeline allows searching of all raw sequencing reads with perfectly matched barcode and expected remnant bp of restriction cut site and reads were further sorted, de-multiplexed and trimmed to create a unique, 64-bp long sequences called tags. These good quality tag sequences were aligned with the reference genome using Burrows-Wheeler Alignment (BWA) software 62 , while reads carrying "N" within first 64 bases had removed from further analysis. The perfectly matched and aligned sequences was processed further for SNP calling and genotyping through GBS analysis pipeline. SNPs were further filtered for minor allele frequency (MAF) below 0.05 and for the percentage of missing data (≤90%) per SNP using TASSEL GBS analysis pipeline using default parameters. Filtered data file having final set of SNPs in nucleotide-based hap map format was converted to an ABH-based format using ABH-plugin in TASSEL pipeline where "A" represents donor allele, "B" represents recipient allele and "H" represents heterozygous allele. Finally, imputed SNPs of lines were filtered against parental alleles and only polymorphic SNPs were retained to be used in construction of linkage map.
Linkage map construction and QtL mapping. The genotypic data for 350 lines from each mapping population with filtered SNP markers was used for linkage map construction using the linkage mapping function implemented in the QTL IciMapping software v4.1 63 . The grouping and ordering of 3929 and 1191 polymorphic SNP markers for both the populations were carried out using regression mapping algorithm RECORD (REcombination Counting and ORDering) based on recombination events between adjacent markers. Further, Rippling was done for fine-tuning of the ordered markers on their respective chromosomes by sum of adjacent recombination fractions (SARF) algorithm with a default window size. QTL mapping for GY QTLs and other traits under drought was performed using composite interval mapping (CIM) functions implemented in the QTL IciMapping software v4.1 64 . The threshold LOD value to declare a significant.
QTL was computed by a permutation test involving 1000 runs at a significance level of p = 0.05. After completion of permutation test, window size of 10 cM and walk speed of 1 cM was set to start analysis of composite interval mapping.

Data Availability
The data sets supporting the results of this article are included within the article.