Marker-assisted introgression of three dominant blast resistance genes into an aromatic rice cultivar Mushk Budji

Modern high yielding rice varieties have replaced most of the traditional cultivars in recent past. Mushk Budji, is one such short grained landrace known for its aroma and exquisite quality, however, is highly susceptible to blast disease that has led to considerable decline in its area. Mushk Budji was crossed to a triple-gene donor line, DHMAS 70Q 164-1b and followed through marker-assisted foreground and background selection in first and second backcross generations that helped to incorporate blast resistance genes Pi54, Pi1 and Pita. Marker-assisted background selection was carried out using 78 SSR and STS markers that helped to reduce linkage drag around the genes Pi54, Pi1 and Pita to 2.74, 4.60 and 2.03 Mb, respectively. The three-gene lines in BC2F2:3 were genotyped using 50 K SNP chip and revealed more than 92% genome similarity to the RP. 2-D gel assay detected differentially expressing 171 protein spots among a set of backcross derived lines, of which 38 spots showing match score of 4 helped us to calculate the proteome recovery. MALDI-TOF analysis helped to detect four significant proteins that were linked to quality and disease resistance. The improved lines expressed resistance to blast under artificial and natural field conditions.


Results
Marker-assisted backcrossing. Marker Assisted Backcross Breeding (MABB) strategy was employed to transfer major blast resistance genes Pi54, Pi1 and Pita from a non-aromatic three-gene donor DHMAS70Q 164-1b which was crossed as a male to popular aromatic landrace Mushk Budji. Hybridity in F 1 s was confirmed and a single F 1 was backcrossed to RP Mushk Budji to yield 17 BC 1 F 1 plants. Foreground selection was exercised on BC 1 F 1 plants to identify heterozygous individuals by using gene based InDel marker Pi54 MAS for the gene Pi54, linked SSR marker RM224 for Pi1 and gene based coupling-repulsion marker pair YL155/87 and YL 155/83 for the gene Pita. Selected BC 1 F 1 plants were advanced to BC 2 F 1 and subsequently followed through selfing generations to identify plants carrying homozygosity at target loci. (Supplementary Figs S1 and S2).
Polymorphism survey was carried out between RP Mushk Budji and three gene donor line DHMAS 70Q 164-1b using 278 genome wide markers of which 96 markers were found to be polymorphic between parents. The polymorphic markers uniformly distributed across the genome were used to carry out background analysis. A total of 55 and 47 markers were screened for carrier chromosomes 11 and 12, where 11 and 14 markers were found to be polymorphic between the parents, respectively. A starting three-gene BC 1 F 1 plant, SKUA-485-27 (Pi54+Pi1+Pita) was screened using polymorphic markers and recorded recipient genome recovery (RPG) at 60.87 per cent of polymorphic loci. The plant revealed heterozygous segments at markers RM190 (linked to Wx allele), PKN7 and PKN10 which are linked to rice grain quality.
The BC 2 F 1 plants were analyzed for recombination breakpoint between Pi54 and Pi1 on chromosome 11L. A single plant namely, SKUA-485-27-6 carrying Pi54+Pi1 showed recovery at RM254. The three gene plant viz., SKUA-485-27-7 had RP allele at RM26746 (16. Nine plants showing recovery at markers flanking target genes were subjected to background analysis using 78 genome wide markers distributed across the genome. Of these, eight polymorphic markers were located on each of the chromosomes 1, 6, 11 and 12. Chromosome 4 carried seven polymorphic markers. Six markers each were used for background selection on chromosomes 2, 5, 7 and 10, besides five markers each for chromosomes 3, 8 and 9. The RPG recovery ranged from 63.04% for SKUA-485-   The triple heterozygotes (SKUA-485-27-7 and SKUA-485-27-4), two-and single gene BC 2 F 1 plants were selfed through BC 2 F 2 to BC 2 F 2:3 in order to recover homozygous, two-and three-gene pyramided lines (PLs) and those with individual genes. Also, selected three-gene BC 1 F 1 plant SKUA-485-27 was advanced to BC 1 F 3:4 . The plants in selfing generations were screened using markers that were heterozygous in previous generation in order to select for RP allele. The scheme and number of plants screened and selected at each generation is given in (Supplementary Fig. S1). The evaluation was carried in early backcross generations on individual plant basis for agronomic traits, cooking quality and target blast resistance loci using foreground markers and is detailed in Supplementary Tables S1-S8. Eight backcross derived lines and the two parental lines were also genotyped using 50 K SNP chip 'OsSNPnks' which carried 50,051 SNPs spread across the twelve rice chromosomes with an average marker density of 131 SNPs per Mb region. The highest number of 10,016 markers was located on rice chromosome 1, followed by Chromosome 3 with 7,044 markers. Both 'genome similarity' as well as 'genome recovery' with respect to RP was worked out for gene pyramids (Table 1) Table 2; Supplementary Table S8). It had a score of 78 which is highly significant and also  Grain and cooking quality of the blast R-gene pyramided lines. The donor and RP genotypes showed a difference of at least 1.3 mm with respect to milled rice length (KLBC) and 2.8 mm with respect to cooked kernel length (KLAC). The stringent phenotypic selection for these traits was employed along with background selection for grain type, KLAC, KLBC, KER and aroma followed by foreground selection for target genes in BC 1 F n and BC 2 F n . A three-gene BC 1

Discussion
Although conventional breeding assumes RPG recovery at the rate of 1 − (1/2) n+1 for every 'n' generations of backcrossing 27 , Marker-assisted backcross breeding 28 (MAB) approach helped us to pyramid three dominant blast resistance genes (Pi54, Pi1 and Pita) along with rapid RPG recovery as early as in BC 2 F 3 generation. Pyramiding multiple genes in a single variety based on phenotyping alone would be near impossible due to difficulty in estimating resistance response of component genes individually 29 . Further, the conduct of detailed pathotyping assay at various backcross generations can be avoided with the use of molecular markers 30 . Marker-assisted foreground selection was carried out using the markers Pi54 MAS, RM224 and a marker pair YL 155/87 and YL 155/83, to select for the genes Pi54, Pi1 and Pita, respectively. Of these, Pi54 MAS, a gene based InDel marker amplifies 216 bp fragment specific to Pi54 resistance and 359 bp allele for susceptible plants 31 . RM224 is a linked SSR marker that is located at 0 cM from gene Pi1 32 . The gene Pita is located at centromeric region of chromosome 12    Table 2. MALDI-TOF analysis of proteins showing differential expression between three-gene pyramids and donor lines. *Significance level is based on threshold score of 48 as per SWISSPROT database.  and was selected using gene based coupling-repulsion marker pair YL 155/87 and YL 155/83 which target transcription start site of the gene 33 . The use of gene based markers allows transfer of gene of interest with high precision and accuracy. Theoretical expectations in using linked marker for foreground section are such that for 10 cM distance between flanking markers and gene of interest, there lies a 0.024 probability of losing the gene after single generation which goes up to 0.1182 after five generations. That means maintenance of lines using same marker at a considerable distance from gene would rather not be practicable in long term after release of the product. A single three-gene (Pi54+Pi1+Pita) donor, DHMAS 70Q 164-1b originating from a Vietnamese indica rice cultivar Tetep 34 , was used as male parent in a cross with Mushk Budji. The genes selected here have been known for effectiveness across various locations inside the target region 6,13 . The plants in BC 1 F 1 showed wide range in spikelet sterility (3.8 to 92.0%) that may be attributed to genetic divergence between donor and recurrent parents 35,36 . The selection for recombinants having better spikelet fertility (SF) was achieved directly through selection for fertile panicles and indirectly by means of selection against heading-date as has been supported elsewhere 37 . The mean grain yield per plant in SKUA-485-27 showed increase from 20.5 g in BC 1 F 1 to 22.0 g in BC 2 F 1 . The marker-assisted background selection, as a method to accelerate the RPG recovery through selection of RP alleles at large number of loci, as per Frisch et al. 38 , leads to selection response 'R' which is decided by the multiplicative action of selection intensity (i), standard deviation of RPG (σ) and correlation between the proportion of RP alleles at marker loci and the proportion of RP alleles across the whole genome (r). Therefore, one way to account for the proportion of genome besides those of marker loci, would be to carry selection for the easily observable (phenotypic) traits that would enhance 'r' in above equation. Phenotypic selection was also performed in segregating backcrosses to recover lines with grain and kernel traits with high similarity to Mushk Budji.
The PLs in BC 1 F 3:4 (SKUA-485-27-47-4-1 and SKUA-485-27-77-6-2) and BC 2 F 2:3 (SKUA-485-27-4-38-4, SKUA-485-27-4-40-6, SKUA-485-27-3-7-5), confirmed for homozygosity at target genes, were analyzed for background genome recovery that ranged from 82.69 to 91.03%. The preferential selection of individuals based on RPG recovery on carrier chromosomes was avoided as supposedly it would have resulted in lower overall RPG content 39 . Here, the stringent selection on carrier chromosomes if performed, might have resulted in reduced selection pressure on non-carrier chromosomes which form the major part of the genome. Also, the two-stage selection process is said to be superior to three-and four-stage selection process in breeding programs aimed up to BC 1 and BC 2 40 .
The final set of PLs showed RP allele at BADH2 locus at chromosome 8 that correlated well with the phenotype. These lines also carried RP allele at Wx locus on chromosome 6 and 8 and were phenotypically similar to RP Mushk Budji. The lines SKUA-485-27-86, SKUA-485-27-70 and SKUA-485-27-47 showed Mushk Budji allele at markers PNK7 and PNK10, which are linked to BADH1 locus for aroma on chromosome 4. The lines with maximum recovery and better plant and grain type were confirmed to have recovered at SSR loci RM6666 on chromosome 1, which is linked to the QTL for cooking quality 41 . The chromosomes 1, 3 and 4 recorded better recovery for recipient parent genome. The chromosome 1 carries gene for plant height and the traits related to grain dimensions. The genes for KER and LBR are located on chromosomes 3 and 4. The quick recovery in BC 1 F 2 and subsequent selfing generations towards shorter grain length and KLBC similar to Mushk Budji, shall be explained by dominant nature of loss of function QTLs responsible for shorter grain length such as GS-3, qGL-3, etc., over those responsible for fine and long grained phenotype caused by recessive alleles at these loci 42 . Marker-assisted background selection with the help of 15 SSR markers on chromosome 11 helped to minimize the linkage drag around gene Pi54 within 2.7 Mb between markers RM3917 and RM26963. On the same chromosome the gene Pi1 was delimited to 1.2 Mb region. Similarly, the segment carrying Pita was narrowed down to 4.1 Mb. In conventional breeding programs expected drag around target locus is 79 and 63 cM in BC 1 F 1 and BC 2 F 1 , respectively 43 . So to our advantage, the background markers helped to reduce the drag down to 4 to 16 cM in BC 2 F 1 PLs. The results are in line with expectations as per simulated experiments carried out by Frisch et al. 44 , where it was concluded that 71 and 86% of genome would be recovered in BC 1 F 1 and BC 2 F 1 for a population having 20 individuals. Singh et al. 20 incorporated Pi54 and Piz5 in PRR78 background and achieved RPG of 89.17 and 87.88%, respectively. Gopalakrishnan et al. 45 used foreground and background selection to develop improved PB 1 with xa13and Xa21 with minimum linkage drag above 1.3 cM.
Specifically, few SNP loci that have been reported to be linked to the traits of agronomic importance, rice quality and tolerance to cold stress, were scored for their recovery in BC 2 F 4 lines and carried RP alleles (Table 4). Overall, the background analysis of PLs using 50 K SNP array revealed low RPG recovery compared to the estimate based on SSR markers. Clearly, the recombination events were resolved efficiently through high density SNPs. Overestimation of background genome recovery using SSR markers has been a feature reported previously 9 . Further, the estimate of RPG similarity (%) rather than RPG recovery (%) may be suggested to be the better option for evaluation of background genome content in backcross derived lines. The genomic proximity between donor and RP based on RPG similarity (%) may help to decide on the necessity of further advancement of backcross generations in a more realistic manner. As in our case, RP and donor share 56.2% genome similarity, so that two backcrosses were sufficient to yield more than 90% similarity among lines and RP.
Besides marker based estimation of RPG recovery, this study reports the evaluation of PLs for recovery on the basis of protein profile. 2-D gel electrophoresis detected 171 protein spots which showed differential expression pattern among selected backcross derived lines and parents. The selected 38 clearly differentiated spots showing match score of 4 were analyzed and revealed an average similarity of 97, 79 and 68% for SKUA-485-27-4-38-4, SKUA-485-27-4-40-6 and donor DHMAS 70Q 164-1b, respectively, against RP Mushk Budji. Therefore, in general, the PLs, not only had high genome recovery based on SNP and SSR markers, but also for matrix of proteins in 2-D profile. Also, 16 spots present in RP and lines were altogether absent in donor DHMAS 70Q 164-1b. MALDI-TOF analysis for peptide fingerprint was performed with 36 protein spots of which, two peptides had significant match to SWISSPROT data base. These included Alpha-amylase OS = Oryza sativa subsp Japonica GN = RASI PE = 1 SV = 2 that had a significant score of 78 with a protein sequence coverage of 22%. The protein expression was almost two-fold in Mushk Budji and backcross derived lines as compared to the donor DHMAS 70Q 164-1b. The theoretical pI and molecular mass was recorded at 8.66 and 21.689 kDa, respectively. Alpha-amylase is responsible for breakdown of starch in rice. The expression of Alpha amylase have shown differential expression across varieties 46 . The 19 kDa globulin protein OS = Oryza sativa subsp. japonica GN = Os05g0499100PE = 1 SV = 2 had a nominal mass of 21.497 kDa and a pI of 7.48. The sequence coverage of 21% was found for 186 amino acid residue protein. Globulin is an important storage protein concentrated to bran layer of brown rice compared to glutelin that is mainly confined to endosperm. The protein was upregulated in Mushk Budji and derived backcross lines as compared to DHMAS 70Q 164-1b. A 67.85 kDa protein, S-(+)-linalool synthase, chloroplastic OS = Oryza sativa subsp. Japonica GN = LIS PE = 2 SV = 1, at pI of 5.69 was upregulated in RP and derived lines and downregulated in donor with 10% coverage in 595 amino acid chain. The protein is reported to be involved in monoterpene biosynthesis. The major product is S-(+)-linalool that is induced by jasmonate in response to Xanthomonas oryzae 47 . However, role in rice blast resistance has not been elucidated.
GGE biplot analysis 48 combines genotype (G) main effects and genotype x environment (GE) component to work out the genotype performance and the helps to designate the mega-environments supporting such genotypes. This is a robust technique to mark genotypes in a which-won-where fashion and helped us to understand about performance of derived lines within and outside the target regions.
The final set of selected PLs were screened under artificial conditions and showed resistance against the four M. oryzae isolates in presence of RP check that succumbed to the disease. The virulence analysis carried out initially using LTH background differential set for isolates SKUA-Mo-3 and SKUA-Mo-9 confirmed them to possess Avr-Pita. Mo-ei-MBI-2 and Mo-nwi-kash-32 were procured from Dr. U. D. Singh, Indian Agricultural Research Institute, New Delhi, India and could detect the genes Pi54, Pita and Pi54, Pi1, respectively. The isolate Affy SNP ID Locus ID Chr Gene R D 1 2 3 4 5 6 7 8 Trait Function   Mo-nwi-kash-32 was collected from RP Mushk Budji and could accurately confirm the resistance carried by lines.
Further, these lines were tested at five blast hot spot locations within Kashmir valley. All the lines expressed resistance response to prevalent isolates while, the RP Mushk Budji planted as check succumbed to the disease at each of these locations. This substantiates our choice of participating genes in constituting PLs and their suitability to be released as cultivars in the traditional Mushk Budji growing areas. Marker-assisted selection and high throughput validation of RPG recovery lead to the development of PLs of Mushk Budji carrying genes for blast resistance. The lines developed here are set for their release as improved versions of Mushk Budji for commercial cultivation in farmers' fields. This is a rare report on improvement of an aromatic rice landrace for resistance to disease like blast. Though short and medium grained traditional rice varieties comprise 4.4% of the total rice cultivars grown by farmers across the globe, and global scented rice market is growing at 12% per annum, a holistic approach needs to be adopted for conservation, promotion and genetic enhancement of such valuable rice cultivars.

Materials and Methods
Plant materials. Mushk Budji, a popular short-grained aromatic rice landrace of Jammu and Kashmir, India, which is highly susceptible to blast disease ( Supplementary Fig. S11), was used as RP and crossed as a female to a blast resistance donor parent (DP) DHMAS 70Q 164-1b. The donor parent is a doubled haploid line obtained from the cross HPU741/Tetep and harbors three blast resistance genes, Pi54, Pi1and Pita.
Marker-assisted backcross breeding. From the Mushk Budji/ DHMAS 70Q 164-1b cross, a single F 1 plant with confirmed hybridity was backcrossed with Mushk Budji to generate BC 1 F 1 plants. Subsequently, the selected BC 1 F 1 plant was crossed to the RP to generate BC 2 F 1 and advanced further through selfing by following a marker-assisted backcross breeding (MABB) scheme ( Supplementary Fig. S1). The scheme comprised of a four-step selection strategy in each backcross generation: (1) foreground selection for the target genes using gene-based/linked DNA markers; (2) recombinant selection using DNA markers flanking the respective target genes; (3) background selection using polymorphic DNA markers, (4) stringent phenotypic selection for agro-morphological traits, grain dimension, cooking quality and aroma to accelerate the recurrent parent phenome (RPP) recovery. The marker-assisted foreground selection for genes Pita and Pi54 was carried out using coupling-repulsion pair of gene-based markers YL155/YL87//YL155/87 33 and Pi54MAS 31 , respectively. The selection for the gene Pi1, was carried out using gene-linked marker RM224 32 (Supplementary Table S14). The marker-assisted background selection was conducted using genome wide SSR markers. The RP Mushk Budji and DP DHMAS 70Q 164-1b were surveyed for polymorphism with 278 genome wide SSR/genic markers. The marker information was retrieved from http://www.gramene.org and published literature 49 . The extent of RPG recovery was calculated as per Khanna et al. 9 . The primers were custom synthesized by Sigma Technologies Inc., USA. The RPG recovery was graphically represented using Graphical Geno Typing (GGT 2.0) software 50 . EDTA (2 mM) and 1 tablet of complete EDTA free protease inhibitor (carrying Pancreas extract, Thermolysin, Chymotrypsin, Trypsin, Papain with 0.02, 0.0005, 0.002, 0.02, 0.33 mg/ml, respectively). Final washing was done with pure acetone. Pellet was kept at −80 °C for overnight. Acetone-free pellet was dissolved in rehydration buffer (8 M Urea, 20 mM DTT, 2% w/v CHAPS). Protein quantification was done according to Bradford 53 and 250 µg of protein containing IPG buffer and DTT was loaded in rehydration tray. Immobilized protein gradient (IPG) strips (pH3-10, 13 cm, GE Healthcare, UK) were rehydrated overnight. IEF was carried out using Ettan IPGPhor3 (GE Healthcare, UK) with standardized programme. Second dimension electrophoresis was carried out using Hoefer SE600 Ruby electrophoresis unit (GE Healthcare, UK) at 40 mA/gel for two and a half hours at 25 °C. The gels were stained overnight with CBB solution as described 39 and then destained with 0.5 M NaCl solution 54 . All 2-D CBB stained gels were scanned with GE Image Scanner III at 300dpi and analyzed using Image Master 2-D Platinum V.7.0 software (GE Healthcare, UK). Protein spots were excised from preparative polyacrylamide gels that had been stained with Comassie Brilliant Blue G-250and each gel fragment was immersed in purified water and sonicated twice for 10 min each time at 50 W and 20 kHz. Subsequently, the gel pieces were destained with 50 mM ammonium bicarbonate and an equivalent volume of 50% acetonitrile, followed by sequential washing with 25 mM ammonium bicarbonate, 50% acetonitrile and 100% acetonitrile, respectively. After lyophilization, the gel fragments were rehydrated in digestion buffer containing 25 mM NH 4  GGE Biplot analysis was carried out to workout genotype relation across different environments with respect to grain yield performance. A "which-won-where" view of GGE biplot was used to characterize the genotypes for their agronomic performance and to demarcate distinct mega-environments which suit them. For our purpose, the biplot was explained using a polygon, spread across the coordinates, with perpendicular lines, called equality lines, drawn onto its sides. The lines thus create the sectors which hold the particular environments. Genotypes located on the vertices of the polygon were regarded as the best performers within the sector 55 . The AEC view of the GGE biplot, which explains genotype comparisons on the basis of mean performance and stability across environments, was drawn to rank the genotypes on the AEC abscissa. The GGE biplot analysis was conducted using the software Genstat v.12.

Evaluation for blast disease resistance under controlled conditions. The isolates collected from
Mushk Budji were used for inoculation for screening the gene PLs for resistance to rice blast. The genes Pi54, Pi1 and Pita were tested using isolates Mo-nwi-kash-32 and Mo-ei-MBI-2 kindly provided by Division of Plant Pathology, IARI New Delhi. The seedlings were inoculated at three-leaf stage by spraying 50 ml of spore suspension (~5 × 10 4 conidia ml −1 ), and incubated in growth chambers for 24 h in dark at 26-27 °C. The seedlings were sprayed with water after every 6-7 h to maintain the humidity for 4-5 days to facilitate the penetration by the fungus and disease establishment. The disease was scored after 7 days of inoculation using the scale given by Mackill and Bonman 21 .
Evaluation for blast disease resistance under field conditions. The pyramids were also screened in Uniform Blast Nursery at five hot spot locations in Jammu and Kashmir, viz, Sagam, Pombay, Khudwani, Shalimar and Budgam. A 50-cm row each of the gene PLs along with the RP and DP controls was planted in a raised bed nursery with a row to row spacing of 10 cm. To ensure uniform spread of disease, a row of susceptible check was planted after every five rows as well as on the borders. The disease evaluation was done on 0-9 Standard Evaluation Scale of IRRI 56 . The lines with 0-3 score were considered as resistant, those with score of 4-5 were regarded as moderately resistant, those having score of 6-7 were treated as moderately susceptible and those with score of 8-9 were considered to be susceptible.