Rapid identification of a stripe rust resistant gene in a space-induced wheat mutant using specific locus amplified fragment (SLAF) sequencing

Stripe rust, caused by Puccinia striiformis f. sp. tritici (Pst), is one of the most devastating diseases of wheat. Resistant cultivars are the preferred strategy to control the disease. Space-induced wheat mutant R39 has adult-plant resistance (APR) to Pst. Genetic analysis indicated that a single recessive gene, designated YrR39, was responsible for the APR of R39 to Pst. Bulked segregant analysis (BSA) combined with a SLAF sequencing (SLAF-seq) strategy was used to fine-map YrR39 to a 17.39 Mb segment on chromosome 4B. The region was confirmed by analysis with simple sequence repeat (SSR) markers. A total of 126 genes were annotated in the region and 21 genes with annotations associated with disease response were selected for further qRT-PCR analysis. The candidate gene Traes_4BS_C868349E1 (annotated as an F-box/LRR-repeat protein) was up-regulated after 12, 24, 48, and 96 hours post inoculation with Pst, suggesting it is likely involved in the resistance. The current study demonstrated that BSA combined with SLAF-seq for SNP discovery is an efficient approach for mapping and identifying candidate functional gene.

Traditional map based cloning of genes of agronomic importance requiring genotyping of large numbers of individuals in segregating populations, which is time-consuming and labor-intensive, is rapidly being replaced by newer molecular methods such as specific locus amplified fragment sequencing (SLAF-seq), a strategy combining bulked segregant analysis (BSA) and next-generation sequencing (NGS) 11 . For example, Liang et al. 12 quickly mapped an aphid resistance QTL to a 0.31 Mb region in cucumber chromosome 5 by using SLAF-seq + BSA. Jia et al. 13 used a SLAF-seq strategy to develop SNP makers to fine-map the barley ari-e gene from a previously estimated 10 Mb region to a 0.58 Mb interval. Similarly, genes controlling agronomic traits, such as fruit length and flesh thickness in cucumber 14,15 , dwarfness in Lagerstroemia 16 , pericarp color in gourd 17 , flowering-time in orchardgrass 18 , and low-tiller in rice 19 have been genetically mapped using SLAF-seq + BSA. However, until now, SLAF-seq has not been used to map genes for Pst resistance gene in wheat.
In September 2005, common wheat cultivar Zhengmai 9023 had went through a space mutation project. Du et al. 20 then performed the system field evaluation of the mutant progenies and found improved adult-plant Pst resistance in several lines. Lv et al. 21 further evaluated the advance generation of resistant lines with Pst races CYR32, CYR33 and V26/G22 and found these lines to be stable for APR. In this study, an eighth generation of Zhengmai 9023 space mutation progeny, named R39, was resistance evaluated by seven Pst races at seedling and adult-plant stage, which confirmed us the adult-resistance of R39 to these races. Genetic analysis further revealed that R39 carries a recessive gene for APR to Pst race CYR33, and then used BSA and NGS-based SLAF-seq to locate the gene to a 17.39 Mb region on chromosome 4B.

Results
Wheat mutant R39 has APR to stripe rust. Seven Pst races, Su11-4, Su11-11, CYR29, CYR30, CYR31, CYR32 and CYR33, were used to evaluate the resistance performance of R39 at seedling and adult-plant stage. R39 was susceptible to all seven Pst races (IT 3 + to 4) at the seedling stage, but was resistant to seven Pst races (IT 0; to 1 + ) at the adult-plant stage (Fig. 1a, Supplementary Table S1). However, Zhengmai 9023 and Mingxian 169 were susceptible to seven Pst races (IT 3 + to 4) at both seedling and adult-plant stage. Besides, TSW of R39 was significant higher than Zhengmai 9023 and Mingxian 169 in field evaluation, suggesting a higher grain production and better seed quality of R39 under Pst disease ( Fig. 1b-d).
A recessive gene controls the APR of R39 to CYR33. As shown in Table 1, R39 was resistant (IT 0;), whereas Mingxian 169 was susceptible (IT 4), and F 1 plants were also susceptible (IT 3-4). The F 2 population segregated 1 resistant: 3 susceptible and the BC 1 population segregated 1: 1, respectively, suggesting that resistance was conferred by a single recessive gene ( Table 1). The clear differences in response of the parents, F1 individuals, and contrasting resistant and susceptible plants in F 2 and BC 1 are shown in Fig. 1. The resistance gene was tentatively designated as YrR39. A striking feature of R39 and resistant F 2 and BC 1 plants was the leaf spotting (Fig. 1a). This condition was also recessive and completely associated with the resistance phenotype in segregating populations and segregated as a recessive gene in non-inoculated populations ( Table 1).
Analysis of SLAF-seq data and SLAF tags. Following SLAF library construction and high-throughput sequencing, more than 301 million 100 bp reads were obtained. Most of the bases (90.05%) were of high quality, with quality scores exceeding 30 ( Table 2). The SLAF numbers were 420,531 for R39 and 433,264 for Mingxian 169. The average depths of the SLAF markers were 27.02-fold in R39, 34.05-fold in Mingxian 169, 62.07-fold in the resistant pool, and 58.37-fold in the susceptible pool (Table 2). And 40,735 of these tags were polymorphic, with a polymorphism rate of 8.49%. Target regions were located into chromosome 4B by SNP-based association analysis. SNP-based association analysis was used to locate the chromosomal region containing YrR39. SNPs were firstly called from polymorphic SLAF tags and 394,138 differences were identified between samples and the reference genome. SNPs with (1) multiple alleles, (2) read depths smaller than 4-fold in either bulk, (3) same genotype between R-pool and S-pool, (4) heterozygous genotypes in the recessive parent, and (5) homozygous genotype in the recessive parent but heterozygous genotypes in the recessive bulk, were excluded. Finally, 73,002 high quality SNPs were selected for association analysis. By merging the results from ED (cutoff value: 0.23, P < 0.01, Fig. 2a) and SNP-index (P < 0.01, Fig. 2b) association analysis, the stripe rust resistance gene YrR39 was located in a candidate 17.39 Mb region on chromosome 4B ( Table 3).

Confirmation of the target region by SSR markers.
In total, 83 SSR primers from chromosome 4B were tested for polymorphism and 8 markers showed clear polymorphisms between the resistant and susceptible parents as well as the DNA bulks. Genotype data from 462 F2 individuals were used to construct a genetic map, in which YrR39 was located a 2.6 cM interval flanked by Xwmc495 and Xwmc48 on chromosome 4BL (Fig. 3). This location was consistent with the target regions identified by association analysis and thus supported the region detected by SLAF-seq.
Influence of stripe rust infection on expression of candidate genes. The target regions contained 126 genes ( Table 3); 21 of which were annotated as disease resistance or defense response-related genes, WRKY transcription factors, hormone-related genes, LRR or ABC transporters, receptor kinases, protein phosphatases, and/or involved in signal transduction, were selected as potential candidates for confirmation by qRT-PCR assays. The predicted functions of all 21 genes are listed in Supplementary Table S2. According to the qRT-PCR results, candidate gene Traes_4BS_C868349E1 (annotated as an F-box/LRR-repeat protein) exhibited higher expression levels in R39 than in Mingxian 169 at 12, 24, 48, and 96 hours post inoculation with Pst. The other 20 genes exhibited no obvious regularity in expression pattern in response to infection (Fig. 4). These results suggested that Traes_4BS_C868349E1 was likely involved in the resistance of R39 to stripe rust.

Discussion
In this study, a classic genetic analysis was performed in an adult-resistance cultivar R39 and results indicated that adult-resistance of R39 to Pst race CYR33 was controlled by a recessive gene, which was tentatively designated as YrR39 (Fig. 1a). Next generation sequencing based SNPs markers development is an effective and high-resolution technique for fine mapping of major genes and QTLs 22 , but this approach has not been frequently used for marker development and isolation of functional genes in common wheat. To rapidly identify YrR39, in this study, we used SLAF-seq + BSA approach to analyze two parents and two pooled F2 population samples to detect genomic regions associated with Pst resistance in wheat. Finally, YrR39 was located on chromosome 4B with a size of 17.39 Mb (Fig. 2, Table 3). Considering the fact that only draft genome data of wheat is available at this moment 23 and wheat is an allohexaploid species with extremely large genomes and high proportion (>80%) of repeated     Totally, 126 genes were identified from the target regions (Table 3). Based on functional annotations, 21 genes, which were descripted as disease resistance and defense response related genes, WRKY transcription factors, hormone related genes, LRR or ABC transporter, receptor kinase, protein phosphatase, and/or involving in signal transduction [25][26][27][28][29][30] , were likely to be involved in plant resistance and were selected as good candidates for further analysis (Supplementary Table 2). According to the qRT-PCR results, the expression level of the gene Traes_4BS_ C868349E1, encoding F-box/LRR protein, was up-regulated in flag leaves of resistant R39 in response to infection (Fig. 4). It is widely accepted that resistance genes with LRR domains have major roles in the regulation of the resistance of plants to pathogens and insects 12 . For example, in tobacco and tomato, van den Burg et al. 25 proposed that ACRE189/ACIF1, a F-Box/LRR protein, which was activated by pathogen recognition to regulate cell death and defense responses, could regulate defense responses via methyl jasmonate-and abscisic acid-responsive genes. In Arabidopsis, Yan et al. 31 has proved that F-box/LRR protein COI1 was directly functioning as a jasmonate receptor to involve in defense responses. In the present study, we speculated that Traes_4BS_C868349E1 may be the key candidate gene responsible for R39 resistance to stripe rust and it may activate wheat defense response by regulating the hormone signaling such as jasmonate and abscisic acid. Elucidation of a more detailed mechanism of Traes_4BS_C868349E1 in regulating wheat plant defense response will be the subject of future studies.
Until now, Yr50, a dominant gene and characterized as seedling resistance to Pst races CYR32 and CYR33, and Yr62, a quantitative trait loci characterized as high-temperature adult-plant (HTAP) resistance to Pst races PST-116 and PST-117, were mapped to 4BL 9,32 . In contrast, YrR39 was a recessive gene and characterized as APR to Pst race CYR33. Moreover, R39 had a different genetic background from former used cultivars PI 192252 and CH223. Therefore, YrR39 is different from these two genes and is considered to be a new resistance gene (Fig. 3). Interestingly, a striking feature, the abnormally spotty phenotype, was always observed in the leaves of resistant parent R39, resistant F 2 and BC 1 progenies (Fig. 1a), indicating that the spotty trait is also recessive and completely associated with the resistance phenotype in segregating populations and segregated as a recessive gene in non-inoculated populations (Table 1). Although R39 showed a flecking response, its production and seed quality was significantly better than Zhengmai 9023 and Mingxian 169 under field evaluation (Fig. 1b-d). In several pathosystems, lesion mimic mutations have been shown to be involved in programmed cell death, which in some instances leads to enhanced disease resistance to multiple pathogens 33 . Therefore, we speculated that R39 is a lesion mimic mutant and the spotty trait may be responsible for the Pst resistance at adult-plant stage. The relationship between spotty trait and resistance is being analyzed at present.
In conclusion, we found that the space-induced mutant R39 showed APR to Pst and genetic analysis indicated that a recessive gene was responsible for the adult-resistance to Pst race CYR33. Combined of BSA and SLAF-seq method was used and candidate region was located into chromosome 4B with a size of 17.39 Mb. The regions were further confirmed by linkage SSR markers. qRT-PCR results show the gene Traes_4BS_C868349E1, encoding F-box/LRR protein, was obviously up-regulated in stripe rust infected R39, but not in Mingxian 169 (susceptible parent), suggesting it is involved in resistance response. Regulating the hormone signaling processes is the possible mechanism of Traes_4BS_C868349E1 acting wheat defense response.

Material and Methods
Plant materials and phenotypic collection. R39 was crossed as male parent with susceptible wheat cultivar Mingxian 169. F 1 , F 2 and BC 1 (R39/Mingxian 169//R39) progenies were generated for the study. Plants of the parents, progenies, and Zhengmai 9023 were used in seedling and adult-plant tests in a temperature-controlled greenhouse as described by Zhou et al. 2 . Briefly, Seedlings were grown in the greenhouse under controlled conditions. When the first leaves were fully expanded, all seedlings were inoculated with fresh urediniospores of seven Pst races (CYR29, CYR30, CYR31, CYR32, CYR33, Su11-4 and Su11-11). After 24 h at 10 °C in dew chambers plants were transferred to an environmentally controlled greenhouse with a daily cycle of 16 h light at 18 °C and 8 h darkness at 10 °C. For adult-plant tests, germinated seedlings were vernalized for 5 weeks in a 4 °C refrigerator, prior to transplanting to pots and grown in a greenhouse. One month later, the parents and progenies plants at the booting to heading stage were inoculated with urediniospores mixed with talc and incubated as described above. Infection type (IT) data were recorded 18-20 days after inoculation based on a 0-4 scale as descripted by Zhou et al. 2 . Plants with ITs 0 to 2+ were considered to be resistant and those with ITs 3− to 4 were considered to be susceptible. R39, Zhengmai 9023 and Mingxian 169 were planted in 2016 for field evaluation (Jingzhou, GPS: 112.15028E, 30.362437 N). After harvesting and drying, the seeds were collected to measure thousand seed weight (TSW). Three replicates of each cultivar were performed in this experiment.
DNA isolation and SLAF library construction for high-throughput sequencing. Young healthy leaves from R39, Mingxian 169 and F 2 individuals were collected, frozen in liquid nitrogen, and used for DNA extraction. Total genomic DNA was prepared from each plant using the CTAB method 5 . The DNA was quantified using a NanoDrop 2000 spectrophotometer (Thermo Scientific, USA). In total, 45 resistant plants and 45 susceptible plants were selected from the F 2 population and an equal amount of DNA from each plant in each response group were pooled as the resistant pool (R-pool) and susceptible pool (S-pool) and adjusted to final concentrations of 40 ng/ul. DNA of the parents and pools were digested to completion with RsaI (NEB, Nanjing, China). A single-nucleotide A overhang was added to the digested fragments with Klenow Fragment (3′-5′ exo-) (NEB, Nanjing, China) and dATP at 37 °C, and then the Duplex Tag-labeled Sequencing adapters (PAGE purified, Life Technologies, Gaithersburg, MD, USA) were ligated to the A-tailed DNA with T4 DNA ligase. The sequence depth of two parental lines was about 10×, and each pool was about 50×. PCR was performed using diluted shearing-ligation DNA samples, dNTP, Q5 ® High-Fidelity DNA Polymerase and PCR primers. PCR products were then purified using Agencourt AMPure XP beads (Beckman Coulter, High Wycombe, UK). Fragments ranging from 300 to 500 base pairs (with barcodes and adaptors) in size were excised and purified using a QIAquick gel extraction kit (Qiagen, Hilden, Germany). Gel-purified products were then diluted for pair-end sequencing (each end 100 bp) on an Illumina HiSeq 2500 platform using the standard protocol (Illumina Inc., San Diego, CA, USA) at Beijing Biomarker Technologies Corporation (http://www.biomarker.com.cn).
Analysis of SLAF-seq data. Low-quality reads (quality score < 30) were filtered out and raw reads were sorted to each progeny according to barcode sequences. After the barcodes were trimmed from each high-quality read, clean reads from the same sample were mapped onto the Triticum aestivum L. genome sequence 23 using Burrows-Wheeler Aligner software 34 . Samtools 35 was used to mark duplicates, and then GATK 36 was used for local realignment and base recalibration. A SNP set was formed by combining GATK and Samtools SNP calling analysis with default parameters. SNPs identified between the pools were regarded as polymorphic for association studies. In this study, P and M refer to the male (resistant) and female (susceptible) parents, while ab and aa refer to the R-pool and S-pool, respectively. Two association analysis methods, SNP-index and Euclidean distance (ED), were used. SNP-index is an association analysis method to find significant differences in genotypic frequency between the pools, indicated by Δ(SNP-index) 37 , which was calculated as: In which Maa was the depth of the aa population derived from Maa, and Paa was the depth of the aa population derived from P; Mab indicates the depth of ab population derived from Mab, and Pab indicates the depth of ab population derived from P 38 .
Euclidean distance (ED) association analysis is a typical method that calculates Euclidean distance (quadratic sum root of differences between bulks from the depth of four types of base) and is satisfied by ED. Theoretically, the higher the ED value, the closer the object site 38 . ED was calculated as follows: Aaa, Caa, Taa, and Gaa respectively represent the depth of bases A, C, T and G at a site in the susceptible pool, and Aab, Cab, Tab, and Gab represent the depth of bases A, C, T and G at a site in the resistance pool, respectively. To decrease the background noise, the original ED values were then raised to a four power set. Peak regions were defined as regions where the Loess-fitted values are greater than three standard deviations above the genome-wide median 39 .  qRT-PCR. Total RNA was isolated from leaves using RNAiso Plus (Takara, Dalian, China). Dried RNA samples were dissolved in DEPC-water to 1 μg/μL using a BioPhotometer Plus spectrophotometer (Eppendorf, Hamburg, Germany). RNA was reverse-transcribed using a Takara PrimeScript ® RT reagent kit with a gDNA eraser according to the manufacturer's specifcations. qRT-PCR was performed using a RealMasterMix (SYBR Green) kit (Tiangen, Beijing, China) according to the manufacturer's specifications. SYBR Green PCR cycling was performed using an iQ ™ 5 multicolour real-time PCR detection system (Bio-Rad, Hercules, CA) with 20 μL samples. PCR primers were designed using Primer Premier 5.0 to avoid conserved regions. Primer sequences are listed in Supplementary Table 1. Wheat TaEF-α (No. Q03033) was used as internal reference gene. The relative quantities were calculated using the 2 −ΔΔCt method 43,44 . Each treatment included three replications, and each replication included two technical replications.
Data availability statement. All data generated or analysed during this study are included in this published article (and its Supplementary Information files). The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.