The soybean GmSNAP18 gene underlies two types of resistance to soybean cyst nematode

Two types of resistant soybean (Glycine max (L.) Merr.) sources are widely used against soybean cyst nematode (SCN, Heterodera glycines Ichinohe). These include Peking-type soybean, whose resistance requires both the rhg1-a and Rhg4 alleles, and PI 88788-type soybean, whose resistance requires only the rhg1-b allele. Multiple copy number of PI 88788-type GmSNAP18, GmAAT, and GmWI12 in one genomic segment simultaneously contribute to rhg1-b resistance. Using an integrated set of genetic and genomic approaches, we demonstrate that the rhg1-a Peking-type GmSNAP18 is sufficient for resistance to SCN in combination with Rhg4. The two SNAPs (soluble NSF attachment proteins) differ by only five amino acids. Our findings suggest that Peking-type GmSNAP18 is performing a different role in SCN resistance than PI 88788-type GmSNAP18. As such, this is an example of a pathogen resistance gene that has evolved to underlie two types of resistance, yet ensure the same function within a single plant species.

S oybean cyst nematode (SCN), Heterodera glycines Ichinohe, one of the most devastating pathogens of soybean, causes more than $1 billion in yield losses annually in the United States alone 1 . Although planting resistant cultivars forms the core management strategy for this pathogen, the mechanism of soybean resistance to SCN is still unknown. Moreover, the genetic diversity of resistance is limited and virulent nematode populations have been identified for most known resistant sources. Therefore, understanding the molecular nature of soybean's resistance to SCN is increasingly important for the long-term management of this devastating nematode disease.
The first Rhg genes (for resistance to H. glycines) were identified in the early 1960s. Numerous reports are available on the identification and mapping of quantitative trait loci (QTL) in soybeans with underlying resistance to SCN from a variety of different germplasm sources. QTL on chromosomes 18 (rhg1) and 8 (Rhg4) are the two major resistance QTL that have been consistently mapped and reported in a variety of soybean germplasm. In some accessions, such as plant introduction (PI) 88788, rhg1 is sufficient to provide resistance against the nematode with the gene itself displaying incomplete dominance 2 . In other cases, such as the soybean cultivar (cv.) Forrest, resistance to SCN requires both rhg1 and Rhg4 (ref. 3). Brucker et al. classified the resistant rhg1 into two types: rhg1-a in Peking-type soybeans and rhg1-b in PI 88788-type soybeans 4 . Here we define the susceptible rhg1 type in Essex and Williams 82 as rhg1-s; otherwise, we use rhg1 to refer to the locus in general.
Resistant soybean lines carrying the Rhg resistance genes display an incompatible interaction with the nematode. Although infective juvenile nematodes penetrate the plant root, the feeding cells eventually degenerate causing the nematodes to die before they reach the adult stage. H. glycines exhibits genetic variability and nematodes carrying the as yet undefined ror (reproduction on a resistant host) alleles are able to survive on resistance cultivars thus causing population shifts in the field associated with monoculture or resistant soybean varieties 5 . Recently, progress has been made in the identification of the major soybean genes underlying SCN resistance at the rhg1-b and Rhg4 loci, and in the development of specific DNA markers for SCN resistance selection. Liu et al. 6 identified and functionally validated a serine hydroxymethyltransferase on chromosome 8 (GmSHMT08) as the Rhg4 gene by high-density genetic mapping, targeting-induced local lesions in genomes (TILLING), gene silencing (virus induced gene silencing (VIGS) and RNA interference) and genetic complementation. Serine hydroxymethyltransferases (SHMTs) are important in onecarbon folate metabolism, but how this particular soybean SHMT functions in resistance to the nematode remains unknown. Cook et al. reported that the resistance of soybean rhg1-b to SCN is simultaneously mediated by the high copy number of three genes (Glyma18g02580, Glyma18g02590 and Glyma18g02610) within a 31 kb PI 88788 rhg1-b segment 7 . This segment also includes partial gene sequences for Glyma18g02570 and Glyma18g02610. Direct repeats (2)(3)(4)(5)(6)(7)(8)(9)(10) are present in SCN-resistant soybean lines, whereas only one copy is present in SCN-susceptible soybean lines [7][8][9] . Using the Glyma18g02590 gene sequences, two specific KASP (kompetitive allele-specific PCR) single-nucleotide polymorphism (SNP) markers were developed to select the rhg1 resistance alleles and the assay can be used to differentiate Peking-type and PI 88788-type resistance 10 .
Using a positional cloning approach, we identified rhg1-a candidate genes within a contrasting 370 kb chromosomal interval between the resistant line Forrest and the susceptible line Essex. We then applied 'region-specific extraction sequencing' (RSE-Seq), a capture technology developed to enrich a targeted chromosomal segment for genome sequencing [11][12][13] , to identify SCN resistance genes within the identified 300 kb chromosomal segment carrying the rhg1 locus. RSE-Seq has been applied successfully to study yeast, human and zebrafish [11][12][13][14][15] . The SNPs and insertions and deletions (InDels) of four soybean lines (Essex, Forrest, Peking and PI 88788) were analysed using the Williams 82 genomic sequence as a reference 16 . The analysis of the relevant SNPs and InDels identified the GmSNAP18 as the strongest candidate gene conferring resistance to SCN at the rhg1 locus. Genetic complementation analyses of the Forrest GmSNAP18 confirmed its major role in resistance to SCN. The complementation analyses also showed the Forrest GmSNAP18 is specifically required for Peking-type resistance, as the resistant GmSNAP18 allele of PI 88788 or the susceptible rhg1-s could not restore resistance in the Forrest (Peking-type) Rhg4 background.

Results
Map-based cloning of rhg1-a gene conferring SCN resistance. To perform positional cloning by high-density genetic mapping of the rhg1-a gene conferring resistance to SCN, three F 5:6 recombinant inbred line (RIL) populations segregating for resistance to SCN were developed from crosses of the SCN-resistant cv. Forrest (F) with the SCN-susceptible soybean cvs. Williams 82 (W) and Essex (E). As Forrest-type resistance to SCN requires both the rhg1-a and Rhg4, identification of recombinants was conducted using DNA markers flanking both loci to detect informative recombinants at the rhg1-a locus (Supplementary Table 1). We identified a total of 222 recombinant lines with chromosomal breakpoints around the rhg1-a locus, of which three recombinants (ExF4361, ExF3126 and WxF6034) were crucial in defining the interval carrying the rhg1-a or rhg1-s gene. All three lines carried the Forrest allele at the Rhg4 locus; ExF4361 and ExF3126 were resistant, whereas WxF6034 was susceptible to SCN (Fig. 1a). The ExF4361 recombinant suggested that the rhg1-a gene(s) was (were) located on the left side of marker 37591. The WxF6034 recombinant indicated that the rhg1-a gene(s) was (were) located to the left of marker 600, excluding the DNA marker 600. The ExF3126 recombination event showed that the rhg1-a gene(s) was (were) located in the interval between DNA maker 560 and the DNA marker SIUC-SAT185, excluding DNA marker 560. Taken together, the recombinant analyses suggested that the interval (B14.3 kb) carrying DNA markers 570, 580 and 590 was underlying resistance to SCN at the rhg1-a locus in Forrest. Three genes were identified within the 14.3 kb interval; one codes for an armadillo/b-catenin-like repeat (Glyma18g02570) at the interval carrying marker 570; the second is an amino acid transporter (AAT, Glyma18g02580) at the interval carrying marker 580; the third is a soluble N-ethylmelaimide sensitive factor (NSF) attachment protein (GmSNAP18, Glyma18g02590) at the interval carrying marker 590. The complementary DNAs for each of the three identified genes were cloned and sequenced. Amino acid alignments of the predicted armadillo/b-catenin-like repeat and amino acid transporter protein sequences of Forrest and Essex revealed identical amino acid sequences between both the resistant and susceptible lines ( Supplementary Fig. 1). A GmSNAP18 genomic DNA sequence comparison of Forrest and PI 88788 with Essex identified six SNPs (C2447A, C2464G, G4203C, G4206C, G4206T and C4215A) and four InDels (À 4211G, À 4212G, À 4213T and À 4213G) within the exons (Fig. 1b), resulting in nine amino acid changes (Fig. 1c). Thus, GmSNAP18 was deemed as the strongest candidate gene for conferring resistance to SCN at the rhg1-a locus.      Table 2 and Supplementary Data 1). However, when the genomic DNA sequences of the GmSNAP18 from Forrest and PI 88788 were compared with those of the susceptible line Essex, only six SNPs (C2447A, C2464G, G4203C, G4206C, G4206T and C4215A) and four InDels ( À 4211G, À 4212G, À 4213T and À 4213G) were mapped to the exons of gene Glyma18g02590 (GmSNAP18). The identified polymorphisms resulted in nine amino acid changes (Supplementary Table 2). Thus, GmSNAP18 was identified to be the most probable candidate gene conferring resistance to SCN at the rhg1 locus, consistent with the results of map-based cloning (Fig. 1). The predicted GmSNAP18 protein of Forrest and Peking is different from that of PI 88788. This is consistent with previous genetic results (termed as rhg1-a in Peking-type soybeans and rhg1-b in PI 88788-type soybeans) 4 .

SNPs
Insertions Deletions E-88-S294 EP88-S1 EP-S232 P-S16 P88-S16  Multiple types of GmSNAP18 at the rhg1 locus. The RSE-Seq data indicated that multiple types of GmSNAP18 existed within rhg1 (Gm18: 1480001..1780000). In Forrest and Peking, only one type was identified within the whole segment between Glyma18g02570 and Glyma18g02610, whereas PI 88788 was different. All types of GmSNAP18 within the exons with or without amino acid changes, compared with the predicted GmSNAP18 protein of Williams 82, are summarized in Table 2. PI 88788 showed two types of GmSNAP18 (Type I, whose sequence is identical to that in the map-based cloning section mentioned above, and Type II, identical to the specific type of Williams 82 and Essex) at amino acid positions 203 and 285-288. These two types of GmSNAP18 are the same as the two types of GmSNAP18 in PI 88788 reported previously 7,9 . However, the third type of GmSNAP18 in PI 88788 (deletion at position 203) reported previously 7 was not detected within the 150 or more RSE-Seq reads at each position. The ratio of number of reads of Type I and Type II of the GmSNAP18 in PI 88788 is about 8-9:1. Meanwhile, the sole type of GmSNAP18 in both Forrest and Peking is different from the two types of GmSNAP18 in PI 88788.
In addition, Glyma18g02610 also exhibited two types at the 5 0 -untranslated region in PI 88788: one is the same as that    Haplotypes of the GmSNAP18 predict SCN resistance.
To establish a link between GmSNAP18 alleles and soybean resistance to SCN, we scored 81 soybean lines (including PIs, landraces and elite cultivars, as described previously 6 ) representing 95% of the sequence diversity of soybeans for their SCN female index (FI) 17 and then determined their SNP-based GmSNAP18 haplotype after genotyping them at the rhg1 and Rhg4 loci ( Table 3). The genotyping data were clustered ( Fig. 2 and Supplementary Table 1 of Liu et al. 6 ) and 11 soybean lines were selected to fully sequence their GmSNAP18 genes. As a result, 14 polymorphisms (10 SNPs and 4 InDels) at 13 positions were identified within the exons of GmSNAP18 (Fig. 3). In total, three different GmSNAP18 haplotypes were identified. Soybean lines with Haplotype I carry resistant alleles at both GmSNAP18 and Rhg4 (GmSHMT08), and are resistant to SCN ( We successively analysed the GmSNAP18 expression in the ExF RILs together with GmSNAP18 and GmSHMT08 genotyping and phenotyping ( Fig. 4 and Supplementary Data 2). In this work, GmSNAP18 þ and GmSNAP18 À , and GmSHMT08 þ and GmSHMT08 À represent Forrest and Essex GmSNAP18, and Forrest and Essex GmSHMT08, respectively; GmSNAP18-P þ represents PI 88788 GmSNAP18. The results demonstrate that GmSNAP18 transcripts were induced in the ExF RILs carrying GmSNAP18 þ / GmSHMT08 þ (ExF7) and GmSNAP18 þ /GmSHMT08 À (ExF12), whereas the ExF RILs carrying GmSNAP18 À /GmSHMT08 þ (ExF68) did not show significant changes in the expression of GmSNAP18, after SCN infection (3 days). ExF RILs carrying GmSNAP18 þ /GmSHMT08 þ display resistance to SCN, whereas the ExF RILs carrying GmSNAP18 À /GmSHMT08 þ or GmSNAP18 þ / GmSHMT08 À do not show resistance to SCN (Fig. 4 and Supplementary Data 2). These results further support a role for GmSNAP18 in resistance to SCN.

Relative transcription level
the SCN-susceptible RIL ExF50 carrying GmSNAP18 À / GmSHMT08 þ , which has the resistant Forrest allele at the Rhg4 locus and the susceptible Essex allele at the rhg1 (rhg1-s) locus. As GmSNAP18 occurs in multiple copies at this locus in the resistant soybean 7-9 , the construct was made to express the gene under the control of a CaMV 35S promoter in planta. Roots of RIL ExF67 transformed with the plasmid vector were used as a positive control for resistance. The transgenic hairy roots overexpressing Forrest GmSNAP18 (rhg1-a) grew normally, but showed a significant reduction in SCN development compared with the control transgenic hairy roots in infection assays ( Fig. 5a and Supplementary Figs 3a and 4). We also tested the amino acid transporter (AAT, Glyma18g02580, the predicted protein sequences between Forrest and Essex are identical (Supplementary Fig. 1) and did not observe a significant reduction in SCN development compared with the control transgenic hairy roots in infection assays ( Fig. 5a and Supplementary Figs 3b  and 4). The predicted protein sequences of Glyma18g02570 of both Forrest and Essex are identical to that of Williams 82 (http://soybase.org) (Supplementary Fig. 1). Glyma18g02570 did not contribute to SCN resistance 7 and therefore was not tested in this study. These data suggest that the GmSNAP18 gene at the rhg1-a locus is sufficient to confer SCN resistance in Forrest. We also expressed the GmSNAP18 genes from two other soybean lines: Essex (rhg1-s) and PI 88788 (rhg1-b). Expression of the GmSNAP18 gene from either Essex or PI 88788 in RIL ExF50 did not result in a significant reduction in SCN development compared with the control (Fig. 5b and Supplementary Fig. 4), demonstrating the specificity of Forrest GmSNAP18 in conferring Peking-type soybean resistance to SCN.
Modelling of GmSNAP18. To understand the locations of the different identified haplotypes of GmSNAP18 and how they may confer SCN resistance, the Forrest GmSNAP18 was modelled and then each haplotype of the other soybean lines (Fig. 3) was mapped. The modelling showed that most of the identified differences were at the carboxy-terminus of the GmSNAP18. These haplotypes not only varied between PI 88788-type and Peking-type resistant lines, but also between the resistant and susceptible lines (Fig. 6). Through structural analyses, variations in the GmSNAP18 C-terminal residues 285-289 (E285, Y286, E287, V288 and I289) may play a role in establishing proteinprotein interactions or modulating vesicular exocytosis 18 . It has been reported that the C-terminus of soluble N-ethylmaleimide sensitive fusion attachment proteins (SNAPs; last 25 residues) determines its functionality in vesicle trafficking and localization 19 . The Q203K haplotype in PI 88788-type soybeans (Fig. 3) has an altered charge of the immediate environment from an uncharged glutamine to a positively charged lysine. The T261A variation only occurring in PI 90763 (Fig. 3) shows a polarity shift from a polar threonine in all other soybean lines to a non-polar alanine, which probably enables PI 90763 to acquire new resistance to certain SCN Hg Types. Similarly, the T251S variation in PI 209332 and A39D in PI 548316 may play a role in nematode effector recognition, as this ability differs from its closest soybean haplotypes (Fig. 3).

Discussion
In this study, GmSNAP18 was identified as the rhg1-a gene conferring resistance to SCN using an integrated set of genomic and genetic approaches in combination with genetic complementation analyses. A high-density genetic map of a 370 kb chromosomal segment carrying the Forrest rhg1-a locus was developed using three RIL populations (Fig. 1a). Three recombination events (ExF3126, ExF4361 and WxF6034) narrowed the rhg1-a gene candidate to a region containing three genes, Glyma18g02570, Glyma18g02580 and Glyma18g02590 (GmSNAP18) (Fig. 1a). Similar to Peking 7 , three tandem copies   (Fig. 1a), excluding the rhg1-a LRR-RLK candidate resistance gene in support of earlier work.
To further confirm the reliability of the constructed genetic map, RSE-Seq was used to study the SNPs and InDels among resistant and susceptible genotypes to pinpoint candidate genes for resistance to SCN at the rhg1 locus. RSE has been successfully applied to yeast, human and zebrafish [11][12][13][14][15] , but has not yet been applied to plants. Here we show that RSE-Seq of a targeted 300 kb genomic DNA segment (Gm18: 1480001..1780000) of contrasting chromosomal regions underlying resistance was effective in the direct identification of a candidate rhg1 SCN resistance gene (Table 1 and Fig. 2). We postulated that the SNPs and InDels at rhg1 most likely to be associated with the alleles underlying resistance to SCN would be shared by Forrest, Peking, and/or PI 88788, but not present in the genomic DNA of susceptible line Essex at the rhg1 locus. Furthermore, those with changes within exons that resulted in amino acid changes would be considered high priority candidates. Compared with the genomic sequence of Williams 82 at that targeted 300 kb rhg1 genomic DNA segment, the sequence analysis of Forrest, Peking and PI 88788 resulted in the identification of 835, 872, and 1021 SNPs and InDels, respectively. In contrast, Essex showed 886 SNPs and InDels (Table 1 and Supplementary Data 1), of which only 26 SNPs and InDels were potentially associated with the alleles underlying resistance to SCN (Supplementary Table 2). Furthermore, of those 26 SNPs and InDels, 10 were specific to Forrest, Peking, and PI 88788, and all were located within the exons of one gene Glyma18g02590 (GmSNAP18). This resulted in nine amino acid changes to the predicted GmSNAP18 protein (Supplementary Table 2). The other SNPs and InDels identified did not result in any amino acid changes. Thus, consistent with the genetic mapping results, GmSNAP18 was identified as the rhg1 candidate gene probably conferring resistance to SCN. The results obtained in this study suggest that the RSE method is a powerful tool for the preparation of specific genomic regions for next-generation sequencing (NGS). However, the design of specific primers is critical when working with organisms possessing repetitive genome sequences. For instance, the soybean genome is about 1.115 Gb in size 27 , but 40-60% of the soybean genome is repetitive sequence and heterochromatic [28][29][30] . To enrich the specific region (Gm18: 1480001..1780000), one primer was designed approximately every 5 kb for a total of 60 primers spanning the targeted genomic DNA region. Specific primers ensured that other homeologous genomic DNA segments (that is, chromosome 11) were not enriched to avoid biased sequence reads which could have complicated the sequence data analyses.
The genetic complementation analyses of GmSNAP18 clearly showed that the transgenic ExF50 (GmSNAP18 À /GmSHMT08 þ ) gained resistance to SCN after being complemented with the Forrest GmSNAP18, but not when complemented with Forrest AAT (Fig. 5a and Supplementary Fig. 4), indicating that the Forrest GmSNAP18 is the gene at the rhg1-a locus that confers resistance to SCN. In addition, the PI 88788 GmSNAP18 (rhg1-b) or Essex GmSNAP18 (rhg1-s) could not restore resistance to ExF50 (Fig. 5b and Supplementary Fig. 4), showing that the only form of GmSNAP18 that could complement the ExF50 RIL is the Forrest-type GmSNAP18 (rhg1-a). These data suggest that the resistant form of the PI 88788 GmSNAP18 is unable to communicate with the Forrest GmSHMT08 (Rhg4), to confer resistance to SCN. Our genetic mapping results did not include Glyma18g02610 among the rhg1-a candidate genes (Fig. 1a). Thus, the complementation results indicate that the SCN resistance at the rhg1-a locus in Forrest is triggered by GmSNAP18 alone, contrasting with the rhg1-b in PI 88788, whose resistance is due to the high copy number and concerted activity of three genes 7 .
Forrest, Peking and PI 88788 require the presence of either the resistant rhg1-a or rhg1-b allele to exhibit resistance to SCN 2,3,6,7 . In addition, the resistance of Forrest and Peking to SCN requires the Rhg4 gene 6 . The resistance reaction of Peking-type soybeans to nematodes is quick and forceful leading to a rapid degeneration of the syncytium, whereas PI 88788-type soybeans react to nematodes comparatively slower 31 . SCN populations that have adapted to reproduce on soybeans with either Peking-type Q203 E208 E287 V288 I289 E285 Y286 T251 T261 A39 Figure 6 | Homology model of the GmSNAP18 from Forrest with mapped residues of identified haplotypes. The C-terminal residues 285-289 (E285, Y286, E287, V288 and I289) vary among all the three GmSNAP18 types; the residue E208 varies in GmSNAP18 Types II and III; the residue Q203 varies in GmSNAP18 Type II; the residue T261 varies only in PI 90763 (from Type I); the residue T251 varies only in PI 209332 (from Type II); and the residue A39 varies only in PI 548316 (from Type II).
or PI 88788-type resistance have been reported 32 . According to the different haplotypes of GmSNAP18 identified from a collection of 81 soybean lines, Peking, Forrest, PI 90763, PI 437654 and PI 89772 were classified as Peking-type soybeans; PI 88788, PI 209332 and PI 548316 were classified as PI 88788type soybeans; and Essex, Williams 82 and PI 603428C were classified as SCN-susceptible-type soybeans (Table 3 and Fig. 3). All are in agreement with the previous genetic results, showing that the SCN resistance of Peking-type soybeans such as Forrest requires both rhg1-a and Rhg4 (GmSHMT08) 2,3,6 . The genomic and cDNA sequences of the GmSNAP18 conferring SCN resistance were different between Peking-type soybeans and PI 88788-type soybeans (Fig. 1b), consistent with the different rhg1 loci identified in both lines (rhg1-a and rhg1-b) 4 .
SNAPs are highly conserved proteins and belong to the tetratricopeptide repeat containing protein family. Tetratricopeptide repeat proteins are known to be involved in vesicular trafficking, cytokinesis and plasma membrane repair and stability [33][34][35][36][37] . SNAP proteins have been described as the member of the SNARE complex involved in many pathogen resistance pathways. It has been reported that SNAP, together with syntaxins SYP121 and SYP132, contributes to the resistance to fungi and bacteria 38,39 . The major amino acid changes of GmSNAP18 between susceptible, Peking-type and PI 88788-type soybean lines were mostly mapped in the C-terminus of the protein (Fig. 6), which we classified into different haplotypes (Fig. 3). Considering that the C-terminus of SNAP proteins have been found to control vesicular trafficking 19 , the haplotype differences among susceptible, Peking-type and PI 88788-type soybeans may alter the destination of a GmSNAP18-guided vesicle. In addition to the two major resistant haplotypes (Peking and PI 88788), there are also the PI 90763 (T261A), PI 548316 (A39D), and PI 209332 (T251S) haplotypes. SNAP proteins have also been found to have their activity altered by phosphorylation. As most kinases act on serine or threonine, the T261A and T251S may have altered phosphorylation capabilities. Various a-SNAP haplotypes may also play a role in nematode effector recognition 40 or in establishing protein-protein interactions 41,42 . Thus, the differences in GmSNAP18 among soybean genotypes could determine the type of interactions between the nematode and the soybean host.
Taken together, our results support a model wherein GmSNAP18 is a major factor mediating the different types of soybean resistance to SCN. Our findings suggest that the Pekingtype GmSNAP18 is performing a different role in SCN resistance than PI 88788-type GmSNAP18. This is an example of a pathogen resistance gene that has evolved to underly two types of resistance, yet ensure the same function within a single plant species. Now that the major genes for resistance to SCN have been discovered, the molecular and biochemical mechanisms of soybean resistance to SCN can be explored in detail.

Methods
Nematode and plant materials. The inbred SCN population PA3 (Hg Type 0) used in this study was mass selected on soybean cv. Williams 82 according to standard procedures 32 46 . The collection of soybean lines used in this study was obtained from the United States Department of Agriculture (USDA) Soybean Germplasm Collection, University of Illinois.
Extraction of soybean genomic DNA. The soybean seeds were planted in the greenhouse at Southern Illinois University, germinating and growing at 25-30°C. One top young trifoliate leaf of a seedling from each soybean line was collected for the extraction of genomic DNA about 2 weeks after germination. The genomic DNA extraction was performed using a Qiagen DNeasy Plant Mini Kit or a Qiagen DNeasy 96 Plant Kit (Qiagen Sciences, USA) per the manufacturer's manual instructions. The DNA extracted was quantified to 100 ng ml À 1 for RSE-Seq, genetic mapping, haplotyping and TILLING mutation screening.
Map-based cloning of the rhg1-a gene. Three genetic populations segregating for resistance to SCN PA3 (Hg Type 0) were used for mapping. These included an F 5:6 RIL population from a cross between Forrest and Essex (98 individuals) and two large F 5:6 RIL populations generated from crosses between Forrest and either Essex (1,755 lines) or Williams 82 (2,060 lines) to enrich the chromosomal interval carrying the rhg1 gene recombinants.
As Forrest resistance to SCN requires both the rhg1-a and Rhg4 genes 3 , genotyping was conducted using DNA markers flanking both loci to detect informative recombinants at the rhg1 locus (Supplementary Table 1). The simple sequence repeat (SSR) markers Sat_210, Satt309 and SIUC-SAT143 were used to identify chromosomal breakpoints at the rhg1-a locus. PCR amplifications were performed using DNA from individuals from each of the three genetic populations. Cycling parameters were as follows: 35 cycles of 94°C for 30 s, 50°C for 30 s and 72°C for 30 s with 7 min of extension at 72°C. The PCR products were separated on 3-4% metaphor agarose gels. The identified recombinants were subjected to a second screening by using the EcoTILLING marker SHMT 6 and the simple sequence repeat (SSR) marker Sat_162 to identify the Rhg4 genotype of each recombinant.
To enrich the chromosomal regions carrying the rhg1-a locus with DNA markers, the GenBank published sequence AX196295 spanning the region and the Williams 82 reference sequence on http://soybase.org, spanning this region were used to design PCR primers every 5-10 kb of the 370 kb carrying the rhg1 locus (Supplementary Table 1). DNA from Forrest, Essex and Williams 82 was tested with each primer, using a modified EcoTILLING protocol, to find and map polymorphic sequences at the rhg1 locus 6,26,47 . The identified SNP and InDel DNA markers were integrated into the informative recombinants to identify chromosomal breakpoints and the interval that carried the rhg1-a gene.
Isolation of the genomic and cDNA sequences of GmSNAP18. The genomic DNA and cDNA of GmSNAP18 in Forrest, Essex and PI 88788 were cloned and sequenced. Genomic DNA was isolated from young leaves using the DNeasy Plant Mini Kit (Qiagen Sciences). Total RNA was isolated from roots using the RNeasy Plant Mini Kit (Qiagen Sciences) and cDNA was synthesized using a cDNA synthesis kit (Invitrogen, USA). PCR primers based on the Forrest and Essex genomic DNA sequences were used to amplify the corresponding cDNA sequences (Supplementary Table 1).
Enrichment of one targeted 300 kb rhg1 genomic DNA segment. Using Williams 82 genomic sequence as the reference sequence, a specific oligonucleotide primer was designed about each 5 kb between Gm18: 1480001 and 1780000 for the RSE of one targeted 300 kb rhg1 genomic DNA segment (Supplementary Table 3). RSE was performed using the genomic DNA of Essex, Forrest, Peking and PI 88788 as described by Gabriel et al. 11 and Dapprich et al. 12 , by Generation Biotech (Lawrenceville, NJ, USA). Briefly, capture primers designed were designed to hybridize to targeted areas of the genome by exploiting sequence elements that are unique to the region of interest. The bound oligos are extended with biotinylated nucleotides to label the targeted DNA segments. Streptavidin-coated magnetic microparticles are then added to the reaction mix to isolate the targeted DNA, along with its flanking regions. The 30 ml RSE reaction mix consisted of a premixed set of targeting primers combined with 600 ng of genomic DNA. The genomic DNA was denatured and an automated capture was performed, followed by washing and elution in preloaded reagent cartridges. After RSE, the enriched DNA from each sample was removed from the microparticles by heating the solution at 80 o C for 15 min to disrupt the biotin-streptavidin complex 12 . The microparticles were magnetically collected and the elute containing the enriched regional genomic DNA was retained for theNGS.
NGS of the targeted 300 kb rhg1 genomic DNA segment. The NGS of the targeted 300 kb rhg1 genomic DNA segment of the soybean lines Essex, Forrest, Peking and PI 88788 was carried out by GenXPro GmbH in Frankfurt, Germany. Briefly, the enriched samples RSEed in the last step were amplified with an Illustra GenomiPhi V2 DNA amplification kit (GE Healthcare) according to the manufacturer's protocol. Residual primers and dNTPs were deactivated with ExoSAP-IT as per the manufacturer's protocol. For each sample, approximately 2 mg of enriched, amplified genomic DNA was used as input for the preparation of the sequencing library. The library was prepared for sequencing using the Illumina Genomic DNA Sample Prep Kit. Sequencing was performed using an Illumina Hiseq2000.
Reads mapping and calling of SNPs and InDels. After the reads were analysed and filtered following sequencing, the high-quality reads were mapped with novoalign version 2.07.14 to the Williams 82 genomic sequence, resulting in a BAM file for each soybean line with the SAMTools 48 . The sequences were analysed with a Tablet viewer 49 and an IGV viewer 50 for the building of consensus sequences, and the manual calling of SNPs and InDels using the Williams 82 genomic sequence as a reference sequence.
Haplotyping of GmSNAP18 in soybean lines. A total of 81 soybean lines (PIs, landraces and elite cultivars), representing 95% of the genetic variability of soybeans 17 , were scored for their SCN FI. Lines were classified resistant (R) to SCN if the FI was r10% and susceptible (S) if the FI was 410%. Soybean lines were genotyped at the rhg1 locus by using the DNA markers 560, 570 and Satt309. They were also genotyped at the Rhg4 locus, using the DNA marker Sat_162 and by the sequencing of GmSHMT08 (ref. 6). The coding region of GmSNAP18 was sequenced for 11 lines. Common SNPs and InDels were identified and used to determine the different GmSNAP18 haplotypes.
Quantitative RT-PCR of the GmSNAP18 gene. Soybean seedlings from the susceptible line Essex, the resistant lines Forrest and PI 88788, and two RILs ExF7 (GmSNAP18 þ /GmSHMT08 þ ) and ExF68 (GmSNAP18 À /GmSHMT08 þ ) were grown in autoclaved sandy soil in the growth chamber for 1 week and then infected with infective eggs from the PA3 population. Total RNA was isolated from root samples 3 and 5 days after SCN infection using Qiagen RNeasy Plant Mini Kit (Qiagen Sciences). Total RNA was DNase treated and purified using a Turbo DNA-free Kit (Life Technologies, USA). RNA was quantified using NanoDrop 1000 (V3.7), then a total of 400 ng of treated RNA was used to generate cDNA using the cDNA synthesis kit (Life Technologies) with random hexamers and 2 ml of cDNA was used for the GmSNAP18 quantitative PCR with the specific primers (Supplementary Table 1), using the Power SYBR Green PCR Master Mix kit (Life Technologies). Gene transcription from three individual biological replicates was used for quantification, then normalized by the deltadelta C q method using Ubiquitin as a reference gene (DC q ¼ C q(TAR) -C q(REF) ). GmSNAP18 expression was exponentially converted using the formula: DC q Expression ¼ 2 À DCq .
VIGS infection analyses. BPMV VIGS vectors pBPMV IA-R1M and pBPMV-IA-V1 were used in this study 51 . Briefly, a 302 bp fragment (spanning base pairs 365-667) of GmSNAP18 (Glyma18g02590) cDNA sequence (GenBank accession number: KX147332) was amplified from soybean cv. Forrest root cDNA by RT-PCR, using primers listed in Supplementary Table 1. PCR products were digested with BamH1 and ligated into pBPMV-IA-V1, then digested with the same enzyme to generate pBPMV-SNAP18-AS (BPMV-SNAP18-AS) with the insert in the antisense orientation. Gold particles coated with pBPMV-IA-R1M and BPMV-SNAP18-AS were co-bombarded into soybean leaf tissue as described previously 52 . For vector control, pBPMV-R1M and pBPMV-GFP-AS were used. At 3-4 weeks after inoculation, trifoliate leaves showing virus symptoms were collected, lyophilized and stored at À 20°C to be used as virus inoculum for plants. The SCN-resistant RIL ExF67 (GmSNAP18 þ /GmSHMT08 þ ) seedlings were inoculated with virus carrying pBPMV-SNAP18-AS or a vector at 7-9 days post germination and the plants were maintained at 20°C, 16 h light/8 h dark for 30 days post virus infection. The experiment was repeated twice with similar results.
Candidate gene complementation experiments. Based on the aforementioned mapping data, the candidate gene GmSNAP18 and a putative amino acid transporter (AAT) were selected for complementation experiments. For this, Forrest GmSNAP18 and Forrest AAT were subcloned into a 35S pAKK gateway vector by recombination cloning. Briefly, the 873 bp open reading frame of GmSNAP18 or the 1311 bp open reading frame of AAT were PCR amplified from Forrest root cDNA using gateway primers (Supplementary Table 1). They were cloned into pDONR/Zeo and subsequently moved into the gateway vector pAKK downstream of CaMV 35S promoter. Transgenic hairy roots were generated from the RIL ExF50, which carries the Forrest Rhg4 (resistant allele) and Essex rhg1 (rhg1-s, susceptible allele) as described previously 53 . Transgenic hairy roots of vector transformed RIL ExF67 (ref. 6) were used as an SCN-resistant control. The infection experiments were done in square Petri plates as described previously 6 . In all experiments, at least 12 independent transgenic hairy root lines were used per treatment. The experiments with GmSNAP18 were repeated at least five times and those with AAT were repeated twice. For experiments with GmSNAP18 of Essex (rhg1-s) or PI 88788 (rhg1-b), the constructs were made as described above, using GmSNAP18 cloned from the root cDNA of these lines, respectively. Three biological replicates of transgenic root generation and infection assays were conducted as described above. The results from infection assays were plotted and analyzed for statistical significance by an unpaired t-test using GraphPad PRISM software.
Modelling GmSNAP18. Homology modelling of a putative GmSNAP18 protein structure was conducted with Deepview and Swiss Model Workspace software 54 , using the predicted GmSNAP18 protein sequence from Forrest and an available a-SNAP crystal structure from Rattus norvegicus as a template; PDB accession is 3J96 chain G 55-57 . Residues 6-284 were modelled against this template with a sequence identity of 39%. The remaining residues were appended on the structure with Deepview, followed by an energy minimization using the ModRefiner server 58 . Haplotype mapping and visualizations were performed using the UCSF Chimera package 59 .
Statistical analysis. All the presented qRT-PCR results were performed with the analysis of variance by Student's t-test means comparison using JMP Pro V12 software.
Data availability. The genomic DNA and cDNA sequences of GmSNAP18 in Forrest, Essex and PI 88788 have been deposited in NCBI GenBank with the accession number of KX147329, KX147331, KX147330, KX147332, KX147333 and KX147334, respectively. The sequences of the 300 kb rhg1 segment of four soybean lines have been deposited in NCBI Sequence Read Archive with the study number of SRP090423. The authors declare that all other data supporting the findings of this study are available upon request.