The AvrPm3-Pm3 effector-NLR interactions control both race-specific resistance and host-specificity of cereal mildews on wheat

The wheat Pm3 resistance gene against the powdery mildew pathogen occurs as an allelic series encoding functionally different immune receptors which induce resistance upon recognition of isolate-specific avirulence (AVR) effectors from the pathogen. Here, we describe the identification of five effector proteins from the mildew pathogens of wheat, rye, and the wild grass Dactylis glomerata, specifically recognized by the PM3B, PM3C and PM3D receptors. Together with the earlier identified AVRPM3A2/F2, the recognized AVRs of PM3B/C, (AVRPM3B2/C2), and PM3D (AVRPM3D3) belong to a large group of proteins with low sequence homology but predicted structural similarities. AvrPm3b2/c2 and AvrPm3d3 are conserved in all tested isolates of wheat and rye mildew, and non-host infection assays demonstrate that Pm3b, Pm3c, and Pm3d are also restricting the growth of rye mildew on wheat. Furthermore, divergent AVR homologues from non-adapted rye and Dactylis mildews are recognized by PM3B, PM3C, or PM3D, demonstrating their involvement in host specificity.

where it is individually expressed or combined with BgtE-20069b_96224, BgtE20069a_96224, and 52 BgtE_20069a_94202. HR was assessed using HSR imaging 5 days after Agrobacterium infiltration. Results 53 were consistent over at least two independent assays of 6 to 8 independent leaf replicates.     The effector benchmarking approach was developed to reduce the large number of known candidate effector 290 genes (595 at the time of analysis) to a manageable number for experimental analysis. It is based on the 291 hypothesis that effector genes encoding for avirulence proteins are likely to share similar features in terms of 292 sequence properties and gene expression levels. We therefore implemented a benchmarking scheme to identify 293 powdery mildew effectors that resemble the functionally validated avirulence genes AvrPm3 a2/f2 , AvrPm2 from 294 B.g. tritici 1,7 and Avra1, Avra13 from B.g. hordei 8 . 295

296
We classified the features defining a putative candidate Avr effector into four groups: "1.1 Sequence 297 polymorphism" between the Bgt_96224 reference isolate (avirulent on all Pm3 alleles) and the isolates 298 Bgt_94202 (virulent on all Pm3 alleles) and Bgt_JIW2 (virulent on Pm3c and Pm3f only), "1.2 Protein 299 structure", "2.1 Absolute expression" in the reference isolate Bgt_96224, and 2.2 "Differential expression" 300 between Bgt_96224 and the phenotypically contrasting isolates Bgt_94202 and Bgt_JIW2 (Supplementary 301 Data 1). For each category, we defined a scoring scheme based on the assumption that Bgt_96224, which is 302 avirulent on all the Pm3 alleles, should encode for all AvrPm3 specificities. Therefore, putative effectors best 303 fulfilling the criteria for an Avr in the Bgt_96224 isolate can be considered as possible AvrPm3 genes. For each 304 of the described 4 categories, we defined a series of criteria each of which describes specific features of an 305 expected Avr. For example, in the category "1.2 Protein structure" we defined 10 (code 121-130 in 306 Supplementary Data 1) criteria assessing the features of each mildew effector in terms of presence of a signal 307 peptide, the number of cysteines, and the size of the native peptide (i.e. including the signal peptide) 308 (Supplementary Data 1). Each criterion was given a weight so that for example a putative effector encoding a 309 protein within the size range of the previously identified AVRs (defined as 115-135 aa) would receive a higher 310 score than those encoding for much bigger or much small peptides (e.g. < 70 aa, or >300 aa). To determine the 311 appropriate weight each criterion should receive, we manually tested several scoring schemes and 312 progressively adapted the weights so that the functionally validated Avrs would score among the top 20 best All candidates were manually re-annotated, and a subset of 16 effectors was applied to molecular validation 320 of mRNA structure by RACE-PCR (Supplementary Data 4). We excluded members of the AvrPm3 a2/f2 family 321 that had been already tested at the time we designed the assay 1,6 . Subsequently, the top 100 candidates were 322 codon optimized for expression in N. benthamiana, and cloned by gene synthesis (Supplementary Data 3). 323 This approach led to the identification AvrPm3 b2/c2 , and AvrPm3 d3 thus demonstrating that effector 37 benchmarking is indeed a rapid and effective alternative to Avr identification by classical map-based cloning 325 or GWAS. However, while effector benchmarking demonstrates several advantages compared to GWAS and 326 map-based cloning, one major limitation is that it can only be used if the candidate genes have well defined 327 features. Also, the effectiveness of effector benchmarking is highly dependent on the quality of the genome 328 annotation, since it is based on the comparison of well annotated effectors. We therefore propose that this 329 approach is complementary to-and builds on classical genetics approaches, and it can be adapted to other plant 330 pathogenic fungi based on specific features of avirulence genes in those systems. 331

Supplementary Note 2. Annotation of the AvrPm3 b2/c2 genetic locus. 332
The position of the AvrPm3 b2 GWAS peak was located within the genetic interval previously identified as the 333 genetic Locus_3 which controls specificity towards Pm3b and Pm3c 1 . In an initial effort to map the AvrPm3c 334 gene in a genetic cross between the mildew isolates Bgt_96224 and Bgt_JIW2 segregating for Pm3c 9 , two 335 flanking markers M049LE and ctg118_21, were identified (Supplementary Figure 2). Here we took advantage 336 of the Bacterial Artificial Chromosome (BAC) clone library which was assembled for the reference isolates 337 Bgt_96224 as another source for uncovering the full sequence of Locus_3 10 . The BAC clones were previously 338 assembled into Finger Printed Contigs (FPC) thus allowing the identification of 6 overlapping BAC clones 339 covering the physical region defined by Locus_3 (Supplementary Figure 2). We used the same approach 340 previously described by Bourras and colleagues 1 to validate the physical overlap between the BACs which 341 has resulted in the selection of five clones (7i16, 28j03, 7p01, 29k04, and 4k17) for sequencing (Supplementary 342 We combined different resources to thoroughly annotate this genetically complex locus as follows: (i) we used 345 the high quality PacBio sequence annotation of the locus, derived from the reference isolate Bgt_96224 11 , (ii) 346 we assembled the sequences of the 5 BAC clones from the same Bgt_96224 reference isolate, spanning the 347 whole region covering the flanking genetic markers (Supplementary Figure 2), (iii) we used RNA-Seq data 348 from the Bgt_96224 reference (avirulent on Pm3b and Pm3c) to manually curate and thoroughly annotate 349 genes and transposable elements, and (iv) we used RNAseq and genome re-sequencing data from the 350 Bgt_94202 isolate (virulent on Pm3b and Pm3c) to identify sequence polymorphisms, locus rearrangements, 351 and differential expression patterns that can be associated with the phenotype. This has resulted in a very-high 352 quality sequence annotation of Locus_3, including the identification of novel effector sequences. 353 354 Supplementary Note 3. Epitope tagging of AVR and SVR proteins. 355 HA and FLAG epitope tags were added N and C-terminally to the mature peptide encoded by AvrPm3 a2/f2 , 356 AvrPm3 b2/c2 , AvrPm3 d3 and SvrPm3 a1/f1 using site-directed mutagenesis (SDM). All constructs were 357 recombined into the pIPKb004 expression vector and mobilized by electroporation into the Agrobacterium 358 tumefaciens strain GV3101 as previously described 1,12 . Protein detection assays upon transient expression in  (Fig. 2a-c), and detectable in N and C terminal fusions (SVRPM3 A1/F1 ) 362 (Fig. 3d). For AVRPM3 D3 no protein could be detected independently of tag position or sequence, despite 363 several attempts optimizing western blotting procedure, using different ODs of Agrobacteria (0.5-1.5) and Compared to other well-described allelic series of resistance genes such as RPP13, the L and the Mla series 372 from Arabidopsis, flax and barley, respectively, the Pm3 alleles stand out with their high level of similarity 373 (>97%) on the protein level 13-16 . An extreme case is exemplified by PM3D and PM3E that only differ by two 374 amino acids in the LRR domain but recognize distinctly different spectra of mildew races 17 . Furthermore, 375 PM3D and PM3E only differ from the PM3CS susceptible allele by respectively 3 and 2 residues in the LRR 376 domain, yet they are among the strongest alleles in the field 17,18 . Interestingly, neither AVRPM3 D3 , nor its 377 recognized homologues from B.g. secalis or B.g. dactylidis are recognized by PM3E or PM3CS. Similarly, in 378 the fungus, the duplicated paralog of AvrPm3 d3 (BgtE-20069a) found in the genome of Bgt_96224 and 379 Bgt_94202, encodes a protein that only differs from the active AVR by 2 and 3 amino acid polymorphisms, 380 respectively. Taken together these observations indicate that specificity of AVR recognition by the PM3 381 variants is highly sensitive to single amino acid changes on both sides of the interaction. 382

383
Evidence of inter-allelic suppression among the Pm3 variants was initially reported from several genetic 384 crosses between near-isogenic Pm3 lines 12 . In one case this observation was molecularly validated in transient 385 co-expression assays in N. benthamiana 1,12 where it was shown that Pm3b was able to suppress the HR induced 386 by an auto-active variant of Pm3f 12 , and also the HR induced by the natural Pm3a and Pm3f alleles upon 387 recognition of AvrPm3 a2/f2 1 . Here, we tested the suppression spectra of all Pm3 alleles and the Pm3CS ancestral 388 sequence, in presence of the newly identified AvrPm3 genes (AvrPm3 b2/c2 , and AvrPm3 d3 ), and taking full 389 advantage of our improved experimental setup (i.e. use of codon optimized Avr constructs, and HR 390 visualization using the HSR imaging technology). 391

425
We found that the active AvrPm3 a2/f2 allele encoded by Bgt_96224, and _JIW2 (Supplementary Figure 15a), 426 the active AvrPm3 b2/c2 allele encoded by all three isolates (Supplementary Figure 15b), as well as the active 427 Based on the functional data from the genetic diversity screens suggesting the presence of specific domains 440 involved in R protein recognition, we wanted to study a possible structural basis for the specificity of the 441 AVRPM3 B2/C2 -PM3B and AVRPM3 D3 -PM3D interactions. Therefore, we designed domain swaps between 442 AVRPM3 B2/C2 and AVRPM3 D3 and the most closely related member of their effector families, BGT-51460 443 and BGTE-5883, respectively. We did so by exchanging regions of approximately 20 amino acids, flanked by 444 conserved residues, between the AVR and its non-recognized partner (Fig. 5a, Supplementary Figure 21a). 445 Care was taken that regions with high levels of natural diversity (Figure 5b, Supplementary Figure 21b) would 446 be located entirely within one exchanged segment. We postulate that structural conservation among close 447 family members will allow the replacement of protein subdomains while preserving AVR protein structure. 448 This should reveal which parts of the protein are involved in recognition. 449

450
We designed eight swaps between AVRPM3 B2/C2 and BGT-51460 all of which were codon optimized for N. 451 benthamiana, cloned without signal peptide by gene synthesis and tested for recognition by PM3B and PM3C. 452 In constructs #1-4 we replaced a defined region of BGT-51460 by its counterpart from AVRPM3 B2/C2 whereas 453 for constructs #5-8 the opposite strategy was used (Fig. 5b). Interestingly none of the defined regions of 454 AVRPM3 B2/C2 introduced into BGT-51460 conferred recognition by PM3B or PM3C on its own (swaps#1-4, 455  Fig. 5b). In contrast, one region of AVRPM3 B2/C2 (segment 'b') with low genetic 458 diversity among natural isolates could be replaced without negative impact on R-gene recognition, and resulted 459 in a stronger HR response (swap#6, Fig. 5c). Similarly, replacement of segment 'd' from AVRPM3 B2/C2 by its 460 counterpart from BGT-51460 in swap#8 resulted in significantly stronger Avr recognition by Pm3b and Pm3c 461 (swap#8, Fig. 5d). Finally, while individual replacement of segments 'a' and 'c' from BGT-51460 with their 462 counterpart from AVRPM3 B2/C2 had no impact on recognition (swap#1 and #3), a stronger HR was observed 463 with swap#9 where these segments were simultaneously exchanged (Fig. 5e). This data demonstrates that 464 regions 'a' and 'c', are necessary and sufficient to confer AVR function, and together with regions 'b' and 465 'd'they can additionally affect the strength of NLR-AVR recognition. Taken together these findings imply that 466 AVRPM3 B2/C2 recognition is dependent on two regions that correspond to sequences previously defined by the 467 natural diversity screens, plus two regions possibly corresponding to a structurally conserved region in the 468 AVRPM3 B2/C2 family. additional ones in which only N-and C-terminal ends of AVRPM3 D3 , consisting of two and six polymorphic 475 residues, were replaced by sequences from BGTE-5883 and vice versa (#5-6 Supplementary Figure 21a-b). 476 Similar to the findings from AVRPM3 B2/C2 none of the defined subdomains of AVRPM3 D3 was sufficient to 477 confer recognition by PM3D on its own (#1-5, Supplementary Figure 21b To summarize, these experiments indicate structural conservation between AVRPM3 B2/C2 and the closest 482 effector family member since several subdomains can be readily exchanged without loss of recognition. 483 Furthermore, our data implies that multiple protein surface regions are involved in the interaction with the 484 corresponding R-proteins and that information from natural diversity screens can be used to define such 485 regions. This further supports the hypothesis that overall protein structure as well as specific contact regions 486 are important for recognition. 487 AvrPm3 b2/c2 and AvrPm3 d3 homologues encoded in these non-adapted pathogens are expressed (Supplementary 525 Figure 26b). The ability of the rye isolates to infect the non-host wheat was assessed microscopically, at an 526 early stage of the infection (48 hours), where compatible isolates can usually form a haustorium and a few 527 secondary hyphae (hereafter referred to as "microcolony"). We used two different staining methods (see 528 Methods) to distinguish the following phenotypic categories: (i) microcolony formation in the absence of 529 hypersensitive cell-death, indicating successful host-penetration at an early stages of infection reminiscent of 530 an infection from an adapted mildew, and (ii) arrest of spore growth in the presence or absence of a detectable 531 hypersensitive cell-death, reminiscent of a race-specific resistance response. In agreement with our hypothesis 532 that Pm3b, Pm3c and Pm3d contribute to non-host resistance to non-adapted formae speciales, the rate of 533 microcolony formation of both tested B.g. secalis isolates was significantly (p < 0.05), and consistently reduced 534 on the transgenic and near-isogenic Pm3 wheat lines when compared to the susceptible controls 'Bobwhite' 535 and 'Chancellor' (Figure 6d-e). We conclude that these assays further demonstrate that the Pm3 alleles are 536 potent host-specificity determinants, as predicted by Matsumura and Tosa two decades ago 23 . 537 538