Populations of European ash trees (Fraxinus excelsior) are being devastated by the invasive alien fungus Hymenoscyphus fraxineus, which causes ash dieback. We sequenced whole genomic DNA from 1,250 ash trees in 31 DNA pools, each pool containing trees with the same ash dieback damage status in a screening trial and from the same seed-source zone. A genome-wide association study identified 3,149 single nucleotide polymorphisms (SNPs) associated with low versus high ash dieback damage. Sixty-one of the 192 most significant SNPs were in, or close to, genes with putative homologues already known to be involved in pathogen responses in other plant species. We also used the pooled sequence data to train a genomic prediction model, cross-validated using individual whole genome sequence data generated for 75 healthy and 75 damaged trees from a single seed source. The model’s genomic estimated breeding values (GEBVs) allocated these 150 trees to their observed health statuses with 67% accuracy using 10,000 SNPs. Using the top 20% of GEBVs from just 200 SNPs, we could predict observed tree health with over 90% accuracy. We infer that ash dieback resistance in F. excelsior is a polygenic trait that should respond well to both natural selection and breeding, which could be accelerated using genomic prediction.
Subscribe to Journal
Get full journal access for 1 year
only $8.67 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The gppool pipeline developed as part of the project to run GP trained on pool-seq data can be found at https://github.research.its.qmul.ac.uk/btx330/gppool. All software used (Trimmomatic v.0.38, BWA MEM v.0.7.17, SAMtools v.1.9, BCFtools v.1.8, VCFtools v.0.1.15, PoPoolation2, R v.3.5.3, Repeatmasker v.4.0.5, Bowtie v.2.3.0, Blobtools v.1.1, SNPeff v.4.3 T, Haploview, rrBLUP v.4.6, NCBI BLAST, RaptorX-Binding, Q-value 2.16.0, SWISS-MODEL Phyre2, SMILES, Autodock Vina v.1.1.2, PyRx v.0.8, PyMOL v.2.0, DRONA, SignalP 4.1 server, Phobius server and NetPhos 3.1 Server) are commercially or freely available.
All trimmed reads are available at the European Nucleotide Archive with primary accession number: PRJEB31096. A guide to these is given in Supplementary Table 7b. The reference F. excelsior genome is available for download at www.ashgenome.org and is Assembly GCA_900149125.1 at the European Nucleotide Archive. Biological Materials from the Forest Research Mass Screening trials are available through negotiation of a Materials Transfer Agreement with Forest Research, Northern Research Station, Roslin, Midlothian EH25 9SY.
Mitchell, R. J. et al. Ash dieback in the UK: a review of the ecological and conservation implications and potential management options. Biol. Conserv. 175, 95–109 (2014).
Sollars, E. S. A. et al. Genome sequence and genetic diversity of European ash trees. Nature 541, 212–216 (2017).
Gross, A., Holdenrieder, O., Pautasso, M., Queloz, V. & Sieber, T. N. Hymenoscyphus pseudoalbidus, the causal agent of European ash dieback. Mol. Plant Pathol. 15, 5–21 (2014).
Pautasso, M., Aas, G., Queloz, V. & Holdenrieder, O. European ash (Fraxinus excelsior) dieback—a conservation biology challenge. Biol. Conserv. 158, 37–49 (2013).
Plumb, W. J. et al. The viability of a breeding programme for ash in the British Isles in the face of ash dieback. Plants, People, Planet https://doi.org/10.1002/ppp3.10060 (2019).
Mckinney, L. V. et al. The ash dieback crisis: genetic variation in resistance can prove a long-term solution. Plant Pathology 63, 485–499 (2014).
Endler, L., Betancourt, A. J., Nolte, V. & Schlötterer, C. Reconciling differences in pool-GWAS between populations: a case study of female abdominal pigmentation in Drosophila melanogaster. Genetics 202, 843–855 (2016).
Fontanesi, L. et al. Genome-wide association study for ham weight loss at first salting in Italian Large White pigs: towards the genetic dissection of a key trait for dry-cured ham production. Anim. Genet. 48, 103–107 (2017).
Zhao, Y., Mette, M. F., Gowda, M., Longin, C. F. H. & Reif, J. C. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat. Heredity (Edinb.) 112, 638–645 (2014).
Hayes, B. J., Visscher, P. M. & Goddard, M. E. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. (Camb). 91, 47–60 (2009).
Goddard, M. E., Hayes, B. J. & Meuwissen, T. H. E. Genomic selection in livestock populations. Genet. Res. (Camb). 92, 413–421 (2010).
Müller, B. S. F. et al. Genomic prediction in contrast to a genome-wide association study in explaining heritable variation of complex growth traits in breeding populations of Eucalyptus. BMC Genomics 18, 524 (2017).
Resende, J. F. R. et al. Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics 190, 1503–1510 (2012).
Schlötterer, C., Tobler, R., Kofler, R. & Nolte, V. Sequencing pools of individuals-mining genome-wide polymorphism data without big funding. Nat. Rev. Genet. 15, 749–763 (2014).
Stocks, J. J., Buggs, R. J. A. & Lee, S. J. A first assessment of Fraxinus excelsior (common ash) susceptibility to Hymenoscyphus fraxineus (ash dieback) throughout the British Isles. Sci. Rep. 7, 16546 (2017).
Bakker, E. G. A genome-wide survey of R gene polymorphisms in Arabidopsis. Plant Cell Online 18, 1803–1818 (2006).
Meng, Z., Ruberti, C., Gong, Z. & Brandizzi, F. CPR5 modulates salicylic acid and the unfolded protein response to manage tradeoffs between plant growth and stress responses. Plant J. 89, 486–501 (2017).
Risseeuw, E. P. et al. Protein interaction analysis of SCF ubiquitin E3 ligase subunits from Arabidopsis. Plant J. 34, 753–767 (2003).
Baker, E. A. G. et al. Comparative transcriptomics among four white pine species. G3 8, 1461–1474 (2018).
Kakehi, J. I. et al. Mutations in ribosomal proteins, RPL4 and RACK1, suppress the phenotype of a thermospermine-deficient mutant of Arabidopsis thaliana. PLoS ONE 10, e0117309 (2015).
Iovine, B., Iannella, M. L. & Bevilacqua, M. A. Damage-specific DNA binding protein 1 (DDB1): a protein with a wide range of functions. Int. J. Biochem. Cell Biol. 43, 1664–1667 (2011).
Liu, Y. et al. A gene cluster encoding lectin receptor kinases confers broad-spectrum and durable insect resistance in rice. Nat. Biotechnol. 33, 301–305 (2015).
Hao, W., Collier, S. M., Moffett, P. & Chai, J. Structural basis for the interaction between the potato virus X resistance protein (Rx) and its cofactor ran GTPase-activating protein 2 (RanGAP2). J. Biol. Chem. 288, 35868–35876 (2013).
Wang, S. et al. A noncanonical role for the CKI-RB-E2F cell-cycle signaling pathway in plant effector-triggered immunity. Cell Host Microbe 16, 787–794 (2014).
Rivas-San Vicente, M. & Plasencia, J. Salicylic acid beyond defence: Its role in plant growth and development. J. Exp. Bot. 62, 3321–3338 (2011).
Morita-Yamamuro, C. et al. The Arabidopsis gene CAD1 controls programmed cell death in the plant immune system and encodes a protein containing a MACPF domain. Plant Cell Physiol. 46, 902–912 (2005).
Han, J. Y., In, J. G., Kwon, Y. S. & Choi, Y. E. Regulation of ginsenoside and phytosterol biosynthesis by RNA interferences of squalene epoxidase gene in Panax ginseng. Phytochemistry 71, 36–46 (2010).
Wang, K., Senthil-Kumar, M., Ryu, C.-M., Kang, L. & Mysore, K. S. Phytosterols play a key role in plant innate immunity against bacterial pathogens by regulating nutrient efflux into the apoplast. Plant Physiol. 158, 789–180 (2012).
Gupta, S. K., Rai, A. K., Kanwar, S. S. & Sharma, T. R. Comparative analysis of zinc finger proteins involved in plant disease resistance. PLoS ONE 7, e42578 (2012).
Soll, J. & Schleiff, E. Protein import into chloroplasts. Nat. Rev. Mol. Cell Biol. 5, 198–208 (2004).
Stief, A. et al. Arabidopsis miR156 regulates tolerance to recurring environmental stress through SPL transcription factors. Plant Cell 26, 1792–1807 (2014).
Michaels, S. D. & Amasino, R. M. Memories of winter: vernalization and the competence to flower. Plant, Cell Environ. 23, 1145–1153 (2000).
Liu, G., Holub, E. B., Alonso, J. M., Ecker, J. R. & Fobert, P. R. An Arabidopsis NPR1-like gene, NPR4, is required for disease resistance. Plant J. 41, 304–318 (2005).
Gutterson, N. & Reuber, T. L. Regulation of disease resistance pathways by AP2/ERF transcription factors. Curr. Opin. Plant Biol. 7, 465–471 (2004).
Mitchell, D. A., Vasudevan, A., Linder, M. E. & Deschenes, R. J. Protein palmitoylation by a family of DHHC protein S-acyltransferases. J. Lipid Res. 47, 1118–1127 (2006).
Li, Y., Scott, R., Doughty, J., Grant, M. & Qi, B. Protein S-acyltransferase 14: a specific role for palmitoylation in leaf senescence in Arabidopsis. Plant Physiol. 170, 415–428 (2016).
Sharmin, S. et al. Xyloglucan endotransglycosylase/hydrolase genes from a susceptible and resistant jute species show opposite expression pattern following Macrophomina phaseolina infection. Commun. Integr. Biol. 5, 598–606 (2012).
Okazawa, K. et al. Molecular cloning and cDNA sequencing of endoxyloglucan transferase, a novel class of glycosyltransferase that mediates molecular grafting between matrix polysaccharides in plant cell walls. J. Biol. Chem. 268, 25364–25368 (1993).
Sakuma, Y. et al. DNA-binding specificity of the ERF/AP2 domain of Arabidopsis DREBs, transcription factors involved in dehydration- and cold-inducible gene expression. Biochem. Biophys. Res. Commun. 290, 998–1009 (2002).
Gkizi, D., Santos-Rufo, A., Rodríguez-Jurado, D., Paplomatas, E. J. & Tjamos, S. E. The β-amylase genes: negative regulators of disease resistance for Verticillium dahliae. Plant Pathol. 64, 1484–1490 (2015).
Huibers, R. P., de Jong, M., Dekter, R. W. & Van den Ackerveken, G. Disease-specific expression of host genes during downy mildew infection of Arabidopsis. Mol. Plant. Microbe. Interact. 22, 1104–1115 (2009).
Carter, C. The vegetative vacuole proteome of Arabidopsis thaliana reveals predicted and unexpected proteins. Plant Cell Online 16, 3285–3303 (2004).
Castaño-Miquel, L. et al. SUMOylation inhibition mediated by disruption of SUMO E1-E2 interactions confers plant susceptibility to necrotrophic fungal pathogens. Mol. Plant 10, 709–720 (2017).
Mur, L. A. J., Simpson, C., Kumari, A., Gupta, A. K. & Gupta, K. J. Moving nitrogen to the centre of plant defence against pathogens. Ann. Bot. 119, 703–709 (2017).
Gao, Y. et al. Two trichome birefringence-like proteins mediate xylan acetylation, which is essential for leaf blight resistance in rice. Plant Physiol. 173, 470–481 (2017).
Slavov, G. T. et al. Genome-wide association studies and prediction of 17 traits related to phenology, biomass and cell wall composition in the energy grass Miscanthus sinensis. New Phytol. 201, 1227–1239 (2014).
Grinberg, N. F. et al. Implementation of genomic prediction in Lolium perenne (L.) breeding populations. Front. Plant Sci. 7, 133 (2016).
Spindel, J. et al. Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS Genet. 11, e1005350 (2015).
Biazzi, E. et al. Genome-wide association mapping and genomic selection for alfalfa (Medicago sativa) forage quality traits. PLoS ONE 12, e0169234 (2017).
Bian, Y. & Holland, J. B. Enhancing genomic prediction with genome-wide association studies in multiparental maize populations. Heredity (Edinb.) 118, 585–593 (2017).
Resende, R. T. et al. Assessing the expected response to genomic selection of individuals and families in Eucalyptus breeding with an additive-dominant model. Heredity (Edinb.) 119, 245–255 (2017).
Hayes, B. J., Lewin, H. A. & Goddard, M. E. The future of livestock breeding: genomic selection for efficiency, reduced emissions intensity, and adaptation. Trends in Genetics 29, 206–214 (2013).
Pryce, J. E. & Daetwyler, H. D. Designing dairy cattle breeding schemes under genomic selection: a review of international research. Anim. Prod. Sci. 52, 107–114 (2012).
Wientjes, Y. C. J., Veerkamp, R. F. & Calus, M. P. L. The effect of linkage disequilibrium and family relationships on the reliability of genomic prediction. Genetics 193, 621–631 (2013).
Clark, S. A., Hickey, J. M., Daetwyler, H. D. & Van der Werf, J. H. J. The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes. Genet. Sel. Evol. 44, 4 (2012).
Pliura, A., Vaidotas, L., Vytautas, S. & Edmundas, B. Performance of twenty-four European Fraxinus excelsior populations in three Lithuanian progeny trials with a special emphasis on resistance to Chalara fraxinea. Balt. For. 17, 17–34 (2011).
Gautier, M. et al. Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping. Mol. Ecol. 22, 3766–3779 (2013).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Kofler, R., Pandey, R. V. & Schlötterer, C. PoPoolation2: identifying differentiation between populations using sequencing of pooled DNA samples (pool-seq). Bioinformatics 27, 3435–3436 (2011).
Jombart, T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24, 1403–1405 (2008).
Wei, T. & Simko, V. corrplot: visualization of a correlation matrix. R Package v.0.84 (CRAN, 2017); https://CRAN.R-project.org/package=corrplot
Landis, J. R., Heyman, E. R. & Koch, G. G. Average partial association in three-way contingency tables: a review and discussion of alternative tests. Int. Stat. Rev. 36, 237–254 (1978).
Storey, J. D., Bass, A. J., Dabney, A., Robinson, D. qvalue: Q-value estimation for false discovery rate control. R package v.2.16.0 (2019); http://github.com/jdstorey/qvalue
Laetsch, D. R., Blaxter, M. L. & Leggett, R. M. BlobTools: interrogation of genome assemblies [version 1; referees: 2 approved with reservations]. F1000Research 6, 1287 (2017).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strainw1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).
Waterhouse, A. et al. SWISS-MODEL: Homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296–W303 (2018).
Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. E. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 845–858 (2015).
The PyMOL Molecular Graphics System v.1.8 (Schrödinger, LLC, 2015).
Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).
Dallakyan, S. & Olson, A. J. Small-molecule library screening by docking with PyRx. Methods Mol. Biol. 1263, 243–250 (2015).
Endelman, J. B. Ridge regression and other kernels for genomic selection with R Package rrBLUP. Plant Genome J. 4, 250–255 (2011).
Endelman, J. B. & Jannink, J.-L. Shrinkage estimation of the realized relationship matrix. G3 2, 1405–1413 (2012).
This study was supported by Forest Research, Queen Mary University of London and the Royal Botanic Gardens, Kew. J.J.S. was funded by a Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) studentship 202790/2014-2 and was part of the Brazilian Scientific Mobility Program—Science without Borders (SwB). S.J.L. and R.J.A.B. were partly funded by Living with Environmental Change (LWEC) Tree Health and Plant Biosecurity Initiative—Phase 2 grant BB/L012162/1, funded jointly by the Biotechnology and Biological Sciences Research Council, the Department for Environment, Food and Rural Affairs (Defra), the Economic and Social Research Council, the Forestry Commission, the Natural Environment Research Council and the Scottish Government. R.J.A.B. and L.J.K. were also supported in this work by funding from the Defra Future Proofing Plant Health scheme and the Erica Waltraud Albrecht Endowment Fund. Sequencing was paid for by a direct grant from Defra to the Royal Botanic Gardens, Kew. W.J.P. was supported by a Walsh Fellowship from the Department of Agriculture, Food and the Marine, Ireland. C.L.M. was supported by a studentship funded by Defra. Forest Research designed and set up the field trials with funding supplied by Defra, contract number TH032, ‘Rapid screening for Chalara resistance using ash trees currently in commercial nurseries’, with additional financial contribution from the Department of Agriculture, Food and the Marine, Ireland. The ash trees were all British-grown and sourced from various participating nurseries in England and Scotland. Maelor Forest Nurseries donated free of charge around half the total number of trees planted.
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Showing sampling and pooling strategies and dependencies of analyses for genome-wide association study and genomic prediction.
Extended Data Fig. 2 Circle plot of major allele frequency correlation values between all 31 pools in the Pool-seq dataset.
Numbers after seed source code correspond to health status (1 - healthy or 2 - damaged by ADB). Pool NSZ204:1 (with low ADB damage) was technically replicated (NSZ204:1R) using the same set of trees. Both pools from NSZ106 and NSZ107 were biologically replicated for both high and low damage pools, using different sets of trees. High correlation for both technical (NSZ204:1R) and biological replicates (NSZ 106 & 107) can be seen.
Blobtools plot for the showing taxonomic affiliation at the phylum rank level, distributed according to GC content and base coverage. Contigs that were not classified as streptophyta corresponded to 0.5% of the genome assembly and 0.24% of all mapped reads.
Extended Data Fig. 4 Pool-seq GWAS p-value density histogram with line plots of the q-values and local False Discovery Rate (FDR) values versus p-values.
The π0 estimate is also displayed.
Extended Data Fig. 5 Predicted protein structures for genes containing amino acid changes associated with tree health status under ADB pressure.
The protein structures to the left were more common in damaged trees, and those to the right were more common in healthy trees. Variant amino acids are coloured in magenta and indicated with a black arrowhead. (a) Gene FRAEX38873_v2_000003260, a BED finger-NBS-LRR resistance protein, where position 157 is a leucine (left) versus tryptophan (right) variant. Two ATP molecules are shown in orange to indicate the location of nucleotide binding sites. (b) Gene FRAEX38873_v2_000164520, a F-box/kelch-repeat, where position 13 is a glutamine (left) versus arginine (right) variant. (c) FRAEX38873_v2_000180950, a Protein DAMAGED DNA-BINDING, where position 99 is a proline (left) versus leucine (right) variant. DNA molecules are shown in orange docked at the proteins’ DNA binding sites. (d) Gene FRAEX38873_v2_000116110, a 60S ribosomal protein L4-1, where position 251 is an arginine (left) versus glycine (right) variant, position 285 is a methionine (left) versus arginine (right) variant, position 287 is an asparagine (left) versus lysine (right) variant and position 297 is a threonine (left) versus alanine (right) variant.
Extended Data Fig. 6 Genomic prediction results using the 150 individually genotyped samples (Dataset B) as both training and testing set, showing little difference between GWAS SNPs and random SNPs in correlations between GEBVs and health statuses.
(A) GEBV-health status correlation using GWAS candidate SNPs with all data filters applied (mapping quality, indel and repeat removal); (B) GEBV-health status correlation using GWAS candidate SNPs only filtering by mapping quality and indel removal; (C) GEBV-health status correlation using random selection of SNPs and all data filters (mean and standard error shown for N=10 runs, each of 500 iterations); (D) GP allocation accuracy using GWAS candidate SNPs with all data filters applied. The scale on the left hand vertical axis is for correlation, and the scale on the right hand vertical axis is for accuracy. 100 to 5 million SNPs used to train and test the rrBLUP model.
Extended Data Fig. 7 Genomic prediction using Pool-seq data for training and 150 NSZ 204 individuals for testing.
Dashed lines show results excluding Pool-seq data from NSZ 204 (the test seed source) from the training dataset, whereas solid lines show results with NSZ 204 included. The left column shows correlation of observed phenotype and GEBV and the right column shows accuracy of phenotypic assignment from GEBV.
Supplementary Tables 1–6.
a, Contigs that may be contamination from other organisms in the Fraxinus excelsior BATG0.5 genome as identified by Blobtools. b, Guide to read data on ENA. c, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 100 SNPs from the pool-seq GWAS. d, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 200 SNPs from the pool-seq GWAS. e, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 500 SNPs from the pool-seq GWAS. f, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 1,000 SNPs from the pool-seq GWAS. g, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 5,000 SNPs from the pool-seq GWAS. h, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 10,000 SNPs from the pool-seq GWAS. i, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 25,000 SNPs from the pool-seq GWAS. j, Estimated effect sizes from genomic prediction model trained on the pool-seq data using the top 50,000 SNPs from the pool-seq GWAS.
About this article
Cite this article
Stocks, J.J., Metheringham, C.L., Plumb, W.J. et al. Genomic basis of European ash tree resistance to ash dieback fungus. Nat Ecol Evol 3, 1686–1696 (2019) doi:10.1038/s41559-019-1036-6