Abstract

Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and the other serving in microbe–microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. This work expands the genome-based understanding of plant–microbe interactions and provides potential leads for efficient and sustainable agriculture through microbiome engineering.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from $8.99

All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Change history

  • Update 05 April 2018

    In the version of this article initially published, owing to technical errors during production Supplementary Tables 2–26 were linked to the incorrect legends, and replacement files posted were corrupted. The errors have been corrected in the HTML version of the paper.

References

  1. 1.

    Ley, R. E. et al. Evolution of mammals and their gut microbes. Science 320, 1647–1651 (2008).

  2. 2.

    Baumann, P. Biology bacteriocyte-associated endosymbionts of plant sap-sucking insects. Annu. Rev. Microbiol. 59, 155–189 (2005).

  3. 3.

    Sprent, J. I. 60Ma of legume nodulation. What’s new? What’s changing? J. Exp. Bot. 59, 1081–1084 (2008).

  4. 4.

    Pfeilmeier, S., Caly, D. L. & Malone, J. G. Bacterial pathogenesis of plants: future challenges from a microbial perspective: Challenges in Bacterial Molecular Plant Pathology. Mol. Plant Pathol. 17, 1298–1313 (2016).

  5. 5.

    Chowdhury, S. P., Hartmann, A., Gao, X. & Borriss, R. Biocontrol mechanism by root-associated Bacillus amyloliquefaciens FZB42—a review. Front. Microbiol. 6, 780 (2015).

  6. 6.

    Fibach-Paldi, S., Burdman, S. & Okon, Y. Key physiological properties contributing to rhizosphere adaptation and plant growth promotion abilities of Azospirillum brasilense. FEMS Microbiol. Lett. 326, 99–108 (2012).

  7. 7.

    Santhanam, R. et al. Native root-associated bacteria rescue a plant from a sudden-wilt disease that emerged during continuous cropping. Proc. Natl. Acad. Sci. USA 112, E5013–E5020 (2015).

  8. 8.

    Peters, N. K., Frost, J. W. & Long, S. R. A plant flavone, luteolin, induces expression of Rhizobium meliloti nodulation genes. Science 233, 977–980 (1986).

  9. 9.

    Hiei, Y., Ohta, S., Komari, T. & Kumashiro, T. Efficient transformation of rice (Oryza sativa L.) mediated by Agrobacterium and sequence analysis of the boundaries of the T-DNA. Plant J. 6, 271–282 (1994).

  10. 10.

    Hueck, C. J. Type III protein secretion systems in bacterial pathogens of animals and plants. Microbiol. Mol. Biol. Rev. 62, 379–433 (1998).

  11. 11.

    Bulgarelli, D. et al. Revealing structure and assembly cues for Arabidopsis root-inhabiting bacterial microbiota. Nature 488, 91–95 (2012).

  12. 12.

    Lundberg, D. S. et al. Defining the core Arabidopsis thaliana root microbiome. Nature 488, 86–90 (2012).

  13. 13.

    Bulgarelli, D., Schlaeppi, K., Spaepen, S., Ver Loren van Themaat, E. & Schulze-Lefert, P. Structure and functions of the bacterial microbiota of plants. Annu. Rev. Plant Biol. 64, 807–838 (2013).

  14. 14.

    Ofek-Lalzar, M. et al. Niche and host-associated functional signatures of the root surface microbiome. Nat. Commun. 5, 4950 (2014).

  15. 15.

    Gottel, N. R. et al. Distinct microbial communities within the endosphere and rhizosphere of Populus deltoides roots across contrasting soil types. Appl. Environ. Microbiol. 77, 5934–5944 (2011).

  16. 16.

    Bai, Y. et al. Functional overlap of the Arabidopsis leaf and root microbiota. Nature 528, 364–369 (2015).

  17. 17.

    Hardoim, P. R. et al. The hidden world within plants: ecological and evolutionary considerations for defining functioning of microbial endophytes. Microbiol. Mol. Biol. Rev. 79, 293–320 (2015).

  18. 18.

    Bulgarelli, D. et al. Structure and function of the bacterial root microbiota in wild and domesticated barley. Cell Host Microbe 17, 392–403 (2015).

  19. 19.

    Hacquard, S. et al. Microbiota and host nutrition across plant and animal kingdoms. Cell Host Microbe 17, 603–616 (2015).

  20. 20.

    Tatusov, R. L., Galperin, M. Y., Natale, D. A. & Koonin, E. V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000).

  21. 21.

    Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016).

  22. 22.

    Haft, D. H., Selengut, J. D. & White, O. The TIGRFAMs database of protein families. Nucleic Acids Res. 31, 371–373 (2003).

  23. 23.

    Huntemann, M. et al. The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). Stand. Genomic Sci. 10, 86 (2015).

  24. 24.

    Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).

  25. 25.

    Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285 (2016).

  26. 26.

    Ives, A. R. & Garland, T. Jr. Phylogenetic logistic regression for binary dependent variables. Syst. Biol. 59, 9–26 (2010).

  27. 27.

    Brynildsrud, O., Bohlin, J., Scheffer, L. & Eldholm, V. Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary. Genome Biol. 17, 238 (2016).

  28. 28.

    Hultman, J. et al. Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes. Nature 521, 208–212 (2015).

  29. 29.

    Louca, S. et al. Integrating biogeochemistry with multiomic sequence information in a model oxygen minimum zone. Proc. Natl. Acad. Sci. USA 113, E5925–E5933 (2016).

  30. 30.

    Coutinho, B. G., Licastro, D., Mendonça-Previato, L., Cámara, M. & Venturi, V. Plant-influenced gene expression in the rice endophyte Burkholderia kururiensis M130. Mol. Plant Microbe Interact. 28, 10–21 (2015).

  31. 31.

    Long, S. R. Rhizobium-legume nodulation: life together in the underground. Cell 56, 203–214 (1989).

  32. 32.

    Ruvkun, G. B., Sundaresan, V. & Ausubel, F. M. Directed transposon Tn5 mutagenesis and complementation analysis of Rhizobium meliloti symbiotic nitrogen fixation genes. Cell 29, 551–559 (1982).

  33. 33.

    Hershey, D. M., Lu, X., Zi, J. & Peters, R. J. Functional conservation of the capacity for ent-kaurene biosynthesis and an associated operon in certain rhizobia. J. Bacteriol. 196, 100–106 (2014).

  34. 34.

    Nett, R. S. et al. Elucidation of gibberellin biosynthesis in bacteria reveals convergent evolution. Nat. Chem. Biol. 13, 69–74 (2017).

  35. 35.

    Scharf, B. E., Hynes, M. F. & Alexandre, G. M. Chemotaxis signaling systems in model beneficial plant-bacteria associations. Plant Mol. Biol. 90, 549–559 (2016).

  36. 36.

    Büttner, D. & He, S. Y. Type III protein secretion in plant pathogenic bacteria. Plant Physiol. 150, 1656–1664 (2009).

  37. 37.

    Gao, R. et al. Genome-wide RNA sequencing analysis of quorum sensing-controlled regulons in the plant-associated Burkholderia glumae PG1 strain. Appl. Environ. Microbiol. 81, 7993–8007 (2015).

  38. 38.

    Weller-Stuart, T., Toth, I., De Maayer, P. & Coutinho, T. Swimming and twitching motility are essential for attachment and virulence of Pantoea ananatis in onion seedlings. Mol. Plant Pathol. 18, 734–745 (2017).

  39. 39.

    De Weger, L. A. et al. Flagella of a plant-growth-stimulating Pseudomonas fluorescens strain are required for colonization of potato roots. J. Bacteriol. 169, 2769–2773 (1987).

  40. 40.

    de Weert, S. et al. Flagella-driven chemotaxis towards exudate components is an important trait for tomato root colonization by Pseudomonas fluorescens. Mol. Plant Microbe Interact. 15, 1173–1180 (2002).

  41. 41.

    Ravcheev, D. A. et al. Comparative genomics and evolution of regulons of the LacI-family transcription factors. Front. Microbiol. 5, 294 (2014).

  42. 42.

    Yamauchi, Y., Hasegawa, A., Taninaka, A., Mizutani, M. & Sugimoto, Y. NADPH-dependent reductases involved in the detoxification of reactive carbonyls in plants. J. Biol. Chem. 286, 6999–7009 (2011).

  43. 43.

    Burstein, D. et al. Genome-scale identification of Legionella pneumophila effectors using a machine learning approach. PLoS Pathog. 5, e1000508 (2009).

  44. 44.

    Dean, P. Functional domains and motifs of bacterial type III effector proteins and their roles in infection. FEMS Microbiol. Rev. 35, 1100–1125 (2011).

  45. 45.

    Stebbins, C. E. & Galán, J. E. Structural mimicry in bacterial virulence. Nature 412, 701–705 (2001).

  46. 46.

    Price, C. T. et al. Molecular mimicry by an F-box effector of Legionella pneumophila hijacks a conserved polyubiquitination machinery within macrophages and protozoa. PLoS Pathog. 5, e1000704 (2009).

  47. 47.

    Rothmeier, E. et al. Activation of Ran GTPase by a Legionella effector promotes microtubule polymerization, pathogen vacuole motility and infection. PLoS Pathog. 9, e1003598 (2013).

  48. 48.

    Xu, R.-Q. et al. AvrAC(Xcc8004), a type III effector with a leucine-rich repeat domain from Xanthomonas campestris pathovar campestris confers avirulence in vascular tissues of Arabidopsis thaliana ecotype Col-0. J. Bacteriol. 190, 343–355 (2008).

  49. 49.

    Shevchik, V. E., Robert-Baudouy, J. & Hugouvieux-Cotte-Pattat, N. Pectate lyase PelI of Erwinia chrysanthemi 3937 belongs to a new family. J. Bacteriol. 179, 7321–7330 (1997).

  50. 50.

    Cesari, S., Bernoux, M., Moncuquet, P., Kroj, T. & Dodds, P. N. A novel conserved mechanism for plant NLR protein pairs: the “integrated decoy” hypothesis. Front. Plant Sci. 5, 606 (2014).

  51. 51.

    Sarris, P. F. et al. A plant immune receptor detects pathogen effectors that target WRKY transcription factors. Cell 161, 1089–1100 (2015).

  52. 52.

    Sarris, P. F., Cevik, V., Dagdas, G., Jones, J. D. & Krasileva, K. V. Comparative analysis of plant immune receptor architectures uncovers host proteins likely targeted by pathogens. BMC Biol. 14, 8 (2016).

  53. 53.

    Le Roux, C. et al. A receptor pair with an integrated decoy converts pathogen disabling of transcription factors to immunity. Cell 161, 1074–1088 (2015).

  54. 54.

    Brown, G. D. & Netea, M. G. (eds.). Immunology of Fungal Infections. (Springer, Dordrecht, The Netherlands, 2007).

  55. 55.

    Gadjeva, M., Takahashi, K. & Thiel, S. Mannan-binding lectin—a soluble pattern recognition molecule. Mol. Immunol. 41, 113–121 (2004).

  56. 56.

    Ma, Q.-H., Tian, B. & Li, Y.-L. Overexpression of a wheat jasmonate-regulated lectin increases pathogen resistance. Biochimie 92, 187–193 (2010).

  57. 57.

    Xiang, Y. et al. A jacalin-related lectin-like gene in wheat is a component of the plant defence system. J. Exp. Bot. 62, 5471–5483 (2011).

  58. 58.

    Yamaji, Y. et al. Lectin-mediated resistance impairs plant virus infection at the cellular level. Plant Cell 24, 778–793 (2012).

  59. 59.

    Weidenbach, D. et al. Polarized defense against fungal pathogens is mediated by the Jacalin-related lectin domain of modular Poaceae-specific proteins. Mol. Plant 9, 514–527 (2016).

  60. 60.

    Sahly, H. et al. Surfactant protein D binds selectively to Klebsiella pneumoniae lipopolysaccharides containing mannose-rich O-antigens. J. Immunol. 169, 3267–3274 (2002).

  61. 61.

    Osborn, M. J., Rosen, S. M., Rothfield, L., Zeleznick, L. D. & Horecker, B. L. Lipopolysaccharide of the gram-negative cell wall. Science 145, 783–789 (1964).

  62. 62.

    Tans-Kersten, J., Huang, H. & Allen, C. Ralstonia solanacearum needs motility for invasive virulence on tomato. J. Bacteriol. 183, 3597–3605 (2001).

  63. 63.

    Cole, B. J. et al. Genome-wide identification of bacterial plant colonization genes. PLoS Biol. 15, e2002860 (2017).

  64. 64.

    Poggio, S. et al. A complete set of flagellar genes acquired by horizontal transfer coexists with the endogenous flagellar system in Rhodobacter sphaeroides. J. Bacteriol. 189, 3208–3216 (2007).

  65. 65.

    Ho, B. T., Dong, T. G. & Mekalanos, J. J. A view to a kill: the bacterial type VI secretion system. Cell Host Microbe 15, 9–21 (2014).

  66. 66.

    MacIntyre, D. L., Miyata, S. T., Kitaoka, M. & Pukatzki, S. The Vibrio cholerae type VI secretion system displays antimicrobial properties. Proc. Natl. Acad. Sci. USA 107, 19520–19524 (2010).

  67. 67.

    Tian, Y. et al. The type VI protein secretion system contributes to biofilm formation and seed-to-seedling transmission of Acidovorax citrulli on melon. Mol. Plant Pathol. 16, 38–47 (2015).

  68. 68.

    Peiffer, J. A. et al. Diversity and heritability of the maize rhizosphere microbiome under field conditions. Proc. Natl. Acad. Sci. USA 110, 6548–6553 (2013).

  69. 69.

    Agler, M. T. et al. Microbial hub taxa link host and abiotic factors to plant microbiome variation. PLoS Biol. 14, e1002352 (2016).

  70. 70.

    Bokulich, N. A., Thorngate, J. H., Richardson, P. M. & Mills, D. A. Microbial biogeography of wine grapes is conditioned by cultivar, vintage, and climate. Proc. Natl. Acad. Sci. USA 111, E139–E148 (2014).

  71. 71.

    Coleman-Derr, D. et al. Plant compartment and biogeography affect microbiome composition in cultivated and native Agave species. New Phytol. 209, 798–811 (2016).

  72. 72.

    Shade, A., McManus, P. S. & Handelsman, J. Unexpected diversity during community succession in the apple flower microbiome. MBio 4, e00602–e00612 (2013).

  73. 73.

    Turner, T. R. et al. Comparative metatranscriptomics reveals kingdom level changes in the rhizosphere microbiome of plants. ISME J. 7, 2248–2258 (2013).

  74. 74.

    Edwards, J. et al. Structure, variation, and assembly of the root-associated microbiomes of rice. Proc. Natl. Acad. Sci. USA 112, E911–E920 (2015).

  75. 75.

    Kroj, T., Chanclud, E., Michel-Romiti, C., Grand, X. & Morel, J.-B. Integration of decoy domains derived from protein targets of pathogen effectors into plant immune receptors is widespread. New Phytol. 210, 618–626 (2016).

  76. 76.

    Mukhtar, M. S. et al. Independently evolved virulence effectors converge onto hubs in a plant immune system network. Science 333, 596–601 (2011).

  77. 77.

    Vimr, E. & Lichtensteiger, C. To sialylate, or not to sialylate: that is the question. Trends Microbiol. 10, 254–257 (2002).

  78. 78.

    de Jonge, R. et al. Conserved fungal LysM efector Ecp6 prevents chitin-triggered immunity in plants. Science 329, 953–955 (2010).

  79. 79.

    Doty, S. L. et al. Diazotrophic endophytes of native black cottonwood and willow. Symbiosis 47, 23–33 (2009).

  80. 80.

    Weston, D. J. et al. Pseudomonas fluorescens induces strain-dependent and strain-independent host plant responses in defense networks, primary metabolism, photosynthesis, and fitness. Mol. Plant Microbe Interact. 25, 765–778 (2012).

  81. 81.

    Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).

  82. 82.

    Beszteri, B., Temperton, B., Frickenhaus, S. & Giovannoni, S. J. Average genome size: a potential source of bias in comparative metagenomics. ISME J. 4, 1075–1077 (2010).

  83. 83.

    Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

  84. 84.

    Varghese, N. J. et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 43, 6761–6771 (2015).

  85. 85.

    Kerepesi, C., Bánky, D. & Grolmusz, V. AmphoraNet: the webserver implementation of the AMPHORA2 metagenomic workflow suite. Gene 533, 538–540 (2014).

  86. 86.

    Wu, M., Chatterji, S. & Eisen, J. A. Accounting for alignment uncertainty in phylogenomics. PLoS One 7, e30288 (2012).

  87. 87.

    Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010).

  88. 88.

    Sen, A. et al. Phylogeny of the class Actinobacteria revisited in the light of complete genomes. The orders ‘Frankiales’ and Micrococcales should be split into coherent entities: proposal of Frankiales ord. nov., Geodermatophilales ord. nov., Acidothermales ord. nov. and Nakamurellales ord. nov. Int. J. Syst. Evol. Microbiol. 64, 3821–3832 (2014).

  89. 89.

    Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

  90. 90.

    Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

  91. 91.

    Wang, Z. & Wu, M. A phylum-level bacterial phylogenetic marker database. Mol. Biol. Evol. 30, 1258–1262 (2013).

  92. 92.

    Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).

  93. 93.

    Finn, R. D. et al. HMMER web server: 2015 update. Nucleic Acids Res. 43, W30–W38 (2015).

  94. 94.

    Alexeyev, M. F. The pKNOCK series of broad-host-range mobilizable suicide vectors for gene knockout and targeted DNA insertion into the chromosome of gram-negative bacteria. Biotechniques 26, 824–826 (1999).

  95. 95.

    Hadjithomas, M. et al. IMG-ABC: a knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites. MBio 6, e00932 (2015).

  96. 96.

    Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).

  97. 97.

    Stamatakis, A., Hoover, P. & Rougemont, J. A rapid bootstrap algorithm for the RAxML Web servers. Syst. Biol. 57, 758–771 (2008).

  98. 98.

    Finkel, O. M., Béjà, O. & Belkin, S. Global abundance of microbial rhodopsins. ISME J. 7, 448–451 (2013).

  99. 99.

    Traore, S. M. Characterization of Type Three Effector Genes of A. citrulli, the Causal Agent of Bacterial Fruit Blotch of Cucurbits. (Virginia Polytechnic Institute and State University, Blacksburg, VA, 2014).

  100. 100.

    Basler, M., Ho, B. T. & Mekalanos, J. J. Tit-for-tat: type VI secretion system counterattack during bacterial cell-cell interactions. Cell 152, 884–894 (2013).

Download references

Acknowledgements

The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231. J.L.D. and S.G.T. were supported by NSF INSPIRE grant IOS-1343020, and J.L.D. was also supported by DOE–USDA Feedstock Award DE-SC001043 and by the Office of Science (BER), US Department of Energy, grant no. DE-SC0014395. S.H.P. was supported by NIH Training Grant T32 GM067553-06 and was a Howard Hughes Medical Institute (HHMI) International Student Research Fellow. D.S.L. was supported by NIH Training Grant T32 GM07092-34. J.L.D. is an Investigator of the HHMI, supported by the HHMI and the Gordon and Betty Moore Foundation (GBMF3030). M.E.F. was supported by NIH Dr. Ruth L. Kirschstein NRSA Fellowship F32-GM112345. D.A.P. and T.-Y.L. were supported by the Genomic Science Program, US Department of Energy, Office of Science, Biological and Environmental Research as part of the Oak Ridge National Laboratory Plant Microbe Interfaces Scientific Focus Area (http://pmi.ornl.gov) and Plant Feedstock Genomics Award DE-SC001043. Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. J.A.V. was supported by a SystemsX.ch grant (Micro2X) and a European Research Council (ERC) advanced grant (PhyMo). We thank I. Bertani, C. Bez, R. Bowers, D. Burstein, A. Chun Chen, D. Chiniquy, B. Cole, O. Cohen, A. Copeland, J. Eisen, E. Eloe-Fadrosh, M. Hadjithomas, O. Finkel, H. Schnitzel Meule Fux, N. Ivanova, J. Knelman, R. Malmstrom, R. Perez-Torres, D. Salomon, R. Sorek, T. Mucyn, R. Seshadri, T.K. Reddy, L. Ryan, and H. Sberro Livnat for general help, text editing, and ideas for this work. We thank R. Walcott (University of Georgia, Athens, GA, USA) for providing the Acidovorax citrulli VasD mutant strain.

Author information

Author notes

  1. Asaf Levy and Isai Salas Gonzalez contributed equally to this work.

Affiliations

  1. DOE Joint Genome Institute, Walnut Creek, CA, USA

    • Asaf Levy
    • , Scott Clingenpeel
    • , Kyra Stillman
    • , Bryan Rangel Alvarez
    • , Tijana Glavina Rio
    • , Susannah G. Tringe
    •  & Tanja Woyke
  2. Department of Biology, University of North Carolina, Chapel Hill, NC, USA

    • Isai Salas Gonzalez
    • , Sur Herrera Paredes
    • , Freddy Monteiro
    • , Derek S. Lundberg
    • , Meredith McDonald
    • , Andrew P. Klein
    • , Meghan E. Feltcher
    • , Sarah R. Grant
    •  & Jeffery L. Dangl
  3. Howard Hughes Medical Institute, Chevy Chase, MD, USA

    • Isai Salas Gonzalez
    • , Sur Herrera Paredes
    • , Freddy Monteiro
    • , Derek S. Lundberg
    • , Meredith McDonald
    • , Andrew P. Klein
    • , Meghan E. Feltcher
    •  & Jeffery L. Dangl
  4. Institute of Microbiology, ETH Zurich, Zurich, Switzerland

    • Maximilian Mittelviefhaus
    •  & Julia A. Vorholt
  5. Department of Horticulture, Virginia Tech, Blacksburg, VA, USA

    • Jiamin Miao
    • , Kunru Wang
    •  & Bingyu Zhao
  6. International Centre for Genetic Engineering and Biotechnology, Trieste, Italy

    • Giulia Devescovi
    •  & Vittorio Venturi
  7. Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA

    • Tse-Yuan Lu
    •  & Dale A. Pelletier
  8. Department of Microbiology, University of Tennessee, Knoxville, TN, USA

    • Sarah Lebeis
  9. Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA

    • Zhao Jin
  10. School of Environmental and Forest Sciences, University of Washington, Seattle, WA, USA

    • Sharon L. Doty
  11. Max Planck Institute for Developmental Biology, Tübingen, Germany

    • Ruth E. Ley
  12. School of Natural Sciences, University of California, Merced, Merced, CA, USA

    • Susannah G. Tringe
    •  & Tanja Woyke
  13. The Carolina Center for Genome Sciences, University of North Carolina, Chapel Hill, NC, USA

    • Jeffery L. Dangl
  14. Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, NC, USA

    • Jeffery L. Dangl
  15. Department of Biology, Stanford University, Stanford, CA, USA

    • Sur Herrera Paredes
  16. The Grassland College, Gansu Agricultural University, Lanzhou, Gansu, China

    • Jiamin Miao
  17. BD Technologies and Innovation, Research Triangle Park, NC, USA

    • Meghan E. Feltcher

Authors

  1. Search for Asaf Levy in:

  2. Search for Isai Salas Gonzalez in:

  3. Search for Maximilian Mittelviefhaus in:

  4. Search for Scott Clingenpeel in:

  5. Search for Sur Herrera Paredes in:

  6. Search for Jiamin Miao in:

  7. Search for Kunru Wang in:

  8. Search for Giulia Devescovi in:

  9. Search for Kyra Stillman in:

  10. Search for Freddy Monteiro in:

  11. Search for Bryan Rangel Alvarez in:

  12. Search for Derek S. Lundberg in:

  13. Search for Tse-Yuan Lu in:

  14. Search for Sarah Lebeis in:

  15. Search for Zhao Jin in:

  16. Search for Meredith McDonald in:

  17. Search for Andrew P. Klein in:

  18. Search for Meghan E. Feltcher in:

  19. Search for Tijana Glavina Rio in:

  20. Search for Sarah R. Grant in:

  21. Search for Sharon L. Doty in:

  22. Search for Ruth E. Ley in:

  23. Search for Bingyu Zhao in:

  24. Search for Vittorio Venturi in:

  25. Search for Dale A. Pelletier in:

  26. Search for Julia A. Vorholt in:

  27. Search for Susannah G. Tringe in:

  28. Search for Tanja Woyke in:

  29. Search for Jeffery L. Dangl in:

Contributions

A.L. performed most data analysis and wrote the paper. I.S.G. performed phylogenetic inference, performed phylogenetically aware analyses, analyzed the data, provided the supporting website, and contributed to manuscript writing. M. Mittelviefhaus and J.A.V. designed and performed experiments related to Hyde1 gene function and contributed to manuscript writing. S.C. isolated single bacterial cells and prepared metadata for data analysis. F.M. analyzed data. S.H.P. analyzed data and contributed to manuscript writing. J.M. produced a mutant strain for Hyde1. K.W. tested Hyde1 toxicity in E. coli. G.D. and V.V. produced deletion mutants and designed and performed rice root colonization experiments. K.S. helped in data analysis. B.R.A. prepared metadata for data analysis. D.S.L., T.-Y.L., S.L., Z.J., M. McDonald, A.P.K., M.E.F., and S.L.D. isolated bacteria from different plants or managed this process. T.G.d.R. managed the sequencing project. S.R.G., D.A.P., and R.E.L. managed bacterial isolation efforts and contributed to manuscript writing. B.Z. managed Hyde1 deletion and toxicity testing. S.G.T. contributed to manuscript writing. T.W. managed single-cell isolation efforts and contributed to manuscript writing. J.L.D. directed the overall project and contributed to manuscript writing.

Competing interests

J.L.D. is a cofounder of and shareholder in, and S.H.P. collaborates with, AgBiome LLC, a corporation that aims to use plant-associated microbes to improve plant productivity.

Corresponding authors

Correspondence to Susannah G. Tringe or Tanja Woyke or Jeffery L. Dangl.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–29 and Supplementary Note 1.

  2. Life Sciences Reporting Summary

  3. Supplementary Table 1

    All genomes used. Lists of all genomes used from nine taxa (pre-filtration). Cells filled with yellow are Brassicaceae root isolates from the USA, cells filled with green are single cells isolated from Arabidopsis thaliana, cells filled with pink are poplar isolates, cells filled with blue are recently published leaf and root Arabidopsis and soil isolates from Europe, cells filled with purple are maize root isolates. “Filtered out?” column is ‘N’ if genome is retained for usage in analysis after QA process. “Representative genome taxid” – taxon id of another genome (different row in the same tab) representing at least two redundant genomes. Completeness and contamination values were calculated with CheckM. Full genome sequence, gene annotation, and metadata of each genome used can be found in the IMG website https://img.jgi.doe.gov/. For example the metadata of taxon id 2558860101 can be found in https://img.jgi.doe.gov/cgibin/mer/main.cgi?section=TaxonDetail&page=taxonDetail&ta xon_oid=2558860101.

  4. Supplementary Table 2

    Statistics of genomes in the taxa used

  5. Supplementary Table 3

    Sequencing and assembly information of new genomes

  6. Supplementary Table 4

    Abundance of the nine taxa in 16S marker gene surveys. The relative abundances of taxa composing a specific taxon were taken from the different publications and were added to yield the relative abundance of that taxon. In those cases with biological replicates, e.g. in Lundberg et al. Nature 2012 we used the median value.

  7. Supplementary Table 5

    Genome size comparison. Genome size comparison between the different isolation sites done by t-test and PhyloGLM. Each cell denotes the group with the largest genomes, if the difference is significant (P < 0.05). N.S. - not significant. PhyloGLM test takes into account the phylogenetic structure of the taxon.

  8. Supplementary Table 6

    COG-to-COG category mapping

  9. Supplementary Table 7

    Acinetobacter PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster inpfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue< 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1.To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  10. Supplementary Table 8

    Actinobacteria1 PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  11. Supplementary Table 9

    Actinobacteria2 PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  12. Supplementary Table 10

    Alphaproteobacteria PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used q- value < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  13. Supplementary Table 11

    Bacillales PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  14. Supplementary Table 12

    Bacteroidetes PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  15. Supplementary Table 13

    Burkholderiales PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  16. Supplementary Table 14

    Pseudomonas PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  17. Supplementary Table 15

    Xanthomonadaceae PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

  18. Supplementary Table 16

    Validation of PA/NPA/RA/soil genes through metagenomes. a. Samples used (n=38), b. Summary of results based on two sided t test.

  19. Supplementary Table 17

    Validation of PA genes in Paraburkholderia kururiensis M130. a. Mutant used and statistical tests results, b. Raw data: cfu/g root, 3. Primers used.

  20. Supplementary Table 18

    The number of operons predicted by different approaches.

  21. Supplementary Table 19

    Reproducible PA domains. a. Protein domains that are significantly PA in at least three taxa by at least two tests. NA – test results are not available (untested), NS – non-significant result. b. Fractions for LacI proteins within genomes, c. Fraction of pfam00248 domain within genomes.

  22. Supplementary Table 20

    DNA motifs predicted to be bound by LacI transcription factors. Predicted promoter sequences are intergenic sequences, at least 25 bp long, located upstream of carbohydrate metabolism and transport genes that are found directly adjacent to LacI genes. The most abundant kmers of different lengths were detected using wordcount (Emboss package). The most abundant motifs found in multiple taxa were compared against their distribution in random intergenic sequences using the Fisher exact test.

  23. Supplementary Table 21

    PREPARADOs. Pfam domains that are both significant PA/RA domains (reproducibly found as such in multiple taxa or by multiple approaches) and more abundant in plants than in bacteria according to Pfam (PREPARADOs). Pfams labeled in yellow are carbohydrate-related and are part of proteins found in eukaryotes and bacteria with full length sequence similarity, having an N-terminus signal peptide, and lacking a transmembrane domain. Cells marked in green are domains that are predicted to be secreted by Sec or T3SS (over >50% of the bacterial proteins having the domain are predicted to be secreted by these secretion systems).

  24. Supplementary Table 22

    Full-length proteins conserved between PA bacterial genes and eukaryotic genes. LAST alignment results of PREPARADO-containing proteins from bacteria (query) against plant, fungi, oomycetes, and protist proteins from Refseq (target). Only alignments that are over 40% identity and stretch across at least 90% of the query and target length are shown.

  25. Supplementary Table 23

    Jekyll and Hyde. Gene homologs of Jekyll and Hyde proteins based on protein homologs on IMG; To find all homologs and paralogs of Jekyll and Hyde genes (a-d) we used IMG blast search with e value threshold of 1e-5 against all IMG isolates, some of which were not included in the original comparartive analysis and hence their genes are not part of any cluster. Since Hyde1 proteins are rapidly evolving, they are scattered across multiple OrthoFinder orthogroups. Metadata in a-d was retrieved from IMG website. a. Jekyll protein homologs of Acidovorax gene Ga0102403_10160, b. Hyde1 protein homologs of Acidovorax protein Aave_1071, c. Hyde1-like protein homologs of Pseudomonas protein A243_06583, d. Hyde2 homologs of Ga0078621_123530, e. Hyde1-like-Hyde2 loci in representative Proteobacteria, one per genus, and their location adjacent to T6SS genes and within genomes that encode T6SS. Hyde2 was found based on blast search against the nr db with Acav_4635 as the query.

  26. Supplementary Table 24

    Divergence of Jekyll gene operon. An analysis of the Jekyll gene cluster that is presented in Figure 6b. Control genes are shown in Figure S26c. The table summarizes a comparison between multiple sequence alignments of the Jekyll locus (Figure S24b) and the control genes (Figure S24c).

  27. Supplementary Table 25

    Toxicity of Hyde proteins and recovery of prey cells confronted with Hyde-encoding Acidovorax and different mutants. Includes primers used to make Acidovorax deletion strains, strains used as prey and their antibiotic resistance, raw results for cell toxicity and competition assays.

  28. Supplementary Table 26

    Significant orthogroups (orthofinder clusters) supported by three statistical approaches: either hypergbin, phyloglmbin, and Scoary, or hypergcn, phyloglmcn, and Scoary

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41588-017-0012-9