Legionella pneumophila is a strictly environmental pathogen and the etiological agent of legionellosis. It is known that non-vertical processes have a major role in the short-term evolution of pathogens, but little is known about the relevance of these and other processes in environmental bacteria. We report the whole-genome sequencing of 69 L. pneumophila strains linked to recurrent outbreaks in a single location (Alcoy, Spain) over 11 years. We found some examples where the genome sequences of isolates of the same sequence type and outbreak did not cluster together and were more closely related to sequences from different outbreaks. Our analyses identify 16 recombination events responsible for almost 98% of the SNPs detected in the core genome and an apparent acceleration in the evolutionary rate. These results have profound implications for the understanding of microbial populations and for public health interventions in Legionella outbreak investigations.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Communications Open Access 04 March 2022
Phenotypic differentiation of Streptococcus pyogenes populations is induced by recombination-driven gene-specific sweeps
Scientific Reports Open Access 08 November 2016
Scientific Reports Open Access 16 September 2016
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Lawrence, J.G. & Retchless, A.C. in Horizontal Gene Transfer: Genomes in Flux (eds. Gogarten, M.B., Gogarten, J.P. & Olendzenski, L.) 29–53 (Humana Press, New York, 2009).
Levin, B.R. & Cornejo, O.E. The population and evolutionary dynamics of homologous gene recombination in bacteria. PLoS Genet. 5, e1000601 (2009).
Townsend, J.P., Bøhn, T. & Nielsen, K.M. Assessing the probability of detection of horizontal gene transfer events in bacterial populations. Front. Microbiol. 3, 27 (2012).
Levin, B.R. & Bergstrom, C.T. Bacteria are different: observations, interpretations, speculations, and opinions about the mechanisms of adaptive evolution in prokaryotes. Proc. Natl. Acad. Sci. USA 97, 6981–6985 (2000).
Roumagnac, P. et al. Evolutionary history of Salmonella typhi. Science 314, 1301–1304 (2006).
Baquero, F., Tedim, A.S. & Coque, T.M. Antibiotic resistance shaping multilevel population biology of bacteria. Front. Microbiol. 4, 15 (2013).
Chin, C.S. et al. The origin of the Haitian cholera outbreak strain. N. Engl. J. Med. 364, 33–42 (2011).
Croucher, N.J. et al. Rapid pneumococcal evolution in response to clinical interventions. Science 331, 430–434 (2011).
Gardy, J.L. et al. Whole-genome squencing and social-network analysis of a tuberculosis outbreak. N. Engl. J. Med. 364, 730–739 (2011).
Mellmann, A. et al. Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104: H4 outbreak by rapid next generation sequencing technology. PLoS ONE 6, e22751 (2011).
Köser, C.U. et al. Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak. N. Engl. J. Med. 366, 2267–2275 (2012).
Harris, S.R. et al. Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. Lancet Infect. Dis. 13, 130–136 (2013).
Fields, B.S., Benson, R.F. & Besser, R.E. Legionella and Legionnaires' disease: 25 years of investigation. Clin. Microbiol. Rev. 15, 506–526 (2002).
Fraser, D.W. et al. Legionnaires' disease: description of an epidemic of pneumonia. N. Engl. J. Med. 297, 1189–1197 (1977).
Fernández, J.A., López, P., Orozco, D. & Merino, J. Clinical study of an outbreak of Legionnaire's disease in Alcoy, Southeastern Spain. Eur. J. Clin. Microbiol. Infect. Dis. 21, 729–735 (2002).
Coscollá, M., Fenollar, J., Escribano, I. & González-Candelas, F. Legionellosis outbreak associated with asphalt paving machine, Spain, 2009. Emerg. Infect. Dis. 16, 1381–1387 (2010).
Rota, M. et al. Cluster of travel-associated Legionnaires' disease in Lazise, Italy, July to August 2011. Euro Surveill. 16, 19982 (2011).
Yiallouros, P.K. et al. First outbreak of nosocomial Legionella infection in term neonates caused by a cold mist ultrasonic humidifier. Clin. Infect. Dis. 57, 48–56 (2013).
Keramarou, M. & Evans, M.R. A community outbreak of Legionnaires' disease in South Wales, August–September 2010. Euro Surveill. 15, 19691 (2010).
Coetzee, N. et al. An outbreak of Legionnaires' disease associated with a display spa pool in retail premises, Stoke-on-Trent, United Kingdom, July 2012. Euro Surveill. 17, 20271 (2012).
McCormick, D. et al. Public health response to an outbreak of Legionnaires' disease in Edinburgh, United Kingdom, June 2012. Euro Surveill. 17, 20216 (2012).
Gaia, V., Fry, N.K., Harrison, T.J. & Peduzzi, R. Sequence-based typing of Legionella pneumophila serogroup 1 offers the potential for true portability in legionellosis outbreak investigation. J. Clin. Microbiol. 41, 2932–2939 (2003).
Gaia, V. et al. Consensus sequence–based scheme for epidemiological typing of clinical and environmental isolates of Legionella pneumophila. J. Clin. Microbiol. 43, 2047–2052 (2005).
Ratzow, S., Gaia, V., Helbig, J.H., Fry, N.K. & Luck, P.C. Addition of neuA, the gene encoding N-acylneuraminate cytidylyl transferase, increases the discriminatory ability of the consensus sequence–based scheme for typing Legionella pneumophila serogroup 1 strains. J. Clin. Microbiol. 45, 1965–1968 (2007).
Mentasti, M. et al. Extension of the Legionella pneumophila sequence-based typing scheme to include strains carrying a variant of the N-acylneuraminate cytidylyltransferase gene. Clin. Microbiol. Infect. 20, O435–O441 (2014).
Coscollá, M., Comas, I. & González-Candelas, F. Quantifying nonvertical inheritance in the evolution of Legionella pneumophila. Mol. Biol. Evol. 28, 985–1001 (2011).
Gomez-Valero, L. & Buchrieser, C. Genome dynamics in legionella: the basis of versatility and adaptation to intracellular replication. Cold Spring Harb. Perspect. Med. 3, a009993 (2013).
Underwood, A.P., Jones, G., Mentasti, M., Fry, N.K. & Harrison, T.G. Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing. BMC Microbiol. 13, 302 (2013).
Coscollá, M. & González-Candelas, F. Population structure and recombination in environmental isolates of Legionella pneumophila. Environ. Microbiol. 9, 643–656 (2007).
Rasko, D.A. et al. Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation. Proc. Natl. Acad. Sci. USA 108, 5027–5032 (2011).
Reuter, S. et al. A pilot study of rapid whole-genome sequencing for the investigation of a Legionella outbreak. BMJ Open 3, e002175 (2013).
Chewapreecha, C. et al. Dense genomic sampling identifies highways of pneumococcal recombination. Nat. Genet. 46, 305–309 (2014).
Croucher, N.J. et al. Population genomics of post-vaccine changes in pneumococcal epidemiology. Nat. Genet. 45, 656–663 (2013).
He, M. et al. Emergence and global spread of epidemic healthcare-associated Clostridium difficile. Nat. Genet. 45, 109–113 (2013).
Holt, K.E. et al. Shigella sonnei genome sequencing and phylogenetic analysis indicate recent global dissemination from Europe. Nat. Genet. 44, 1056–1059 (2012).
D'Auria, G., Jimenez-Hernandez, N., Peris-Bondia, F., Moya, A. & Latorre, A. Legionella pneumophila pangenome reveals strain-specific virulence factors. BMC Genomics 11, 181 (2010).
Bryant, J.M. et al. Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study. Lancet 381, 1551–1560 (2013).
Bucci, C. et al. Hypermutation in pathogenic bacteria: frequent phase variation in meningococci is a phenotypic trait of a specialized mutator biotype. Mol. Cell 3, 435–445 (1999).
Weigand, M.R. & Sundin, G.W. General and inducible hypermutation facilitate parallel adaptation in Pseudomonas aeruginosa despite divergent mutation spectra. Proc. Natl. Acad. Sci. USA 109, 13680–13685 (2012).
Hanage, W.P., Fraser, C., Tang, J., Connor, T.R. & Corander, J. Hyper-recombination, diversity, and antibiotic resistance in Pneumococcus. Science 324, 1454–1457 (2009).
Thaipadungpanit, J. et al. A dominant clone of Leptospira interrogans associated with an outbreak of human leptospirosis in Thailand. PLoS Negl. Trop. Dis. 1, e56 (2007).
Nicolas, P. et al. Population structure of the fish-pathogenic bacterium Flavobacterium psychrophilum. Appl. Environ. Microbiol. 74, 3702–3709 (2008).
Vos, M. & Didelot, X. A comparison of homologous recombination rates in bacteria and archaea. ISME J. 3, 199–208 (2009).
Hanage, W.P., Fraser, C. & Spratt, B.G. Fuzzy species among recombinogenic bacteria. BMC Biol. 3, 6 (2005).
Brochet, M. et al. Shaping a bacterial genome by large chromosomal replacements, the evolutionary history of Streptococcus agalactiae. Proc. Natl. Acad. Sci. USA 105, 15961–15966 (2008).
He, M. et al. Evolutionary dynamics of Clostridium difficile over short and long time scales. Proc. Natl. Acad. Sci. USA 107, 7527–7532 (2010).
Velmurugan, K. et al. Mycobacterium tuberculosis nuoG is a virulence gene that inhibits apoptosis of infected host cells. PLoS Pathog. 3, e110 (2007).
Miller, J.L., Velmurugan, K., Cowan, M.J. & Briken, V. The type I NADH dehydrogenase of Mycobacterium tuberculosis counters phagosomal NOX2 activity to inhibit TNF-α–mediated host cell apoptosis. PLoS Pathog. 6, e1000864 (2010).
Blomgran, R., Desvignes, L., Briken, V. & Ernst, J.D. Mycobacterium tuberculosis inhibits neutrophil apoptosis, leading to delayed activation of naive CD4 T cells. Cell Host Microbe 11, 81–90 (2012).
Doolittle, W.F. & Zhaxybayeva, O. On the origin of prokaryotic species. Genome Res. 19, 744–756 (2009).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Zerbino, D.R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Wilm, A. et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 40, 11189–11201 (2012).
Darling, A.C.E., Mau, B., Blattner, F.R. & Perna, N.T. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 14, 1394–1403 (2004).
Darling, A.E., Mau, B. & Perna, N.T. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE 5, e11147 (2010).
Capella- Gutiérrez, S., Silla-Martínez, J.M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Stamatakis, A. RAxML-VI-HPC: maximum likelihood–based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).
Vilella, A.J., Blanco-Garcia, A., Hutter, S. & Rozas, J. VariScan: analysis of evolutionary patterns from large-scale DNA sequence polymorphism data. Bioinformatics 21, 2791–2793 (2005).
Tamura, K. et al. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739 (2011).
R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2011).
Paradis, E., Claude, J. & Strimmer, K. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics 20, 289–290 (2004).
Drummond, A.J., Suchard, M.A., Xie, D. & Rambaut, A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973 (2012).
Cui, Y. et al. Historical variations in mutation rate in an epidemic pathogen, Yersinia pestis. Proc. Natl. Acad. Sci. USA 110, 577–582 (2013).
Schmidt, H.A., Strimmer, K., Vingron, M. & von Haeseler, A. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502–504 (2002).
The authors acknowledge the Dirección General de Salud Pública (DGSP; Conselleria de Sanidad, Generalitat Valenciana) and the Centro de Salud Pública de Alcoy for sharing information and comments on the Legionella outbreaks in Alcoy and facilitating access to epidemiological data. We also thank the Microbiology Service at the Hospital Virgen de los Lírios (Alcoy) and the Laboratorio de Salud Pública for sharing the Legionella samples used in this study, as well as the Genomics section of the Serveis Centrals de Support a l'Investigació Experimental (SCSIE; University of Valencia) for their technical help in the deep sequencing of the strains. We would also like to thank J. Parkhill and S. Harris (Wellcome Trust Sanger Institute) for their comments as well as facilitating their computational resources during the revision of the manuscript. The work has been funded by project BFU2011-24112 from the Ministerio de Economía y Competitividad (MINECO) of the Spanish government. L.S.-B. is the recipient of a Formación de Profesorado Universitario (FPU) predoctoral fellowship from the Ministerio de Educación of the Spanish government. I.C. was supported by an Intra-European Fellowship (IEF; 272086) from Framework Programme 7 and Ramón y Cajal from the Spanish government (RYC-2012-10627).
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Distribution of the number of core genome pairwise SNP distances between and within the L. pneumophila strains included in the study.
(a) Intra– and inter–sequence type (ST) pairwise distances between L. pneumophila isolates are shown in gray and red, respectively. (b) Within-ST pairwise polymorphisms for the three most common profiles in the data set: ST578, ST637 and ST1.
Correlation between the SNP distances found from SBT and core genome data for the 15 STs included in this study. Dots of different color show comparisons between each ST and the remainder.
Genetic variability measured by the nucleotide diversity (π, filled dots) and the population mutation rate (τ, unfilled dots) between the core genome of the three most abundant STs in our data set (ST1, ST578 and ST637; with 10, 45 and 6 isolates, respectively) measured using Variscan. Red represents the estimates from clinical samples (C), blue represent the estimates from environmental samples (E) and black represents the estimates from both groups together.
Nucleotide diversity (π filled dots) and population mutation rate (τ, unfilled dots) calculated for the core genome using Variscan (a) within and between the ST578 and ST637 strains grouped by outbreak or sporadic cases (sorted by the time at which each outbreak occurred) and (b) between the ST578 strains isolated in different years. Red represents the estimates in clinical samples (C), blue represents the estimates in environmental samples (E) and black shows the grouping of both together.
The correlation between the root-to-tip distances for the strains included in the study and the isolation time estimated using Path-O-Gen in the initial core alignment (a) and in the non-recombining core alignment (b). The two outlier points in b correspond to strains ID_3355 and ID_7371, which have probably acquired mutations at a slightly higher rate than the other strains present in the data set. Removing these two outliers, the correlation and R2 parameters do not change significantly.
The distribution of the substitution rates per branch (evaluated from the MCC reconstruction with BEAST using the BSP and uncorrelated lognormal clock) of the ST578 strains from Alcoy is represented as boxplots for the complete core genome and the non-recombining core genome, respectively. The B1–B5 labels for the outlier dots mark the five mainly recombinant branches found in the study. Summary statistics for the corresponding distributions are shown in the box below each plot.
Scatterplot showing the observed number of SNPs per gene against the expected number considering a random normal distribution. Red lines represent the 95% confidence interval for the regression analysis. The 3,120 genes of the initial core genome are represented as black dots, and the 2,688 genes of the non-recombining core genome are shown in orange.
Gene trees were compared topologically to the concatenate of the whole-core alignment. Sequences from other STs were included to account for recombination with external strains and are shown in different colors. (a) The topology of the concatenate. (b–f) The topologies of five different genes detected as recombinant. Note the external STs interspersed within the ST578 population.
The SNP density per non-recombinant gene is shown for branches B1, B2 and B5 taking into account the ancestral SNPs traced using Mesquite and the corresponding gene lengths. Branches B3 and B4 are not present in the topology of the 2,688-gene non-recombining MCC BEAST tree and thus are not represented in the plot. The 432 genes detected as recombinant are represented in the back as gray points for comparison purposes.
Supplementary Figure 10 Representation of the number of SNPs per gene per branch in the 3,120-gene ST578 core tree.
The tree on the left contains the node numbers that identify the different branches (the notation for the five branches, B1–B5, that accumulate more SNPs is shown in gray). The graph on the right shows the total number of SNPs per gene as the header. The following rows represent the number of SNPs per gene in each of the branches. Background colors represent the clades in the tree where the corresponding branches are present. The five branches on which this study focuses are marked in bold with black rectangles. Gray vertical lines represent the 247 genes detected with a non-vertical phylogenetic signal.
The distribution of the function of the genes involved in the detected recombination events over Cluster of Orthologous Genes (COGs) classification. These genes include virulence factors involved in the type IV secretion system (sidA, sidF, lepA and lepB) and genes involved in invasion and motility (enhA, enhB, enhC and mviN), regulation (letA) and iron acquisition (feoA and feoB). Moreover, with recombination events B and C, ST578B could have incorporated SNPs in genes involved in DNA damage repair and mutagenesis, such as mutH, umuC and umuD. Letters on the x axis correspond to the functional categories described in ftp://ftp.ncbi.nlm.nih.gov/pub/COG/COG/fun.txt.
Comparison between the mean time to MRCA estimates (black dots) and the 95% HPD intervals (blue dots) before (A) and after (B) removing recombination events for the whole ST578 tree and the two sublineages ST578A and ST578B independently.
About this article
Cite this article
Sánchez-Busó, L., Comas, I., Jorques, G. et al. Recombination drives genome evolution in outbreak-related Legionella pneumophila isolates. Nat Genet 46, 1205–1211 (2014). https://doi.org/10.1038/ng.3114
This article is cited by
Nature Communications (2022)
Current Genetics (2017)
Phenotypic differentiation of Streptococcus pyogenes populations is induced by recombination-driven gene-specific sweeps
Scientific Reports (2016)
Legionella pneumophila strain associated with the first evidence of person-to-person transmission of Legionnaires’ disease: a unique mosaic genetic backbone
Scientific Reports (2016)
Nature Genetics (2016)