Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Dense genomic sampling identifies highways of pneumococcal recombination


Evasion of clinical interventions by Streptococcus pneumoniae occurs through selection of non-susceptible genomic variants. We report whole-genome sequencing of 3,085 pneumococcal carriage isolates from a 2.4-km2 refugee camp. This sequencing provides unprecedented resolution of the process of recombination and its impact on population evolution. Genomic recombination hotspots show remarkable consistency between lineages, indicating common selective pressures acting at certain loci, particularly those associated with antibiotic resistance. Temporal changes in antibiotic consumption are reflected in changes in recombination trends, demonstrating rapid spread of resistance when selective pressure is high. The highest frequencies of receipt and donation of recombined DNA fragments were observed in non-encapsulated lineages, implying that this largely overlooked pneumococcal group, which is beyond the reach of current vaccines, may have a major role in genetic exchange and the adaptation of the species as a whole. These findings advance understanding of pneumococcal population dynamics and provide information for the design of future intervention strategies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Population structure and genetic interactions.
Figure 2: Evolutionary parameters estimated in dominant clusters.
Figure 3: Recombination hotspots in seven prevalent clusters.
Figure 4: Associations between recombining genes and resistant phenotypes.

Similar content being viewed by others

Accession codes

Primary accessions

Sequence Read Archive

Referenced accessions



  1. O'Brien, K.L. & Nohynek, H. Report from a WHO Working Group: standard method for detecting upper respiratory carriage of Streptococcus pneumoniae. Pediatr. Infect. Dis. J. 22, e1–e11 (2003).

    Article  Google Scholar 

  2. Adetifa, I.M. et al. Pre-vaccination nasopharyngeal pneumococcal carriage in a Nigerian population: epidemiology and population biology. PLoS ONE 7, e30548 (2012).

    Article  CAS  Google Scholar 

  3. Hanage, W.P. et al. Evidence that pneumococcal serotype replacement in Massachusetts following conjugate vaccination is now complete. Epidemics 2, 80–84 (2010).

    Article  Google Scholar 

  4. Croucher, N.J. et al. Rapid pneumococcal evolution in response to clinical interventions. Science 331, 430–434 (2011).

    Article  CAS  Google Scholar 

  5. Croucher, N.J. et al. Population genomics of post-vaccine changes in pneumococcal epidemiology. Nat. Genet. 45, 656–663 (2013).

    Article  CAS  Google Scholar 

  6. Steinmoen, H., Knutsen, E. & Havarstein, L.S. Induction of natural competence in Streptococcus pneumoniae triggers lysis and DNA release from a subfraction of the cell population. Proc. Natl. Acad. Sci. USA 99, 7681–7686 (2002).

    Article  CAS  Google Scholar 

  7. Hanage, W.P., Fraser, C., Tang, J., Connor, T.R. & Corander, J. Hyper-recombination, diversity, and antibiotic resistance in pneumococcus. Science 324, 1454–1457 (2009).

    Article  CAS  Google Scholar 

  8. Hiller, N.L. et al. Generation of genic diversity among Streptococcus pneumoniae strains via horizontal gene transfer during a chronic polyclonal pediatric infection. PLoS Pathog. 6, e1001108 (2010).

    Article  Google Scholar 

  9. Eldholm, V., Johnsborg, O., Haugen, K., Ohnstad, H.S. & Havarstein, L.S. Fratricide in Streptococcus pneumoniae: contributions and role of the cell wall hydrolases CbpD, LytA and LytC. Microbiology 155, 2223–2234 (2009).

    Article  CAS  Google Scholar 

  10. Wei, H. & Havarstein, L.S. Fratricide is essential for efficient gene transfer between pneumococci in biofilms. Appl. Environ. Microbiol. 78, 5897–5905 (2012).

    Article  CAS  Google Scholar 

  11. Donkor, E.S. et al. High levels of recombination among Streptococcus pneumoniae isolates from the Gambia. mBio 2, e00040–11 (2011).

    Article  CAS  Google Scholar 

  12. Turner, P. et al. A longitudinal study of Streptococcus pneumoniae carriage in a cohort of infants and their mothers on the Thailand-Myanmar border. PLoS ONE 7, e38271 (2012).

    Article  CAS  Google Scholar 

  13. Turner, C. et al. High rates of pneumonia in children under two years of age in a South East Asian refugee population. PLoS ONE 8, e54026 (2013).

    Article  CAS  Google Scholar 

  14. Cheng, L., Connor, T.R., Siren, J., Aanensen, D.M. & Corander, J. Hierarchical and spatially explicit clustering of DNA sequences with BAPS Software. Mol. Biol. Evol. 30, 1224–1228 (2013).

    Article  CAS  Google Scholar 

  15. Corander, J., Marttinen, P., Siren, J. & Tang, J. Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations. BMC Bioinformatics 9, 539 (2008).

    Article  Google Scholar 

  16. Salter, S.J. et al. Variation at the capsule locus, cps, of mistyped and non-typable Streptococcus pneumoniae isolates. Microbiology 158, 1560–1569 (2012).

    Article  CAS  Google Scholar 

  17. Hsieh, Y.C. et al. Serotype competence and penicillin resistance in Streptococcus pneumoniae. Emerg. Infect. Dis. 12, 1709–1714 (2006).

    Article  CAS  Google Scholar 

  18. Zapun, A., Contreras-Martel, C. & Vernet, T. Penicillin-binding proteins and β-lactam resistance. FEMS Microbiol. Rev. 32, 361–385 (2008).

    Article  CAS  Google Scholar 

  19. Adrian, P.V. & Klugman, K.P. Mutations in the dihydrofolate reductase gene of trimethoprim-resistant isolates of Streptococcus pneumoniae. Antimicrob. Agents Chemother. 41, 2406–2413 (1997).

    Article  CAS  Google Scholar 

  20. Padayachee, T. & Klugman, K.P. Novel expansions of the gene encoding dihydropteroate synthase in trimethoprim-sulfamethoxazole-resistant Streptococcus pneumoniae. Antimicrob. Agents Chemother. 43, 2225–2230 (1999).

    Article  CAS  Google Scholar 

  21. Silver, L.L. Multi-targeting by monotherapeutic antibacterials. Nat. Rev. Drug Discov. 6, 41–55 (2007).

    Article  CAS  Google Scholar 

  22. Hoge, C.W., Gambel, J.M., Srijan, A., Pitarangsi, C. & Echeverria, P. Trends in antibiotic resistance among diarrheal pathogens isolated in Thailand over 15 years. Clin. Infect. Dis. 26, 341–345 (1998).

    Article  CAS  Google Scholar 

  23. O'Brien, K.L. & Nohynek, H. Report from a WHO working group: standard method for detecting upper respiratory carriage of Streptococcus pneumoniae. Pediatr. Infect. Dis. J. 22, 133–140 (2003).

    Article  Google Scholar 

  24. Zerbino, D.R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).

    Article  CAS  Google Scholar 

  25. Boetzer, M., Henkel, C.V., Jansen, H.J., Butler, D. & Pirovano, W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2011).

    Article  CAS  Google Scholar 

  26. Boetzer, M. & Pirovano, W. Toward almost closed genomes with GapFiller. Genome Biol. 13, R56 (2012).

    Article  Google Scholar 

  27. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  28. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  Google Scholar 

  29. Croucher, N.J. et al. Role of conjugative elements in the evolution of the multidrug-resistant pandemic clone Streptococcus pneumoniaeSpain23F ST81. J. Bacteriol. 191, 1480–1489 (2009).

    Article  CAS  Google Scholar 

  30. Harris, S.R. et al. Evolution of MRSA during hospital transmission and intercontinental spread. Science 327, 469–474 (2010).

    Article  CAS  Google Scholar 

  31. Corander, J., Waldmann, P. & Sillanpaa, M.J. Bayesian analysis of genetic differentiation between populations. Genetics 163, 367–374 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Corander, J. & Tang, J. Bayesian analysis of population structure based on linked molecular information. Math. Biosci. 205, 19–31 (2007).

    Article  Google Scholar 

  33. Tang, J., Hanage, W.P., Fraser, C. & Corander, J. Identifying currents in the gene pool for bacterial populations using an integrative approach. PLoS Comput. Biol. 5, e1000455 (2009).

    Article  Google Scholar 

  34. Corander, J., Connor, T.R., O'Dwyer, C.A., Kroll, J.S. & Hanage, W.P. Population structure in the Neisseria, and the biological significance of fuzzy species. J. R. Soc. Interface 9, 1208–1215 (2011).

    Article  Google Scholar 

  35. Mutreja, A. et al. Evidence for several waves of global transmission in the seventh cholera pandemic. Nature 477, 462–465 (2011).

    Article  CAS  Google Scholar 

  36. Cheng, L., Connor, T.R., Siren, J., Aanensen, D.M. & Corander, J. Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol. Biol. Evol. 30, 1224–1228 (2013).

    Article  CAS  Google Scholar 

  37. Willems, R.J. et al. Restricted gene flow among hospital subpopulations of Enterococcus faecium. mBio 3, e00151–12 (2012).

    Article  CAS  Google Scholar 

  38. Price, M.N., Dehal, P.S. & Arkin, A.P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  Google Scholar 

  39. Drummond, A.J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).

    Article  Google Scholar 

  40. Feil, E.J., Maiden, M.C., Achtman, M. & Spratt, B.G. The relative contributions of recombination and mutation to the divergence of clones of Neisseria meningitidis. Mol. Biol. Evol. 16, 1496–1502 (1999).

    Article  CAS  Google Scholar 

  41. Stamatakis, A. RAxML-VI-HPC: maximum likelihood–based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).

    Article  CAS  Google Scholar 

  42. Smith, A.M. & Klugman, K.P. Alterations in MurM, a cell wall muropeptide branching enzyme, increase high-level penicillin and cephalosporin resistance in Streptococcus pneumoniae. Antimicrob. Agents Chemother. 45, 2393–2396 (2001).

    Article  CAS  Google Scholar 

  43. Letunic, I. & Bork, P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 39, W475–W478 (2011).

    Article  CAS  Google Scholar 

  44. Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).

    Article  CAS  Google Scholar 

Download references


We thank the microbiology and clinical team in the Shoklo Malaria Research Unit, part of the Mahidol Oxford University Research Unit, Faculty of Tropical Medicine, Mahidol University, Thailand, and the core informatics, library-generating and sequencing teams at the Wellcome Trust Sanger Institute. Attending authors are grateful for the opportunity for discussion at the Permafrost workshop. C.C. was funded by a Royal Thai Government scholarship and a Wellcome Trust PhD studentship. J.C., P.M., A.P. and L.C. were funded by Academy of Finland grant 251170 and European Research Council grant 239784. S.D.B. is partly funded by the National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre. P.T. was funded by Wellcome Trust grant 083735/Z/07Z. This work is sponsored by Wellcome Trust grant 098051.

Author information

Authors and Affiliations



S.D.B., P.T., J.P., D.G. and F.N. conceived the study. P.T. and C.T. collected and provided the samples for the study. J.P., S.D.B., C.C., S.R.H., N.J.C. and J.C. designed the analyses. C.C., S.R.H., P.M., L.C., A.P., D.M.A., A.E.M., A.J.P., S.J.S., D.H. and J.C. performed the analyses. C.C., S.D.B., S.R.H., A.E.M. and N.J.C. wrote the manuscript. All authors read and approved the manuscript.

Corresponding author

Correspondence to Stephen D Bentley.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Association between recombining pbp genes and resistant phenotypes.

(a) pbp1a gene tree. (b) pbp2b gene tree. (c) pbp2x gene tree. The inner ring is colored according to dominant population clusters (BC1–7), with the rest of the population appearing in white. The outer ring is colored according to resistance to penicillin with black and white showing non-susceptibility and susceptibility, respectively.

Supplementary Figure 2 All possible recombination donors for recombinant fragments detected in one isolate.

Top panel: nine predicted recombination fragments in SMRU1452 (fragments A–I) are highlighted in different colors and are ordered according to their locations on the genome with their size labeled. Bottom panel: the bar chart presents the possible sources of each recombinant fragment based on the above color scheme. The y axis gives the proportion of hits detected per population of particular lineage. For example, recombination fragment A of 1,236 bp in length was found to have identical matches in 96.29%, 75%, 70.59%, 69.84%, 10% and 8.33% of the population of secondary BAPS clusters of serotype 11A, 15B, 34, 6B, NT and 16F, respectively.

Supplementary Figure 3 Potential donors characterized by isolated and by BAPS primary clusters.

(a) Boxplots represent distribution of donation probability of isolates within each cluster. A red bar represents a mean frequency of donation event of any isolates (2.53 × 10–5), i.e., each isolate has a probability of 1/39,537 to donate DNA in a recombination event. (b) and (c) respectively show positive correlations between potential donor clusters (based on primary BAPS clusters) and outer population size, and separately cluster diversity.

Supplementary Figure 4 Single-nucleotide polymorphism (SNPs)-based phylogeny using method described in Croucher et al.

Each panel represents a major Maela cluster: (a) BC1-19F, (b) BC2-23F, (c) BC3-NT, (d) BC4-6B, (e) BC5-23A/F, (f) BC6-15B/C and (g) BC7-14. Subclades where substitution rates were estimated are highlighted in different colors and labeled accordingly. Please note that substitution rates cannot be confidently estimated from any clades in BC6-15B/C. The scale bar represents the number of SNPs.

Supplementary Figure 5 Demonstration that clocklike signals can be detected from the subclades but not from the whole population.

Each dominant cluster is comprised of more than a single subclade that coevolve together. This plot used BC5-23A/F as an example. The clock signal cannot be detected in the whole cluster as there is confounding from the signals of the subclades.

Supplementary Figure 6 Clocklike signals from Path-O-Gen in the subclades where substitution rates were estimated.

These subclades were highlighted in Supplementary Figure 4. The y axis reports root-to tip divergence while the x axis represents the time scale in days from the first date of collection, which was 12 November 2007. The first date (time = 0) is shown as a vertical dashed line.

Supplementary Figure 7 Recombination per mutation (r/m) of each cluster calculated by linear regression.

Due to the large sample size available in our studies, we alternatively calculated the ratio of recombination events (y axis) over point mutations (x axis) observed on each branch from the slope (r/m) of the linear regression. The number is tabulated in Supplementary Table 4. For comparison of r/m by linear regression, all the data are ranked to accommodate the non-parametric ANCOVA analysis.

Supplementary Figure 8 Comparison of two recombination-detecting methods.

(a) Genome view of recombination fragments predicted by both algorithms. Recombination regions are aligned with taxa on the phylogenetic tree (left). Genome coordinates are labeled on top. Recombination regions exclusively predicted by methods described in Croucher et al. and Marttinen et al. are highlighted in red and blue, respectively. Overlapping regions predicted by both algorthms are highlighted in dark grey. (b) A histogram showing the length of recombination fragments (bp) predicted by two algorithms. (c) Sequence quality of recombination fragments predicted by two algorithms reported as percent “N”. For (b) and (c), fragments predicted by tools described in Croucher et al. and Marttinen et al. are shaded in red and blue, respectively. The values given by (b) and (c) are summarized in (d).

Supplementary Figure 9 Identifying donor blocks from recipient block identity.

Donor blocks are sequences that show identical matching to the recipient blocks. (a) A histogram showing distribution of length of recipient blocks from recombination events detected at the tip of the phylogenies. Shaded in gray are all recipient blocks used as queries for blast searches. In white are recipient blocks where identical hits were detected from the rest of the population. (b) Summarization of the values from distribution in (a). (c) A plot showing association between the length of sequence queries (recipient blocks) and the diversity of detected hits (potential donor blocks classified by secondary BAPS clusters). The data was modeled as exponential decay with the line of best fit (red line) and the 95% confidence interval (dashed red lines).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–9, Supplementary Tables 2–9 and Supplementary Note. (PDF 4563 kb)

Supplementary Table 1

Epidemiological data associated with strains and accession codes associated with data deposited in the European Nucleotide Archive. (XLS 543 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chewapreecha, C., Harris, S., Croucher, N. et al. Dense genomic sampling identifies highways of pneumococcal recombination. Nat Genet 46, 305–309 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research