Accurate and universal delineation of prokaryotic species


The exponentially increasing number of sequenced genomes necessitates fast, accurate, universally applicable and automated approaches for the delineation of prokaryotic species. We developed specI (species identification tool;, a method to group organisms into species clusters based on 40 universal, single-copy phylogenetic marker genes. Applied to 3,496 prokaryotic genomes, specI identified 1,753 species clusters. Of 314 discrepancies with a widely used taxonomic classification, >62% were resolved by literature support.

Figure 1: Comparative performance assessment of specI.
Figure 2: Phylogenetic trees and species-level clustering of Prochlorococcus displaying discrepancies with the NCBI Taxonomy data.


We thank the members of the Bork group for helpful discussions and Y. Yuan and members of the European Molecular Biology Laboratory information technology core facility for managing the high-performance computing resources. We acknowledge funding provided by the CancerBiome project (European Research Council project reference 268985), the 'METACARDIS' project (FP7-HEALTH-2012-INNOVATION-I-305312) and the International Human Microbiome Standards project (HEALTH-F4-2010-261376).

P.B., D.R.M., S.S. and G.Z. designed the study. D.R.M. developed and implemented the program, D.R.M. and G.Z. performed the experiments, D.R.M., S.S. and G.Z. analyzed the data, and D.R.M., S.S., G.Z. and P.B. wrote the manuscript.

Correspondence to Peer Bork.

The authors declare no competing financial interests.

Supplementary Figures 1–8, Supplementary Tables 1–3, 5–7, 15, 17, 19 and 20, and Supplementary Note (PDF 1573 kb)

NCBI Taxonomy information of type strains listed on the list of prokaryotic names with standing in nomenclature (LPSN; that could be linked to NCBI, including their sequencing status (XLS 365 kb)

ANIb values of Prochlorococcusmarinus (XLS 19 kb)

ANIm values of Prochlorococcusmarinus (XLS 14 kb)

ANIb values of the Serratia and Rahnella clades (XLS 14 kb)

ANIm values of the Serratia and Rahnella clades (XLS 14 kb)

ANIb values of the Buchnera clade (XLS 15 kb)

ANIm values of the Buchnera clade (XLS 15 kb)

Cluster assignments for the 3,496 genomes used in this study (XLS 496 kb)

Literature-based reclassifications of species assignments of NCBI Taxonomy database (XLS 94 kb)

Assignments of genomes were previously not assigned to a named species to known species using the species clustering strategy presented in this publication (XLS 28 kb)

Mende, D., Sunagawa, S., Zeller, G. et al. Accurate and universal delineation of prokaryotic species. Nat Methods 10, 881–884 (2013).

