Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes

Nielsen, H Bjørn; Almeida, Mathieu; Juncker, Agnieszka Sierakowska; Rasmussen, Simon; Li, Junhua; Sunagawa, Shinichi; Plichta, Damian R; Gautier, Laurent; Pedersen, Anders G; Le Chatelier, Emmanuelle; Pelletier, Eric; Bonde, Ida; Nielsen, Trine; Manichanh, Chaysavanh; Arumugam, Manimozhiyan; Batto, Jean-Michel; Quintanilha dos Santos, Marcelo B; Blom, Nikolaj; Borruel, Natalia; Burgdorf, Kristoffer S; Boumezbeur, Fouad; Casellas, Francesc; Doré, Joël; Dworzynski, Piotr; Guarner, Francisco; Hansen, Torben; Hildebrand, Falk; Kaas, Rolf S; Kennedy, Sean; Kristiansen, Karsten; Kultima, Jens Roat; Léonard, Pierre; Levenez, Florence; Lund, Ole; Moumen, Bouziane; Le Paslier, Denis; Pons, Nicolas; Pedersen, Oluf; Prifti, Edi; Qin, Junjie; Raes, Jeroen; Sørensen, Søren; Tap, Julien; Tims, Sebastian; Ussery, David W; Yamada, Takuji; Renault, Pierre; Sicheritz-Ponten, Thomas; Bork, Peer; Wang, Jun; Brunak, Søren; Ehrlich, S Dusko

doi:10.1038/nbt.2939

Article
Published: 06 July 2014

Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes

H Bjørn Nielsen^1,2^na1,
Mathieu Almeida^3,4,5^na1,
Agnieszka Sierakowska Juncker^1,2,
Simon Rasmussen¹,
Junhua Li^6,7,8,
Shinichi Sunagawa⁹,
Damian R Plichta¹,
Laurent Gautier¹,
Anders G Pedersen¹,
Emmanuelle Le Chatelier^3,4,
Eric Pelletier^10,11,12,
Ida Bonde^1,2,
Trine Nielsen¹³,
Chaysavanh Manichanh¹⁴,
Manimozhiyan Arumugam ORCID: orcid.org/0000-0002-0886-9101^7,9,13,
Jean-Michel Batto^3,4,
Marcelo B Quintanilha dos Santos¹,
Nikolaj Blom²,
Natalia Borruel¹⁴,
Kristoffer S Burgdorf¹³,
Fouad Boumezbeur^3,4,
Francesc Casellas¹⁴,
Joël Doré^3,4,
Piotr Dworzynski¹,
Francisco Guarner¹⁴,
Torben Hansen^13,15,
Falk Hildebrand^16,17,
Rolf S Kaas¹⁸,
Sean Kennedy^3,4,
Karsten Kristiansen^7,19,
Jens Roat Kultima⁹,
Pierre Léonard^3,4,
Florence Levenez^3,4,
Ole Lund¹,
Bouziane Moumen^3,4,
Denis Le Paslier^10,11,12,
Nicolas Pons^3,4,
Oluf Pedersen^13,20,21,22,
Edi Prifti^3,4,
Junjie Qin^6,7,
Jeroen Raes^17,23,24,
Søren Sørensen²⁵,
Julien Tap⁹,
Sebastian Tims²⁶,
David W Ussery¹,
Takuji Yamada^9,27,
MetaHIT Consortium,
Pierre Renault³,
Thomas Sicheritz-Ponten^1,2,
Peer Bork^9,28,
Jun Wang^7,13,19,29,
Søren Brunak^1,2 &
…
S Dusko Ehrlich^3,4,30

Nature Biotechnology volume 32, pages 822–828 (2014)Cite this article

41k Accesses
638 Citations
103 Altmetric
Metrics details

Subjects

Abstract

Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Overview of co-abundance clustering and the MGS-augmented assembly.**

**Figure 2: Size distributions of co-abundance gene groups (CAGs).**

**Figure 3: Benchmarking sensitivity and specificity of the co-abundance clustering across a range of sequencing depths or sample numbers.**

**Figure 4: Comparison of the MGS:337 augmented assembly and the *B. animalis* reference genome.**

**Figure 5: Dependency associations among MGS and CAGs.**

**Figure 6: Gut persistence probability for *B. adolescentis*.**

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Wenpin Hou & Zhicheng Ji

A host–microbiota interactome reveals extensive transkingdom connectivity

Article 20 March 2024

Nicole D. Sonnert, Connor E. Rosen, … Noah W. Palm

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

Accession codes

Primary accessions

BioProject

32811

European Nucleotide Archive

References

Fodor, A.A. et al. The “most wanted” taxa from the human microbiome for whole genome sequencing. PLoS ONE 7, e41294 (2012).
Article CAS Google Scholar
Qin, J. et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59–65 (2010).
Article CAS Google Scholar
Lukjancenko, O., Wassenaar, T.M. & Ussery, D.W. Comparison of 61 sequenced Escherichia coli genomes. Microb. Ecol. 60, 708–720 (2010).
Article CAS Google Scholar
Fitzsimons, M.S. et al. Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome. Genome Res. 23, 878–888 (2013).
Article CAS Google Scholar
Pop, M. Genome assembly reborn: recent computational challenges. Brief. Bioinform. 10, 354–366 (2009).
Article CAS Google Scholar
Wooley, J.C., Godzik, A. & Friedberg, I. A primer on metagenomics. PLOS Comput. Biol. 6, e1000667 (2010).
Article Google Scholar
Iverson, V. et al. Untangling genomes from metagenomes: revealing an uncultured class of marine Euryarchaeota. Science 335, 587–590 (2012).
Article CAS Google Scholar
Wang, Y., Leung, H.C.M., Yiu, S.M. & Chin, F.Y.L. MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample. Bioinformatics 28, i356–i362 (2012).
Article CAS Google Scholar
Albertsen, M. et al. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat. Biotechnol. 31, 533–538 (2013).
Article CAS Google Scholar
Raes, J. & Bork, P. Molecular eco-systems biology: towards an understanding of community function. Nat. Rev. Microbiol. 6, 693–699 (2008).
Article CAS Google Scholar
Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 55–60 (2012).
Article CAS Google Scholar
Reyes, A. et al. Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature 466, 334–338 (2010).
Article CAS Google Scholar
Minot, S. et al. The human gut virome: inter-individual variation and dynamic response to diet. Genome Res. 21, 1616–1625 (2011).
Article CAS Google Scholar
Stern, A., Mick, E., Tirosh, I., Sagy, O. & Sorek, R. CRISPR targeting reveals a reservoir of common phages associated with the human gut microbiome. Genome Res. 22, 1985–1994 (2012).
Article CAS Google Scholar
Zhang, Q., Rho, M., Tang, H., Doak, T.G. & Ye, Y. CRISPR-Cas systems target a diverse collection of invasive mobile genetic elements in human microbiomes. Genome Biol. 14, R40 (2013).
Article Google Scholar
Chain, P.S.G. et al. Genomics. Genome project standards in a new era of sequencing. Science 326, 236–237 (2009).
Article CAS Google Scholar
Le Chatelier, E. et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013).
Article CAS Google Scholar
Chervaux, C. et al. Genome sequence of the probiotic strain Bifidobacterium animalis subsp. lactis CNCM I-2494. J. Bacteriol. 193, 5560–5561 (2011).
Article CAS Google Scholar
Terns, M.P. & Terns, R.M. CRISPR-based adaptive immune systems. Curr. Opin. Microbiol. 14, 321–327 (2011).
Article CAS Google Scholar
Kruschke, J.K. Bayesian data analysis. Wiley Interdiscip. Rev. Cogn. Sci. 1, 658–676 (2010).
Article Google Scholar
Karch, H. et al. The enemy within us: lessons from the 2011 European Escherichia coli O104:H4 outbreak. EMBO Mol. Med. 4, 841–848 (2012).
Article CAS Google Scholar
Kultima, J.R. et al. MOCAT: a metagenomics assembly and gene prediction toolkit. PLOS ONE 7, e47656 (2012).
Article Google Scholar
Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20, 265–272 (2010).
Article CAS Google Scholar
Zhu, W., Lomsadze, A. & Borodovsky, M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res. 38, e132 (2010).
Article Google Scholar
Li, R. et al. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–1967 (2009).
Article CAS Google Scholar
Leplae, R., Lima-Mendez, G. & Toussaint, A. ACLAME: a classification of mobile genetic elements, update 2010. Nucleic Acids Res. 38, D57–D61 (2010).
Article CAS Google Scholar
Finn, R.D., Clements, J. & Eddy, S.R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 39, W29–37 (2011).
Article CAS Google Scholar
Punta, M. et al. The Pfam protein families database. Nucleic Acids Res. 40, D290–D301 (2012).
Article CAS Google Scholar
Kristensen, D.M., Cai, X. & Mushegian, A. Evolutionarily conserved orthologous families in phages are relatively rare in their prokaryotic hosts. J. Bacteriol. 193, 1806–1814 (2011).
Article CAS Google Scholar
Powell, S. et al. eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res. 40, D284–D289 (2012).
Article CAS Google Scholar
Tringe, S.G. et al. Comparative metagenomics of microbial communities. Science 308, 554–557 (2005).
Article CAS Google Scholar
Roessner, C.A. & Scott, A.I. Fine-tuning our knowledge of the anaerobic route to cobalamin (vitamin B12). J. Bacteriol. 188, 7331–7334 (2006).
Article CAS Google Scholar
Bland, C. et al. CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics 8, 209 (2007).
Article Google Scholar
Zankari, E. et al. Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 67, 2640–2644 (2012).
Article CAS Google Scholar
Kobayashi, K. et al. Essential Bacillus subtilis genes. Proc. Natl. Acad. Sci. USA 100, 4678–4683 (2003).
Article CAS Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS Google Scholar
Kelley, D.R., Schatz, M.C. & Salzberg, S.L. Quake: quality-aware detection and correction of sequencing errors. Genome Biol. 11, R116 (2010).
Article CAS Google Scholar
Zerbino, D.R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).
Article CAS Google Scholar
Mavromatis, K. et al. Use of simulated data sets to evaluate the fidelity of metagenomic processing methods. Nat. Methods 4, 495–500 (2007).
Article CAS Google Scholar
Earl, D. et al. Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res. 21, 2224–2241 (2011).
Article CAS Google Scholar
Teeling, H., Meyerdierks, A., Bauer, M., Amann, R. & Glöckner, F.O. Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ. Microbiol. 6, 938–947 (2004).
Article CAS Google Scholar
Salzberg, S.L. et al. GAGE: a critical evaluation of genome assemblies and assembly algorithms. Genome Res. 22, 557–567 (2012).
Article CAS Google Scholar
Koren, S., Treangen, T.J. & Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 2964–2971 (2011).
Article CAS Google Scholar
Ciccarelli, F.D. et al. Toward automatic reconstruction of a highly resolved tree of life. Science 311, 1283–1287 (2006).
Article CAS Google Scholar
Letunic, I. & Bork, P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 39, W475–W478 (2011).
Article CAS Google Scholar
Treangen, T.J., Sommer, D.D., Angly, F.E., Koren, S. & Pop, M. Next generation sequence assembly with AMOS. Curr. Protoc. Bioinformatics Chapter 11, Unit 11.8 (2011).
Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
Article CAS Google Scholar
Gelman, A., Jakulin, A., Pittau, M.G. & Su, Y. A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat. 2, 1360–1383 (2008).
Article Google Scholar
Plummer, M. JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. in Proc. 3rd Int. Work. Distrib. Stat. Comput. March, 20–22 (2003).
Gelman, A. & Rubin, D. Inference from iterative simulation using multiple sequences. Stat. Sci. 7, 457–511 (1992).
Article Google Scholar

Download references

Acknowledgements

The research leading to these results has received funding from the European Community's Seventh Framework Programme FP7- HEALTH-F4-2007-201052: Metagenomics of the Human Intestinal Tract (MetaHIT) and FP7-HEALTH-2010-261376: International Human Microbiome Standards, as well as the Novo Nordisk Foundation Center for Biosustainability. Work on the clustering concept has been supported by the OpenGPU FUI collaborative research projects, with funding from DGCIS. Researchers on the project were granted access to the HPC resources of CCRT under the allocation 2011-036707 made by GENCI (Grand Equipement National de Calcul Intensif). The company Alliance Services Plus (AS+) has provided help to scale up the process, especially, V. Arslan, D. Tello, V. Ducrot, T. Saidani and S. Monot. The authors affiliated with MGP are funded, in part, by the Metagenopolis ANR-11-DPBS-0001 grant. Ciberehd is funded by the Instituto de Salud Carlos III (Spain). M.A. was supported by a grant from the Ministère de la Recherche et de l'Education Nationale (France).

Author information

H Bjørn Nielsen, Mathieu Almeida, H Bjørn Nielsen, Mathieu Almeida, Julian Parkhill and Keith Turner: These authors contributed equally to this work.

Authors and Affiliations

Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark
H Bjørn Nielsen, Agnieszka Sierakowska Juncker, Simon Rasmussen, Damian R Plichta, Laurent Gautier, Anders G Pedersen, Ida Bonde, Marcelo B Quintanilha dos Santos, Piotr Dworzynski, Ole Lund, David W Ussery, H Bjørn Nielsen, Agnieszka S Juncker, Simon Rasmussen, Damian R Plichta, Laurent Gautier, Anders G Pedersen, Ida Bonde, Marcelo B Quintanilha dos Santos, Piotr Dworzynski, Ole Lund, David W Ussery, Thomas Sicheritz-Ponten, Søren Brunak, Thomas Sicheritz-Ponten & Søren Brunak
Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
H Bjørn Nielsen, Agnieszka Sierakowska Juncker, Ida Bonde, Nikolaj Blom, H Bjørn Nielsen, Agnieszka S Juncker, Ida Bonde, Nikolaj Blom, Thomas Sicheritz-Ponten, Søren Brunak, Thomas Sicheritz-Ponten & Søren Brunak
INRA, Institut National de la Recherche Agronomique, UMR 14121 MICALIS, Jouy en Josas, France
Mathieu Almeida, Emmanuelle Le Chatelier, Jean-Michel Batto, Fouad Boumezbeur, Joël Doré, Sean Kennedy, Pierre Léonard, Florence Levenez, Bouziane Moumen, Nicolas Pons, Edi Prifti, Mathieu Almeida, Emmanuelle Le Chatelier, Jean-Michel Batto, Fouad Boumezbeur, Joël Doré, Sean Kennedy, Pierre Leonard, Florence Levenez, Bouziane Moumen, Nicolas Pons, Edi Prifti, Pierre Renault, S Dusko Ehrlich, Alexandre Jamet, Antonella Cultrone, Christine Delorme, Emmanuelle Maguin, Eric Guedon, Gaetana Vandemeulebrouck, Ghalia Khaci, Maarten van de Guchte, Nicolas Sanchez, Rozenn Dervyn, Séverine Layec, Yohanan Winogradski, Pierre Renault & S Dusko Ehrlich
INRA, Institut National de la Recherche Agronomique, US 1367 Metagenopolis, Jouy en Josas, France
Mathieu Almeida, Emmanuelle Le Chatelier, Jean-Michel Batto, Fouad Boumezbeur, Joël Doré, Sean Kennedy, Pierre Léonard, Florence Levenez, Bouziane Moumen, Nicolas Pons, Edi Prifti, Mathieu Almeida, Emmanuelle Le Chatelier, Jean-Michel Batto, Fouad Boumezbeur, Joël Doré, Sean Kennedy, Florence Levenez, Bouziane Moumen, Nicolas Pons, Edi Prifti, S Dusko Ehrlich, Benoit Quinquis, Florence Haimet, Hervé Blottière, Nathalie Galleron & S Dusko Ehrlich
Department of Computer Science, Center for Bioinformatics and Computational Biology, University of Maryland, USA
Mathieu Almeida & Mathieu Almeida
BGI Hong Kong Research Institute, Hong Kong, China
Junhua Li, Junjie Qin, Junhua Li & Junjie Qin
BGI-Shenzhen, Shenzhen, China
Junhua Li, Manimozhiyan Arumugam, Karsten Kristiansen, Junjie Qin, Junhua Li, Junjie Qin, Jun Wang & Jun Wang
School of Bioscience and Biotechnology, South China University of Technology, Guangzhou, China
Junhua Li & Junhua Li
European Molecular Biology Laboratory, Heidelberg, Germany
Shinichi Sunagawa, Manimozhiyan Arumugam, Jens Roat Kultima, Julien Tap, Takuji Yamada, Shinichi Sunagawa, Manimozhiyan Arumugam, Jens Roat Kultima, Julien Tap, Takuji Yamada, Peer Bork & Peer Bork
Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Institut de Génomique, Évry, France
Eric Pelletier, Denis Le Paslier, Eric Pelletier, Denis Le Paslier, François Artiguenave, Jean Weissenbach & Thomas Bruls
Centre National de la Recherche Scientifique, Évry, France
Eric Pelletier, Denis Le Paslier, Eric Pelletier & Denis Le Paslier
Université d'Évry Val d'Essonne, Évry, France
Eric Pelletier, Denis Le Paslier, Eric Pelletier & Denis Le Paslier
The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark
Trine Nielsen, Manimozhiyan Arumugam, Kristoffer S Burgdorf, Torben Hansen, Oluf Pedersen, Trine Nielsen, Kristoffer S Burgdorf, Torben Hansen, Oluf Pedersen, Jun Wang & Jun Wang
Digestive System Research Unit, University Hospital Vall d'Hebron, Ciberehd, Barcelona, Spain
Chaysavanh Manichanh, Natalia Borruel, Francesc Casellas, Francisco Guarner, Chaysavanh Manichanh, Natalia Borruel, Francesc Casellas, Francisco Guarner, Antonio Torrejon, Encarna Varela & Maria Antolin
Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark
Torben Hansen & Torben Hansen
Department of Structural Biology, VIB, Brussels, Belgium
Falk Hildebrand, Falk Hildebrand & Falony Gwen
Department of Bioscience Engineering, Vrije Universiteit, Brussels, Belgium
Falk Hildebrand, Jeroen Raes, Falk Hildebrand & Jeroen Raes
Division for Epidemiology and Microbial Genomics, National Food Institute, Technical University of Denmark, Kongens Lyngby, Denmark
Rolf S Kaas & Rolf S Kaas
Department of Biology, University of Copenhagen, Copenhagen, Denmark
Karsten Kristiansen, Karsten Kristiansen, Jun Wang & Jun Wang
Hagedorn Research Institute, Gentofte, Denmark
Oluf Pedersen & Oluf Pedersen
Institute of Biomedical Science, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
Oluf Pedersen, Oluf Pedersen & Niels Grarup
Faculty of Health, Aarhus University, Aarhus, Denmark
Oluf Pedersen & Oluf Pedersen
Department of Microbiology and Immunology, Rega Institute, KU Leuven, Belgium
Jeroen Raes & Jeroen Raes
VIB Center for the Biology of Disease, Leuven, Belgium
Jeroen Raes & Jeroen Raes
Department of Biology, Section of Microbiology, University of Copenhagen, Copenhagen, Denmark
Søren Sørensen & Søren Sørensen
Laboratory of Microbiology, Wageningen University, Wageningen, The Netherlands
Sebastian Tims, Sebastian Tims, Willem M de Vos, Jørgensen Torben, Michiel Kleerebezem & Zoetendal Erwin G
Department of Biological Information, Tokyo Institute of Technology, Yokohama, Japan
Takuji Yamada & Takuji Yamada
Max Delbrück Centre for Molecular Medicine, Berlin, Germany
Peer Bork & Peer Bork
Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, Saudi Arabia
Jun Wang & Jun Wang
King's College London, Centre for Host-Microbiome Interactions, Dental Institute Central Office, Guy's Hospital, United Kingdom
S Dusko Ehrlich & S Dusko Ehrlich
Institut Mérieux, Lyon, France
Alexandre Mérieux, Christian Brechot & Christine M'Rini
Danone Research, Palaiseau, France
Gérard Denariaz, Johan E T van Hylckama Vlieg, Muriel Derrien & Patrick Veiga
Gut Biology & Microbiology, Danone Research, Center for Specialized Nutrition, Wageningen, the Netherlands
Jan Knol & Raish Oozeer
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, U.K.
Julian Parkhill & Keith Turner
Istituto Europeo di Oncologia, Milan, Italy
Maria Rescigno

Authors

H Bjørn Nielsen
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Almeida
View author publications
You can also search for this author in PubMed Google Scholar
Agnieszka Sierakowska Juncker
View author publications
You can also search for this author in PubMed Google Scholar
Simon Rasmussen
View author publications
You can also search for this author in PubMed Google Scholar
Junhua Li
View author publications
You can also search for this author in PubMed Google Scholar
Shinichi Sunagawa
View author publications
You can also search for this author in PubMed Google Scholar
Damian R Plichta
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Gautier
View author publications
You can also search for this author in PubMed Google Scholar
Anders G Pedersen
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuelle Le Chatelier
View author publications
You can also search for this author in PubMed Google Scholar
Eric Pelletier
View author publications
You can also search for this author in PubMed Google Scholar
Ida Bonde
View author publications
You can also search for this author in PubMed Google Scholar
Trine Nielsen
View author publications
You can also search for this author in PubMed Google Scholar
Chaysavanh Manichanh
View author publications
You can also search for this author in PubMed Google Scholar
Manimozhiyan Arumugam
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Michel Batto
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo B Quintanilha dos Santos
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaj Blom
View author publications
You can also search for this author in PubMed Google Scholar
Natalia Borruel
View author publications
You can also search for this author in PubMed Google Scholar
Kristoffer S Burgdorf
View author publications
You can also search for this author in PubMed Google Scholar
Fouad Boumezbeur
View author publications
You can also search for this author in PubMed Google Scholar
Francesc Casellas
View author publications
You can also search for this author in PubMed Google Scholar
Joël Doré
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Dworzynski
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Guarner
View author publications
You can also search for this author in PubMed Google Scholar
Torben Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Falk Hildebrand
View author publications
You can also search for this author in PubMed Google Scholar
Rolf S Kaas
View author publications
You can also search for this author in PubMed Google Scholar
Sean Kennedy
View author publications
You can also search for this author in PubMed Google Scholar
Karsten Kristiansen
View author publications
You can also search for this author in PubMed Google Scholar
Jens Roat Kultima
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Léonard
View author publications
You can also search for this author in PubMed Google Scholar
Florence Levenez
View author publications
You can also search for this author in PubMed Google Scholar
Ole Lund
View author publications
You can also search for this author in PubMed Google Scholar
Bouziane Moumen
View author publications
You can also search for this author in PubMed Google Scholar
Denis Le Paslier
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Pons
View author publications
You can also search for this author in PubMed Google Scholar
Oluf Pedersen
View author publications
You can also search for this author in PubMed Google Scholar
Edi Prifti
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Qin
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen Raes
View author publications
You can also search for this author in PubMed Google Scholar
Søren Sørensen
View author publications
You can also search for this author in PubMed Google Scholar
Julien Tap
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Tims
View author publications
You can also search for this author in PubMed Google Scholar
David W Ussery
View author publications
You can also search for this author in PubMed Google Scholar
Takuji Yamada
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Renault
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Sicheritz-Ponten
View author publications
You can also search for this author in PubMed Google Scholar
Peer Bork
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Søren Brunak
View author publications
You can also search for this author in PubMed Google Scholar
S Dusko Ehrlich
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

MetaHIT Consortium

H Bjørn Nielsen
, Mathieu Almeida
, Agnieszka S Juncker
, Simon Rasmussen
, Junhua Li
, Shinichi Sunagawa
, Damian R Plichta
, Laurent Gautier
, Anders G Pedersen
, Emmanuelle Le Chatelier
, Eric Pelletier
, Ida Bonde
, Trine Nielsen
, Chaysavanh Manichanh
, Manimozhiyan Arumugam
, Jean-Michel Batto
, Marcelo B Quintanilha dos Santos
, Nikolaj Blom
, Natalia Borruel
, Kristoffer S Burgdorf
, Fouad Boumezbeur
, Francesc Casellas
, Joël Doré
, Piotr Dworzynski
, Francisco Guarner
, Torben Hansen
, Falk Hildebrand
, Rolf S Kaas
, Sean Kennedy
, Karsten Kristiansen
, Jens Roat Kultima
, Pierre Leonard
, Florence Levenez
, Ole Lund
, Bouziane Moumen
, Denis Le Paslier
, Nicolas Pons
, Oluf Pedersen
, Edi Prifti
, Junjie Qin
, Jeroen Raes
, Søren Sørensen
, Julien Tap
, Sebastian Tims
, David W Ussery
, Takuji Yamada
, Pierre Renault
, Thomas Sicheritz-Ponten
, Peer Bork
, Jun Wang
, Søren Brunak
, S Dusko Ehrlich
, Alexandre Jamet
, Alexandre Mérieux
, Antonella Cultrone
, Antonio Torrejon
, Benoit Quinquis
, Christian Brechot
, Christine Delorme
, Christine M'Rini
, Willem M de Vos
, Emmanuelle Maguin
, Encarna Varela
, Eric Guedon
, Falony Gwen
, Florence Haimet
, François Artiguenave
, Gaetana Vandemeulebrouck
, Gérard Denariaz
, Ghalia Khaci
, Hervé Blottière
, Jan Knol
, Jean Weissenbach
, Johan E T van Hylckama Vlieg
, Jørgensen Torben
, Julian Parkhill
, Keith Turner
, Maarten van de Guchte
, Maria Antolin
, Maria Rescigno
, Michiel Kleerebezem
, Muriel Derrien
, Nathalie Galleron
, Nicolas Sanchez
, Niels Grarup
, Patrick Veiga
, Raish Oozeer
, Rozenn Dervyn
, Séverine Layec
, Thomas Bruls
, Yohanan Winogradski
& Zoetendal Erwin G

Contributions

All authors are members of the Metagenomics of the Human Intestinal Tract (MetaHIT) Consortium. S.D.E. and S.B. managed the project. F.C., N.B., F.G., T.H., K.S.B. and T.N. performed clinical sampling. F.L. and C.M. performed DNA extraction. J.L., E.P. and D.L.P. performed sequencing. S.D.E., H.B.N., M.A., A.S.J., S.R., P.R. and P.B. designed the analyses. H.B.N., A.S.J., S.R., M.A., A.G.P., D.R.P., L.G., I.B., M.B., M.B.Q.d.S., M.A., J.L., J.T., S.S., T.Y., E.P., D.L.P. and R.S.K. performed the data analyses. H.B.N., S.B., A.S.J., S.R., A.G.P. and M.A. wrote the manuscript. H.B.N., S.B., S.D.E., D.R.P., I.B., P.B., E.P., O.P. and D.W.U. revised the manuscript. The MetaHIT Consortium members contributed to the design and execution of the study.

Corresponding authors

Correspondence to Søren Brunak or S Dusko Ehrlich.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

A full list of members and affiliations appears at the end of the paper.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–17 and Supplementary Notes 1–9 (PDF 4213 kb)

Supplementary Data 1

Sample description (XLS 259 kb)

Supplementary Data 2

MGS taxonomical statistics (XLS 264 kb)

Supplementary Data 3

MGS augmented assembly statistics (XLS 266 kb)

Supplementary Data 4

MGS augmented assemblies comparison to reference genomes (XLS 31 kb)

Supplementary Data 5

Summary information on the 6640 small CAGs (XLS 1156 kb)

Supplementary Data 6

Dependency-association network (XLS 251 kb)

Supplementary Data 7

MGS:4 + dependency-associated CAG assembly statistics (XLS 37 kb)

Supplementary Data 8

eggNOG prevalent in frequently observed MGS (XLS 36 kb)

Supplementary Data 9

Gene catalogue comparison (XLS 10 kb)

Supplementary Data 10

Bacillus subtilis essential COG list (XLS 61 kb)

Supplementary Data 11

Dependency-associations with or without companion species (XLS 37 kb)

Supplementary Software

Source code for canopy-clustering algorithm (ZIP 40 kb)

Source data

Source data to Fig. 1

Source data to Fig. 2

Source data to Fig. 3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nielsen, H., Almeida, M., Juncker, A. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat Biotechnol 32, 822–828 (2014). https://doi.org/10.1038/nbt.2939

Download citation

Received: 12 February 2014
Accepted: 22 May 2014
Published: 06 July 2014
Issue Date: August 2014
DOI: https://doi.org/10.1038/nbt.2939

This article is cited by

MAGICIAN: MAG simulation for investigating criteria for bioinformatic analysis
- Kat Steinke
- Sünje J. Pamp
- Patrick Munk
BMC Genomics (2024)
Exploration of genes encoding KEGG pathway enzymes in rhizospheric microbiome of the wild plant Abutilon fruticosum
- Aala A. Abulfaraj
- Ashwag Y. Shami
- Rewaa S. Jalal
AMB Express (2024)
Gut microbiota composition is altered in postural orthostatic tachycardia syndrome and post-acute COVID-19 syndrome
- Viktor Hamrefors
- Fredrik Kahn
- Bodil Ohlsson
Scientific Reports (2024)
Metagenomic analysis of gut microbiome and resistome of Whooper and Black Swans: a one health perspective
- Yin Fu
- Kaihui Zhang
- Longxian Zhang
BMC Genomics (2023)
Resistome expansion in disease-associated human gut microbiomes
- Simen Fredriksen
- Stef de Warle
- Jerry M. Wells
Microbiome (2023)