Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Compendium of 530 metagenome-assembled bacterial and archaeal genomes from the polar Arctic Ocean

Abstract

The role of the Arctic Ocean ecosystem in climate regulation may depend on the responses of marine microorganisms to environmental change. We applied genome-resolved metagenomics to 41 Arctic seawater samples, collected at various depths in different seasons during the Tara Oceans Polar Circle expedition, to evaluate the ecology, metabolic potential and activity of resident bacteria and archaea. We assembled 530 metagenome-assembled genomes (MAGs) to form the Arctic MAGs catalogue comprising 526 species. A total of 441 MAGs belonged to species that have not previously been reported and 299 genomes showed an exclusively polar distribution. Most Arctic MAGs have large genomes and the potential for fast generation times, both of which may enable adaptation to a copiotrophic lifestyle in nutrient-rich waters. We identified 38 habitat generalists and 111 specialists in the Arctic Ocean. We also found a general prevalence of 14 mixotrophs, while chemolithoautotrophs were mostly present in the mesopelagic layer during spring and autumn. We revealed 62 MAGs classified as key Arctic species, found only in the Arctic Ocean, showing the highest gene expression values and predicted to have habitat-specific traits. The Artic MAGs catalogue will inform our understanding of polar microorganisms that drive global biogeochemical cycles.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Metagenomic genome reconstruction of the Tara Oceans Polar Circle expedition.
Fig. 2: Taxonomical annotation and novelty of Arctic Ocean MAGs.
Fig. 3: Potential autotrophy in RuBisCO-coding MAGs.
Fig. 4: Chemolithoautotrophic Arctic Ocean MAGs.
Fig. 5: Composition and biogeography of the 530 Arctic microbial MAGs.
Fig. 6: Expression patterns and metabolic potential of sentinel polar MAGs in the Arctic Ocean.

Data availability

Accession numbers for the data used and generated in this study can be found in Supplementary Table 12, which includes the Arctic MAGs Catalogue and their functional annotation (European Bioinformatics Institute BioStudies ID: S-BSST451) and the co-assembly of metagenomic samples used to generate the metagenomic bins (European Nucleotide Archive PRJEB41575). Accession numbers for the metagenomic and metatranscriptomic samples used in the fragment recruitment analyses can be found in Supplementary Table 13. Publicly available datasets used in this study include the following: CheckM v.1.0.11 (https://github.com/Ecogenomics/CheckM/releases/tag/v1.1.0), GTDB release 89 (https://data.gtdb.ecogenomic.org/releases/release89/), SILVA 132 (https://www.arb-silva.de/documentation/release-132/), KEGG release 89.1 (https://www.genome.jp/kegg/docs/relnote.html) and Pfam database release 31.0 (http://ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam31.0/). Source data are provided with this paper.

References

  1. IPCC. IPCC Special Report on the Ocean and Cryosphere in a Changing Climate (in the press).

  2. Cavicchioli, R. et al. Scientists’ warning to humanity: microorganisms and climate change. Nat. Rev. Microbiol. 17, 569–586 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Meltofte, H. (ed.) Arctic Biodiversity Assessment: Status and Trends in Arctic Biodiversity (CAFF International Secretariat, 2013).

  4. Wassmann, P. & Reigstad, M. Future Arctic Ocean seasonal ice zones and implications for pelagic-benthic coupling. Oceanography 24, 220–231 (2011).

    Article  Google Scholar 

  5. Bunse, C. & Pinhassi, J. Marine bacterioplankton seasonal succession dynamics. Trends Microbiol. 25, 494–505 (2017).

    Article  CAS  PubMed  Google Scholar 

  6. Olli, K. et al. Seasonal variation in vertical flux of biogenic matter in the marginal ice zone and the central Barents Sea. J. Mar. Syst. 38, 189–204 (2002).

    Article  Google Scholar 

  7. Riedel, A., Michel, C., Gosselin, M. & LeBlanc, B. Winter–spring dynamics in sea-ice carbon cycling in the coastal Arctic Ocean. J. Mar. Syst. 74, 918–932 (2008).

    Article  Google Scholar 

  8. Joli, N., Monier, A., Logares, R. & Lovejoy, C. Seasonal patterns in Arctic prasinophytes and inferred ecology of Bathycoccus unveiled in an Arctic winter metagenome. ISME J. 11, 1372–1385 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Alonso-Sáez, L., Sánchez, O., Gasol, J. M., Balagué, V. & Pedrós-Alio, C. Winter-to-summer changes in the composition and single-cell activity of near-surface Arctic prokaryotes. Environ. Microbiol. 10, 2444–2454 (2008).

    Article  PubMed  CAS  Google Scholar 

  10. Alonso-Sáez, L. et al. Role for urea in nitrification by polar marine Archaea. Proc. Natl Acad. Sci. USA 109, 17989–17994 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Boetius, A., Anesio, A. M., Deming, J. W., Mikucki, J. A. & Rapp, J. Z. Microbial ecology of the cryosphere: sea ice and glacial habitats. Nat. Rev. Microbiol. 13, 677–690 (2015).

    Article  CAS  PubMed  Google Scholar 

  12. Circumpolar Biodiversity Monitoring Program, Conservation of Arctic Flora and Fauna. State of the Arctic Marine Biodiversity Report (Conservation of Arctic Flora and Fauna International Secretariat, 2017).

  13. Kirchman, D. L., Cottrell, M. T. & Lovejoy, C. The structure of bacterial communities in the western Arctic Ocean as revealed by pyrosequencing of 16S rRNA genes. Environ. Microbiol. 12, 1132–1143 (2010).

    Article  CAS  PubMed  Google Scholar 

  14. Galand, P. E., Casamayor, E. O., Kirchman, D. L., Potvin, M. & Lovejoy, C. Unique archaeal assemblages in the Arctic Ocean unveiled by massively parallel tag sequencing. ISME J. 3, 860–869 (2009).

    Article  CAS  PubMed  Google Scholar 

  15. Pedrós-Alió, C., Potvin, M. & Lovejoy, C. Diversity of planktonic microorganisms in the Arctic Ocean. Prog. Oceanogr. 139, 233–243 (2015).

    Article  Google Scholar 

  16. Amaral-Zettler, L. et al. in Life in the World’s Oceans: Diversity, Distribution, and Abundance (ed. McIntyre, A. D.) 221–245 (Blackwell Publishing Ltd, 2010).

  17. Christman, G. D., Cottrell, M. T., Popp, B. N., Gier, E. & Kirchman, D. L. Abundance, diversity, and activity of ammonia-oxidizing prokaryotes in the coastal Arctic Ocean in summer and winter. Appl. Environ. Microbiol. 77, 2026–2034 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Alonso-Sáez, L., Galand, P. E., Casamayor, E. O., Pedrós-Alió, C. & Bertilsson, S. High bicarbonate assimilation in the dark by Arctic bacteria. ISME J. 4, 1581–1590 (2010).

    Article  PubMed  CAS  Google Scholar 

  19. Galand, P. E., Lovejoy, C., Pouliot, J., Garneau, M.-È. & Vincent, W. F. Microbial community diversity and heterotrophic production in a coastal Arctic ecosystem: a stamukhi lake and its source waters. Limnol. Oceanogr. 53, 813–823 (2008).

    Article  Google Scholar 

  20. Nguyen, D. et al. Winter diversity and expression of proteorhodopsin genes in a polar ocean. ISME J. 9, 1835–1845 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Cifuentes-Anticevic, J. et al. Proteorhodopsin phototrophy in Antarctic coastal waters. mSphere 6, e00525–21 (2021).

    Article  CAS  PubMed Central  Google Scholar 

  22. Ghiglione, J.-F. et al. Pole-to-pole biogeography of surface and deep marine bacterial communities. Proc. Natl Acad. Sci. USA 109, 17633–17638 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Salazar, G. et al. Gene expression changes and community turnover differentially shape the global ocean metatranscriptome. Cell 179, 1068–1083.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kraemer, S., Ramachandran, A., Colatriano, D., Lovejoy, C. & Walsh, D. A. Diversity and biogeography of SAR11 bacteria from the Arctic Ocean. ISME J. 14, 79–90 (2020).

    Article  PubMed  Google Scholar 

  25. Cao, S. et al. Structure and function of the Arctic and Antarctic marine microbiota as revealed by metagenomics. Microbiome 8, 47 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Sunagawa, S. et al. Tara Oceans: towards global ocean ecosystems biology. Nat. Rev. Microbiol. 18, 428–445 (2020).

    Article  CAS  PubMed  Google Scholar 

  27. Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).

    Article  CAS  PubMed  Google Scholar 

  29. Delmont, T. O. et al. Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes. Nat. Microbiol. 3, 804–813 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Ibarbalz, F. M. et al. Global trends in marine plankton diversity across kingdoms of life. Cell 179, 1084–1097.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Aagaard, K., Swift, J. H. & Carmack, E. C. Thermohaline circulation in the Arctic Mediterranean Seas. J. Geophys. Res. Oceans 90, 4833–4846 (1985).

    Article  Google Scholar 

  32. Dupont, C. L. et al. Genomes and gene expression across light and productivity gradients in eastern subtropical Pacific microbial communities. ISME J. 9, 1076–1092 (2015).

    Article  CAS  PubMed  Google Scholar 

  33. Franzosa, E. A. et al. Relating the metatranscriptome and metagenome of the human gut. Proc. Natl Acad. Sci. USA 111, E2329–E2338 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Jones, S. E. & Lennon, J. T. Dormancy contributes to the maintenance of microbial diversity. Proc. Natl Acad. Sci. USA 107, 5881–5886 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Mestre, M. & Höfer, J. The microbial conveyor belt: connecting the globe through dispersion and dormancy. Trends Microbiol. 29, 482–492 (2021).

    Article  CAS  PubMed  Google Scholar 

  36. Ciufo, S. et al. Using average nucleotide identity to improve taxonomic assignments in prokaryotic genomes at the NCBI. Int. J. Syst. Evol. Microbiol. 68, 2386–2392 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Chaumeil, P-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2019).

    PubMed Central  Google Scholar 

  38. Nelson, W. C., Tully, B. J. & Mobberley, J. M. Biases in genome reconstruction from metagenomic data. PeerJ 8, e10119 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Alneberg, J. et al. Ecosystem-wide metagenomic binning enables prediction of ecological niches from genomes. Commun. Biol. 3, 119 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Tully, B. J., Graham, E. D. & Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Christensen, M. & Nilsson, A. E. Arctic sea ice and the communication of climate change. Pop. Commun. 15, 249–268 (2017).

    Article  Google Scholar 

  42. Jaffe, A. L., Castelle, C. J., Dupont, C. L. & Banfield, J. F. Lateral gene transfer shapes the distribution of RuBisCO among candidate phyla radiation bacteria and DPANN Archaea. Mol. Biol. Evol. 36, 435–446 (2019).

    Article  CAS  PubMed  Google Scholar 

  43. Kono, T. et al. A RuBisCO-mediated carbon metabolic pathway in methanogenic archaea. Nat. Commun. 8, 14007 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Sato, T., Atomi, H. & Imanaka, T. Archaeal type III RuBisCOs function in a pathway for AMP metabolism. Science 315, 1003–1006 (2007).

    Article  CAS  PubMed  Google Scholar 

  45. Tabita, F. R., Satagopan, S., Hanson, T. E., Kreel, N. E. & Scott, S. S. Distinct form I, II, III, and IV Rubisco proteins from the three kingdoms of life provide clues about Rubisco evolution and structure/function relationships. J. Exp. Bot. 59, 1515–1524 (2008).

    Article  CAS  PubMed  Google Scholar 

  46. Yelton, A. P. et al. Global genetic capacity for mixotrophy in marine picocyanobacteria. ISME J. 10, 2946–2957 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Cordero, P. R. F. et al. Atmospheric carbon monoxide oxidation is a widespread mechanism supporting microbial survival. ISME J. 13, 2868–2881 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. King, G. M. & Weber, C. F. Distribution, diversity and ecology of aerobic CO-oxidizing bacteria. Nat. Rev. Microbiol. 5, 107–118 (2007).

    Article  CAS  PubMed  Google Scholar 

  49. Sunagawa, S. et al. Ocean plankton. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).

    Article  PubMed  CAS  Google Scholar 

  50. Sul, W. J., Oliver, T. A., Ducklow, H. W., Amaral-Zettler, L. A. & Sogin, M. L. Marine bacteria exhibit a bipolar distribution. Proc. Natl Acad. Sci. USA 110, 2342–2347 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Roller, B. R. K., Stoddard, S. F. & Schmidt, T. M. Exploiting rRNA operon copy number to investigate bacterial reproductive strategies. Nat. Microbiol. 1, 16160 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Levins, R. Evolution in Changing Environments: Some Theoretical Explorations (Princeton Univ. Press, 1968).

    Book  Google Scholar 

  53. Colwell, R. K. & Futuyma, D. J. On the measurement of niche breadth and overlap. Ecology 52, 567–576 (1971).

    Article  PubMed  Google Scholar 

  54. Massana, R. & Logares, R. Eukaryotic versus prokaryotic marine picoplankton ecology. Environ. Microbiol. 15, 1254–1261 (2013).

    Article  PubMed  Google Scholar 

  55. Székely, A. J., Berga, M. & Langenheder, S. Mechanisms determining the fate of dispersed bacterial communities in new environments. ISME J. 7, 61–71 (2013).

    Article  PubMed  CAS  Google Scholar 

  56. Brooks, J. P. et al. The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies. BMC Microbiol. 15, 66 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Logares, R. et al. Biogeography of bacterial communities exposed to progressive long-term environmental change. ISME J. 7, 937–948 (2013).

    Article  CAS  PubMed  Google Scholar 

  58. Ruiz-González, C. et al. Higher contribution of globally rare bacterial taxa reflects environmental transitions across the surface ocean. Mol. Ecol. 28, 1930–1945 (2019).

    Article  PubMed  CAS  Google Scholar 

  59. Staley, J. T. & Gosink, J. J. Poles apart: biodiversity and biogeography of sea ice bacteria. Annu. Rev. Microbiol. 53, 189–215 (1999).

    Article  CAS  PubMed  Google Scholar 

  60. Chaffron, S. et al. Environmental vulnerability of the global ocean epipelagic plankton community interactome. Sci. Adv. 7, eabg1921 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Estrada, E. Characterization of topological keystone species: local, global and “meso-scale” centralities in food webs. Ecol. Complex. 4, 48–57 (2007).

    Article  Google Scholar 

  62. Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).

    Article  CAS  PubMed  Google Scholar 

  63. Tully, B. J., Sachdeva, R., Graham, E. D. & Heidelberg, J. F. 290 metagenome-assembled genomes from the Mediterranean Sea: a resource for marine microbiology. PeerJ 2017, e3558 (2017).

    Article  CAS  Google Scholar 

  64. Deep ocean metagenomes provide insight into the metabolic architecture of bathypelagic microbial communities. Commun. Biol. 4, 604 (2021).

  65. Pesant, S. et al. Open science resources for the discovery and analysis of Tara Oceans data. Sci. Data 2, 150023 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Alberti, A. et al. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition. Sci. Data 4, 170093 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    Article  CAS  PubMed  Google Scholar 

  68. Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).

    Article  CAS  PubMed  Google Scholar 

  69. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  71. Kang, D. D., Froula, J., Egan, R. & Wang, Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  72. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Huang, X. & Madan, A. CAP3: a DNA sequence assembly program. Genome Res. 9, 868–877 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).

    Article  CAS  PubMed  Google Scholar 

  75. Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article  CAS  PubMed  Google Scholar 

  77. Wheeler, T. J. & Eddy, S. R. nhmmer: DNA homology search with profile HMMs. Bioinformatics 29, 2487–2489 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Jain, C., Rodriguez-R, L. M., Phillipy, A. M., Konstantinidis, K. T. & Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  79. Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Vieira-Silva, S. & Rocha, E. P. C. The systemic imprint of growth and its uses in ecological (meta)genomics. PLoS Genet. 6, e1000808 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  81. Pertea, G. & Pertea, M. GFF utilities: GffRead and GffCompare. F1000Res. 9, ISCB Comm J-304 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Aylward, F. O. & Santoro, A. E. Heterotrophic Thaumarchaeota with ultrasmall genomes are widespread in the ocean. mSystems 5, e00415–20 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2––approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  85. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 47, W256–W259 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Louca, S., Doebeli, M. & Parfrey, L. W. Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem. Microbiome 6, 41 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Tara Oceans (which includes both the Tara Oceans and Tara Oceans Polar Circle expeditions) would not exist without the leadership of the Tara Ocean Foundation and the continuous support of 23 institutes (http://oceans.taraexpeditions.org). We thank SHOOK Studio for assistance with designing the figures. This work acknowledges the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S). We thank the commitment of the following sponsors and research funding agencies: the Spanish Ministry of Economy and Competitiveness (project MAGGY, grant no. CTM2017-87736-R and Polar EcoGen PID2020-116489RB-I00), Horizon 2020-Research and Innovation Framework Programme (Atlantic ECOsystems assessment, forecasting & sustainability, grant no. H2020-BG-2019-2), Centre National de la Recherche Scientifique (in particular Groupement de Recherche GDR3280 and the Research Federation for the study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans-GOSEE), European Molecular Biology Laboratory, Genoscope/Commissariat à l’Énergie Atomique et aux Énergies Alternatives, the French Ministry of Research and the French Government’s ‘Investissements d’Avenir’ programmes OCEANOMICS (project no. ANR-11-BTBR-0008), FRANCE GENOMIQUE (project no. ANR-10-INBS-09-08), MEMO LIFE (project no. ANR-10-LABX-54), Paris Sciences et Lettres University (project no. ANR-11-IDEX-0001-02), Eidgenössische Technische Hochschule Zürich and Helmut Horten Foundation, the Swiss National Foundation (project no. 205321_184955), MEXT/JSPS/KAKENHI (project nos. 16H06429, 16K21723, 16H06437 and 18H02279). We also thank the support and commitment of agnès b. and E. Bourgois, the Prince Albert II de Monaco Foundation, the Veolia Foundation, Region Bretagne, Lorient Agglomération, Serge Ferrari, World Courier and King Abdullah University of Science and Technology. The global sampling effort was enabled by countless scientists and crews who sampled aboard the Tara from 2009 to 2013. We thank Mercator/Coriolis and ACRI-ST for providing daily satellite data during the expedition. We also thank the countries who graciously granted sampling permissions. All data reported herein are fully and freely available from the date of publication, with no restrictions; all of the analyses, publications and ownership of data are free from legal entanglement or restriction by the various nations whose waters the Tara Oceans expeditions sampled in. This article is contribution number 122 of Tara Oceans.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

M.R.-L. developed the methodology, analysed the data, designed the data visualizations and wrote the manuscript. P.S. developed the methodology and analysed the data. C.R.-G., G.S., L.P., F.I., B.C. and M.S. analysed the data. L.Z. and S.C. analysed the data and contributed to the interpretation of the findings. C.P.-A., L.K.-B. and C.B. contributed to the interpretation of the findings and provided critical reading of the manuscript. K.L. and P.W. coordinated all sequencing efforts. D.E. and S.S. provided critical reading of the manuscript. E.K. is the director of the Tara Oceans expedition. C.B., L.K.-B. and M.B. directed the Tara Oceans Polar Circle expedition. The Tara Oceans coordinators conceptualized the research, organized the sampling efforts and revised the manuscript. S.G.A. created the study design, developed the methodology, analysed the data and helped to write the manuscript.

Corresponding author

Correspondence to Silvia G. Acinas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Microbiology thanks Eric Collins, David Pearce and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 NMDS of the 16S miTAG community composition of the 41 Tara Oceans Polar Circle metagenomes.

Colors delimit the 9 groups of samples used for co-assembly in order to build Arctic bins. Shape indicates the ocean layer from which each metagenomic sample was collected.

Source data

Extended Data Fig. 2 Taxonomic classification of the 27 partial ribosomal genes encoded in the 530 MQ and HQ Arctic MAGs.

a, Number of Arctic MAGs assigned to each phylum in the Bacteria domain. b, Number of Arctic MAGs assigned to each phylum in the Archaea domain.

Source data

Extended Data Fig. 3 Rank-abundance curve of Arctic MAGs in Arctic metagenomes.

MAGs are sorted in X axis by their accumulated RPKGs in the 37 Arctic metagenomes (including the sub-Arctic North Atlantic) used in this study. MAGs are colored by phyla and the those recruiting at least 200 RPKGs are labelled with extended taxonomic annotation. Taxonomic annotation reaches the furthest level of classification for each MAG and the number in parenthesis is the MAG’s identification code.

Source data

Extended Data Fig. 4 Transcript abundance (RPKM; reads per gene kilobase per million of sequenced reads) of the marker gene coxL (Carbon monoxide dehydrogenase large chain) K03520 from the aerobic carbon-monoxide dehydrogenase.

a, Polar maps with the accumulated metatranscriptomic RPKMs of 9 Arctic MAGs expressing the CO fixing coxL Form I, color-coded by CAFF region. The size of the dot is proportional to the accumulated metatranscriptomic RPKMs. In the dot plot below, RPKMs of 9 Arctic MAGs expressing CO fixing coxL Form I colored based on taxonomic annotation at the phylum level. Accumulated RPKMs per sample is depicted with a dashed black line. b, Polar maps with the accumulated metatranscriptomic RPKMs of 105 Arctic MAGs expressing coxL Form II, color-coded by CAFF region. The size of the dot is proportional to the accumulated metatranscriptomic RPKGs. Absent maps mean that no recruitment was found for that specific metabolism/domain/layer. In the dot plot below, RPKMs of 9 Arctic MAGs expressing coxL Form II colored based on taxonomic annotation at the phylum level. Accumulated RPKMs per sample is depicted with a dashed black line.

Source data

Extended Data Fig. 5 Pan-Arctic profiles of Tara Arctic Ocean MAGs catalogue.

a, Quantification and taxonomic classification of MAGs based on their pan-Arctic profiles. Stacked barplots representative of the number of pan-Arctic MAGs (left) and non pan-Arctic MAGs (right), colored by phylum. MAGs have been classified based on the number of Arctic regions they are present, represented in the X axis. Absolute number of MAGs per X axis category can be seen in the top barplots. More details for pan-Arctic categorization can be found in the Methods sections. b, Quantification and taxonomic classification of MAGs with a limited distribution, found only in one Arctic Region. Stacked barplots representative of the number of non pan-Arctic MAGs present only in one Arctic region, colored by phylum. The different Arctic regions and their season of sampling are found in axis X. Absolute number of MAGs per X axis category can be seen in the top barplots.

Source data

Extended Data Fig. 6 Differences in estimated minimum growth times and optimal growth temperatures across biogeographic categories, mean 16S rRNA gene copy number across latitude and differences in estimated complete genome sizes and optimal growth temperatures between different niche breadth categories.

a, Distribution of Minimum Generation Times estimated for each individual MAG, grouped by their biogeographical categorization (n = 153 MAGs classified as Arctic only, 123 MAGs classified as Arctic & NAO, 23 MAGs classified as bipolar, 4 MAGs classified as NAO only and 227 MAGs classified as Other latitudes). Data are shown as box plots (Tukey style): the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the horizontal line indicates the median and the whiskers indicate the lowest and highest points within 1.5× the interquartile ranges of the lower (first) or upper (third) quartile, respectively. Data beyond the end of the whiiskers are outlying points and are plotted individually. Statistical support was calculated using the two-sided Dunnett-Tukey-Kramer Pairwise Multiple Comparison Test Adjusted for Unequal Variances and Unequal Sample Sizes (DTK) and CI 95. DTK test shows significant differences between the ‘Arctic only’ and the ‘Other latitudes’ MAGs. b, Distribution of Optimal Growth Temperatures estimated for each individual MAG, grouped by their biogeographical categorization (n = 153 MAGs classified as Arctic only, 123 MAGs classified as Arctic & NAO, 23 MAGs classified as bipolar, 4 MAGs classified as NAO only and 227 MAGs classified as Other latitudes). Boxplots describe the data as in a. DTK was performed as in a and shows significant differences (p-value < 0.05) between the ‘Arctic only’ and the ‘Other latitudes’ MAGs. c, Dots correspond to Tara Oceans samples from surface and subsurface chlorophyll maxima and are place across latitude depending on their estimated number of ribosomal copies (derived from miTAGs, see Methods). d, Distribution of estimated complete assembly size of MAGs based on their niche breadth category. Statistical support was calculated using the two-sided Dunnett-Tukey-Kramer Pairwise Multiple Comparison Test Adjusted for Unequal Variances and Unequal Sample Sizes (DTK) and CI 95. DTK test shows significant differences in estimated complete assembly size (p-value < 0.05) between MAGs classified as habitat specialists and those uncategorized (n = 38 MAGs classified as generalists, 111 MAGs classified as specialists, 381 uncategorised MAGs). Data are shown as box plots (Tukey style): the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the horizontal line indicates the median and the whiskers indicate the lowest and highest points within 1.5× the interquartile ranges of the lower (first) or upper (third) quartile, respectively. Data beyond the end of the whiskers are outlying points and are plotted individually. e) Distribution of optimal growth temperatures of MAGs based on their niche breadth category (n = 38 MAGs classified as generalists, 111 MAGs classified as specialists, 381 uncategorised MAGs). DTK was performed as in a and shows (p-value < 0.05) between the generalists and the uncategorised MAGs.

Source data

Extended Data Fig. 7 Disentangling generalists and specialists within the 530 Arctic MAGs.

a, Distribution of Arctic MAGs based on their mean read recruitments in Arctic metagenomic samples (RPKG, X axis) and their Levin’s Index (i.e, niche breadth, Y axis). The color gradient depicts the occurrence (that is, % of samples where a given MAG is present) in the Arctic metagenomic dataset and shape indicates their niche breadth category (generalists, specialists and uncategorised). b, Number of habitat generalists (orange), specialists (blue) and uncategorised MAGs (grey) in each biogeographic category shown in bar plots (n = 530 MAGs examined over 32 Arctic metagenomes). The adjacent boxplots show the distribution of assembly sizes within each subcategory (upscaled to 100% of genome completeness) and statistically significant differences have been highlighted with an asterisk (DTK test, p-value < 0.05). Box plots are presented horizontally and in Tukey style: the lower (left) and upper (right) hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the vertical line indicates the median and the whiskers indicate the lowest and highest points within 1.5× the interquartile ranges of the lower (first) or upper (third) quartile, respectively. Data beyond the end of the whiskers are outlying points and are plotted individually. Statistical support was calculated using the two-sided Dunnett-Tukey-Kramer Pairwise Multiple Comparison Test Adjusted for Unequal Variances and Unequal Sample Sizes (DTK) and CI 95. Adjacent stacked barplots indicate their taxonomic composition at the phylum level. Asterisks in the taxonomic annotation legend indicate phyla from domain Archaea, lack of asterisk indicates domain Bacteria. c, Abundances of generalists (n = 38 MAGs; orange), specialists (n = 111 MAGs; blue) and uncategorised (n = 381 MAGs; grey) MAGs in Arctic metagenomes (n = 32 samples, 18 SRF, 7 SCM, 7 MES; filled boxplots) and metatranscriptomes (n = 29 samples, 18 SRF, 7 SCM, 4 MES; empty boxplots) across the three ocean layers. Boxplots describe the data as in b. There are no significant differences between the groups.

Source data

Extended Data Fig. 8 Seawater temperature in samples where maximum metagenomic recruitment occurred per MAG, per niche breadth category.

a, Histogram of the number of MAGs in the station where their maximum metagenomic RPKG occurred, colored by niche breadth category and horizontally separated by layer. Bottom heatmap above X axis represents the temperature of each sample. b, Distribution of temperatures in those samples where maximum metagenomic RPKG per MAG occurred, by niche category and layer (n = 38 MAGs classified as generalists and 111 MAGs classified as specialists tested against 32 metagenomes, including 18 SRF, 7 SCM and 7 MES). Data are shown as box plots (Tukey style): the lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles), the horizontal line indicates the median and the whiskers indicate the lowest and highest points within 1.5× the interquartile ranges of the lower (first) or upper (third) quartile, respectively. Data beyond the end of the whiiskers are outlying points and are plotted individually.

Source data

Extended Data Fig. 9 Genes found in specialists but not in generalists.

Quantification of genes annotated against KEGG database that were found in specialist MAGs but not in generalist, colored by pathway.

Source data

Extended Data Fig. 10 Map with the reference stations with metagenomic and metatranscriptomic samples used in the study.

Samples are colored based on the expedition. Supplementary Table 4 contains more details about environmental metadata of these stations.

Source data

Supplementary information

Supplementary Information

Supplementary Figs. 1–9 and text on taxonomical classification of Arctic MAGs.

Reporting Summary

Peer Review Information

Supplementary Tables

Source data

Source Data Fig. 1

Source data for the generation of all plots in Fig. 1.

Source Data Fig. 2

Source data for the generation of all plots in Fig. 2.

Source Data Fig. 3

Source data for the generation of all plots in Fig. 3.

Source Data Fig. 3

Sequence alignment in FASTA format (RoyoLlonch_NatMicrobiol_2021_SOURCEDATA_Figure_3A_Rubisco_alignment.fasta) and phylogenetic tree in Newick format (RoyoLlonch_NatMicrobiol_2021_SOURCEDATA_Figure_3A_Rubisco_ phylogenetic_tree.nwk) for the generation of Fig. 3a.

Source Data Fig. 4

Source data for the generation of all plots in Fig. 4.

Source Data Fig. 5

Source data and statistical data for the generation of all plots in Fig. 5.

Source Data Fig. 6

Source data for the generation of all plots in Fig. 6.

Source Data Extended Data Fig. 1

Statistical source data for the generation of Extended Data Fig. 1.

Source Data Extended Data Fig. 2

Source data for the generation of all plots in Extended Data Fig. 2.

Source Data Extended Data Fig. 3

Source data for the generation of Extended Data Fig. 3.

Source Data Extended Data Fig. 4

Source data for the generation of all plots in Extended Data Fig. 4.

Source Data Extended Data Fig. 5

Source data for the generation of all plots in Extended Data Fig. 5.

Source Data Extended Data Fig. 6

Source data and statistical data for the generation of all plots in Extended Data Fig. 6.

Source Data Extended Data Fig. 7

Source data and statistical data for the generation of all plots in Extended Data Fig. 7.

Source Data Extended Data Fig. 8

Source data for the generation of all plots in Extended Data Fig. 8.

Source Data Extended Data Fig. 9

Source data for the generation of Extended Data Fig. 9.

Source Data Extended Data Fig. 10

Source data for the generation of Extended Data Fig. 10.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Royo-Llonch, M., Sánchez, P., Ruiz-González, C. et al. Compendium of 530 metagenome-assembled bacterial and archaeal genomes from the polar Arctic Ocean. Nat Microbiol 6, 1561–1574 (2021). https://doi.org/10.1038/s41564-021-00979-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41564-021-00979-9

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing