The Antarctic is among the most isolated and dynamic regions on Earth, because of the Antarctic circumpolar current, coupled with massive and seasonally variable sea-ice cover (Arrigo et al., 1997; Arrigo and Thomas, 2004; Arrigo, 2014). The Southern Ocean plays a disproportionately large role in the regulation of Earth's climate and biogeochemical cycles (Boyd, 2002). As a repository of macronutrients, it is unmatched among the worlds oceans, and although high nutrient-low chlorophyll conditions prevail in the permanently open-ocean area between the Polar Front and Seasonal Ice Zones, the Seasonal Ice Zones and coastal areas are more productive and support intense diatom-dominated blooms (Tréguer and Jacques, 1992). Consequently, the ecology of the Southern Ocean plays a key role in the global biological carbon pump and nutrient cycling (Murphy et al., 2012), mediated primarily by autotrophic phytoplankton, particularly diatoms.

Over the last half century, dramatic changes associated with rapid warming occurred in the western Antarctic Peninsula (wAP), with implications for global biogeochemical cycling (Meredith and King, 2005; Clarke et al., 2007; Montes-Hugo et al., 2009; Ducklow et al., 2007, 2013). Regional warming rates in the wAP are among the most rapid globally, causing dramatic sea ice decline, increased glacial input and massive loss of permanent ice shelf. Primary production is intimately linked to sea ice dynamics; meltwater from receding ice edges stabilizes the water column allowing massive phytoplankton blooms. Warming effects in the wAP are regional; ice retreat in northern areas reduces primary production because of wind-induced mixing of the water column, whereas to the south, the increased extent of summer ice-free areas has improved bloom-forming conditions (Montes-Hugo et al., 2009).

Diatoms contribute up to 25% of global primary carbon fixation (Field et al., 1998), making them the single most important phytoplankton group. As phytoplankton stocks (Boyce et al., 2010) and diatom production (Bopp et al., 2005) decline globally under climate change, carbon export to deeper layers via diatom blooms will be affected (Korb et al., 2010). Additionally, in the Southern Ocean, intense grazing on diatoms by krill efficiently transfers biomass to higher trophic levels (Smetacek et al., 2004).

Owing to the global importance of diatoms, genome sequencing projects (Armbrust et al., 2004; Bowler et al., 2008), genome-wide expression analyses (Allen et al., 2008; Mock et al., 2008; Thamatrakoln et al., 2012) and, most recently, environmental metatranscriptomics (Marchetti et al., 2012; Toseland et al., 2013), have provided the most extensive genomic baseline for any phytoplankton group. However, to date, no environmental genomic comparison between Southern Ocean communities has been reported. Metatranscriptomes are a powerful comparative and experimental tool to gain detailed insight into the metabolic and physiological status of communities (Marchetti et al., 2012; Toseland et al., 2013). Here, we present comparative metatranscriptomes from diverse diatom-rich communities in contrasting coastal habitats of the wAP and western Weddell Sea (WDS). Our data illustrate the flexibility of diatom transcriptomes in natural communities, revealing functional differentiation across habitats, particularly in nutrient acquisition, irradiance and low temperature responses.

Materials and methods

Sampling and environmental data

Samples were collected in the Antarctic Peninsula–Bellingshausen Sea–Weddell Sea, in the austral summer of 2009 (Figure 1). Water from the Bransfield Strait (BFS; 30 m) and WDS (6–45 m) was collected from a conductivity, temperature and depth instrument array using 12 l Niskin bottles, together with environmental data. From the Wilkins Ice Shelf (WKI), sea ice collected from floes was melted with cold seawater (0 ºC; 0.2 μm—filtered). Both WKI (sunrise 06:30, sunset 22:30 h) and WDS (sunrise 05:00, sunset 23:00 h) samples were collected just after midday, whereas BFS was collected near dusk at 21:30 h (sunrise 04:00, sunset 22:00 h). Within 30–60 min after collection, samples were filtered (5 μm pore-etched polycarbonate filters) and frozen in liquid nitrogen for storage and transportation.

Figure 1
figure 1

Sampling locations and physico-chemical conditions. (a) Map of the Antarctic Peninsula showing the locations for metatranscriptome sampling in the BFS (red), the WDS (green), and the WKI/Bellingshausen Sea (blue). (b) Depth profile for salinity, temperature, oxygen and fluorescence at BFS, shaded area at ca; 30 m indicates the depth for metatranscriptome sampling, and the major nutrient concentrations at this depth are given in the insert box. (c) Depth profile for WKI at the site where sea ice samples for metatranscriptome analysis were taken. Near-surface water column nutrient concentrations are shown in the insert box. (d) Depth profile at WDS showing broad depth range for metatranscriptome sample; average nutrient concentrations across this range are given in the insert box.

Phosphorus, nitrate + nitrite and silicate concentrations were analysed using standard methods (Hansen and Koroleff, 1999) in a Bran Luebbe AA3 autoanalyser (Norderstedt, Germany). Ammonium was measured spectrophotometrically, based on the reaction with ortho-phthalaldehyde and sulphite following the procedure and recommendations described by Kerouel and Aminot (1997). Dissolved iron concentrations in surface waters were sampled and analysed using trace-metal clean techniques (cf. Tovar-Sánchez et al., 2010).

For cell counts, 2–3 l of seawater was concentrated by gentle pressure for <30 min (Lasternas and Agustí, 2010). Duplicate 10 ml aliquots were filtered onto 2 μm pore size black filters, fixed with glutaraldehyde (1% final concentration), and counted under the epifluorescence microscope (Zeiss Axioplan Imaging, Carl Zeiss Iberia S.L., Madrid, Spain).

Library preparation and sequencing

Filters were vortexed in buffer RTL plus (1200 μl per 47 mm filter) containing 100 μl of glass beads to disrupt the cells. Lysates were transferred onto QIAshredder columns for homogenization. The centrifuged lysate was pipetted onto an AllPrep spin column and total RNA was purified with AllPrep DNA/RNA Mini Kit (Qiagen, Izasa Portugal, LDA, Carnaxide, Portugal), including DNase digestion. RNA quantity and quality were assessed by spectrophotometry (Nanodrop; Fisher Scientific, Porto Salvo, Portugal) and electrophoresis, respectively.

Full-length double-stranded cDNA (ds-cDNA) was synthesized from 250 ng of total RNA (SMARTer PCR cDNA Synthesis Kit; Clontech, Enzifarma S.A., Lisbon, Portugal). First strand cDNA templates were amplified by long-distance PCR (Advantage 2 PCR Kit; Clontech), optimized to avoid over-amplification (20 cycles). Replicate PCR reactions per library were pooled and purified (MiniElute PCR Purification kit; Qiagen) for sequencing.

A total of 560K high-quality GS-FLX Titanium 454 reads were sequenced at Biocant (Cantanhede, Portugal), resulting in 532K reads after cleaning and removal of SMART adaptors (TagCleaner; Schmieder et al., 2010). Reads were screened for rRNA (Meta-RNA 1.0; Huang et al., 2009) using HMMER 3.0 (Eddy, 1998); significant hits were removed for taxonomy binning and biodiversity analyses (Table 1).

Table 1 Sequence, assembly and BLASTx annotation information for three western Antarctic Peninsula phytoplankton community metatranscriptomes from the Bransfield Strait (BFS), Wilkins sea ice (WKI) and Weddell Sea (WDS) samples

Sequence assembly and annotation

Sequences were assembled using MIRA3.0 (Chevreux et al., 2004). BLASTx (E-value ≤ 10−4) was performed against a custom protein database (NCBI RefSeq plus additional collections from the Joint Genome Institute (JGI: for the diatom Fragilariopsis cylindrus, the pelagophyte Aureococcus anophagefferens, the haptophyte Emiliania huxleyi and the crustacean Daphnia pulex). Taxonomic assignment of contigs with significant BLASTx scores was performed using MEGAN (v.4.6: Huson et al., 2011) with minimum bit score = 50, min support = 5 contigs, winning bit score = 200.

Predicted protein-coding sequences were determined from significant top BLASTx matches. To account for reading frame shifts introduced by NGS, multiple high-scoring segment pairs in top BLASTx matches were ‘annealed’ according to maximum and minimum overlap criteria (±5 nucleotides) to improve annotation success against protein databases. Functional annotation was performed against the KEGG (Ogata et al., 1999), Pfam and KOG databases. Read counts, taxonomy assignment and functional annotation data were stored in a relational database (MySQL, for further analysis.

Total reads assigned to diatoms (Bacillariophyta) by MEGAN were used to define this taxonomic subset. Library size normalization between communities was then performed prior to enrichment analyses for diatoms. Significant differences in normalized read counts for annotation terms or pathways were identified using the R statistic (Stekel, 2000), with an empirically determined cut-off of R 8 (≈ 98% believability).

Community structure, diversity and taxonomy

Large and small subunit rRNA sequence reads were compared by BLASTn against the Silva database (release 108, reference small subunit and large subunit rRNAs; After taxonomic assignment and binning (MEGAN), the results were represented graphically (KronaTools;

Bacillariophyta (diatom) diversity was studied within a phylogenetic framework (Kembel et al., 2011). In total, 20 ribosomal protein-coding genes were selected (Supplementary Table S2) based on evolutionary conservation, high expression levels and low copy number in sequenced diatoms. Contigs were examined and edited when necessary (CodonCode Aligner v. 1.6, Dedham, MA, USA) and those with 98% nucleotide identity merged into a single operational taxonomic unit. Homologous protein sequences were identified by BLASTp and/or genome annotations from four diatom genomes (Thalassiosira pseudonana, Phaeodactylum tricornutum, Pseudo-nitzschia multiseries and F. cylindrus) and the outgroups Ectocarpus siliculosus (brown alga) and A. anophagefferens (pelagophyte). Each protein data set was then aligned (MAFFT v.7, E-INS-i algorithm; Katoh and Standley, 2013).

Phylogenetic relationships for each protein were inferred by maximum likelihood (PhyML v3.0.1; Guindon and Gascuel, 2003), with node support estimated by approximate likelihood ratio tests under the LG evolutionary model (Le and Gascuel, 2008). These alignments were then concatenated into a single 2781 amino acid alignment stored in 20 partitions. Phylogenetic relationships were then inferred by maximum likelihood for the concatenated dataset. Node support was estimated by approximate likelihood ratio test and bootstrap analyses (1000 replicates). Phylogenetic reconstructions were represented using the ape-package v.2.5-1 (Paradis et al., 2004) in R.

Metatranscriptome assembly quality

The accuracy of the assembly was tested by RT-PCR amplification of 12 predicted community-specific gene targets. Total RNA was prepared from replicate filters as for NGS. Integrity of DNAase-treated RNA was checked on an Experion RNA chip (Bio-Rad Laboratories, LDA, Amadora, Portugal), and oligo-dT primed cDNA was synthesized with SuperScriptIII. PCR with primers designed using BatchPrimer3 ( amplified small (ca. 400 bp) transcript fragments, using 2 μl of 1/10 diluted cDNA in 25 μl using GoTaq (Promega), 0.2 μM each primer and 2.5 mM Mg. To check for gDNA contamination, 2 μl of 1/100 diluted total RNA were used in control PCRs. The PCR program was: 2’ at 94 °C, 34 cycles of (30’’ at 94 °C, 30’’ at 58 °C, 30’’ at 72 °C), final extension 2’ at 72 °C. The PCR reaction was run on 2% agarose gel alongside the ‘GeneRuler 100bp’ ladder (Fisher Scientific).

Results and discussion

Sample diversity and community features across habitats

After removing rRNA, 452 694 sequence reads were assembled into 58 101 contigs and 5448 singletons, with a N50 contig length of 398 bp (Table 1). Ribosomal reads accounted for 7.4–18.7% of the total reads, depending on the community (Table 1). Sequences were deposited in the NCBI short read archive (Accessions SRX727358, SRX727361, SRX727362). The resulting ‘global’ metatranscriptomic assembly has the advantages of (i) maximizing read numbers for contig assembly and transcriptome coverage and (ii) allowing co-assembly of homologous transcripts from different samples (subsequent analysis of normalized reads/sample yields relative enrichment estimates).

Community composition based on rRNA reflected the ecosystems from which they originated (Figure 1). The western BFS, situated on the continental shelf, was the warmest (about 2.0 °C at the surface) and most 'oceanic' (Figure 1a), with a mixed layer extending to 40 m depth. The converging water masses and complex hydrology support moderate primary production, often dominated by small flagellates (Phaeocystis, Cryptomonas) rather than diatoms (Mura et al., 1995, Varela et al., 2002). Sea ice floes adjacent to the WKI were sampled immediately following massive fragmentation and loss of ≈425 km2 of the WKI (February 2009, Here a cold low salinity surface layer dominated, typical of the seasonal ice zone (Figure 1b). In the WDS, the water column at the post-bloom sampling time displayed a stable mixed layer and strongly developed pycnocline (Figure 1c), but had been more stratified near the bloom peak 3 weeks earlier (Supplementary Figure S1). The fluorescence signal clearly indicated that the algal planktonic community was beginning to sink below the pycnocline (Figure 1c).

Bacillariophyta (diatoms) dominated eukaryotic rRNA reads in BFS (67%) and WDS (73%) communities (Supplementary Figures S2a and c), while heterokont silicoflagellates (Dictyochophyceae; 51%) replaced diatoms (26%) as the dominant taxa sequenced from WKI sea ice (Supplementary Figure 2b). The latter were overwhelmingly Pedinellales; mixotrophic algae (Sekiguchi et al., 2003) associated with ice-covered habitats (Thomsen 1988), which constituted <1% in the other communities. Dinoflagellates accounted for 7% and 8 % of sequences from BFS and WDS, respectively, and only 2% in sea ice (WKI). Cryptophyta were largely restricted to BFS (2%). Heterokont algae (chiefly, diatoms and silicoflagellates), with 70–75% of rRNA reads, therefore appear to contribute overwhelmingly to primary productivity in these communities. Interestingly, fungal sequences featured in both BFS and WKI (10% and 5%, respectively), but not in WDS (<1%). Chytrids represent 78% of the fungal sequences in BFS, while reads from WKI were mainly assigned to Basidiomycetes (ca. 40%) and only 10% to chytrids. Fungi have rarely been reported from Antarctic waters, and deserve future attention (López-García et al., 2001), particularly given the potential role of chytrids as parasites on phytoplankton, likely affecting turnover, succession and nutrient recycling (Kagami et al., 2014). Ciliates in WDS (4% rRNA) may play a significant role, perhaps feeding on the abundant bacteria present, and on other algae.

Contrasting phylogenetic and functional diversity within diatoms

Taxonomic assignment of rRNA reads from the metatranscriptome provided useful information. Bacillariophyceae (=pennate diatoms) dominated BFS and WKI with 60–65% of assigned reads, while in the WDS 50% were Coscinodiscophyceae (=centric diatoms) with only 29% Bacillariophyceae (Supplementary Figure S2).

The phylogenetic method used to describe diatom diversity (based on Kembel et al., 2011) circumvents single gene amplicon approaches (for example, 18S-rRNA). Alignment of individual ribosomal protein contigs against concatenated sequence scaffolds derived from whole genomes revealed divergent communities with distinct taxonomic compositions (Figure 2a). Phylogenetic assignment is limited by the availability of only four diatom genomes, but is consistent with diversity estimates by microscopy (Supplementary Figure S3). Pseudo-nitzschia sp. dominated the BFS sample, and most sequences from BFS clustered near Pseudo-nitzschia multiseries on the tree. Sequences from sea ice (WKI) were more dispersed over the tree, but many clustered near the sea ice diatom F. cylindrus, an abundant genus in our sample. Higher diversity was found in WDS, but most sequences were concentrated near the centric diatom T. pseudonana (Figure 2a).

Figure 2
figure 2

(a) Maximum likelihood (ML) phylogenetic analysis of diatoms from the western Antarctic Peninsula/WDS communities; Bransfield Strait (BFS; red), Weddell Sea (WDS; green) and Wilkins sea-ice (WKI; blue). The ML tree is based on protein-coding sequences for 20 ribosomal protein genes identified in the metatranscriptome assembly, aligned against the concatenated reference proteins from the genomes of the stramenopiles E. siliculosus (brown alga), A. anophagefferens (pelagophyte), P. tricornutum, Pseudo-nitzschia multiseries, F. cylindrus and T. pseudonana (diatoms). The reference species are indicated by name on the tree, and their position is shown by an asterisk. The size of the pie at each node is proportional to the number of reads identified in the contig, and the colour(s) indicates the community from which the sequences originate. (b) Rarefaction analysis of the same ribosomal protein dataset used for tree construction. Contigs for each gene were considered independent taxonomic units (TU) at 98% identity. Solid lines show the expected mean TU richness, dotted lines show the standard deviation.

Whole-assembly metrics of sequence similarity support the phylogenetic results; very few contigs contained reads from all three communities (0.6% of contigs, 3.2% of reads), with the large majority uniquely assigned to a single community (93% of contigs, 82% of reads) (Figure 3a). The accuracy of the assembly was checked using RT-PCR, in which 8 out 12 community-specific contigs tested showed community-specific amplification (Supplementary Figures S4 and S5).

Figure 3
figure 3

Venn diagrams illustrating the relationship between sequence similarity and functional assignment in the metatranscriptome assembly for diatom communities from the West Antarctic Peninsula; BFS, WDS and WKI. (a) Numbers of putative protein-coding contigs (bold) and sum of the total reads (in parentheses) assigned to Bacillariophyta (using MEGAN and based on NCBI taxonomy), and their distribution between the three communities. (b) Functional annotation of Bacillariophyta. In bold are shown the number of annotation terms (KEGG and PFAM non-redundant set), and in parentheses the corresponding total number of reads.

Unsurprisingly, functional similarity between the three diatom communities appeared to be much greater than their taxonomic relatedness (cf. Venn diagrams; Figures 3a and b). Overall, 54% of KEGG and Pfam annotations (91% of reads) were shared between 2 communities, a core set of which (27% of terms, 77% of reads) were common to all three communities (Figure 3b). Nevertheless, 46% of functional terms were unique to one community (ca. 15% unique terms per community), indicating considerable habitat specificity in functional processes and constraints.

Variation in diatom metabolic pathways across communities

The number of reads mapped to central metabolic pathways varied across communities (Figure 4a). Most importantly, key pathways for carbohydrate (glycolysis/gluconeogenesis and pentose phosphate cycle) and energy metabolism (oxidative phosphorylation, photosynthesis, carbon fixation) were under-represented in sea ice (WKI) relative to pelagic diatoms (BFS and WDS). Hence, a potentially depleted supply of reductant via electron transport pathways in sea ice diatoms was matched by downregulation of both carbon fixation (Calvin/Benson cycle) and the oxidative pentose phosphate cycle, indicating restricted flow of energy into carbohydrates compared with pelagic communities. Reduced carbohydrate assimilation due to low temperature and irradiance may be common in diatoms inhabiting brine channels on the underside of sea ice (reviewed by Thomas and Dieckmann, 2002; Mock and Thomas, 2008; Arrigo, 2014).

Figure 4
figure 4

Functional assignment of diatom metatranscriptome sequences to metabolic pathways. (a) Total numbers of sequence reads (normalized by library size) assigned to KEGG Carbohydrate, Energy, Lipid, Nucleotide and Amino acid metabolic pathways. Each of the pathways shown differs in expression between the three communities (R-statistics shown to the right of the plot; Stekel, 2000). (b) Ternary plot showing differentially expressed KEGG orthologues in carbohydrate, lipid and energy metabolic pathways. Orthologues are represented by circles whose size is proportional to read number. The position of an orthologue relative to the vertices of the triangular plot indicates its relative expression by each community sample. ACAC, acetyl-CoA carboxylase/biotin carboxylase; ADHC, alcohol dehydrogenase (cytochrome c); ADHN, alcohol dehydrogenase (NADP+); ATP6V0A, V-type H+-transporting ATPase subunit I; ATPF0B, F-type H+-transporting ATPase subunit b; ATPF0G, F-type H+-transporting ATPase subunit gamma; BR, bacteriorhodopsin; CA, carbonic anhydrase; COX2, cytochrome c oxidase subunit II; CSH, chitin synthase; CYB5R, cytochrome-b5 reductase; DLD, dihydrolipoamide dehydrogenase; FABB, 3-oxoacyl-[acyl-carrier-protein] synthase I; FAD2, omega-6 fatty acid desaturase (delta-12 desaturase); FADS2, fatty acid desaturase 2 (delta-6 desaturase); FBP, fructose-1,6-bisphosphatase I; FLD, flavodoxin I; FNR, ferredoxin-NADP+ reductase; GADPH, glyceraldehyde 3-phosphate dehydrogenase; GFPT, glucosamine-fructose-6-phosphate aminotransferase (isomerizing); GLCD, glycolate oxidase; GOGAT, glutamate synthase (ferredoxin); ICL, isocitrate lyase; LHCA, light-harvesting complex I chlorophyll a/b binding protein 1; LHCA, light-harvesting complex I chlorophyll a/b binding protein 2; LHCA, light-harvesting complex I chlorophyll a/b binding protein 3; LHCB, light-harvesting complex II chlorophyll a/b binding protein 1; LHCB, light-harvesting complex II chlorophyll a/b binding protein 4; LHCB, light-harvesting complex II chlorophyll a/b binding protein 5; MBOAT, lysophospholipid acyltransferase 1/2; MMSA, methylmalonate-semialdehyde dehydrogenase; NASB, nitrite reductase (NAD(P)H) large subunit; NASB, nitrite reductase (NAD(P)H) small subunit; NDUFAB1, NADH dehydrogenase (ubiquinone) 1 alpha/beta subcomplex 1; NDUFS3, NADH dehydrogenase (ubiquinone) Fe-S protein 3; NR, nitrate reductase (NADH); PCY, plastocyanin; PETC, cytochrome b6-f complex iron-sulfur subunit; PGK, phosphoglycerate kinase; PK, pyruvate kinase; PPA, inorganic pyrophosphatase; PRK, phosphoribulokinase; PSAE, photosystem I subunit IV; PSBO, photosystem II oxygen-evolving enhancer protein 1; PSBQ, photosystem II oxygen-evolving enhancer protein 3; PSBU, photosystem II PsbU protein; SBP, sedoheptulose-bisphosphatase; SLD, delta8-fatty-acid desaturase; TKT, transketolase; TPI, triosephosphate isomerase.

Read counts for carbohydrate, lipid and energy metabolic pathway orthologues confirmed the depletion of general metabolic functions in WKI sea ice; most orthologues lie along the BFS–WDS axis (ternary plot, Figure 4b). Two highly expressed orthologues were transketolase (TKT; a key Calvin cycle and pentose phosphate pathway enzyme, and link to other carbohydrate pathways) and glyceraldehyde 3-phosphate dehydrogenase (GADPH; key enzyme in glycolysis), each had four to fivefold more reads in BFS and WDS compared with WKI. Other photosynthesis and Calvin cycle genes were overexpressed in either or both pelagic communities compared with sea ice (Figure 4b; Supplementary Table S1).

Photosynthesis-related transcripts were under-, whereas photosynthetic antenna proteins were over-represented in WKI sea ice diatoms (Figures 4a and b). Further annotation revealed that most (ca. 65%) of the latter encoded stress-related light-harvesting complex proteins (LHCSR or LI818 clade). Psychrophilic diatoms adjust their capacity for non-photochemical quenching at low temperatures (Mock and Hoch, 2005) and LHCSR proteins are thought to function in photoprotective energy dissipation, analogous to PsbS in higher plants—a gene missing from diatom genomes (Zhu and Green, 2008). However, although some LHCSR family members in T. pseudonana respond to high light conditions (Zhu and Green, 2008), not all are stress-induced (Bailleul et al., 2010), and they probably play other important and integral roles in mediating diatom responses to light. LHCSR-like proteins are also important in WDS, accounting for ca. 50% of antenna protein reads (Supplementary Table S2), but for only ca. 1% in BFS (where diurnal expression changes cannot be excluded). BLASTx analysis suggests over-representation of centric (Thalassiosira-like) sequences in the WDS, and Fragilariopsis-like sequences in the WKI (Supplementary Table S2). Furthermore, using RT-PCR, we confirmed the specific expression of a WKI LHCSR transcript in sea ice (Supplementary Figure S5).

Although diatom genomes encode the enzymes necessary for classical C4 metabolism (Reinfelder, 2011), subcellular localizations predict futile inorganic carbon cycling and ATP consumption, and a possible role in excess energy dissipation and/or cellular pH homeostasis (Kroth et al., 2008; Haimovich-Dayan et al., 2012). Although not abundant, the key C4 enzyme pyruvate-orthophosphate dikinase (PPDK) was uniquely expressed in WKI, and phosphoenol pyruvate carboxylase (PEPC) was also over-represented (Supplementary Table S1). Sea ice diatoms therefore seem to activate several distinct (but interdependent) mechanisms to deal with excitation pressure arising from high irradiance/low temperature and/or carbon limitation.

Bacteriorhodopsin-like proteins (Pfam PF01036; KEGG orthologue K04641) related to F. cylindrus (Protein id 267528;; see also Marchetti et al., 2012) were identified in 29 contigs. Read numbers were highest in BFS and WDS (BR; Figure 4b), with greater sequence diversity in the latter (Supplementary Figure S6). Enrichment was much lower (ca. fivefold) in WKI. Bacteriorhodopsins are light-dependent proton pumps in archaea and eubacteria, and related proteins occur in some eukaryotes (Waschuk et al., 2005; Wada et al., 2011). Their presence in marine eukaryotic phytoplankton genomes is thought to result from lateral gene transfer from bacteria (Slamovits et al., 2011). Although still unclear, bacteriorhodopsins could function under prevailing low-Fe conditions as an alternative mechanism for ATP generation (Marchetti et al., 2012).

Fatty acid biosynthesis (lipid metabolism) was strongly overexpressed in BFS (Figure 4a), and only this community expressed the full KEGG complement of genes reported for diatoms (Supplementary Table S1). Particularly striking was the 5–10-fold enrichment of acetyl-CoA carboxylase (ACAC, Figure 4b), which catalyzes one of the initial steps in fatty acid biosynthesis. Diatoms accumulate lipids under nutrient-limited conditions (N, Si or Fe; Roessler, 1988; Mock and Kroon, 2002a; Allen et al., 2008), particularly relevant in light of enriched transport functions (see following section).

Several fatty acid desaturase transcripts (delta-12 desaturase (FAD2), delta-6 desaturase (FADS2) and delta-8 desaturase (SLD)) were over-represented in WDS > WKI > BFS (Figure 4b). The biosynthesis of polyunsaturated fatty acids and membrane remodelling by desaturases may be critical for regulating membrane fluidity at low temperatures (Wada et al., 1990; Nishida and Murata, 1996). Polyunsaturated fatty acids or their aldehyde derivatives are also important in signalling and grazer defence (Pohnert, 2002; Vardi et al., 2008), which might be especially important during late bloom conditions.

Transporter and environmental sensing genes as indicators of physiological status

The chimeric genomes of diatoms confer an unrivalled flexibility among phytoplankton groups to environmental conditions, such as spatio-temporal variability in nutrient levels (Armbrust et al., 2004; Bowler et al., 2008). Here, we identified a wide variety of diatom nutrient acquisition and transport genes, of which 34% (18 of 53 functional terms) showed differential enrichment between communities (Table 2). Ammonium and urea transporters were enriched in the Pseudo-nitzschia-dominated BFS community, suggesting that reduced forms of N are important in this habitat, despite the apparent sufficiency of nitrate in the water column (Figure 1b). Silicon transporters were between 4-fold (versus WKI) and 50-fold (versus WDS) enriched in BFS. The energy required for silicic acid uptake appears to derive from respiratory rather than photosynthetic activity (Martin-Jézéquel et al., 2000), consistent with enrichment of both TCA cycle genes (BFS > WKI > WDS; Figure 4a) and vacuolar (V-type) ATPase subunits that appear closely involved in silica biomineralization processes (Vartanian et al., 2009; Supplementary Table S1). Although transcript and protein levels of silicon transporters are loosely correlated (Thamatrakoln and Hildebrand, 2007), silicic acid uptake is closely coordinated with cell cycle activity and division. The Pseudo-nitzschia spp. in the BFS community were mitotically active, consistent with enhanced silicic acid uptake relative to species in WKI and particularly WDS communities.

Table 2 Differential-enrichment of diatom transporters in metatranscriptomes of western Antarctic Peninsula communities

Overall, ATP-binding cassette (ABC) superfamily members were enriched in WKI. ABC transporters are multicomponent active (ATP-dependent) permeases that uptake a broad range of substrates. The specific affinities of ABC members identified here are largely unknown, but their relative enrichment suggests a reliance on energy-dependent uptake of certain nutrients by sea ice diatoms. The major facilitator superfamily and solute carrier (SLC) membrane transporters are largely solute exchangers operating in response to chemiosmotic gradients. Of these, a nitrate/nitrite transporter was ca. 10-fold enriched in WKI, while glutamate and sugar transporters were also uniquely identified in this community (Table 2). The WDS community was enriched for only two transporters; a mitochondrial ATP/ADP transporter, responsible for ATP export to the cytosol, and (together with BFS) a ZIP family permease putatively involved in Fe uptake.

Diatom nitrogen metabolism is flexible among habitats

Expression of primary N-uptake systems differed considerably in the three areas sampled (Figure 5 and Table 2). Subcellular localization data were inferred from annotated diatom gene sequences and the available literature (Armbrust et al., 2004; Bowler et al., 2008; Brown et al., 2009; Kinoshita et al., 2009; Allen et al., 2011). A caveat to the following discussion is that variation in diurnal timing (for BFS) could have transcriptome-wide effects on relative enrichment (Ashworth et al., 2013; Chauton et al., 2013). In contrast to ammonium and urea transporters discussed in the previous section, nitrate transporters were less abundant in the more oceanic BFS community, despite high NO3 levels in the water column (Figure 1b). The Calvin cycle marker genes sedoheptulose bisphosphatase (SBP) and fructose bisphosphatase (FBP) (Parker and Armbrust, 2005; Kroth et al., 2008) were both most abundant in BFS (Figure 5) indicating that N availability is not affecting these processes. These enrichment patterns therefore appear to reflect a physiological stance towards the uptake of reduced N sources, consistent with experimental evidence that NH4+ limits phytoplankton growth in Fe-sufficient areas of the Southern Ocean (Agustí et al., 2009), such as the BFS and the Antarctic Peninsula (Martin et al., 1990; Sañudo-Wilhelmy et al., 2002). Given the enrichment of Fe-transport systems in this community (Table 2), such patterns of N use may minimize the use of Fe-rich N-acquisition proteins, despite similar median values of 1.8 nmol Fe l−1 across the regions sampled in this study (Tovar-Sanchez, unpublished results).

Figure 5
figure 5

Simplified schematic representing N acquisition and N-based photorespiratory pathways in a diatom cell (left). The subcellular localization of gene products was determined by reference to annotation data and the literature (see text for details). The uncertain localization of NasB is indicated by dashed arrow and ‘?’. The relative expression (as Log2 reads) of the gene products highlighted in colour are shown on the right for BFS, Wilkins sea-ice (WKI) and WDS communities. Asterisks in red indicate significant expression differences between the communities (R-values (where > 8 ≈ 98% believability) and annotation codes (KEGG/Pfam/BLASTx) are given in parentheses after the gene acronyms). AMT, ammonium transporter (44.7, K03320); FBP, fructose-1,6-bisphosphatase (9.3, K03841); GDCT, glycine decarboxylase (57.9, K00281); Fd-GOGAT, ferredoxin-glutamate synthase (plastidial) (13.8, K00284); NADH-GOGAT, NADH-glutamate synthase (cytosolic) (2.3/3.1, K00264/K00266); GSII, glutamine synthetase (plastid) (1.1, K01915); GSIII, glutamine synthetase (cytosolic) (1.1, K01915); NaR, nitrite transporter (4.1, PF01226); NasB, NAD(P)H-nitrite reductase (15.9, K00363); NiR, ferredoxin-nitrite reductase (plastid) (2.1, K00366); NiT, nitrate transporter (21.7, K02575); NR, nitrate reductase (12.3, K00360); SBP, sedoheptulose bisphosphatase (20.4, K01100); UAP, urease accessory protein (2.6, K03189); UR, urease (BLASTx); UT, urea transporter (74.4, K03307).

In contrast, the WKI sea ice diatom community seems to primarily assimilate nitrate. Nitrate transporters, assimilatory nitrate and nitrite reductases (NR and NasB) were most enriched in this community (Figure 5). Transcript profiles may not reflect protein levels (or activity), because although nitrate is required for NR protein accumulation (Poulsen and Kröger, 2005), mRNA levels can remain high under NO3 limitation. Nevertheless, enrichment of plastid glutamine synthetase (GSII), which is transcriptionally regulated by external NO3, but not NH4+, supports NO3 as the main N source in the WKI (Takabayashi et al., 2005). Moreover, although sea ice environments are generally considered N-limited, owing to reduced exchange with surrounding seawater (Mock and Kroon, 2002a, 2002b), we sampled ice near the ice–water interface (surface to 5–10 cm depth), where somewhat higher nutrient levels are expected (Lyon et al., 2011). Interestingly, urea transporters were relatively enriched in WKI, suggesting that reduced organic N from heterotrophic sources may provide additional N within sea ice brine channels (Table 2).

The assimilation of nitrate by sea ice diatoms may help in controlling cellular energy balance under photorespiratory conditions (Lomas and Glibert, 1999). Photorespiration in WKI is suggested by abundant photoprotective LHCSR proteins, reduced photosynthetic pathway enrichment (Figure 4) and by enrichment of mitochondrial glycine decarboxylase (GDCT), a key enzyme and marker for photorespiration (Douce et al., 2001). Photorespiration consumes ATP and reductant 'wastefully', but may prevent overreduction of the photosynthetic electron transport chain (Wingler et al., 2000), while photorespiratory glycolate may contribute to DOC production in sea ice (Arrigo, 2014). Under high light and/or low CO2 conditions that lead to photoinhibition of photosynthesis, particularly at low temperatures where carbon fixation is enzymatically inefficient, reduction of intracellular NO3 pools by NR may buffer excess electron transport capacity and act as an effective photoprotective mechanism (Lomas and Glibert, 1999).

In comparison with both BFS and WKI, transporters for NO3, NH4+ and urea were all under-represented in WDS (Table 2). Assimilatory pathways were also depleted, (Figure 5), which could indicate severe N limitation in the post-bloom mixed layer. Three weeks before sampling, at the bloom peak, NO3/NO2 was < 2 μM, and NH4+ was 0.2 μM at the same station (Supplementary Figure S1).

Enrichment patterns at low temperature

Several contrasts between diatom communities are consistent with strong temperature effects on gene expression and metabolism, even across a relatively small gradient within the wAP (from < −1.8 to 2.0 ºC). Communities differed greatly in pathway enrichment for 'Genetic Information Processing', most strikingly in the ribosome pathway, which was enriched in the WDS and WKI relative to the warmer BFS (Figure 6a). In addition to ribosomal proteins, the levels of several other orthologues including translation factors (EEF1A, EF3) are biased towards these two colder communities (Figure 6b). Overexpression of the translational machinery is a response to compensate rates of protein synthesis under low temperatures. The capacity for thermal acclimation of RNA translation or protein synthesis appears limited, largely relying on production of additional ribosomes (Nomura et al., 1984; Fraser et al., 2002; Storch and Pörtner, 2003; Kim et al., 2004). Such a large difference in ribosomal protein enrichment implies large increases in cellular RNA at lower temperatures, as RNA:protein in ribosomes is constant. Over a 3–4 °C range, this could have important consequences for population growth rates and nutrient stoichiometry, particularly N:P, as ribosomes are relatively P-rich (Toseland et al., 2013).

Figure 6
figure 6

Functional assignment of diatom metatranscriptome sequences to KEGG Genetic Information Processing pathways. Each of the pathways shown differs in expression between the three communities (R-statistics shown to the right of the plot; Stekel, 2000). (a) Total numbers of sequence reads (normalized by library size) assigned to KEGG Transcription, Translation, and Folding, sorting and degradation pathways. (b) Ternary plot showing differentially expressed KEGG orthologues in Transcription, Translation, and Folding, sorting and degradation pathways. Orthologues are represented by circles whose size is proportional to read number. The position of an orthologue relative to the vertices of the triangular plot indicates its relative expression by each community sample. AARS, alanyl-tRNA synthetase; cspA, cold shock protein (beta-ribbon, CspA family); dnaK, molecular chaperone DnaK; EEF1A, elongation factor EF-1 alpha subunit; EF3, elongation factor EF-3; EIF4A, translation initiation factor eIF-4A; HSP90A, molecular chaperone HtpG; HSP90B, heat shock protein 90 kDa beta; HSPA1_8, heat shock 70 kDa protein 1/8; NARS, asparaginyl-tRNA synthetase; NFS1, cysteine desulfurase; NMD3, nonsense-mediated mRNA decay protein 3; NOG1, ribosomal RNA large subunit methyltransferase N; PPP1C, protein phosphatase 1, catalytic subunit; RP-L17e, large subunit ribosomal protein L17e; RP-L2, large subunit ribosomal protein L2; RP-L23, large subunit ribosomal protein L23; RP-L24e, large subunit ribosomal protein L24e; RP-L28, large subunit ribosomal protein L28; RP-L40e, large subunit ribosomal protein L40e; RP-L4e, large subunit ribosomal protein L4e; RP-L5e, large subunit ribosomal protein L5e; RP-L6, large subunit ribosomal protein L6; RP-L9e, large subunit ribosomal protein L9e; RP-S12e, small subunit ribosomal protein S12e; RP-S13e, small subunit ribosomal protein S13e; RP-S20, small subunit ribosomal protein S20; RP-S23e, small subunit ribosomal protein S23e; RP-S3, small subunit ribosomal protein S3; RP-S4, small subunit ribosomal protein S4; rpoA, DNA-directed RNA polymerase subunit alpha; SEC61A, protein transport protein SEC61 subunit alpha; secA, preprotein translocase subunit SecA; secY, preprotein translocase subunit SecY; SSRP1, structure-specific recognition protein 1; SUMO, small ubiquitin-related modifier; VARS, valyl-tRNA synthetase.

The levels of several other transcript classes was also correlated with low temperatures: (i) CSP family cold-shock transcription factors (possible 'RNA chaperones'; Phadtare, 2004) (Figure 6b). (ii) Fatty acid desaturases (delta−6, −8 and −12), implicated in the control of membrane fluidity during acclimation to cold (D’Amico et al., 2006; Figure 4b; Supplementary Table S1). (iii) Ice-binding proteins were prominent in the sea ice community, but absent in BFS.

Some 65 ice-binding protein contigs were identified (ca. 490 reads) from bacteria, archaea and fungi, but diatoms were the most abundant source, accounting for ca. 390 reads in WKI, followed by copepod crustaceans (top BLASTx hit Stephos longipes).


Metatranscriptomes revealed the physiological and metabolic flexibility of diatom genomes facing the challenges of different habitats in the Antarctic Peninsula. The metatranscriptome of a sea ice community reflected its stressful environment: carbohydrate and energy metabolism pathways were depleted relative to pelagic communities, while direct and indirect light energy dissipation pointed to irradiance stress and/or inorganic carbon limitation within the ice.

Transcript abundances for N acquisition and transport varied widely across habitats. Their expression was reduced or absent under post-bloom conditions in the WDS community, suggesting dependence on internal stores for continued growth, in marked contrast with the BFS community where transporters for N sources were highly enriched. The relative enrichment of fatty acid biosynthesis in BFS likely reflects nutrient (possibly including Fe) stress in this community, which consists of species (mainly Pseudo-nitzschia spp.) with coping strategies.

Most remarkable was the evidence for an influence of temperature over the narrow 4 °C range separating the communities. Investment in transcription and translation functions (mainly ribosomal proteins) decreased with increasing temperature (Supplementary Figure S7). A similar phenomenon was recently reported over a much broader thermal range and taxon diversity (Toseland et al., 2013). In summary, this initial metatranscriptomic survey revealed a range of functional responses in diatom communities that reflect the ecological diversity of habitats in waters around the unique Antarctic region.