Introduction

Nitrogen is the second most important nutrient, after carbon, in phytoplankton and is generally considered the major element limiting phytoplankton growth in the marine environment1. Nitrogen coupling with carbon is essential for the biosynthesis of nucleic acids, proteins and chlorophylls, sharing of energy and organic compounds via glycolysis, the tricarboxylic acid (TCA) cycle, the mitochondrial electron transport chain and photosynthesis2. Various studies have shown that several microalgae such as Desmodesmus sp., Chlorella sp. and Chlamydomonas reinhardtii (Chlorophyceae), Nannochloropsis oculata (Eustigmatophyceae) and Porphyridium cruentum (Rhodophyceae) increased lipid accumulation when cultured in nitrogen starvation (N starvation) and, for this reason, they have been proposed as promising feedstock for biodiesel production3,4,5,6. N starvation induced an increase in glycolytic and TCA cycle enzymes in the marine diatom Thalassiosira pseudonana (Mediophyceae7) and de novo biosynthesis of triacylglycerols, decrease of chloroplast galactolipids and reorganization of the photosynthetic apparatus in the flagellate Nannochloropsis gaditana8. However, cellular responses triggered by N starvation are not completely clarified.

Among microalgae, green algae, with more than 7000 species growing in a variety of habitats9, have been frequently studied for energy purposes10, but also as sources of bioactive extracts/compounds11,12. Tetraselmis spp. (green algae) are widely harvested as feed for molluscs, shrimp larvae and rotifers13, for their antimicrobial activity14, as sources of vitamins for animal and human consumption15 and for biodiesel production16. T. suecica clone CCMP906 raw extracts did not show any antimicrobial, antioxidant, anticancer and anti-diabetes activities17, but the purified carotenoid extract had a strong antioxidant and repairing activity in the human lung cancer cell line (A549) and on reconstructed human epidermal tissue cells (EpiDermTM12). These data suggest that this species has cosmeceutical activity and potential interesting biotechnological applications.

In this paper, we present for the first time the full-transcriptome of the green alga Tetraselmis suecica (CCMP906) and differential expression analysis between N-starved and –repleted (control) conditions focusing not only on lipid metabolism but giving new insights on N starvation responses and possible biotechnological applications for this species.

Even in the absence of a fully sequenced and annotated genome, transcriptomic analysis by RNA-sequencing can provide a powerful tool to improve our understanding of physiological networks that allow microalgae to respond to various environmental cues18. Regarding Tetraselmis, transcriptome sequencing has been done for Tetraselmis sp. GSL018 (MMETSP0419), T. chuii PLY429 (MMETSP0491), T. astigmatica CCMP880 (MMETSP0804) and T. striata LANL1001 (MMETSP0817, MMETSP0818, MMETSP0819, MMETSP0820). In addition, Adarme-Vega et al. studied some specific genes involved in lipid metabolism in the clone Tetraselmis sp. M819 and, recently, Lim et al. have sequenced the transcriptome of Tetraselmis sp. M8 clone in nitrogen depletion in order to study lipid-related pathways that lead to triacylglyceride accumulation in oleaginous microalgae20. Our study focuses on N starvation-induced metabolic changes and new insights on Tetraselmis responses to low concentrations of this nutrient.

Materials and Methods

Cell culturing and harvesting, RNA extraction and cDNA synthesis

Tetraselmis suecica (CCMP906) was cultured in Guillard’s f/2 medium21 without silicic acid. Experimental culturing for both control and nitrogen starvation conditions was performed in 2 litre polycarbonate bottles (each experiment was performed in triplicate) constantly bubbled with air filtered through 0.2 µm membrane filters. For the N starvation experiment the medium was prepared with low concentrations of nitrogen (30 mM of NO3; N starvation condition). Cultures were kept in a climate chamber at 19 °C on a 12:12 h light:dark cycle at 100 µmol photons m−2 s−1. Initial cell concentrations were about 5000 cells/mL for each experiment and net growth was monitored22. Aliquots of 50 mL were sampled during the stationary phase (day 7) and centrifuged for 15 minutes at 4 °C at 1900 g (Eppendorf, 5810 R). Cell concentration was ~2 × 106 cells ml−1 for the control condition and ~2 × 105 cells ml−1 for N-starved cells. For RNA extractions, both RNA sequencing (RNAseq) and reverse transcription-quantitative PCR (RT-qPCR) pellets (triplicates for each condition and for each technique) were re-suspended in 500 µL of TRIZOL© (Invitrogen, Carlsbad, CA), incubated for 2–3 min at 60 °C until completely dissolved and kept at −80 °C23.

RNA was extracted as in Lauritano et al.18 using TRIZOL® manufacturer’s instructions. For RT-qPCR, 500 ng/replicate were retrotranscribed into cDNA with the iScriptTM cDNA Synthesis Kit (BIORAD, Hercules, CA) following the manufacturer’s instructions.

Library preparation, sequencing and assembly

RNA-Seq libraries were prepared from 2.5 µg of total RNA using the Illumina TruSeq® Stranded mRNA kit (Illumina Inc., San Diego, CA, USA) according to the manufacturer’s instructions. Paired-end sequencing (2 × 100 bp) was performed with the HiSeq1000 Illumina platform. Adaptor trimming of the reads was performed using CutAdapt (ver. 1.6), followed by duplicates removal and then abundance normalization with khmer software (ver. 1.1). De novo assembly was performed with the Oases/Velvet assembler (v. 0.2.08) with multiple k-mers (from k-mer 25 to k-mer 85 with steps of 1024,25). Clustering and redundancy removal was performed by using EvidentialGene (ver. 2013-07-27). To evaluate coverage depth on the obtained transcriptome, the filtered RNA-seq reads were mapped against it using BWA mem (ver. 0.7.5a-r405) and data were filtered to eliminate transcripts with less than 100 alignments.

Functional annotation and differential expression analysis

Functional annotation of non-redundant contigs was performed using Blast2GO software (Version 4.1.9) with NCBI-NR database and default parameters26. Expression abundances were quantified using RSEM (version 1.1.21) with default settings27. Differentially expressed genes (FDR ≤ 0.001; |Log2(FC)| ≥ 2) were identified by using the R package DESeq28. Raw read counts were transformed to FPKM (fragments per kilobase of exon per million fragments mapped). Analysis with BUSCO software (v3.0.0) was performed using the eukaryote dataset (odb9). In addition, transcript functional categories were also deeply investigated by using the Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation29. Raw reads and assembled transcripts have been deposited in GenBank (GEO database30; series entry GSE109461).

Reverse transcription-quantitative polymerase chain reaction (RT-qPCR)

Primers for putative reference genes (RGs) and genes of interest (GOI) were designed using the software Primer3 v. 0.4.0 (http://frodo.wi.mit.edu/primer3/) and optimized as in Lauritano et al.31. Supplementary Table S1 lists selected RGs and GOI, their functions, primers’ sequences and efficiencies. To normalize expression levels of the selected GOI, a panel of putative RGs (i.e. actin, alpha and beta tubulins, glyceraldehyde 3-phosphate dehydrogenase, histone 1 and 4) was first screened in the 2 experimental conditions: control and N starvation conditions. The best RGs (i.e. histone 1, actin and β tubulin, see Supplementary Table S2) were identified by using the software BestKeeper32, geNorm33 and NormFinder34. Primer reaction efficiency (E) and correlation factor (R2) were calculated using the equation E = 10−1/slope. GOI were selected between the most up- or down-regulated DEGs with functional annotation: Calcium/calmodulin-dependent protein kinase type 1 (CAMK), Protein phosphatase 2c family protein (PP2C), Squamosa promoter binding protein (SBP), Ammonium transporter (AMT), ATP-binding cassette protein transporter (ABC), Glutathione-S-transferase (GST), Catalase (CAT), Heat shock protein 20 (HSP20), Phosphoenolpyruvate carboxylase kinase (PPCK), Lipoxygenase (LPX), Polyketide synthase (PKS), Nitrilase (NIT1), 3,5-cyclic nucleotide phosphodiesterase (PDE) and Elmo-domain-containing protein 3 (ELMOD3). RT-qPCR was performed as in Lauritano et al.18 in a Viia7 real-time PCR system (Applied Biosystem) and using the Relative Expression Software Tool35 for expression level analyses. Control condition was represented by microalgae cultured in normal repleted-medium. Statistical analysis was performed using GraphPad Prim statistic software, V4.00 (GraphPad Software). Normality of data was tested by using the Anderson-Darling test36 with the PAST software (v.3.1537).

Protein structure prediction of selected transcripts

We selected some transcripts (lipoxygenases, nitrilase and polyketide synthase) with known potential biotechnological applications to further investigate their structure at protein level. Their nucleotide sequence was first translated into the corresponding amino acid one from the first methionine to the first stop codon using the Translate tool of ExPASy38 (https://web.expasy.org/translate/); then, functional domains of the protein sequences were annotated using the webserver InterProScan (available at https://www.ebi.ac.uk/interpro/search/sequence-search) and protein structures were predicted using the Phyre2 web server (http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index). For PKS, separate analyses were done for each domain identified by InterProScan (https://www.ebi.ac.uk/interpro/search/sequence-search).

Phylogenetic analysis of nitrilase

In order to assess the evolutionary relationships among the nitrilase found in our transcriptome and the others from closely related organisms, we inferred a phylogenetic analysis retrieving homologous sequences from different databases as eggNOG v. 4.5 (available at http://eggnogdb.embl.de/#/app/home), the UniRef90 database of the BLAST tool available at UniProt server (http://www.uniprot.org/blast/) and the genome database for red algae (realDB, http://realdb.algaegenome.org/P.e.a.html#). The latter database was used to include homologous sequences of distantly related taxa that are at the basis of the evolutionary lineage that led to green algae and land plants. We also downloaded green algae transcriptomes from the Marine MicroEukaryote Transcriptome Sequencing Project (MMETSP) to increase the number of nitrilase sequences to be analysed. Finally, the sequences of land plants nitrilases by Howden et al.39 were added into our analysis to assess the evolutionary relationships with our sequence. All sequences were aligned in ClustalX240 and then edited manually. A Maximum Likelihood (ML) tree was built using RAxML41 under the substitution model LG + G suggested by PartitionFinder v.1.1.142 using the AICc criterion. Support to branches was inferred via bootstrap analysis using the autoMRE option of RAxML. The resulting tree was visualised and graphically edited in FigTree v1.4.3 (http://tree.bio.ed.ac.uk/software/figtree/).

Results and Discussion

Transcriptome sequencing, de novo assembly and functional annotation

RNA-sequencing (RNA-seq) from samples cultured in control culturing condition and in nitrogen starvation (N starvation) yielded 26,550,078 and 5,005,719 total raw or normalized fragments, respectively, per sample on average (Table 1). As no available reference genome of T. suecica was available, normalised RNA-seq reads have been assembled with de novo approach producing 621,424 putative transcripts. In order to evaluate the assembled transcriptome, general statistics have been computed (Table 2). Clustering and redundancy removal resulted in a transcriptome of 31,352 main transcripts with an average length of 962.6 bp (Table 1). Of these, 24,399 transcripts were supported by sufficient RNA-seq reads (>100) and 15,027 were further associated to a known protein, based on the NCBI database. Analysis with the Benchmarking Universal Single-Copy Orthologs (BUSCO) software identified 94.4% of 303 BUSCOs reference transcripts (81.5% completed and 12.9% fragmented, see Supplementary Table S3) demonstrating that the large majority of the transcriptome has been reconstructed with mostly full length transcripts.

Table 1 Total, filtered and normalized fragments obtained from the RNA-sequencing analysis performed using 100 nt paired-end reads. Each experimental condition was analysed in triplicate.
Table 2 De novo assembly statistics.

Functional annotation using Gene Ontology (GO) assigned molecular function, biological process and cellular localization to 46% of the putative transcripts (see Supplementary Fig. S1). The Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation (see Supplementary Table S4), identified the presence of 134 metabolic pathways. Of these, the biosynthesis of the antibiotic pathway (Pathway ID map01130), was the one with the highest number of enzymes associated to it (112 enzymes). Other highly represented pathways were purine metabolism (Pathway ID map00230), pyruvate metabolism (Pathway ID map00620), amino sugar and nucleotide sugar metabolism (Pathway ID map00520), glycolysis/gluconeogenesis (Pathway ID map00010), starch and sucrose metabolism (Pathway ID map00500) and alanine, aspartate and glutamate metabolism (Pathway ID map00250).

Differential expression analysis

Differential expression analysis identified 319 genes with significant expression variations (|LogFC| > 2; P value adjusted ≤0.01) in N starvation condition relative to control (i.e. T. suecica cultured in complete K medium). GO annotation was used to identify major categories of genes differentially expressed between the two experimental conditions and percentage of sequences for each GO term within cellular component, biological process and molecular function are reported in Fig. 1. Among the 319 differential expressed genes (DEGs; of which 189 were up-regulated and 130 down-regulated), 166 transcripts had no NCBI NR assignment (of which 107 were up-regulated and 59 down-regulated), while the remaining 153 included 82 up-regulated and 71 down-regulated genes. The full list of DEGs, log2 x-fold change, adjusted P value (padj), and their GO annotation are reported in the Supplementary Table S5. Among the DEGs, the ones showing the highest expression in N starvation conditions were an extracellular ligand-binding receptor (padj = 4,32E-64), the abc transporter substrate-binding protein (padj = 4,73E-181), the elmo domain-containing protein 3-like (padj = 3,72E-15) and the 3,5-cyclic nucleotide phosphodiesterase (padj = 3,30E-05). Conversely, N starvation induced a strong down-regulation of polyketide synthase (PKS; padj = 8,17E-20). Figure 2 summarizes the main results while details are reported in the following paragraphs. Up-regulated transcripts were mainly involved in signal transduction pathways, stress and antioxidant responses and solute transport while transcripts involved in amino acid synthesis, degradation of sugars, secondary metabolite synthesis and photosynthetic activity were down-regulated when cultured in N starvation.

Figure 1
figure 1

Histograms of GO classifications showing sequence distribution of the differentially expressed genes within cellular component (a), biological process (b) and molecular function (c). The y-axis indicates the percentage of sequences for each category.

Figure 2
figure 2

Summary of the main results. Up-regulated transcripts were mainly involved in signal transduction pathways, stress and antioxidant responses and solute transport while transcripts involved in amino acid synthesis, degradation of sugars, secondary metabolite synthesis and photosynthetic activity were down-regulated in nitrogen starvation condition.

DEGs involved in signal transduction pathways and their regulation

In N starvation condition, T. suecica activated several signal transduction pathways involving protein kinases and phosphatases. Some of these are protein phosphatases 2 C (PP2C), involved in mitogen-activated protein kinase (MAPK) signalling43, and serine-threonine kinases, playing a central role in cell-cycle regulation by transmitting DNA damage signals to downstream effectors of cell-cycle progression44. Eukaryotic MAPK cascades transduce environmental and developmental cues into intracellular responses45,46. To date, the activation of MAPKs in response to N starvation has been observed in the yeast Saccharomyces cerevisiae47,48 and in the ascomycete fungus Fusarium proliferatum49, but no information exists in microalgae. In the present study, we found a 3 fold up-regulation of MAPK 14 during N starvation. MAPK signalling is regulated by the action of phosphatases50 and we observed a 5.4 fold up-regulation of PP2C51,52. Interactions between PP2C and MAPK have been observed in S. cerevisiae in response to osmotic stress53 and in A. thaliana during stress responses54,55. Since we found a 5.4 up-regulation of PP2C, together with a significant expression of MAPK in N starvation, we suggest a possible role of this signalling pathway in the response to N starvation in T. suecica.

We also found a transcript coding for a serine-threonine kinase atr-like which was 4.5 fold up-regulated and a calcineurin-like metallo-phosphoesterase/metallo-dependent phosphatase that was 2.6 fold down-regulated. The serine-threonine kinase atr is a kinase which can be involved in stress responses (https://www.uniprot.org/uniprot/Q13535). The function of calcineurins are less known compared to other members of this family in plants, while there is no information regarding microalgae. Our study suggests a possible role in microalgal response to N starvation.

Three transcripts coding for putative SQUAMOSA Promoter-Binding Proteins-Like (SPL) were significantly up-regulated as well (4.9, 3.9, and 2.2 fold, respectively). These are transcription factors that are involved in the regulation of other transcription factors and metabolic processes56. A recent phylogenetic analysis revealed nine major SPL gene lineages in higher plants, each of which is described in terms of function and diversification57, but an extensive knowledge in their closest relatives, the green algae, is still missing. Our study indicates a possible role during nitrogen starvation and calls for further investigations.

Transporter DEGs

A significant differential expression of several genes involved in the transport of different metabolites was observed in our experiment. In particular, the sugar transport protein 13 like (for sugar transport), the solute carrier family 39 protein (generic transporter of solutes) and importin alpha (involved in the import of proteins into the nucleus) were 2.4, 4.1 and 3.9 fold up-regulated, respectively. On the contrary, lysine-histidine transporter-like 5 was significantly 4.5 fold down-regulated. The decrease of transcripts involved in the carriage of amino acids is compatible with the same trend observed in their production as a consequence of N starvation. In contrast, the increase of sugar and solute carriers is probably due to the altered metabolic state of the cell and consequently need of reallocating such compounds.

Two transcripts referring to ammonium transporters were also found in our transcriptome, whose encoded amino acid sequences shared a 40% similarity. One was 2.7 fold up-regulated and the other one 4.4 fold down-regulated. Ammonium transporter genes (AMTs) have been found in several phytoplankton species including diatoms58, green algae59,60, haptophytes61 and prasinophytes62. A study conducted in the marine diatom Cylindrotheca fusiformis also revealed the occurrence of two types of AMT that were differentially expressed under N starvation conditions63. Both studies suggest that different AMT isoforms of the same species may not be activated under the same experimental conditions (as also observed in higher plants64).

DEGs involved in stress and antioxidant responses

Microalgae are constantly exposed to both physical and chemical stressors which they react to by activating a series of defense mechanisms. The most common defense strategies include the activation of heat shock proteins (HSPs) and antioxidant enzymes65,66. HSPs are molecular chaperones that can be involved in protein folding and unfolding, and degradation of mis-folded or aggregated proteins65. The antioxidant enzymes detoxify reactive or toxic intermediates (e.g. reactive oxygen species) which can be damaging to DNA, RNA and proteins66. In this study, N starvation induced the activation of HSP20 (3.7 fold up-regulation) and the antioxidant enzymes catalase (CAT; 2.5 fold up-regulation) and glutathione S-transferase (GST; 2.07 fold up-regulation). HSP20 was highly up-regulated also in a recent paper on the diatom P. tricornutum67 and the plant A. thaliana cultured in N starvation68. CAT enzymatic activity increased in the chlorophytes Chlorella sorokiniana and Coccomyxa sp.69,70 and GST also increased in C. reinhardtii exposed to N depletion71. Both CAT and GST were not differentially regulated in other green algae exposed to N starvation/depletion. These data highlight a connection between nutrient deprivation, oxidative stress and detoxification of free radicals; however, the responses were slightly different depending on the studied species.

DEGs involved in carbon metabolism

Several transcripts involved in glycolytic pathways, like pyruvate kinase (PK), glycosyl hydrolase, xylulose kinase and xylosyltransferase, were down-regulated. PK plays an important role in the initiation of de novo amino acid synthesis providing carbons to the tricarboxylic acid (TCA) cycle2. The down-regulation of this transcript is in agreement with the decrease in the number of transcripts of enzymes involved in the biosynthesis of several amino acids such as dihydroxy-acid dehydratase for valine-leucine-isoleucine biosynthesis (2.3 fold down-regulation). On the contrary, in the green alga Chlamydomonas reinhardtii no drastic changes were observed in pyruvate kinase expression after N starvation72. The phosphoenolpyruvate carboxylase kinase (PPCK), responsible for phosphoenolpyruvate carboxylase phosphorylation and activation, was strongly up-regulated (5.225 fold up-regulated). Phosphoenolpyruvate carboxylase (PPC) catalyses the fixation of CO2 to yield oxaloacetate, playing several key roles in the central metabolism of plants (e.g. regulation of carbon fixation and TCA cycle73). Hence, the increased transcription of PPCK indicates a possible cellular signalling for the regulation of carbon fixation and TCA cycle upon nitrogen starvation.

Glycosyl hydrolases (also called glycosylases) are a family of enzymes mainly involved in the degradation of complex sugars such as cellulose, hemicellulose, and starch74. Together with glycosyl transferases, which are involved in the establishment of glycoside linkages, they form the major catalytic machinery of sugar bonds. We found a 2.2 fold down-regulation of a glycosyl hydrolase transcript and a 2.2 up-regulation of a glycosyl transferase, indicating that during N starvation there is a trend in the accumulation of carbon compounds rather than in their degradation or de novo biosynthesis. Other transcripts related to carbohydrate metabolism, such as xylulose kinase and xylosyltransferase, were 2 and 2.2 fold down-regulated, respectively. Xylulose kinase is a key enzyme for arabinose and xylose metabolism in the green alga Chlorella protothecoides75, xylose utilisation in S. cerevisiae76 and biosynthesis of plastidial isoprenoids in A. thaliana77. Furthermore, this enzyme is involved in the phosphorylation of xylulose to xylulose 5-phosphate, which plays an important role in the regulation of glucose metabolism and lipogenesis75. Xylosyltransferase is an enzyme involved in the biosynthesis of glycosaminoglycans, which are known to have anticoagulant and anti-inflammatory properties, as well as tissue repairing properties78. Hence, the presence of this enzyme also suggests the possible production of other tissue-repairing compounds.

Transcripts involved in lipid metabolism (such as 3-ketoacyl-ACP synthase and Glycerol kinase) were affected in Tetraselmis sp. M8 during N starvation (in early-stationary growth phase in nitrogen depletion)20 but were not differentially regulated in our study (in stationary growth phase in nitrogen starvation). Summarizing, our data suggest that lipid metabolism was not the most affected pathway by N starvation in T. suecica, whereas transcripts involved in sugar degradation were strongly down-regulated.

Biosynthesis of secondary metabolites

Lipoxygenases (LOX), which are enzymes involved in fatty acid metabolism and biosynthesis of secondary metabolites with anti-proliferative activities79 (i.e. polyunsaturated aldehydes and other non-volatile oxylipins), were down-regulated in this study. Three transcripts including the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin)/LH2 (Lipoxygenase homology) domains were found (2.458, 2.506 and 3.065 fold down-regulation, respectively). Domain assignment using InterProScan and structure prediction by Phyre2 confirmed that the three transcripts belong to the family of lipoxygenases (Fig. 3). For the first transcript (2.458 down-regulated; LOX1), three PLAT/LH2 domains were identified, at aa positions 112–227, 234–356 and 380–496, respectively. The structure was predicted at >90% accuracy based on the 47% of residues (aa positions 112–227). The second transcript (2.506 down-regulated; LOX2) contained only one PLAT/LH2 domain at aa positions 137–255 and its structure was predicted at >90% accuracy based on the 61% of residues. The third transcript (3.065 down-regulated; LOX3) contained three PLAT/LH2 domains (aa positions 1–35, 43–158 and 166–286). The 3D model was inferred at >90% accuracy using the 87% of residues (Fig. 3). However, our data suggest that this fatty acid metabolic pathway is affected by N starvation, and only future chemical analyses may confirm oxylipin production by T. suecica.

Figure 3
figure 3

(a) Domain annotation of transcripts coding for lipoxygenase according to InterProScan. Query length is in black. Orange arrows refer to the PLAT/LH2 (Polycystin-1, Lipoxygenase, Alpha-Toxin/Lipoxygenase homology) domains. (b) Protein 3D structure predicted by Phyre2. Models are coloured by rainbow from N to C terminus. Helices in the secondary structure represent α-helices, arrows indicate β-strands and faint lines indicate coils. 1,2 and 3 stand for LOX transcript down-regulated by 2.46, 2.51 and 3.07, respectively, and reported in the DEG list.

In this study, a transcript for polyketide synthase I (PKS), involved in the synthesis of polyketides (compounds known to have antipredator, antimicrobial, anticancer and sometimes toxic activities80), was identified as well. Type I PKS are large multifunctional proteins, comprising several essential domains: acyltransferase (AT), β-ketosynthase (KS), acyl carrier protein (ACP), β-ketoacyl reductase (KR), enoyl reductase (ER), methyl transferases, thioesterases (TE) and dehydrogenase (DH) domains81. The InterProScan analysis of the transcript coding for PKS revealed the occurrence of the following domains: dehydratase (DH, aa positions 227–356), keto-reductase (KR, aa positions 692–875), phosphopantetheine-binding acyl carrier protein (ACP, aa positions 986–1060), β-ketoacyl synthase (KS, aa positions 1296–1735), keto-reductase (KR, aa positions 2179–2356) and phosphopantetheine-binding acyl carrier protein (ACP, aa positions 2467–2542). (Fig. 4). Protein structure prediction analysis confirmed the identification of all the domains (Fig. 4) with a coverage between 99.8 and 100%. In previous studies T. suecica showed antioxidant and protective activity on human cells12 without any cytotoxicity17 and this activity was associated to a pool of carotenoids. However, the presence of PKS may suggest the production of secondary metabolites that can be active as well. In N starvation, PKS was strongly down-regulated (5.77 fold down-regulation), as also found for the dinoflagellate Amphidinium carterae18 and for several fungi82.

Figure 4
figure 4

Detected domains and homology models for PKS transcript. Coloured blocks indicate the amino acid positions of the following domains: dehydratase (DH), keto-reductase (KR), phosphopantetheine-binding acyl carrier protein (ACP), β-ketoacyl synthase (KS), keto-reductase (KR) and phosphopantetheine-binding acyl carrier protein (ACP). Predicted 3D models are coloured by rainbow from N to C terminus. Helices in the secondary structure represent α-helices, arrows indicate β-strands and faint lines indicate coils. For the three consecutive domains of ACP the predicted protein model has been shown once.

Our data suggest that both lipoxygenase and PKS metabolism are affected by N starvation, thus reducing the possible secondary metabolite production they are involved in. This regulatory mechanism is known in fungi as nitrogen metabolite repression82 and this is the first study reporting this type of repression also in green algae.

Photosynthetic activity related DEGs

Among the genes related to photosynthetic activity whose expression was significantly different between control and N-starved conditions, there were the chlorophyll a/b-binding protein, the chloroplast processing enzyme-like protein and two rhodanese-like domain containing proteins which were significantly down-regulated (3.3, 2.1, 2.4 and 4.0 fold respectively). Light-harvesting chlorophyll a/b-binding (LHCB) proteins are found in the antenna complex of the light-harvesting complex of photosystem II (PSII) and their expression at the gene level is considered to be an important mechanism to modulate chloroplast functions83,84. A significant decrease in chlorophyll a/b-binding protein was also observed in A. thaliana plants grown with low nitrogen concentrations85. Chloroplast processing enzymes are known to be involved in the cleavage of the precursor of the light-harvesting chlorophyll a/b-binding protein of photosystem II (LHCPII) and in the production of mature protein86. Rhodanases catalyse the transfer of a sulfane sulfur atom from thiosulfate to cyanide in vitro, and in vascular plants are involved in many processes including leaf senescence87, immune response88, and tethering of ferredoxin NADP+ oxidoreductase in electron transfer chains in photosynthesis89,90.

The down-regulation of these transcripts in N starvation can be interpreted as a way for algal cells to balance the alteration of C:N ratio due to N starvation. As a general pattern, photosynthesis has been demonstrated to be down-regulated in many organisms cultured under N starvation7,72,91,92,93. In some cases, major changes involving photosynthetic enzymes and apparatus were observed94; in others, our case included, only a decrease in photosynthetic enzymes (e.g. enzymes involved in maturation of photosystems, electron transfer chain) was found.

DEG coding for putative nitrilase

Nitrilases (EC 3.5.5.1) catalyse the hydrolysis of nitriles to carboxylic acids and ammonia95. Nitrile converting enzymes have attracted substantial interest in several fields because nitriles (used as solvents, synthetic rubber, starting material for pharmaceuticals and herbicides96) are highly toxic, carcinogenic96 and cause of hazardous environmental pollution97. Likely, nitrilases are very important for nitrile biodegradation (enzymatic bioremediation). In addition, enzymes of the nitrilase superfamily have been shown to play different roles in the cell, such as vitamin and co-enzyme metabolism98, detoxification of small molecules99,100, synthesis of signalling mediators101 and post-translational modification of proteins102.

Nitrilase and nitrilase-like enzymes have been identified in several macro and microorganisms, but nitrilases were considered absent in algae103. We did not find publications reporting nitrilases in microalgae but, using our sequence as query, we found homologs in red and green algae as well as land plants, with a percentage of identity of about 60% and labelled as putative nitrilase, nitrilase-like protein 2 or carbon-nitrogen hydrolase (see Supplementary Table S6).

Our transcript was significantly down-regulated in N starvation conditions (3.48 fold down-regulation) suggesting that nutrient starvation affects its function and indicating a putative role in nitrogen metabolism. The cladogram in Fig. 5 showed that it clusters together with other putative nitrilases of green algae (in light green) and is not strictly related to the nitrilases characterized in land plants by Howden et al.39 (in dark green). However, further investigations at the biochemical and molecular levels are needed to unravel their function in the cell.

Figure 5
figure 5

(a) Phylogenetic tree showing the evolutionary relationships of enzymes belonging to the nitrilase superfamily in the plant lineage. In red are highlighted red algae (Rhodophyta), in green the green algae (Chlorophyta) and in dark green the land plants (Spermatophyta). Bootstrap values were reported only for internal and basal nodes. (b) Predicted 3D structure of the enzyme using Phyre2. Colours are by rainbow from N to C terminus. Helices in the secondary structure represents α-helices, arrows indicate β-strands and faint lines indicate coils.

Data validation by Reverse transcription-quantitative PCR

Reverse transcription-quantitative PCR (RT-qPCR) of 14 selected transcripts, between the most up- and down-regulated DEGs with functional annotation (see Supplementary Table S1), showed a good correlation with RNAseq data (R = 0.81, p value < 0.0005; Fig. 6). In particular, transcripts involved in signaling pathways (CAMK, PP2C and SBP), transport (AMT and ABC), stress and antioxidant responses (GST, CAT and HSP20), and carbon metabolism (PPCK) were up-regulated (p < 0.05 for CAMK, PP2C, SBP, GST and PPCK; p < 0.001 for CAT and HSP20). The 3,5-cyclic nucleotide phosphodiesterase (PDE), involved in purine metabolism, and the Elmo-domain-containing protein 3 (ELMOD3), which acts as a GTPase-activating protein, were up-regulated as well (p < 0.05 for ELMOD3 and p < 0.01 for PDE).

Figure 6
figure 6

Expression levels of selected genes in Tetraselmis suecica cells cultured in nitrogen starvation compared to control conditions (i.e. culturing in complete medium; represented in the figure by the x-axis). Data are represented as log2 x-fold expression ratio ± SD (n = 3). Gene abbreviations are: Calcium/calmodulin-dependent protein kinase type 1 (CAMK), Protein phosphatase 2c family protein (PP2C), Squamosa promoter binding protein (SBP), Ammonium transporter (AMT), ATP-binding cassette protein transporter (ABC), Glutathione-S-transferase (GST), Catalase (CAT), Heat shock protein 20 (HSP20), Phosphoenolpyruvate carboxylase kinase (PPCK), Lipoxygenase (LPX), Polyketide synthase (PKS), Nitrilase (NIT1), 3,5-cyclic nucleotide phosphodiesterase (PDE) and Elmo-domain-containing protein 3 (ELMOD3).

On the contrary, lipoxygenase (the LPX sequence which was down-regulated by 3.065 log2 x-fold in the transcriptome differential expression analysis was selected for primer design for the RT-qPCR) and polyketide synthase, both involved in the synthesis of secondary metabolites, and nitrilase, involved in nitrile bioremediation, were significantly down-regulated (p < 0.001 for all). RT-qPCR data further confirmed the up/down regulation of these transcripts in N starvation (Fig. 6) thus demonstrating the reliability of the high-throughput results.

Conclusion

Summarizing, what makes this study interesting is the fact that we give a series of details regarding the molecular response of T. suecica providing a bigger picture of the nitrogen starvation story. For example, T. suecica does not activate transcripts involved in lipid biosynthesis under nitrogen starvation as other green algae4,5 such as C. reinhardtii72 and Tetraselmis sp. M820. T. suecica is also able to activate stress and antioxidant transcripts as well as signalling and solute transporter transcripts indicating the activation of a series of defense and adaptation strategies to maintain cellular homeostasis and survival.

Our study also identifies enzymes that have never been reported before in T. suecica, such as nitrilase and various PKS domains. Nitrilase, involved in nitrile detoxification95,96, has potential enzymatic bioremediation applications to clean up hazardous environmental pollutants97. Many of the known nitrilases possess various disadvantages, such as insufficient stability, selectivity or low specific activities, preventing their application, and there is therefore a constant demand for new nitrilases. Our results suggest a new possible source of nitrilase.

On the other hand, PKS enzymes are known to be involved in the synthesis of compounds with anti-infective and antiproliferative activities, with possible pharmaceutical applications. The presence of PKS domains in this species suggests the production of still unknown polyketides. Until now most of the known polyketides have been identified in dinoflagellates80 whereas our study indicates that they may be widely spread in other microalgal groups as well. This study confirms that transcriptomic approaches are not only useful for physiological studies but also have the power to discover gene clusters that can be involved in the production of novel metabolites.