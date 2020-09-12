C. burkhardae in the Malaspina dataset

Marine microbes (0.2–3 µm size fraction) were collected during the Malaspina expedition in 120 stations at surface and in 13 profiles of 7 depths from surface to the bathypelagic zone. Eukaryotic diversity was assessed by sequencing the V4 18S rDNA region. Details of sample collection, nucleic acid extraction, V4 amplification, and Illumina sequencing are presented elsewhere for surface data [30] and vertical profiles [31]. Here, we processed the reads using DADA2 [32] with parameters truncLen 240,210 and maxEE 6,8 and identified the ASV (Amplicon Sequence Variant) corresponding to C. burkhardae. Its relative abundance was calculated against the number of reads per sample after removal of metazoan and plant reads. Metagenomes of the same size fraction in vertical profiles were generated from the same cruise [33] and used in BLAST [34] fragment recruitment analysis against the C. burkhardae genome [24]. Direct cell counts were performed in 13 surface samples by FISH as explained before [29, 35].

Growth of C. burkhardae on Dokdonia sp.

The flavobacterium Dokdonia sp. MED134 was isolated on Zobell agar plates from the Blanes Bay Microbial Observatory [36]. To prepare cell concentrates, a colony was inoculated in 50 mL of Zobell medium and incubated at 22 °C for 3 days. Cells were collected by centrifugation (4500 rpm for 15 min), resuspended in sterile seawater (filtered by 0.2 µm and autoclaved), centrifuged again, resuspended in 100 mL of sterile seawater, and kept at 4 °C for 1 week. To calculate the cell abundance of the concentrate, one aliquot was fixed with ice-cold glutaraldehyde (1% final concentration), stained with DAPI, and filtered on a 0.2 µm pore-size polycarbonate filter. Filters were mounted on a slide and counts were performed by epifluorescence microscopy by exciting with UV radiation [37].

C. burkhardae strain E4-10 was isolated in 1989 [38] and maintained on a rice grain with artificial seawater. The culture was acclimated to grow on Dokdonia MED134 as prey in two steps. First 0.1 mL of the culture was inoculated in a flask with 20 mL of sterile seawater and 108 bacteria mL−1 for 5 days. Second, 1 mL of this culture was inoculated to 400 mL of sterile seawater and 2.4 × 107 bacteria mL−1 for 1 week. Flagellate growth was inspected by light microscopy through the culture flasks. Incubations were done at 22 °C on the lab bench.

Batch cultures, dilution event, and RNA extraction and sequencing

Three batch cultures were prepared with 400 mL of sterile seawater, Dokdonia MED134 at 2.5 × 107 cells mL−1, and 1 mL of C. burkhardae from the last acclimation bottle. Three milliliters aliquots were fixed with glutaraldehyde to count, just after sampling, the abundance of flagellates and bacteria by epifluorescence microscopy. Flagellate growth rates were calculated as the slope of the linear part of logarithmic cell numbers versus time. Grazing rates were calculated using growth rates, the slope of the logarithmic decrease of bacteria, and the geometric mean of flagellates and bacteria abundances using the formulas of Frost [39] and Heinbokel [40]. Growth efficiency was calculated from growth and grazing rates and the estimated carbon per cell of both species obtained from cell sizes measured at the microscope [41].

Samples for transcriptomics were taken in triplicates from the last acclimation bottle (Inoculum), and in duplicates in the three bottles at the exponential (day 2.3) and stationary (day 3.7) phases. Cells were collected in microfiltration units of 0.8 µm pore size (Vivaclear MINI 0.8 µm PES, Sartorius, Göttingen, Germany). For each sample, four units were filled with 0.5 mL of culture, spun down for 30 s at 1000 rpm, and the step repeated until processing 10 mL. Next, 100 µL of lysis buffer from the RNAqueous-Micro kit (Thermo Fisher Scientific, Waltham, Massachusetts, US) were added to each unit, vortexed, left for 1 min, and the lysate was spun down at 13,000 rpm for 30 s. The four cell lysates from the same sample were combined and the RNA was extracted following the kit’s protocol. Genomic DNA was removed with DNase I. RNA quantity and purity was assessed with a NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific) and the RNA extracts were kept at −80 °C.

During the exponential phase, three dilutions (10 mL of culture in 190 mL sterile seawater) were prepared from each batch culture, and they were processed after 0.4, 1.4, and 3.3 days for cell counts (5 mL) and RNA extraction (195 mL). As these large volumes prevented the use of microfiltration units, cell collection was done on 47 mm polycarbonate filters of 0.8 µm pore size. Filters were cut in four pieces, submerged in 1 mL of lysis buffer, vortexed, and left for 30 s. The lysate was recovered and the RNA was extracted as before.

Polyadenylated RNA transcripts were converted into cDNA following the Smart-seq2 protocol [42] designed for very low RNA amounts. In brief, Oligo-dT 30 VN primers annealed to all mRNAs containing a poly(A) tail, then reverse transcription and template-switching was done, followed by 9-cycles of PCR amplification using IS PCR oligos linked at the two ends of the cDNA molecules [42]. Amplified cDNA was purified and quantified with a Qubit fluorometer (Thermo Fisher Scientific). The complete set of 24 cDNA samples (15 µL at 2–4 ng L−1) was sent to the Sequencing + Bioinformatics Consortium at UBC and, based on the BioAnalyzer results (Agilent, Santa Clara, California, US), 21 samples were chosen for sequencing (Table S1). Illumina Nextera XT libraries with a dual index were prepared and pooled on a single lane of a NextSeq Illumina sequencer yielding, on average, 14.1 million 150 bp pair-ended reads per sample (Table S1). Raw reads have been deposited in ENA under the accession number PRJEB36247.

Transcriptome assembly, functional annotation, and DE analysis

Quality trimming of Illumina reads was done using Trimmomatic 0.33 [43] with parameters set to crop:149 slidingwindow:6:25 minlen:50. This removed about one third of the reads per sample (Table S1). High-quality reads were mapped with Bowtie2 [44] towards the genome of Dokdonia MED134 (3.3 Mb; CP009301) and the C. burkhardae rDNA operon (5800 bp; extracted from a genome contig with the 18S rDNA [KY886365] and the 28 S rDNA [FJ032656]). We used Bowtie2 in the sensitive mode, which restricts to zero the mismatches in seed alignment, and removed the mapped reads from the sequencing files. Reads mapping the bacterial genome were highest in exponential, intermediate in dilution, and lowest in stationary stages (Fig. S1a), while reads mapping to eukaryotic rDNA operon were similar in all cases (Fig. S1b). Cleaned reads from all samples (4.9 million on average, Table S1) were co-assembled using Trinity-v2.4.0 [45]. The initial transcriptome consisted of 70,652 isoforms, for which the longest one of each gene was retained, resulting in 48,502 transcripts. These were compared using BLAST against the genome [26] and the transcriptome [25] of C. burkhardae, and annotated by Trinotate using UniProt [46], Pfam [47] and eggNOG [48] databases. We retained transcripts having a match to the genome or the transcriptome, or annotated as Eukaryota (19,215 left). Cleaned reads were mapped to this set with RSEM [49] and we kept 15 887 transcripts that appeared in at least 3 samples (0.3% of the signal removed). An additional BLASTn search removed obvious bacterial and viral genes (15,123 left). Transcripts with several ORFs identified by TransDecoder [45] were split when a different function was predicted for each ORF: 866 were split in two, 92 in three and 12 in four parts. The expression level of split regions was often very different (Fig. S2). Gene space completeness of the final curated transcriptome of 16,209 genes was estimated with BUSCO V3 [50].

The curated transcriptome was further processed using TRAPID [51] to annotate sequences with InterPro domains [52]. The processing strategy outlined in the original publication was slightly modified: sequence similarity search was performed using DIAMOND [53] in ‘more-sensitive’ mode (e-value cutoff of 10−5) against a stramenopile-oriented PLAZA database [54] comprising genomic data of 35 organisms including C. burkhardae (Table S2). Functional annotation was transferred from the top protein hit and its assigned gene family.

Cleaned reads were mapped to the curated transcriptome using RSEM. The TPM (Transcripts Per Million) table was used for sample comparison by NMDS and for DE analyses with EdgeR [55]. The latter tool detects DE genes (logFC > 2 and FDR corrected p values <10−3) in pairwise sample comparisons. InterPro domain enrichment analysis of gene sets showing a specific expression profile (e.g. genes upregulated in the exponential versus the stationary phase) was performed with TRAPID using the hypergeometric distribution, with a maximum Benjamini–Hochberg corrected p value cutoff of 0.05 and the entire curated transcriptome used as background. Enriched protein domains were manually assigned to given general processes and cellular functions.