Spatio-temporal patterns of Synechococcus oligotypes in Moroccan lagoonal environments

Synechococcus are unicellular cyanobacteria susceptible to environmental fluctuations and can be used as bioindicators of eutrophication in marine ecosystems. We examined their distribution in two Moroccan lagoons, Marchica on the Mediterranean coast and Oualidia on the Atlantic, in the summers of 2014 and 2015 using 16S rRNA amplicon oligotyping. Synechococcus representatives recruited a higher number of reads from the 16S rRNA in Marchica in comparison to Oualidia. We identified 31 Synechococcus oligotypes that clustered into 10 clades with different distribution patterns. The Synechococcus community was mainly represented by oligotype 1 (clade III) in Marchica. Cooccurring clades IV and I had an important relative abundance in Marchica in the summer of 2014, which is unusual, as these clades are widespread in cold waters. Moreover, Clades VII and subcluster “5.3” formed a sizeable percentage of the Synechococcus community in Marchica. Notably, we found low Synechococcus sequence counts in the Atlantic Lagoon. These results showed that the relative abundance of Synechococcus reads is not constant over space and time and that rare members of the Synechococcus community did not follow a consistent pattern. Further studies are required to decipher Synechococcus dynamics and the impact of environmental parameters on their spatial and temporal distributions.


Results
Detection of Synechococcus oligotypes in environmental samples. A total of 535,138 16S rRNA amplicon sequence reads were generated from the microbial populations of four samples collected at two stations on the summer solstice, 21st June in 2014 and 2015. The relative Synechococcus read number compared to the total microbial community varied between both sampling sites during both timepoints ( Fig. 1 and Table 1).
We identified 31 Synechococcus oligotypes, which was equivalent to 95% of all Synechococcus reads analyzed. The most abundant representative Synechococcus reads were used for downstream analyses depending on the sampling location and date. Our phylogeny confirmed that Synechococcus strains are classified into ten different clades based on representative V4-V5 16S rRNA sequences (Fig. 2).
Synechococcus comprised a higher fraction of the microbial population and showed higher relative abundances in the summer of 2014 in Marchica than in 2015. We placed the eight abundant oligotypes within a phylogenetic tree that included known Synechococcus strains (Fig. 2). Table 1 shows higher read counts of Synechococcus oligotypes in the Marchica Lagoon (n = 15,447), (n = 1,667) in the summers of 2014 and 2015, respectively, compared to Oualidia (n = 26), (n = 14) during both the summers of 2014 and 2015.
Interestingly, Oligotype 1 was strongly represented in both sampling lagoons (Table 1), where it comprised a larger segment of the overall Synechococcus community. In contrast to the 2014 summer community in Marchica, a few Synechococcus oligotypes decreased in 2015; for instance, O7 changed from 177 Synechococcus reads to not detected, and O8 changed from 119 Synechococcus reads to not detected. In Oualidia, we noticed the presence of clades III (O1), IV (O4), I (O6, O14, O26), and VII (O7) in 2014 in contrast with 2015, where clade VII was absent and only 5 Synechococcus oligotypes were identified: O1, O4, O6, O9 and O14. sampling site clearly displayed some differences over both space and time, many oligotypes were shared as well. Network analysis allowed visualization of the specificity of the oligotypes and how they were distributed in Mediterranean Marchica and Atlantic Oualidia lagoons and further investigation of which factors influenced this distribution (Fig. 3). We identified oligotypes that were either found in one collected sample, shared by two samples, or present in all samples (Fig. 3). Oligotypes in OSD24-2014 accounted for the largest fraction (25/31), whereas oligotypes shared by the four samples made up a small portion of the total number (4/31). Most oligotypes were shared between both 2014 and 2015 samples of Mediterranean Marchica, in addition to some overlap with Atlantic Oualidia. Among all oligotypes, six were exclusively found in Marchica in 2014 (Fig. 3). Furthermore, the distribution of cooccurring oligotypes was clearly different in Marchica and Oualidia during the summers of 2014 and 2015. Cooccurring Synechococcus oligotypes tended to be well connected to each other (Table 2), showing strong correlations (e.g., O4 (clade IV) and O6 (clade I), = 0.98, O4 (clade IV) and O8 (clade I), = 0.99).
Environmental variables influencing oligotypes diversity. Following principle component analysis (PCA) of oligotype relative abundance, we observed that physicochemical factors in the lagoons correlated with oligotype co-occurrence patterns. The first principle component (PC1) captured 46% of the variance in oligotype relative abundance and discriminated oligotypes according to nitrate. The second principle component explained an additional 27% of the variation and discriminated oligotypes according to salinity and temperature.

Discussion
In a previous study 18 , we used bioinformatics tools to analyze the metagenome and the amplicon 16S sequences to gain an insight into microbial diversity in Moroccan lagoons, namely Marchica and Oualidia. 16S rRNA gene classification revealed a high percentage of bacteria in both lagoons. On average, bacteria accounted for 90%  At the genus level, we found that the highest assigned hits were attributed to Synechococcus, which was highly abundant in Marchica (32%) compared to Oualidia (0.07%) in 2014. This amount dropped to 22% in Marchica and 0.04% in Oualidia in 2015. Hence, in this study we performed the analysis of the Synechococcus genus community using oligotyping to investigate their dynamics and understand their co-occurrence and covariation in space and time within fragile ecosystems such as lagoons. We may divide our results into two emerging Synechococcus communities: one dominated in 2014 and the other was less present in 2015, each composed of different cooccurring Synechococcus oligotypes. The abundant Synechococcus community in Marchica in 2014 consisted of clades I, 5.3, III, IV, and VII. These clades are typically found in either warmer or more oligotrophic environments 19,20 . This result is in accordance with Marchica's environmental characteristics; it is an oligotrophic ecosystem with high primary production and warmer water in summer 21 . The community included clades CB5 and WPC1 in Marchica 2014 and 2015 when the number of Synechococcus reads was lower. Strains belonging to the CB5 clade lack phycourobilin (PUB), contain one motile strain 22,23 , are present in temperate coastal waters and are prevalent in polar/subpolar waters [24][25][26] . WPC1 strains are observed in open-ocean and near-shore waters 1,24,27 . Clades IV and I usually co-occur and are more prevalent in cold coastal waters 19,[28][29][30] . Interestingly, Clade III was prominent in Marchica. This clade is known to be motile and restricted to warm, oligotrophic water 19,20,30 . Although at a smaller read number, clade III was also observed in Oualidia, where the temperature is cooler compared to Marchica. Furthermore, we found that clade III growth has been shown to be severely affected at low temperatures 30 . Moreover, representatives of both clades I and IV were present in Oualidia in both the summers of 2014 and 2015. Some Synechococcus strains,    31 . Their growth rates were marginally lower at low temperatures in strains from clades I and IV, which were dominant in temperate regions. Nitrate levels are typically low or undetectable in these lagoons, which allows the persistence of clades that would not typically thrive in coastal waters at other times of the year. In 2014, the nitrate concentration was higher than the average of 10 mg/l, which could be due to increased agricultural activities and wastewater treatment plant effluent 21 . The decreasing nitrate concentration in Marchica in 2015 could be explained by the newly installed inlet in 2010, which was designed to improve water exchange with the open sea and reduce the amount of suspended matter 21 . Temperature and salinity have a large effect on nitrate in marine ecosystems 32 ; the highest nitrate degradation rates were observed at 35 °C and at increasing salinity rates. Therefore, we expected to see correlations between salinity, temperature and nitrate concentrations. Interestingly, clades CB5 in Marchica and IV in Oualidia increased in relative abundance in summer 2015 compared to 2014, when the nitrate concentration decreased. Moreover, the Synechococcus microbial community diversity and density are variables depending on the variations in the physical and chemical parameters. These parameters are strongly influenced by the marine waters passing through the artificial inlets, which have an impact on the internal hydrodynamics of both lagoons and hence the distribution and co-occurrence of Synechococcus strains. In addition, anthropogenic activities also have a great influence on Synechococcales population growth and interactions with their viruses 33,34 .
This study revealed some differences between Marchica and Oualidia in identified Synechococcus clades. The Marchica lagoon showed more heterogeneity (clades I, II, III, IV, VII, VIII, 5   We presume that the greater variability in oligotype co-occurrence behavior observed in Marchica Lagoon, especially in the summer of 2014, could be due to the higher abundance and diversity of Synechococcus oligotypes, physico-chemical parameter fluctuations or rehabilitation of the lagoon. Less abundant oligotypes could also be considered potential bioindicators of Synechococcus genetic diversity. Their seasonal occurrence might contribute to changing ecological and biogeochemical characteristics of the marine environment 35 . The Synechococcus relative abundance count revealed that the Marchica Synechococcus community included the least abundant oligotypes in 2015. For instance, O7 and O8 were detected in 2014 and were absent in 2015 (Table 1). It is unclear which factors served to constrain the relative abundances of these least present oligotypes, but temperature and salinity could have an impact on their distribution in Marchica (Fig. 4) and the opposite for Oualidia, which are cooler-temperature adapted ones. We noticed that the relative abundance of cooccurring Synechococcus was not constant. For instance, oligotype 4 belonging to Clade IV showed higher values in summer 2014 (974 reads) in Marchica compared to summer 2015 (319 reads), and the opposite was observed in Oualidia, with a lower abundance compared to Marchica. Increased values of cooccurring clade I oligotypes (14, 26, and 6) were detected in the summer of 2014 in both lagoons.
In comparing our results with a study from Little Sippewissett Marsh (LSM) 8 that used oligotyping to investigate the distribution of the genus Synechococcus in space and time sequencing the V4-V6 hypervariable region of the 16S rRNA gene, we found 31 oligotypes, while they identified 12. In both studies, the proportion of Synechococcus oligotypes increased in summer and in coastal waters compared to estuaries. In addition, Clades I and IV were more abundant in saline conditions, such as Marchica Lagoon. However, these clades were found in greater relative abundances at cold temperatures, in contrast to our study, where they were identified in Marchica's warm waters. Moreover, clade CB5 tended to be prominent at relatively warm temperatures (17-20 °C) 6 . In our work, it was not prevalent either in cooler or warmer water. Notably, the relative abundance of rare oligotypes was higher in warm hypersaline estuary waters 8,18 , while in our case study, they occurred in cooler moderately saline Oualidia waters.
The dominance of a certain clade could have many different ecological ramifications, especially as the clades can be incredibly diverse in their growth, loss, nutrient utilization and other attributes. The dominant clade's growth and loss patterns will set the stage for the population dynamics. For instance, if the dominant clade only blooms in a given environmental factor such as temperature, light, or salinity, it will then affect the timing of blooms, and follow-on the effects of subsequent grazing, lysis or even biogeochemical cycling. Even if the www.nature.com/scientificreports/ population is diverse, the dynamics as a whole will be a composite response of each individual clade's ecophysiology, making it important to understand their composition and how it changes over space and time.
While the rpoC1 gene is a higher resolution diversity marker 36 , 16S amplicon data can be used for exploring the entire bacterial assemblage including Synechococcus clade designations via oligotyping 35 . The latter has a great advantage in answering unexplained diversity contained in taxa using 16S rRNA gene sequences. Nevertheless, it has some limitations, as it acts optimally only when performed on taxa that are closely related. Regarding distantly related taxa, the high number of increased-entropy locations makes the supervision steps difficult. In addition, although oligotyping does not rely on clustering conditions or availability of existing reads within reference databases, it demands preliminary operational taxonomic unit clustering to find closely related species appropriate for the analysis. This method is under continuous improvement to better exploit the information within subtle variations in 16S rRNA gene sequences 5 .
In conclusion, we explored the patterns of Synechococcus diversity in space and time using an oligotyping approach to examine these populations in lagoon waters of Mediterranean Marchica and Atlantic Oualidia, in Morocco. Patterns that have been observed at the clade and subclade levels, such as Synechococcus, relative abundance and the co-occurrence of groups from different clades, were shown to occur among oligotypes. The Marchica Lagoon showed a heterogeneous Synechococcus diversity compared to Oualidia in summer 2014. Thirty-one Synechococcus oligotypes were identified. Two distinct communities emerged in the 2014 and 2015 summer samplings, abundant and rare Synechococcus species, each comprising cooccurring Synechococcus oligotypes from different clades. Network analysis showed that six oligotypes were exclusive to Marchica Lagoon. The identified clades I, III, IV, VII, and 5.3 in Marchica were in accordance with its environmental characteristics. In addition, the relative abundance of some cooccurring Synechococcus strains was not constant over time and space (e.g., clades I and IV). Using gene oligotyping, we illustrated some of the challenges associated with the identification of novel Synechococcus strains or studied their co-occurrence in space and time. Oligotyping has been instrumental in discriminating closely related Synechococcus strains. However, this study leaves open questions about how samples differ by location and whether locations differ from year to year. Do cooccurring oligotypes interact with each other and to what extent do they correlate with physicochemical parameters? What triggers the coexistence of clades I and IV with clade III in warm water or 5.3 with VII, which do not know much about. Finally, how do relative abundances change over seasons. Hence, future work needs to consider additional stations and seasons to provide better statistical support for our findings and to better understand their correlation with physical and chemical environmental parameters. Other factors were not considered in this study, such as nutrient availability, chlorophyll, irradiance, viral lysis, and greater sequencing depth, which could also influence the observed seasonal dynamics. www.nature.com/scientificreports/ duced "workable" amplicon fasta files. We used VAMPS 38 to process 16S rRNA gene sequences, where taxonomy assignment was performed using Global Alignment for Sequence Taxonomy (GAST) 39 and the SILVA rRNA gene reference database 40 . The obtained files include the reference ID, the taxonomy assigned, and the source of the taxonomy.

Methods
Oligotyping. For Synechococcus investigation, we used 16S rRNA gene oligotyping as described in 5 . This method is based on a supervised algorithm that identifies microdiversity using 16S rRNA gene sequences. Oligotyping is unlike regular taxonomic classification based on available reference databases available sequences or cluster analysis based on the selection of the similarity threshold. This technique tackles the taxonomic resolution limitation by finding the most information-rich nucleotide positions (i.e., oligotypes). Sequences identified as Synechococcus were extracted from the Vamps database. We aligned Synechococcus reads using PyNAST 41 . Of the 22,387 sequences identified as Synechococcus, 17,941 remained after quality filtration and Pynast alignment. The mean length of Synechococcus reads was 254 bp. Next, we removed the uninformative gaps in the resulting aligned sequences using the "o-trim-uninformative-columns-from-alignment" script. Subsequently, we calculated the entropy of each nucleotide position within the oligotype package. After the initial calculation of Shannon entropy using the "analyze-entropy" script, we ran 16S rRNA oligotyping for the Synechococcus genus until each oligotype had converged. Uninformative nucleotide positions were excluded. Seven nucleotide positions were used in total to define each oligotype, and to minimize the impact of sequencing errors on oligotyping results, we used a "minimum substantive abundance" criterion (M) of 5; thus, an oligotype was not included if the most common sequence for that type occurred less than five times. To reduce the noise, each oligotype was required to appear in at least one sample but was not required to comprise a certain percentage of reads or represent a minimum number of reads in all samples combined. We removed any oligotypes that did not meet these criteria from the analysis. The final number of quality-controlled oligotypes revealed by the analysis was 31 and represented 95% of the total Synechococcus reads. For each oligotype, the oligotyping pipeline chose the most abundant read as the representative sequence to be used for downstream analyses. Upon completion of oligotyping analysis, the resulting "observation matrices" are concatenated to generate a single "observation matrix" for our V4-V5 dataset. These observation matrices report counts, which are the number of reads assigned to each oligotype in each sample (Table 1). We then converted counts to percent abundances within each sample and used these normalized relative abundances for subsequent analyses. We searched the most biologically relevant representative sequence of our oligotypes using blastn version 2.2.26 to assign taxonomy for each oligotype. We kept default parameters, except 'per. identity 100' to have hits with 100% sequence identity reported.
Oligotype network analysis. We performed network analysis using Gephi software, version 0.9.2 42 , to determine the distribution of all Synechococcus oligotypes from both lagoons using a force-directed graph algorithm (ForceAtlas2 in Gephi software). Every dot identifies an oligotype present in at least one sampling site, and each edge on the network connects an oligotype to one or more sampling sites.
Clade identification. We designated a clade for each oligotype's representative sequence by matching this latter to a key reference database of 16S rRNA gene sequences from cultured Synechococcus 6 . Synechococcus sequences downloaded from NCBI GenBank clade classifications were obtained from the following sources 4,6 . We added the representative sequences for each oligotype to the Synechococcus database, and we aligned them with Muscle (version 3.8.4, 17 ). We used exact matches between each oligotype Synechococcus sequence and the Synechococcus sequence database to infer clade designation.

Statistical analyses.
To group oligotypes statistically, we computed a principal components analysis (PCA) using R package "ggfortify" with respect to a sample matrix of Synechococcus oligotype reads normalized to total Synechococcus reads for that sample. Each oligotype was projected onto the first two PCs of the matrix.
To investigate environmental correlates of oligotype grouping, multiple regressions of each of the first two PCs were computed against the three environmental factors, which are the water temperature, salinity, and the concentration of nitrate.