Introduction

Chromatium okenii, belonging to the purple sulfur bacteria (PSB, family Chromatiaceae), was described in the environment as massive purple blooms as early as 1838 by the microbiologists Ehrenberg and Weisse as: “Monas corpore cylindrico, aequabili, parumpcr curvato, ter quaterve longiore quam lato, utrinque rotundato, 1/192 lineae attingens, volutando procedens, vacillans, rubra; socialis1, and was subsequently characterized in more detail by Maximilian Perty in 18522 and Sergei Winogradsky in 18883. Light microscopy was used to study changes in microbial motility patterns under opposing gradients of light and sulfur4,5. C. okenii thereby showed scotophobotaxis – i.e. the sudden reorientation of a moving microorganism when experiencing a decrease in light intensity over time – negative aerotaxis and positive chemotaxis towards H2S in vitro6,7,8. Populations of C. okenii have since been found in freshwater ecosystems worldwide such as lakes, ponds and bacterial mats in bogs9,10,11,12,13. Schlegel and Pfennig managed to isolate a C. okenii strain later designated DSM 169T from a pond in Germany in 196014, however the type strain got lost over time. C. okenii is described as rod-shaped, 5.0 to 6.5 µm wide and 8.0 to 10.0 µm long, stains Gram-negative, contains okenone and bacteriochlorophyll a (BChl a) as the main photosynthesis pigments, and is motile through flagella (Fig. 1a)15. The cells contain intracellular carbon storage compounds, such as polyhydroxybutyrate (PHB), glucose, glycogen, polyphosphate and also elemental sulfur globules that typically functions as an electron donors for anaerobic photosynthesis in PSB16.

Figure 1
figure 1

Visual characterization of Chromatium okenii strain LaCa. (a) Microscopic image of C. okenii str. LaCa cells enriched form water sample from Lake Cadagno take at 12 m depth. Intracellular sulphur globules are visible as yellow, highly-refractive spheres. The polar flagellar tuft is visible (black arrow). Cells were directly mounted on a microscopy slide in 0.9% NaCl solution. An Axio Imager 2 microscope (Carl Zeiss Microscopy GmbH, Germany) with a EC Plan NEOFLUAR objective (100x, phase contrast) and an AxioCam MRc Rev3 digital camera were used to take photomicrographs. Images were processed with the AxioVision SE64 v4.8.2 software suite. (b) Visible C. okenii cell pellets enriched from a water sample from Lake Cadagno after 10 min centrifugation at 15 g.

The meromictic Lake Cadagno in the Swiss Alps (46°32′59″N 8°42′41″E) is a prime example of a stratified, sulfidic environment where anoxygenic phototrophic sulfur bacteria of the families Chromatiaceae (purple sulfur bacteria; PSB), such as Chromatium, Thiodictyon, Lamprocystis and Thiocystis spp. and Chlorobiaceae (green sulfur bacteria; GSB) of the Chlorobium genus, thrive to dense populations during the summer months. At a depth of around 12 m, sulfide of a concentration of approximately 0.2 mM and the relative light availability of around 5 µmol quanta m−2 s−1 allows this heterogeneous community to grow up to 107 cells mL−1 by anoxygenic photosynthesis17,18. The presence of an at least 9,450 year old Chromatium sp. 16S rRNA-gene sequence in Lake Cadagno sediments has been demonstrated19 and in recent years a unique C. okenii strain has been detected using a combination of fluorescent in situ hybridization (FISH) and 16S rRNA gene analysis20,21,22. Whereas C. okenii represents only 1–10% of all bacterial cells and 22–83% of the phototrophic community21,22,23,24,25, due to its large size it comprises to 72% of the total biovolume of the microbial chemocline population24. Large seasonal variations of C. okenii were observed with concentrations ranging from 104–106 cells mL−1. When cell concentrations were monitored using FISH and flow cytometry in Lake Cadagno, C. okenii dominated over small celled PSB from July to September that was then followed by a C. okenii population decline within two weeks in October21,24. However, due to a mixing event in 2000 and the following massive bloom of the Chlorobium clathratiforme, increasing the total number of phototrophic sulfur bacteria three-fold between 2000–2005, also less regular C. okenii population dynamics were observed26, suggesting that environmental influences may have a long lasting impact on microbial community composition26. With a doubling time of 5 to 7 days, C. okenii was found to assimilate up to 70% of total carbon and 40% of ammonium with light27,28. Additional evidence was given by analyses of bulk carbon isotope fractionation (between δ13CPOC and δ13CDIC) in the Cadagno chemocline, in which 36% and 52% of the total bulk δ13C -signal was attributed to C. okenii in October and June, respectively29. As grazing on C. okenii by the ciliate Trimyema compressum was shown in vitro30, C. okenii may also function as a major food source for zooplankton in Lake Cadagno.

The importance of C. okenii in bioconvection in Lake Cadagno has been discussed theoretically31. Interestingly, a spatial and temporal correlation of convection in zones with high concentrations of C. okenii (105–106 cells mL−1) was lately inferred in situ32. Short-time dynamics in sulfide uptake of C. okenii and putative interactions between C. okenii and the GSB Chlorobium phaeobacteroides were also demonstrated recently22. And in another novel in situ study by Berg et al. on C. okenii in Lake Cadagno, it was found that anaerobic sulfide oxidation is coupled to aerobic respiration using sulfide as electron acceptor by active vertical movement from the oxic to anoxic parts of the water column and back33.

Herein, we provide the first annotated high-quality draft genome for a member of a large celled PSB genus Chromatium, namely C. okenii strain LaCa, enriched from Lake Cadagno. Importantly, the available sequence data on PSB and GSB isolates of Lake Cadagno enabled us to first, compare core-genomes and, secondly to further elucidate on strain-specific biological functions. The successful enrichment and the high quality draft genome of C. okenii strain LaCa are fundamental for a detailed understanding of nutrient fluxes and microbial interactions within the balanced ecosystem of the Lake Cadagno chemocline and is an important addition to our understanding of global microbial sulfur cycling.

Material and Methods

The complete materials and methods section has previously been described in Luedin 201834.

Chemicals

All chemicals were purchased from Sigma-Aldrich AG (Buchs, Switzerland), if not further specified.

Enrichment

Physicochemical measurements on Lake Cadagno were made with an YSI 6000 profiler (Yellow Springs Inc., Yellow Springs OH, USA) on 14 July 2016. In order to understand carbon isotope fractionation of PSB and GSB strains, C. okenii was previously enriched to high purity by sedimentation and dilution, however cultivation was not established29. We used a comparable approach for this study. Samples were collected with a 1 L Ruttner-type sampling bottle (Hydrobios Apparatebau GmbH, Germany) taken at depths with maximum turbidity and rapid changes in redox-potential between 11.6–12.0 m, indicating a dense bacterial population at the chemocline, as previously described31. The bottles were brought immediately to the laboratory and were placed at natural illumination (2000 lux PAR, or about 36 µmol quanta m−2 s−1) at 16 °C for 6 hours. The purple precipitates thus obtained (Fig. 1b) were identified by cell morphology with light microscopy as Chromatium sp. based on previous descriptions in the literature (e.g. ref.35). Other bacteria were also present, however only in low numbers (<10%). Cells were collected with a 10 mL pipette and transferred to 50 ml tubes. The cells were then centrifuged 10 min at 15 g at room temperature (RT). The supernatant was carefully discarded, and the residual 10 mL were collected and combined in 100 ml serum bottles. The bottles were then filled up with filtered (0.45 µm) chemocline water. A rubber plug was applied, and a vacuum was generated with a suction pump to remove O2. Using a syringe, 100 µL of a 35 mM Na2S 9 H2O solution was added to each sample, resulting in a final concentration of 0.03 mM. Subsamples were conserved at −20 °C before DNA extraction.

Light Microscopy

Cells were directly added directly to on the microscopy slides in 0.9% NaCl solution. An Axio Imager A.2 microscope (Carl Zeiss Microscopy GmbH, Germany) with an EC Plan NEOFLUAR objective (100×, phase contrast) was used to examine the enrichments. Images were taken with an AxioCam MRc Rev3 digital camera and images were processed with the AxioVision SE64 v4.8.2 software suite (Carl Zeiss Microscopy GmbH).

DNA Extraction, Sequencing and Genomic Analysis

Frozen samples were thawed on ice and cells were collected by centrifugation for 15 min at 10,600 g. Genomic DNA (gDNA) was extracted with phenol/chloroform/isoamylalcohol solution (25:24:1, v/v) adhering to the protocol provided by Pacific Biosciences in combination with phase lock gels for phase separation (VWR International, Radnor, USA). gDNA was concentrated and washed using AMPure beads (Agencourt, Beckman Coulter Life Sciences, Indianapolis, USA) following the E2612 protocol from New England Biolabs36. Concentration of the DNA was assessed using a Qubit UV/VIS absorption reader (Thermo Fisher Scientific, Rheinach, Switzerland).

The library construction and Single-Molecule Real-Time sequencing (SMRT) was done on the Pacific Biosciences RS II platform at the Functional Genomic Center Zurich, Zurich, Switzerland. A 10 kb SMRTbell library was constructed using the DNA Template Prep Kit 1.0 (Pacific Biosciences, Menlo Park, USA). SMRTbell template fragments over 10 kb length were used for creating a SMRTbell-Polymerase Complex with P6-C4 chemistry (Pacific Biosciences) according to the manufacturer instructions. Two SMRT cells v3.0 (Pacific Biosciences) for PacBio RS II chemistry were used for sequencing. Sequencing quality reports were created through the SMRT portal software.

PacBio RSII reads were assembled using the canu assembler v1.4, [RRID:SCR_015880]37. Genes were annotated using the NCBI Prokaryotic Genome Annotation Pipeline GeneMarkS+, v4.3 [RRID:SCR_011930]. PhiSpy v2.338, PHASTER39 and VIRSorter v1.0.340 were used to detect phage and prophage related sequences. EggNOG [RRID:SCR_002456]41 was used to classify the predicted genes into COG-categories and OrthoVenn42 was used to classify gene families and visualize clustering. Genes were to assigned to KEGG categories with the blastKOALA v2.1 [RRID:SCR_012773]43. AmphoraNet [RRID:SCR_005009]44, BUSCO [RRID:SCR_015008]45 and CheckM v1.0.12 [RRID:SCR_016646]46 were used to asses genome completeness and contamination. Centrifuge v1.0.347 was used to classify the PacBio raw reads against the NCBI nt database (including additional genomic sequences of Lamprocystis strain CadA31 and “T. syntrophicum” strain Cad 16T). The CRISPR-CAS++ v1.0.548 web server was used to infer Clustered regularly interspaced short palindromic repeats (CRISPR) and associated proteins (Cas) in the C. okenii genome. CRISPR-Cas arrays with an evidence level of 1 were excluded from the analysis.

Phylogenetic Analysis

Roary49 was used to compare the core genomes of sequenced Chromatiaceae. Out of this dataset, 100 single-copy orthologues were selected randomly and their sequences aligned with MUSCLE50. The best-fit phylogenetic model and subsequent consensus tree estimation, based on maximum-likelihood and 1,000 bootstrap iterations, was performed with the W-IQ-TREE v1.6.3 platform51.

A C. okenii strain LaCa full length 16S rRNA gene sequence (CXB77_RS15475) was used to search 16S rRNA gene NCBI database for related sequences with BLASTn [RRID:SCR_001598]. The online tool W-IQ-TREE v1.6.352 was used to create the phylogenetic trees based on the alignment with MAFFT v7 21553 [RRID:SCR_011811]. A combination of 1,000 bootstrap iterations and 1,000 aLRT replications were performed. FigTree v1.4.354 was used to render phylogenetic trees.

Results and Discussion

Genomic Features and Phylogeny

The de novo sequencing of an enrichment of C. okenii strain LaCa was successfully done with a PacBio RSII system using two SMRTcells. A total of 45 contigs were assembled with a total length of 3,784,749 bp, a N50 of 448,938 bp and a L50 of 3. The GC content was found to be 49.8% (Table 1). More details on the sequencing output can be found under Supplementary Figure S1a and a classification of C. okenii strain LaCa is given by Supplementary Table S1. The chromosome comprises three long contigs (PPGH01000034.1, PPGH01000035.1 and PPGH01000037.1) with coverage values between 24× and 27× (Supplementary Fig. S1b). Further five contigs (PPGH01000013.1, PPGH01000018.1, PPGH01000010.1, PPGH01000038.1 and PPGH01000036.1) were found to be associated by partial sequence overlaps (Supplementary Fig S1b) and showed an average coverage of 25× (22–27×). Due to repetitive sequences of >20 kb we could not circularize the chromosome and therefore created a pseudo-circular map with contig borders indicated (Supplementary Fig. S1c). Additional four putative circular sequences (PPGH01000033.1, PPGH01000043.1, PPGH01000024.1 and PPGH01000027.1) were identified (Supplementary Fig. S1b). The other 33 shorter contigs with coverage <22× showed partial or complete overlap with the seven longest contigs.

Table 1 Genome statistics for Chromatium okenii strain LaCa.

The genome of C. okenii strain LaCa was considered as a high quality draft due to the high number of single copy genes when analysed with BUSCO; 129 complete/single copy genes, amphoraNet; 40 genes homologous to Allochromatium vinosum and CheckM; 88.89% (490 of 545) marker genes specific for Chromatiaceae, respectively. The initial assembly was then reduced to the seven longest contigs and the linking contig PPGH01000036.1 and repeated completeness and contamination analysis was repeated using CheckM. Completeness was thereby found to be 88.86%, whereas contamination could be reduced to 1.11%. Additionally, the number of multiple marker genes was reduced from 56 to 8 (Supplementary Tables S2 and S3 and Supplementary Fig. S2), and tetranucleotide frequency and GC content were found to be uniform amongst the seven longest contigs (Supplementary Fig. S3). Mapping of raw PacBio reads to the NCBI nt database and other sequenced PSB isolates from Lake Cadagno revealed that 96% of all reads assigned to Chromatiales mapped on the C. okenii strain LaCa contigs (Supplementary Table S4).

COG classification of the complete dataset of the 3,016 protein coding genes resulted in 2,022 assigned proteins (Table 2).

Table 2 Clusters of Orthologous Genes (COG) functional categories of Chromatium okenii strain LaCa.

Phylogeny based on the 16S rRNA gene revealed a 99% sequence identity with C. okenii DSM 169T and C. okenii strain LaCa groups with Allochromatium and Thiocystis spp. (Fig. 2). When comparing a subset of 100 core genes of sequenced Chromatiaceae, C. okenii is closely related to T. violascens and A. vinosum (Supplementary Fig. S4). Since the C. okenii DSM 169T type strain or other Chromatium spp. are not available anymore in strain collections54, a more detailed genomic comparison within the genus Chromatium was not possible.

Figure 2
figure 2

Phylogenetic relationship of Chromatium okenii strain LaCa based on 16S rRNA gene sequences. IQTree49 was used to calculate a consensus tree combining 1,000 bootstrap iterations and 1,000 aLRT replications. Branch lengths were optimized by maximum likelihood on original alignment. Full node circles indicate bootstrap support above 60%. Scale bar denotes genomic distance in base-pair (bp) substitutions per 100 bp.

Genome Features

An overview of the features discussed below is depicted in a cell scheme in Fig. 3.

Figure 3
figure 3

Schematic Chromatium okenii cell with the discussed metabolic and structural features indicated. (Phage scheme adapted from “PhageExterior.svg” by “Adenosine”, used under the ‘Creative Commons’ Attribution - Share Alike 3.0 Unported license).

Photosynthesis and Chemotrophy

For C. okenii, an extensive system of photosynthetic membranes has been characterized55 where type II reaction centres (RCs) are used to transform light energy into chemical energy. In the strain LaCa genome, a canonical PSB photosystem II-type RC is encoded in two clusters, containing two pufAB (CXB77_RS07530, CXB77_RS07535, CXB77_RS07560 and CXB77_RS07565), pufL (CXB77_RS07555) and pufM (CXB77_RS07550) genes, a RC complex subunit H puhA (CXB77_RS09135) and a putative photosynthetic complex assembly protein (CXB77_RS09130). Additional light harvesting complex LHC I and LHC II genes (CXB77_RS02170 and CXB77_RS02175) and two extra pairs of pufAB genes were identified (CXB77_RS10755–CXB77_RS10765). In accordance with findings in other PSB, multiple gene copies of LHC are thought to be an adaptation to changes in light availability56 and were also found in “Thiodictyon syntrophicum” strain Cad16T and A. vinosum DSM180T57. For C. okenii, low light adaptation was elucidated by measuring fluorescence kinetics in situ58 and quantum yields below the optimum were observed59. In PSB, light is taken up efficiently by photosynthetic pigments and the energy obtained is then further transferred to the reaction centers (RCs). Both, the carotenoid okenone60 and BChl a14 are synthesized in C. okenii strain LaCa. In agreement, the complete genes encoding for BChl a synthesis (CXB77_RS09140–CXB77_RS09180) and of the carotenoid okenone (crt and cru) were found. Notably, a carotenoid 3,4-desaturase crtD-homologue of the C-4/4′ ketolase cruO-type (CXB77_RS02160)61 was detected in proximity to the crtC hydroxyneurosporene synthase gene (CXB77_RS02155) homologous as in Marichromatium purpuratum DSM 1591T. Interestingly, C. okenii strain LaCa showed a stable BChl a to protein ratio over a three months sampling period in situ23. The BChl a synthesis rate in the dark was thereby found to be independent of sampling depth. However, subtle changes in light intensity (0.06 mol quanta m−2 h−1 in average) had a significant impact on the successive BChl a synthesis rates23. In summary, okenii strain LaCa may modulate light dependent energy uptake efficiency by different combinations of LHC antenna proteins and pigments concentrations, respectively.

Soluble electron-carrier cytochromes ensure cyclic electron flow by shuttling electrons from the cytochrome bc1 complex back to the RCs during photosynthesis. In strain LaCa, the high potential iron sulfur protein (HiPIP; CXB77_RS02565) possibly function as the key high potential cytochrome as in A. vinosum DSM180T61. Furthermore, cytochrome c551/c552 (CXB77_RS07885) may serve as an extra RC reductant under autotrophy62. Variable soluble electron carriers found were cytochrome c′ (CXB77_RS15975), cytochrome c4 (CXB77_RS04500) homologous as in Thiocapsa roseopersicina and two soluble c-type cytochromes (CXB77_RS01960–CXB77_RS01970 and CXB77_04510) most closely related to the gene in A. vinosum DSM180T.

In stratified lakes, motile Chromatium spp. have been found in viable, non-dividing states below the chemocline62 and survival over 1.5 year in darkness has also been described63. In the chemocline of Lake Cadagno, the greatly diminished numbers during winter is possibly due to the low light availability of >0.4 µmol quanta·m −2 s−121. Interestingly however, upward motility of C. okenii in dark conditions in Lake Cadagno has been inferred indirectly by cells at the underside of sediment traps in spring24 and by detection of nocturnal bioconvection in summer31. Both findings may point to the importance of dark heterotrophic metabolism for C. okenii vitality. A relatively low oxycline has been observed from October to December29 due to thermal mixing of the mixolimnion. Furthermore, oxygen (<20 nmol L−1) produced in situ by oxygenic photosynthesis has been detected in summer64. Together, these observations infer the possibility of micro-oxic conditions at the chemocline throughout most time of the year. Despite that aerobic sulfur oxidation yields only about 25–30% of the energy provided by anaerobic photosynthesis65, this amount may still be critical for persistence of these microorganisms and the mixotrophic growth in summer.

Accordingly, a complete respiratory chain was found in C. okenii strain LaCa including NADH-quinone oxido-reductase (CXB77_RS02240–CXB77_RS02300 and CXB77_RS02310), succinate-dehydrogenase (CXB77_RS02325–CXB77_RS02335 and CXB77_RS09655), as well as a multi-subunit terminal cytochrome bd oxidase (CXB77_RS12155 and putatively CXB77_RS12145 and CBX77_RS12150). Both can function as terminal electron acceptor in photosynthesis and substrate respiration. Taken together, we found evidence on the genomic level that C. okenii strain LaCa may be able to perform anoxygenic photosynthesis and chemotrophic respiration in Lake Cadagno, as generally suggested for PSB by Kämpf and Pfennig66. Interestingly, partly complementary information to our findings is given by a metagenomic sequence bin in Berg et al.33.

In contrast, no significant growth after five days in darkness was shown under chemotrophic incubations at room temperature with two strains of C. okenii under a 5% O2 atmosphere66. Additionally, no correlation between light availability and specific dark fixation rates was observed in situ in a chemocline population dominated by C. okenii59. Consequently, the combination of the experimentally described metabolism and the biological functions encoded in the genome of C. okenii does not readily explain the high dark total fixation rates measured in Lake Cadagno28,67. Interestingly however, C okenii might be able to combine aerobic and anaerobic carbon fixation pathways by actively moving along the vertical oxygen gradient in summer conditions33.

Sulfur Metabolism

C. okenii uses reduced sulfur compounds such as H2S and S0 as reductants for photolitho-autotrophic growth68,69. Subsequently, light energy is used to transfer electrons to NAD(P)+ and ferredoxin for CO2 fixation. In accordance, C. okenii strain LaCa encodes flavocytochrome c (FccAB; CXB77_RS06380) and the sulfide:quinone oxidoreductases (SqrD and SqrF; CXB77_RS06755 and CXB77_RS12425) that both oxidize H2S in the periplasm to form sulfur globules (SGBs) containing S0. The SGBs are surrounded by sulfur globule proteins (SGPs) that fold into collagen-like filaments70 filaments. Accordingly, we identified two putative SgpA copies with N-terminal signal peptides (CXB77_RS07855 and CXB77_RS14820). This is important, as the homologue SgpA is essential to build intact sulfur globules in A. vinosum71. Furthermore, the canonical dissimilatory sulfite oxidation pathway (Dsr) that enables sulfite production in the cytoplasm in PSB72 was found completely conserved in C. okenii strain LaCa in one cluster (CXB77_RS03215–CXB77_RS03270) and shows gene synteny to other Chromatiaceae. Interestingly, two arsR family transcriptional regulator genes (CXB77_RS06260 and CXB77_RS07240) possibly involved in H2S-dependent gene regulation73 were also detected. Moreover, in strain LaCa we found the trimeric adenylylsulfate reductase alpha and beta-subunits AprAB (CXB77_RS17245 and CXB77_RS17240) that is anchored by the CoB–CoM heterodisulfide reductase multi subunit complex (CXB77_RS04305–CXB77_RS04320). To complete sulfur oxidation, a Sat sulfate adenylyltransferase (CXB77_RS09675) and the dissimilatory-type SoeABC type enzyme (CXB77_RS11845–CXB77_RS11855) are encoded. An additional cluster of sulfur carrier proteins TusA (XB77_RS15940) and DsrE2 (CXB77_RS15945)74 putatively involved in sulfur oxidation were detected. Furthermore, we identified a cytochrome b561 (CXB77_RS01235) and an octaheme cytochrome c (CXB77_RS01240) homologues to A. vinosum DSM180T. Both enzymes are conserved among PSB and have been found to be upregulated in A. vinosum DSM180T with sulfide as sole electron donor75. Notably, no genes encoding Sox proteins necessary for thiosulfate (S2O32−) oxidation were found76, which is in accordance with previous experimental results69. Furthermore, no genes of the adenylyl-sulfate kinase Cys-pathway for assimilatory sulfate reduction could be detected, confirming previous experimental findings where sulfate uptake was not observed for C. okenii69. Finally, for C. okenii no hydrogenases were predicted in the genome that excludes H2 as a source of electrons68,69.

Nitrogen and Phosphate Assimilation

We detected nif genes involved in nitrogen fixation in the C. okenii strain LaCa genome distributed throughout the genome as in A. vinosum DSM180T77. The presence of a dimeric nitrogenase molybdenum-iron protein nifDK (CXB77_RS12525 and CXB77_RS12530) and the nitrogenase iron protein nifH (CXB77_RS12535) point to a diazotrophic metabolism. Additionally, homologues nifD sequences were identified in T. violascens, Lamprocystis spp. and “T. syntrophicum”. N2-uptake is under transcriptional control of the two-component sensor histidine kinases NtrX and NtrY (CXB77_RS03520/ CXB77_RS03525), the nitrogen regulatory protein P-II (CXB77_RS11185) and nifA (CXB77_RS10450), as well as the oxygen sensor nifL (CXB77_RS10445). In strain LaCa, we found polyphosphate kinase and exopolyphosphatase encoded, however previous studies failed to demonstrate in situ polyphosphate accumulation78. Furthermore C. okenii strain LaCa also encodes genes for ammonium assimilation, glutamate synthase and glutamine synthetase. In accordance, in situ NH4+-consumption of C. okenii was demonstrated27 and modelled for the Lake Cadagno chemocline25. However, alternative N-uptake mechanisms must have still to be described.

Carbon Metabolism

The light driven carbon uptake kinetic has been studied in detail in C. okenii before69,79. In PSB, the Calvin-Benson-Bassham-cycle (CBB) is the central carbon assimilation mechanism69,79,80. For the genome of strain LaCa a complete CBB cycle with the one cbbM ribulose 1,5-biphosphate carboxylase/oxygenase (RuBisCO) form II (CXB77_RS09535), two regulatory genes cbbQ (CXB77_RS09540) and cbbO (CXB77_RS09550), and phosphoribulokinase PrkB (CXB77_RS15420) were described. Furthermore, an additional RuBisCO-like protein gene rbcL (CXB77_RS07520) is also present in the genome. The RbcL is putatively involved in methionine salvage and it is typically found in purple bacteria, such as Rhodopseudomonas palustris81,82. Alternatively, RbcL also functions in a stress response mechanism or in sulfur metabolism in the GSB Chlorobium tepidum83 and the purple bacteria Rhodospirillum rubrum84 or in stress response. In the genome of strain LaCa, no hypothetical carboxysome-like subunits and RuBisCO form I (cbbL and cbbS) were found. This is in contrast to other small-celled PSB such as “T. syntrophicum” strain Cad16T and Lamprocystis spp. isolated from Lake Cadagno85,86.

For PSB, several carbon storage mechanism have been described87,88 that possibly serve both as energy and reductant reserves. In C. okenii strain LaCa, glycogen storage is mediated through glucose-1-phosphate adenylyltransferase and a 1,4-alpha-glucan (glycogen) branching enzyme (CXB77_RS16905). Furthermore, we found a complete tricarboxylic acid (TCA) cycle and enzymes for glycolysis. For C. okenii, polyhydroxybutyrate (PHB) synthesis under nitrogen limitation was described in vitro78 and a high average C:N ratio of 14.8 was previously reported that could potentially could induce carbon storage mechanisms. In accordance, genes encoding PhaC (CXB77_RS16475) and PhaE (CXB77_RS16480) involved in PHB synthesis and depolymerisation are present in C. okenii strain LaCa. Furthermore, C. okenii possibly oxidizes glycogen to PHB and stored sulfur is used as an electron sink by reduction into H2S79. Additionally, for C. okenii under in situ dark conditions the putatively aerobic oxidation of sulfur was found to be favoured over glucose oxidation to acetate and CO278 under in situ dark conditions. However, PHB inclusions were not observed and storage compounds were depleted within hours under in situ conditions in Lake Cadagno78. These results obscure the role of PHB for long time survival of C. okenii. For the small-celled PSB “T. syntrophicum” strain Cad16T, proteins involved in the degradation of PHB were shown to be upregulated in the dark conditions in vitro89 and under micro-oxic conditions in situ90. The degradation of PHB granules results in acetyl-CoA and NAD(P)H, which are both needed in the CO2 fixing process in the absence of light.

Membrane Transport and Bacterial S-layer

Similar to other PSB species such as A. vinosum DSM 180T or “T. synthrophicum” strain Cad16T, the genome of C. okenii strain LaCa encodes both a Type IV pilus (CXB77_RS13225, CXB77_RS13640–CXB77_RS13660 and CXB77_RS13745–CXB77_RS13760) and a Type VI secretion system (CXB77_RS0616–CXB77_RS06190 and CXB77_RS12705–CXB77_RS12755). Other secretion systems encoded are a general secretion (Sec) and twin-arginine translocation (Tat). Moreover, we found ABC-type transporters for di-peptide, oligopeptide, lipoprotein, phosphate and molybdenum uptake, as well as Tol and TRAP and Co2+, Mg2+ and Ni2+-uptake systems.

The main function of the surface layer (S-layer) is to reinforce bacterial cells against osmotic, mechanical and thermal forces91. Moreover, the S-layer possibly also functions as protection against bacterial predation and bacteriophage infection91,92. The S-layer in C. okenii consists of conical shaped hexagonal lattice subunits with a diameter of 13 nm that are regularly spaced by 19 nm and extend 25 nm from the surface54. Accordingly, two putative exported S-layer proteins (CXB77_RS09990 and CXB77_RS09995) and a FhaB-like protein (CXB77_RS10005) similar to alkaline phosphatases in Microcystis spp. were identified in strain LaCa. The S-layer proteins might be exported through a homologues Type I secretion SapDEF system (CXB77_RS09940–CXB77_RS09950)88. We also found a putative SapC protein (CXB77_RS08915), missing the signal peptide homologous to Halorhodospira halochloris (HH1059_1773).

In Lake Cadagno the C. okenii population was observed to wither dramatically within a period of days in October21,22. The increase in C. okenii cells over the preceding summer months possibly leads to metabolic stress and an increased sedimentation rate could lead to conditions of high bacterial predator pressure. In accordance, epibionts were reported for C. okenii in Lake Cadagno27 and for other large-celled Chromatium species elsewhere93. These were characterized as bacterial scavengers that feed on non-dividing Chromatium cells94,95 and may lead to this population collapse96. In contrast, no sequences related to Bdellovibrio, Daptobacter or Vampirococcus-type could be detected in the enrichment samples. While this data is currently unavailable, we expect to detect epibionts on non-viable, sedimented C. okenii cells in samples from the lower monimolimnion.

The importance of bacteriophages for aquatic microbial community dynamics has been recognized97,98 however few studies have focused on stratified systems99,100,101. In this study, several putative prophage and incomplete phage sequences were found in the C. okenii strain LaCa sequence (Supplementary Table S5 and Supplementary Fig. S5). The prophage sequences were thereby similar to sequences from T. violascens DSM 198 and A. vinosum DSM 180 and the putative phage Chok4 showed sequence similarity to T. violascens DSM 198 and A. vinosum DSM 180 using BLASTn. When the database was restricted to viral sequences no significant hits were obtained. Furthermore, a 442 bp CRISPR with seven 36 bp repeats and six spacers was detected on contig PPGH01000029.1 (Supplementary Table S6). A similar CRISPR-sequence was found in A. vinosum DSM180T on plasmid pALVIN01. However, no adjacent CAS protein cluster was detected in the genome of C. okenii strain LaCa. Furthermore, eleven Rha-type phage regulatory proteins were detected. In summary, putative phage sequences and phage-related genes found indicate the presence of phages specific for the dense chemocline community. Therefore, the function of the S-layer against phage attachment, as well as the Type 6 secretion system in defence against bacterial predation, must be further elucidated.

Flagella and Chemotaxis

C. okenii is motile using around 40 lophotrichous flagella that together form a tuft with a length of 20 to 30 µm6,54. The direction is controlled by the action of either pulling or pushing flagella that rotate clockwise or counter clockwise, respectively99. In accordance, one cluster with genes encoding the basal body, hook and filament were identified. Additional genes flg, flh, che and fli were identified to group with motA and motB genes. A histidine–aspartate phosphorelay (HAP) based system100 that comprises chemotaxis genes cheABRWYZ and in total 31 putative chemoreceptors (MCP: methyl-accepting chemotaxis protein) of the TAP or TLPA family were detected. Notably, a putative Aer aerotaxis sensor receptor protein (CXB77_RS12890), bacteriophytochrome (CXB77_RS05740) and two putative blue-light-activated histidine kinases (CXB77_RS09475 and CXB77_RS08785) were found. We also detected a putative circadian input kinase A (CXB77_RS08775), however no complete kaiABC relay was identified. Interestingly, only parts of a set of genes involved in acyl homoserine lactone mediated quorum sensing were detected, including components of the SagS-HptB-HsbR (swarming activity and biofilm formation) two-component regulatory system/cAMP/Vfr signalling (CXB77_RS11060 and CXB77_RS08790) and the putative transcriptional activator protein LasR (CXB77_RS11545).

Large- celled Chromatium spp. have been characterized as metabolically less flexible in comparison with the non-motile, small-celled PSB56,68 and might therefore be forced to adapt to changing conditions by constantly moving along the optimal gradients. Overmann and Pichel-Garcia postulate that motile PSB have an advantage over PSB with gas vacuoles at light intensities above 0.2 µmol quanta·m −2 s−156. In Lake Cadagno, between 5.835 µmol quanta m−2 s−1 were measured at the upper chemocline during summer, whereas a ten-fold decrease in light intensity within the cm to m thick bacterial layer was found17. Moreover, an inverse correlation between available light and thickness of the bacterial plume was described for Lake Cadagno23. That may indicate that members of the microbial population actively move vertically on a minute to hour timescale. In comparison, the velocity of Chromatium minus seems to be determined defined by external sulfide concentration and light intensity in vitro102. Interestingly, for C. minus both swimming speed and run time, respectively, are higher and longer under low light intensity when compared to high light conditions, a phenomena that persists over hours102. The observed swimming speed of (2.7 ± 1.4) × 10−5 m s−132,103 (0.97 m h−1) of strain LaCa enables the vertical crossing of the chemocline the chemocline can be crossed within minutes to hours. Importantly, the resulting accumulations of motile, dense cells at the upper border of the chemocline may provoke bioconvection32. Taken together, these different observations indicate that C. okenii would benefit from upward movement under non light-limiting conditions under the guidance of scotophobotaxis, negative O2 and positive H2S chemotaxis, respectively. However, temporal vertical mobility patterns of C. okenii have been additionally described as diel10,104,105 or stochastic31,32. The intervened signalling pathways that coordinate movement in C. okenii will also have to be examined in more detail.

Comparative genomics

Orthologue gene families can be used to compare the encoded metabolic, structural and behavioural potential between organisms106 and have here been applied in the study of encoded differences between PSB and GSB. When we compared the KEGG enzymatic pathways between PSB only minor differences were detected (Supplementary Table S7). Subsequently, OrthoVenn was used to create a dataset of annotated gene clusters to compare phototrophic sulfur bacteria population of Lake Cadagno. Thereby, the genomes of previously isolated PSB (“T. syntrophicum” strain Cad16T and L. purpurea strain CadA31) and GSB (C. phaeoclathratiforme strain Bu-1) were compared to PSB C. okenii strain LaCa. Using an in silico approach, we sought to find genes potentially elucidating the co-existence of an oxygenic phototrophic sulfur bacteria in the chemocline. In total, 10,632 genes were included, and the four species encompassed 4,536 gene clusters, 3,902 orthologous gene clusters –at least containing two species– and 386 single-copy gene clusters (Fig. 4).

Figure 4
figure 4

Venn diagram showing the shared orthologous gene clusters among four sulfur oxidizing bacteria found in Lake Cadagno. Denoted the number of clusters (orthologues or paralogs) shared between phototrophic sulphur bacteria species. Each cluster contains at least two genes.

Orthologous gene clusters shared by PSB (n = 828) were enriched for GO-terms protein export and membrane insertion, as well as light harvesting complex components and cyclic electron flow, indicating the primary phototrophic lifestyle and possibly the membrane bound enzymes (e.g. RCs) involved. Whereas GSB C. phaeoclathratiforme strain Bu-1 was enriched for chlorosome components among others, “T. syntrophicum” strain Cad16T was enriched for chitinase function and extracellular and outer membrane components and L. purpurea strain CadA31 for phage related sequences and processes, respectively.

Conserved RuBisCO type II (CbbM) and RuBisCO-like (RPL) type IV sequences were detected in all PSB examined here. Interestingly, the heterodimeric RuBisCO type I (CbbLS) is missing in C. okenii and was found only in the both small-celled PSB. Additionally, all PSB studied encoded cytochrome d ubiquinol oxidases (CydAB), whereas only small-celled PSB encoded a ccb3 type cytochrome c oxidase (Table 3).

Table 3 Genome features and growth characteristics of Chromatium C. okenii strain LaCa and of selected purple sulphur bacteria. CBB: Calvin-Benson-Bassham cycle.

In C. okenii, out of 3,016 protein coding genes, 144 exclusive gene-clusters were present. GO-enrichment analysis within this group resulted in over-representation of GO-terms linked to chemotaxis, flagellar movement, the S-layer and arginine uptake, respectively (Fig. 4). Arginine uptake may be important for C okenii, since microbial utilization of free amino acids in lakes has been described as a driver for bacterial community function107 and arginine ammonification has also been used as a proxy for respiration of microbial communities108,109. For strain LaCa, arginine could therefore function as an additional carbon source not available to other PSB and GSB, and also provide extra N due to the high C:N ratio of 3, as proposed for other freshwater bacteria109.

C. okenii comprises an approximately 7× larger cell volume27 and a 30× reduced surface-to-volume ratio compared to small-celled PSB. As bacterial cell size influences metabolic activity and internal organization110,111, transcriptional regulation, functional compartmentalisation and genome organisation (i.e. polyploidy) may be fundamentally diverse between C. okenii and small-celled PSB and GSB. However, no evidence of multiple chromosomes in C. okenii strain LaCa was found when taking into account the uniform coverage and the lack of allele variants of the assembly, respectively.

Conclusions

In the study presented, we could we confirmed several previous experimental findings of metabolic activity66,68,78,79 on the basis of the genomic information. Typically for PSB, the C. okenii strain LaCa genome encodes the CBB-cycle and a type II RC, however sox-proteins, hydrogenases and the Cys sulfate assimilation pathway are missing completely. Furthermore, genes involved in carbon and nitrogen utilization were similar between C. okenii strain LaCa and other PSB and show redundancy with A. vinosum DSM 180T. Interestingly, cytochrome d ubiquinol oxidases were also found in all known PSB genomes of Lake Cadagno, indicating aerobic respiration of oxidized organic carbon compounds, such as glucose. In contrast, the co-occurrence of RuBisCO type II together with cbbQ and cbbO genes as in C okenii has been described in more detail for obligate autotrophs112. However, in the PSB “T. syntrophicum” strain Cad16T the type II RuBisCO was constitutively expressed at dark and light conditions89,90 and it was therefore suggested to be function in cofactor re-oxidation as it was found in in purple non sulfur bacteria81. Taken together, the absence of both, a type I RuBisCO (CbbLS) and a carboxysome-like CO2 concentration mechanism in C. okenii, as well as known low CO2 affinity of RuBisCO form II113, possibly inhibits the cell functioning at low surrounding CO2 concentrations.

In terms of changes in the environment, the C. okenii population in the Lake Cadagno is exposed to abiotic factors that vary on the short-term (minutes to hours), such as light availability, reduced electron donors and oxygen, disturbances of the water column (i.e. internal waves and seiches) and biotic factors, such as grazing pressure67,114. Seasonal factors such as an increase in total cell numbers within the chemocline in summer, changes in the day to night length ratio and the 3–5 months of ice-cover in winter –i.e. light availability and quality– add up additional complexity.

Under low light availability in spring, the reported relative higher sulfide affinity in comparison with other PSB and the benefit of the larger dark-to-light hours ratio may, in turn, give C. okenii an advantage over small celled PSB as observed in vitro115. Furthermore, the low phototrophic population cell concentration of ~25% of the summer community21 may also reduce self-shading116 and predation rates and phage numbers may also be lower. The rapid onset of aerobic photosynthesis after the ice-melt may additionally support chemotrophic microaerophilic growth of C. okenii.

To conclude, the multiple factors that influence C. okenii strain LaCa behaviour have to be further disentangled. The sensing of short-term fluctuations and adaptation to more dramatic longer-lasting changes in the environment were found to have left an imprint in the C. okenii genome by the relative over-representation of genes for motility and sensing, and in some versatility in the assimilatory pathways. Chemo- and scotophobotaxis, quorum sensing and diel and seasonal behavioural patterns have must be considered in future studies on bioconvection. Further studies on genomic heterogeneity within the C. okenii population or, and diversity transcriptional control on single cell level could give further insight on the important ecological role of C. okenii for the Lake Cadagno ecosystem22,27 and other stratified water bodies.