Draft Genome Sequence of Chromatium okenii Isolated from the Stratified Alpine Lake Cadagno

Blooms of purple sulfur bacteria (PSB) are important drivers of the global sulfur cycling oxidizing reduced sulfur in intertidal flats and stagnant water bodies. Since the discovery of PSB Chromatium okenii in 1838, it has been found that this species is characteristic of for stratified, sulfidic environments worldwide and its autotrophic metabolism has been studied in depth since. We describe here the first high-quality draft genome of a large-celled, phototrophic, γ-proteobacteria of the genus Chromatium isolated from the stratified alpine Lake Cadagno, C. okenii strain LaCa. Long read technology was used to assemble the 3.78 Mb genome that encodes 3,016 protein-coding genes and 67 RNA genes. Our findings are discussed from an ecological perspective related to Lake Cadagno. Moreover, findings of previous studies on the phototrophic and the proposed chemoautotrophic metabolism of C. okenii were confirmed on a genomic level. We additionally compared the C. okenii genome with other genomes of sequenced, phototrophic sulfur bacteria from the same environment. We found that biological functions involved in chemotaxis, movement and S-layer-proteins were enriched in strain LaCa. We describe these features as possible adaptions of strain LaCa to rapidly changing environmental conditions within the chemocline and the protection against phage infection during blooms. The high quality draft genome of C. okenii strain LaCa thereby provides a basis for future functional research on bioconvection and phage infection dynamics of blooming PSB.

long, stains Gram-negative, contains okenone and bacteriochlorophyll a (BChl a) as the main photosynthesis pigments, and is motile through flagella (Fig. 1a) 15 . The cells contain intracellular carbon storage compounds, such as polyhydroxybutyrate (PHB), glucose, glycogen, polyphosphate and also elemental sulfur globules that typically functions as an electron donors for anaerobic photosynthesis in PSB 16 .
The meromictic Lake Cadagno in the Swiss Alps (46°32′59″N 8°42′41″E) is a prime example of a stratified, sulfidic environment where anoxygenic phototrophic sulfur bacteria of the families Chromatiaceae (purple sulfur bacteria; PSB), such as Chromatium, Thiodictyon, Lamprocystis and Thiocystis spp. and Chlorobiaceae (green sulfur bacteria; GSB) of the Chlorobium genus, thrive to dense populations during the summer months. At a depth of around 12 m, sulfide of a concentration of approximately 0.2 mM and the relative light availability of around 5 µmol quanta m −2 s −1 allows this heterogeneous community to grow up to 10 7 cells mL −1 by anoxygenic photosynthesis 17,18 . The presence of an at least 9,450 year old Chromatium sp. 16S rRNA-gene sequence in Lake Cadagno sediments has been demonstrated 19 and in recent years a unique C. okenii strain has been detected using a combination of fluorescent in situ hybridization (FISH) and 16S rRNA gene analysis [20][21][22] . Whereas C. okenii represents only 1-10% of all bacterial cells and 22-83% of the phototrophic community [21][22][23][24][25] , due to its large size it comprises to 72% of the total biovolume of the microbial chemocline population 24 . Large seasonal variations of C. okenii were observed with concentrations ranging from 10 4 -10 6 cells mL −1 . When cell concentrations were monitored using FISH and flow cytometry in Lake Cadagno, C. okenii dominated over small celled PSB from July to September that was then followed by a C. okenii population decline within two weeks in October 21,24 . However, due to a mixing event in 2000 and the following massive bloom of the Chlorobium clathratiforme, increasing the total number of phototrophic sulfur bacteria three-fold between 2000-2005, also less regular C. okenii population dynamics were observed 26 , suggesting that environmental influences may have a long lasting impact on microbial community composition 26 . With a doubling time of 5 to 7 days, C. okenii was found to assimilate up to 70% of total carbon and 40% of ammonium with light 27,28 . Additional evidence was given by analyses of bulk carbon isotope fractionation (between δ 13 C POC and δ 13 C DIC ) in the Cadagno chemocline, in which 36% and 52% of the total bulk δ 13 C -signal was attributed to C. okenii in October and June, respectively 29 . As grazing on C. okenii by the ciliate Trimyema compressum was shown in vitro 30 , C. okenii may also function as a major food source for zooplankton in Lake Cadagno.
The importance of C. okenii in bioconvection in Lake Cadagno has been discussed theoretically 31 . Interestingly, a spatial and temporal correlation of convection in zones with high concentrations of C. okenii (10 5 -10 6 cells mL −1 ) was lately inferred in situ 32 . Short-time dynamics in sulfide uptake of C. okenii and putative interactions between C. okenii and the GSB Chlorobium phaeobacteroides were also demonstrated recently 22 . And in another novel in situ study by Berg et al. on C. okenii in Lake Cadagno, it was found that anaerobic sulfide oxidation is coupled to aerobic respiration using sulfide as electron acceptor by active vertical movement from the oxic to anoxic parts of the water column and back 33 .
Herein, we provide the first annotated high-quality draft genome for a member of a large celled PSB genus Chromatium, namely C. okenii strain LaCa, enriched from Lake Cadagno. Importantly, the available sequence data on PSB and GSB isolates of Lake Cadagno enabled us to first, compare core-genomes and, secondly to further elucidate on strain-specific biological functions. The successful enrichment and the high quality draft genome of C. okenii strain LaCa are fundamental for a detailed understanding of nutrient fluxes and microbial interactions within the balanced ecosystem of the Lake Cadagno chemocline and is an important addition to our understanding of global microbial sulfur cycling.

Material and Methods
The complete materials and methods section has previously been described in Luedin 2018 34 . LaCa cells enriched form water sample from Lake Cadagno take at 12 m depth. Intracellular sulphur globules are visible as yellow, highly-refractive spheres. The polar flagellar tuft is visible (black arrow). Cells were directly mounted on a microscopy slide in 0.9% NaCl solution. An Axio Imager 2 microscope (Carl Zeiss Microscopy GmbH, Germany) with a EC Plan NEOFLUAR objective (100x, phase contrast) and an AxioCam MRc Rev3 digital camera were used to take photomicrographs. Images were processed with the AxioVision SE64 v4.8.2 software suite. (b) Visible C. okenii cell pellets enriched from a water sample from Lake Cadagno after 10 min centrifugation at 15 g. enrichment. Physicochemical measurements on Lake Cadagno were made with an YSI 6000 profiler (Yellow Springs Inc., Yellow Springs OH, USA) on 14 July 2016. In order to understand carbon isotope fractionation of PSB and GSB strains, C. okenii was previously enriched to high purity by sedimentation and dilution, however cultivation was not established 29 . We used a comparable approach for this study. Samples were collected with a 1 L Ruttner-type sampling bottle (Hydrobios Apparatebau GmbH, Germany) taken at depths with maximum turbidity and rapid changes in redox-potential between 11.6-12.0 m, indicating a dense bacterial population at the chemocline, as previously described 31 . The bottles were brought immediately to the laboratory and were placed at natural illumination (2000 lux PAR, or about 36 µmol quanta m −2 s −1 ) at 16 °C for 6 hours. The purple precipitates thus obtained (Fig. 1b) were identified by cell morphology with light microscopy as Chromatium sp. based on previous descriptions in the literature (e.g. ref. 35 ). Other bacteria were also present, however only in low numbers (<10%). Cells were collected with a 10 mL pipette and transferred to 50 ml tubes. The cells were then centrifuged 10 min at 15 g at room temperature (RT). The supernatant was carefully discarded, and the residual 10 mL were collected and combined in 100 ml serum bottles.  46 were used to asses genome completeness and contamination. Centrifuge v1.0.3 47 was used to classify the PacBio raw reads against the NCBI nt database (including additional genomic sequences of Lamprocystis strain CadA31 and "T. syntrophicum" strain Cad 16 T ). The CRISPR-CAS++ v1.0.5 48 web server was used to infer Clustered regularly interspaced short palindromic repeats (CRISPR) and associated proteins (Cas) in the C. okenii genome. CRISPR-Cas arrays with an evidence level of 1 were excluded from the analysis. Phylogenetic Analysis. Roary 49 was used to compare the core genomes of sequenced Chromatiaceae. Out of this dataset, 100 single-copy orthologues were selected randomly and their sequences aligned with MUSCLE 50 . The best-fit phylogenetic model and subsequent consensus tree estimation, based on maximum-likelihood and 1,000 bootstrap iterations, was performed with the W-IQ-TREE v1. 6

Results and Discussion
Genomic Features and Phylogeny. The de novo sequencing of an enrichment of C. okenii strain LaCa was successfully done with a PacBio RSII system using two SMRTcells. A total of 45 contigs were assembled with a total length of 3,784,749 bp, a N 50 of 448,938 bp and a L 50 of 3. The GC content was found to be 49.8% (Table 1). More details on the sequencing output can be found under Supplementary Figure S1a  PPGH01000036.1) were found to be associated by partial sequence overlaps (Supplementary Fig S1b) and showed an average coverage of 25× (22-27×). Due to repetitive sequences of >20 kb we could not circularize the chromosome and therefore created a pseudo-circular map with contig borders indicated ( Supplementary Fig. S1c). Additional four putative circular sequences (PPGH01000033.1, PPGH01000043.1, PPGH01000024.1 and PPGH01000027.1) were identified ( Supplementary Fig. S1b). The other 33 shorter contigs with coverage <22× showed partial or complete overlap with the seven longest contigs.
The genome of C. okenii strain LaCa was considered as a high quality draft due to the high number of single copy genes when analysed with BUSCO; 129 complete/single copy genes, amphoraNet; 40 genes homologous to Allochromatium vinosum and CheckM; 88.89% (490 of 545) marker genes specific for Chromatiaceae, respectively. The initial assembly was then reduced to the seven longest contigs and the linking contig PPGH01000036.1 and repeated completeness and contamination analysis was repeated using CheckM. Completeness was thereby found to be 88.86%, whereas contamination could be reduced to 1.11%. Additionally, the number of multiple marker genes was reduced from 56 to 8 (Supplementary Tables S2 and S3 and Supplementary Fig. S2), and tetranucleotide frequency and GC content were found to be uniform amongst the seven longest contigs ( Supplementary Fig. S3). Mapping of raw PacBio reads to the NCBI nt database and other sequenced PSB isolates from Lake Cadagno revealed that 96% of all reads assigned to Chromatiales mapped on the C. okenii strain LaCa contigs (Supplementary Table S4).
COG classification of the complete dataset of the 3,016 protein coding genes resulted in 2,022 assigned proteins (Table 2).
Phylogeny based on the 16S rRNA gene revealed a 99% sequence identity with C. okenii DSM 169 T and C. okenii strain LaCa groups with Allochromatium and Thiocystis spp. (Fig. 2). When comparing a subset of 100 core genes of sequenced Chromatiaceae, C. okenii is closely related to T. violascens and A. vinosum ( Supplementary  Fig. S4). Since the C. okenii DSM 169 T type strain or other Chromatium spp. are not available anymore in strain collections 54 , a more detailed genomic comparison within the genus Chromatium was not possible.

Genome
Features. An overview of the features discussed below is depicted in a cell scheme in Fig. 3.

Photosynthesis and
Chemotrophy. For C. okenii, an extensive system of photosynthetic membranes has been characterized 55 where type II reaction centres (RCs) are used to transform light energy into chemical energy. In the strain LaCa genome, a canonical PSB photosystem II-type RC is encoded in two clusters, containing two pufAB (CXB77_RS07530, CXB77_RS07535, CXB77_RS07560 and CXB77_RS07565), pufL (CXB77_RS07555) and pufM (CXB77_RS07550) genes, a RC complex subunit H puhA (CXB77_RS09135) and a putative photosynthetic complex assembly protein (CXB77_RS09130). Additional light harvesting complex LHC I and LHC II genes (CXB77_RS02170 and CXB77_RS02175) and two extra pairs of pufAB genes were identified (CXB77_ RS10755-CXB77_RS10765). In accordance with findings in other PSB, multiple gene copies of LHC are thought to be an adaptation to changes in light availability 56 and were also found in "Thiodictyon syntrophicum" strain Cad16 T and A. vinosum DSM180 T 57 . For C. okenii, low light adaptation was elucidated by measuring fluorescence kinetics in situ 58 and quantum yields below the optimum were observed 59 . In PSB, light is taken up efficiently by photosynthetic pigments and the energy obtained is then further transferred to the reaction centers (RCs). Both, the carotenoid okenone 60 and BChl a 14 are synthesized in C. okenii strain LaCa. In agreement, the complete genes encoding for BChl a synthesis (CXB77_RS09140-CXB77_RS09180) and of the carotenoid okenone (crt and cru) were found. Notably, a carotenoid 3,4-desaturase crtD-homologue of the C-4/4′ ketolase cruO-type (CXB77_ RS02160) 61 was detected in proximity to the crtC hydroxyneurosporene synthase gene (CXB77_RS02155) homologous as in Marichromatium purpuratum DSM 1591 T . Interestingly, C. okenii strain LaCa showed a stable BChl a to protein ratio over a three months sampling period in situ 23 . The BChl a synthesis rate in the dark was thereby found to be independent of sampling depth. However, subtle changes in light intensity (0.06 mol quanta m −2 h −1 in average) had a significant impact on the successive BChl a synthesis rates 23 . In summary, okenii strain LaCa may modulate light dependent energy uptake efficiency by different combinations of LHC antenna proteins and pigments concentrations, respectively. Soluble electron-carrier cytochromes ensure cyclic electron flow by shuttling electrons from the cytochrome bc1 complex back to the RCs during photosynthesis. In strain LaCa, the high potential iron sulfur protein (HiPIP; CXB77_RS02565) possibly function as the key high potential cytochrome as in A. vinosum DSM180 T 61 . Furthermore, cytochrome c551/c552 (CXB77_RS07885) may serve as an extra RC reductant under autotrophy 62 . Variable soluble electron carriers found were cytochrome c′ (CXB77_RS15975), cytochrome c4 (CXB77_RS04500) homologous as in Thiocapsa roseopersicina and two soluble c-type cytochromes (CXB77_ RS01960-CXB77_RS01970 and CXB77_04510) most closely related to the gene in A. vinosum DSM180 T .
In stratified lakes, motile Chromatium spp. have been found in viable, non-dividing states below the chemocline 62 and survival over 1.5 year in darkness has also been described 63 . In the chemocline of Lake Cadagno, the greatly diminished numbers during winter is possibly due to the low light availability of >0.4 µmol quanta·m −2 s −1 21 . Interestingly however, upward motility of C. okenii in dark conditions in Lake Cadagno has been inferred indirectly by cells at the underside of sediment traps in spring 24 and by detection of nocturnal bioconvection in summer 31 . Both findings may point to the importance of dark heterotrophic metabolism for C. okenii vitality. A relatively low oxycline has been observed from October to December 29 due to thermal mixing of the mixolimnion. Furthermore, oxygen (<20 nmol L −1 ) produced in situ by oxygenic photosynthesis has been detected in summer 64 . Together, these observations infer the possibility of micro-oxic conditions at the chemocline throughout most time of the year. Despite that aerobic sulfur oxidation yields only about 25-30% of the energy provided by anaerobic photosynthesis 65 , this amount may still be critical for persistence of these microorganisms and the mixotrophic growth in summer.  Accordingly, a complete respiratory chain was found in C. okenii strain LaCa including NADH-quinone oxido-reductase (CXB77_RS02240-CXB77_RS02300 and CXB77_RS02310), succinate-dehydrogenase (CXB77_ RS02325-CXB77_RS02335 and CXB77_RS09655), as well as a multi-subunit terminal cytochrome bd oxidase (CXB77_RS12155 and putatively CXB77_RS12145 and CBX77_RS12150). Both can function as terminal electron acceptor in photosynthesis and substrate respiration. Taken together, we found evidence on the genomic level that C. okenii strain LaCa may be able to perform anoxygenic photosynthesis and chemotrophic respiration in Lake Cadagno, as generally suggested for PSB by Kämpf and Pfennig 66 . Interestingly, partly complementary information to our findings is given by a metagenomic sequence bin in Berg et al. 33 .
In contrast, no significant growth after five days in darkness was shown under chemotrophic incubations at room temperature with two strains of C. okenii under a 5% O 2 atmosphere 66 . Additionally, no correlation between light availability and specific dark fixation rates was observed in situ in a chemocline population dominated by C. okenii 59 . Consequently, the combination of the experimentally described metabolism and the biological functions encoded in the genome of C. okenii does not readily explain the high dark total fixation rates measured in Lake Cadagno 28,67 . Interestingly however, C okenii might be able to combine aerobic and anaerobic carbon fixation pathways by actively moving along the vertical oxygen gradient in summer conditions 33 .
Sulfur Metabolism. C. okenii uses reduced sulfur compounds such as H 2 S and S 0 as reductants for photolitho-autotrophic growth 68,69 . Subsequently, light energy is used to transfer electrons to NAD(P) + and ferredoxin for CO 2 fixation. In accordance, C. okenii strain LaCa encodes flavocytochrome c (FccAB; CXB77_ RS06380) and the sulfide:quinone oxidoreductases (SqrD and SqrF; CXB77_RS06755 and CXB77_RS12425) that both oxidize H 2 S in the periplasm to form sulfur globules (SGBs) containing S 0 . The SGBs are surrounded by sulfur globule proteins (SGPs) that fold into collagen-like filaments 70 filaments. Accordingly, we identified two putative SgpA copies with N-terminal signal peptides (CXB77_RS07855 and CXB77_RS14820). This is important, as the homologue SgpA is essential to build intact sulfur globules in A. vinosum 71 . Furthermore, the canonical dissimilatory sulfite oxidation pathway (Dsr) that enables sulfite production in the cytoplasm in PSB 72 was found completely conserved in C. okenii strain LaCa in one cluster (CXB77_RS03215-CXB77_RS03270) and shows gene synteny to other Chromatiaceae. Interestingly, two arsR family transcriptional regulator genes (CXB77_RS06260 and CXB77_RS07240) possibly involved in H 2 S-dependent gene regulation 73 were also detected. Moreover, in strain LaCa we found the trimeric adenylylsulfate reductase alpha and beta-subunits AprAB (CXB77_RS17245 and CXB77_RS17240) that is anchored by the CoB-CoM heterodisulfide reductase multi subunit complex (CXB77_RS04305-CXB77_RS04320). To complete sulfur oxidation, a Sat sulfate adenylyltransferase (CXB77_ RS09675) and the dissimilatory-type SoeABC type enzyme (CXB77_RS11845-CXB77_RS11855) are encoded. An additional cluster of sulfur carrier proteins TusA (XB77_RS15940) and DsrE2 (CXB77_RS15945) 74 putatively involved in sulfur oxidation were detected. Furthermore, we identified a cytochrome b561 (CXB77_RS01235) and an octaheme cytochrome c (CXB77_RS01240) homologues to A. vinosum DSM180 T . Both enzymes are conserved among PSB and have been found to be upregulated in A. vinosum DSM180 T with sulfide as sole electron donor 75 . Notably, no genes encoding Sox proteins necessary for thiosulfate (S 2 O 3 2− ) oxidation were found 76 , which is in accordance with previous experimental results 69 . Furthermore, no genes of the adenylyl-sulfate kinase Cys-pathway for assimilatory sulfate reduction could be detected, confirming previous experimental findings where sulfate uptake was not observed for C. okenii 69 . Finally, for C. okenii no hydrogenases were predicted in the genome that excludes H 2 as a source of electrons 68,69 . Nitrogen and Phosphate Assimilation. We detected nif genes involved in nitrogen fixation in the C. okenii strain LaCa genome distributed throughout the genome as in A. vinosum DSM180 T 77 . The presence of a dimeric nitrogenase molybdenum-iron protein nifDK (CXB77_RS12525 and CXB77_RS12530) and the nitrogenase iron protein nifH (CXB77_RS12535) point to a diazotrophic metabolism. Additionally, homologues nifD sequences were identified in T. violascens, Lamprocystis spp. and "T. syntrophicum". N 2 -uptake is under transcriptional control of the two-component sensor histidine kinases NtrX and NtrY (CXB77_RS03520/ CXB77_ RS03525), the nitrogen regulatory protein P-II (CXB77_RS11185) and nifA (CXB77_RS10450), as well as the oxygen sensor nifL (CXB77_RS10445). In strain LaCa, we found polyphosphate kinase and exopolyphosphatase encoded, however previous studies failed to demonstrate in situ polyphosphate accumulation 78 . Furthermore C. okenii strain LaCa also encodes genes for ammonium assimilation, glutamate synthase and glutamine synthetase. In accordance, in situ NH 4 + -consumption of C. okenii was demonstrated 27 and modelled for the Lake Cadagno chemocline 25 . However, alternative N-uptake mechanisms must have still to be described.
Carbon Metabolism. The light driven carbon uptake kinetic has been studied in detail in C. okenii before 69,79 . In PSB, the Calvin-Benson-Bassham-cycle (CBB) is the central carbon assimilation mechanism 69,79,80 . For the genome of strain LaCa a complete CBB cycle with the one cbbM ribulose 1,5-biphosphate carboxylase/ oxygenase (RuBisCO) form II (CXB77_RS09535), two regulatory genes cbbQ (CXB77_RS09540) and cbbO (CXB77_RS09550), and phosphoribulokinase PrkB (CXB77_RS15420) were described. Furthermore, an additional RuBisCO-like protein gene rbcL (CXB77_RS07520) is also present in the genome. The RbcL is putatively involved in methionine salvage and it is typically found in purple bacteria, such as Rhodopseudomonas palustris 81,82 . Alternatively, RbcL also functions in a stress response mechanism or in sulfur metabolism in the GSB Chlorobium tepidum 83 and the purple bacteria Rhodospirillum rubrum 84 or in stress response. In the genome of strain LaCa, no hypothetical carboxysome-like subunits and RuBisCO form I (cbbL and cbbS) were found. This is in contrast to other small-celled PSB such as "T. syntrophicum" strain Cad16 T and Lamprocystis spp. isolated from Lake Cadagno 85 For PSB, several carbon storage mechanism have been described 87,88 that possibly serve both as energy and reductant reserves. In C. okenii strain LaCa, glycogen storage is mediated through glucose-1-phosphate adenylyltransferase and a 1,4-alpha-glucan (glycogen) branching enzyme (CXB77_RS16905). Furthermore, we found a complete tricarboxylic acid (TCA) cycle and enzymes for glycolysis. For C. okenii, polyhydroxybutyrate (PHB) synthesis under nitrogen limitation was described in vitro 78 and a high average C:N ratio of 14.8 was previously reported that could potentially could induce carbon storage mechanisms. In accordance, genes encoding PhaC (CXB77_RS16475) and PhaE (CXB77_RS16480) involved in PHB synthesis and depolymerisation are present in C. okenii strain LaCa. Furthermore, C. okenii possibly oxidizes glycogen to PHB and stored sulfur is used as an electron sink by reduction into H 2 S 79 . Additionally, for C. okenii under in situ dark conditions the putatively aerobic oxidation of sulfur was found to be favoured over glucose oxidation to acetate and CO 2 78 under in situ dark conditions. However, PHB inclusions were not observed and storage compounds were depleted within hours under in situ conditions in Lake Cadagno 78 . These results obscure the role of PHB for long time survival of C. okenii. For the small-celled PSB "T. syntrophicum" strain Cad16 T , proteins involved in the degradation of PHB were shown to be upregulated in the dark conditions in vitro 89 and under micro-oxic conditions in situ 90 . The degradation of PHB granules results in acetyl-CoA and NAD(P)H, which are both needed in the CO 2 fixing process in the absence of light.
Membrane Transport and Bacterial S-layer. Similar to other PSB species such as A. vinosum DSM 180 T or "T. synthrophicum" strain Cad16 T , the genome of C. okenii strain LaCa encodes both a Type IV pilus (CXB77_ RS13225, CXB77_RS13640-CXB77_RS13660 and CXB77_RS13745-CXB77_RS13760) and a Type VI secretion system (CXB77_RS0616-CXB77_RS06190 and CXB77_RS12705-CXB77_RS12755). Other secretion systems encoded are a general secretion (Sec) and twin-arginine translocation (Tat). Moreover, we found ABC-type transporters for di-peptide, oligopeptide, lipoprotein, phosphate and molybdenum uptake, as well as Tol and TRAP and Co 2+ , Mg 2+ and Ni 2+ -uptake systems.
The main function of the surface layer (S-layer) is to reinforce bacterial cells against osmotic, mechanical and thermal forces 91 . Moreover, the S-layer possibly also functions as protection against bacterial predation and bacteriophage infection 91,92 . The S-layer in C. okenii consists of conical shaped hexagonal lattice subunits with a diameter of 13 nm that are regularly spaced by 19 nm and extend 25 nm from the surface 54 . Accordingly, two putative exported S-layer proteins (CXB77_RS09990 and CXB77_RS09995) and a FhaB-like protein (CXB77_RS10005) similar to alkaline phosphatases in Microcystis spp. were identified in strain LaCa. The S-layer proteins might be exported through a homologues Type I secretion SapDEF system (CXB77_RS09940-CXB77_RS09950) 88 . We also found a putative SapC protein (CXB77_RS08915), missing the signal peptide homologous to Halorhodospira halochloris (HH1059_1773).
In Lake Cadagno the C. okenii population was observed to wither dramatically within a period of days in October 21,22 . The increase in C. okenii cells over the preceding summer months possibly leads to metabolic stress and an increased sedimentation rate could lead to conditions of high bacterial predator pressure. In accordance, epibionts were reported for C. okenii in Lake Cadagno 27 and for other large-celled Chromatium species elsewhere 93 . These were characterized as bacterial scavengers that feed on non-dividing Chromatium cells 94,95 and may lead to this population collapse 96 . In contrast, no sequences related to Bdellovibrio, Daptobacter or Vampirococcus-type could be detected in the enrichment samples. While this data is currently unavailable, we expect to detect epibionts on non-viable, sedimented C. okenii cells in samples from the lower monimolimnion.
The importance of bacteriophages for aquatic microbial community dynamics has been recognized 97,98 however few studies have focused on stratified systems [99][100][101] . In this study, several putative prophage and incomplete phage sequences were found in the C. okenii strain LaCa sequence (Supplementary Table S5 and Supplementary  Fig. S5). The prophage sequences were thereby similar to sequences from T. violascens DSM 198 and A. vinosum DSM 180 and the putative phage Chok4 showed sequence similarity to T. violascens DSM 198 and A. vinosum DSM 180 using BLASTn. When the database was restricted to viral sequences no significant hits were obtained. Furthermore, a 442 bp CRISPR with seven 36 bp repeats and six spacers was detected on contig PPGH01000029.1 (Supplementary Table S6). A similar CRISPR-sequence was found in A. vinosum DSM180 T on plasmid pALVIN01. However, no adjacent CAS protein cluster was detected in the genome of C. okenii strain LaCa. Furthermore, eleven Rha-type phage regulatory proteins were detected. In summary, putative phage sequences and phage-related genes found indicate the presence of phages specific for the dense chemocline community. Therefore, the function of the S-layer against phage attachment, as well as the Type 6 secretion system in defence against bacterial predation, must be further elucidated.
Flagella and Chemotaxis. C. okenii is motile using around 40 lophotrichous flagella that together form a tuft with a length of 20 to 30 µm 6,54 . The direction is controlled by the action of either pulling or pushing flagella that rotate clockwise or counter clockwise, respectively 99 . In accordance, one cluster with genes encoding the basal body, hook and filament were identified. Additional genes flg, flh, che and fli were identified to group with motA and motB genes. A histidine-aspartate phosphorelay (HAP) based system 100 that comprises chemotaxis genes cheABRWYZ and in total 31 putative chemoreceptors (MCP: methyl-accepting chemotaxis protein) of the TAP or TLPA family were detected. Notably, a putative Aer aerotaxis sensor receptor protein (CXB77_RS12890), bacteriophytochrome (CXB77_RS05740) and two putative blue-light-activated histidine kinases (CXB77_RS09475 and CXB77_RS08785) were found. We also detected a putative circadian input kinase A (CXB77_RS08775), however no complete kaiABC relay was identified. Interestingly, only parts of a set of genes involved in acyl homoserine lactone mediated quorum sensing were detected, including components of the SagS-HptB-HsbR (swarming activity and biofilm formation) two-component regulatory system/cAMP/Vfr signalling (CXB77_RS11060 and CXB77_RS08790) and the putative transcriptional activator protein LasR (CXB77_RS11545). Large-celled Chromatium spp. have been characterized as metabolically less flexible in comparison with the non-motile, small-celled PSB 56,68 and might therefore be forced to adapt to changing conditions by constantly moving along the optimal gradients. Overmann and Pichel-Garcia postulate that motile PSB have an advantage over PSB with gas vacuoles at light intensities above 0.2 µmol quanta·m −2 s −1 56 . In Lake Cadagno, between 5.8-35 µmol quanta m −2 s −1 were measured at the upper chemocline during summer, whereas a ten-fold decrease in light intensity within the cm to m thick bacterial layer was found 17 . Moreover, an inverse correlation between available light and thickness of the bacterial plume was described for Lake Cadagno 23 . That may indicate that members of the microbial population actively move vertically on a minute to hour timescale. In comparison, the velocity of Chromatium minus seems to be determined defined by external sulfide concentration and light intensity in vitro 102 . Interestingly, for C. minus both swimming speed and run time, respectively, are higher and longer under low light intensity when compared to high light conditions, a phenomena that persists over hours 102 . The observed swimming speed of (2.7 ± 1.4) × 10 −5 m s −1 32,103 (0.97 m h −1 ) of strain LaCa enables the vertical crossing of the chemocline the chemocline can be crossed within minutes to hours. Importantly, the resulting accumulations of motile, dense cells at the upper border of the chemocline may provoke bioconvection 32 . Taken together, these different observations indicate that C. okenii would benefit from upward movement under non light-limiting conditions under the guidance of scotophobotaxis, negative O 2 and positive H 2 S chemotaxis, respectively. However, temporal vertical mobility patterns of C. okenii have been additionally described as diel 10,104,105 or stochastic 31,32 . The intervened signalling pathways that coordinate movement in C. okenii will also have to be examined in more detail.
Comparative genomics. Orthologue gene families can be used to compare the encoded metabolic, structural and behavioural potential between organisms 106 and have here been applied in the study of encoded differences between PSB and GSB. When we compared the KEGG enzymatic pathways between PSB only minor differences were detected (Supplementary Table S7). Subsequently, OrthoVenn was used to create a dataset of annotated gene clusters to compare phototrophic sulfur bacteria population of Lake Cadagno. Thereby, the genomes of previously isolated PSB ("T. syntrophicum" strain Cad16 T and L. purpurea strain CadA31) and GSB (C. phaeoclathratiforme strain Bu-1) were compared to PSB C. okenii strain LaCa. Using an in silico approach, we sought to find genes potentially elucidating the co-existence of an oxygenic phototrophic sulfur bacteria in the chemocline. In total, 10,632 genes were included, and the four species encompassed 4,536 gene clusters, 3,902 orthologous gene clusters -at least containing two species-and 386 single-copy gene clusters (Fig. 4).
Orthologous gene clusters shared by PSB (n = 828) were enriched for GO-terms protein export and membrane insertion, as well as light harvesting complex components and cyclic electron flow, indicating the primary phototrophic lifestyle and possibly the membrane bound enzymes (e.g. RCs) involved. Whereas GSB C. phaeoclathratiforme strain Bu-1 was enriched for chlorosome components among others, "T. syntrophicum" strain Cad16 T was enriched for chitinase function and extracellular and outer membrane components and L. purpurea strain CadA31 for phage related sequences and processes, respectively.
Conserved RuBisCO type II (CbbM) and RuBisCO-like (RPL) type IV sequences were detected in all PSB examined here. Interestingly, the heterodimeric RuBisCO type I (CbbLS) is missing in C. okenii and was found only in the both small-celled PSB. Additionally, all PSB studied encoded cytochrome d ubiquinol oxidases (CydAB), whereas only small-celled PSB encoded a ccb3 type cytochrome c oxidase (Table 3).
In C. okenii, out of 3,016 protein coding genes, 144 exclusive gene-clusters were present. GO-enrichment analysis within this group resulted in over-representation of GO-terms linked to chemotaxis, flagellar movement, the S-layer and arginine uptake, respectively (Fig. 4). Arginine uptake may be important for C okenii, since microbial utilization of free amino acids in lakes has been described as a driver for bacterial community function 107 and arginine ammonification has also been used as a proxy for respiration of microbial communities 108,109 . For strain LaCa, arginine could therefore function as an additional carbon source not available to other PSB and GSB, and also provide extra N due to the high C:N ratio of 3, as proposed for other freshwater bacteria 109 .
C. okenii comprises an approximately 7× larger cell volume 27 and a 30× reduced surface-to-volume ratio compared to small-celled PSB. As bacterial cell size influences metabolic activity and internal organization 110,111 , transcriptional regulation, functional compartmentalisation and genome organisation (i.e. polyploidy) may be fundamentally diverse between C. okenii and small-celled PSB and GSB. However, no evidence of multiple chromosomes in C. okenii strain LaCa was found when taking into account the uniform coverage and the lack of allele variants of the assembly, respectively.

Conclusions
In the study presented, we could we confirmed several previous experimental findings of metabolic activity 66,68,78,79 on the basis of the genomic information. Typically for PSB, the C. okenii strain LaCa genome encodes the CBB-cycle and a type II RC, however sox-proteins, hydrogenases and the Cys sulfate assimilation pathway are missing completely. Furthermore, genes involved in carbon and nitrogen utilization were similar between C. okenii strain LaCa and other PSB and show redundancy with A. vinosum DSM 180 T . Interestingly, cytochrome d ubiquinol oxidases were also found in all known PSB genomes of Lake Cadagno, indicating aerobic respiration of oxidized organic carbon compounds, such as glucose. In contrast, the co-occurrence of RuBisCO type II together with cbbQ and cbbO genes as in C okenii has been described in more detail for obligate autotrophs 112 . However, in the PSB "T. syntrophicum" strain Cad16 T the type II RuBisCO was constitutively expressed at dark and light conditions 89,90 and it was therefore suggested to be function in cofactor re-oxidation as it was found in in purple non sulfur bacteria 81 . Taken together, the absence of both, a type I RuBisCO (CbbLS) and a carboxysome-like CO 2 concentration mechanism in C. okenii, as well as known low CO 2 affinity of RuBisCO form II 113  In terms of changes in the environment, the C. okenii population in the Lake Cadagno is exposed to abiotic factors that vary on the short-term (minutes to hours), such as light availability, reduced electron donors and oxygen, disturbances of the water column (i.e. internal waves and seiches) and biotic factors, such as grazing    67,114 . Seasonal factors such as an increase in total cell numbers within the chemocline in summer, changes in the day to night length ratio and the 3-5 months of ice-cover in winter -i.e. light availability and quality-add up additional complexity. Under low light availability in spring, the reported relative higher sulfide affinity in comparison with other PSB and the benefit of the larger dark-to-light hours ratio may, in turn, give C. okenii an advantage over small celled PSB as observed in vitro 115 . Furthermore, the low phototrophic population cell concentration of ~25% of the summer community 21 may also reduce self-shading 116 and predation rates and phage numbers may also be lower. The rapid onset of aerobic photosynthesis after the ice-melt may additionally support chemotrophic microaerophilic growth of C. okenii.
To conclude, the multiple factors that influence C. okenii strain LaCa behaviour have to be further disentangled. The sensing of short-term fluctuations and adaptation to more dramatic longer-lasting changes in the environment were found to have left an imprint in the C. okenii genome by the relative over-representation of genes for motility and sensing, and in some versatility in the assimilatory pathways. Chemo-and scotophobotaxis, quorum sensing and diel and seasonal behavioural patterns have must be considered in future studies on bioconvection. Further studies on genomic heterogeneity within the C. okenii population or, and diversity transcriptional control on single cell level could give further insight on the important ecological role of C. okenii for the Lake Cadagno ecosystem 22,27 and other stratified water bodies.