Main

Marine cyanobacteria of the Synechococcus and Prochlorococcus genera account together for about 25% of global photosynthesis (Li et al., 1993; Liu et al., 1997; Partensky et al., 1999). Several of their cyanophages, from the myo- and podo-viruses families, carry photosynthetic genes, and it was suggested that these genes increase phage fitness (Mann et al., 2003; Lindell et al., 2004, 2005; Millard et al., 2004; Sullivan et al., 2005; Sharon et al., 2007; Dammeyer et al., 2008). Cyanobacterial photosynthetic membranes contain two photosystems, of which PSII mediates the transfer of electrons from water, the initial electron donor, to the plastoquinone pool whereas PSI mediates electron transfer from plastocyanin to ferredoxin, thereby generating reducing power needed for CO2 fixation in the form of nicotinamide adenine dinucleotide phosphate oxidase. Although PSII is known to be sensitive to photodamage, PSI is considered to be more\able than PSII.

The PSII gene psbA coding for the labile D1 protein is readily detected in various cultured and environmental myoviruses and podoviruses infecting Prochlorococcus and Synechococcus (Mann et al., 2003; Lindell et al., 2004; Zeidner et al., 2005; Sullivan et al., 2006; Sharon et al., 2007). In myoviruses, genes encoding the PSII D2 protein as well as different genes of the electron transport chain are also found (Mann et al., 2003; Lindell et al., 2004; Millard et al., 2004; Sullivan et al., 2005, 2006; Alperovitch et al., 2011; Philosof et al., 2011; Sharon et al., 2011).

Recently, PSI gene cassettes containing whole gene suites (psaJF, C, A, B, K, E and D) sufficient to build a monomeric PSI were reported to exist in marine cyanophages from the Pacific and Indian Oceans (Sharon et al., 2009; Alperovitch et al., 2011). This was observed using both the GOS data set (Rusch et al., 2007) and the viral marine biome from the Pacific Line Islands (Dinsdale et al., 2008a, 2008b). These viral PSI gene cassettes were observed in the Pacific and the Indian Oceans. The main gene organization observed was psaJF->C->A->B->K->E->D and could be observed in several GOS stations (from the Pacific and Indian Oceans) and in the marine viral biome (from the Pacific Line islands). However, a different GOS arrangement, psaD->C->A, was detected only once (clone JCVI_TMPL_1061008099984 (hereafter 9984) in the Pacific open ocean GOS station GS047) and could not be confirmed with the different available viral 454 pyrosequenced biome data sets (Sharon et al., 2009). In addition, this clone’s PsaD, PsaC or PsaA possessed long branch topologies in phylogenetic trees as compared with other viral or cyanobacterial PSI proteins (Sharon et al., 2009). A better understanding of different viral photosynthesis genes and their genomic organizations might help explaining the ‘viral photosynthesis’ phenomenon (Rohwer and Thurber, 2009).

In order to check whether the rare clone psaD->C->A is an authentic event and not a chimeral incident, we checked its %G+C content. As could be seen in the PsaA phylogenetic tree (Figure 1), the clone psaA gene’s 50% G+C is clearly distinct from its Synechococcus tree neighbors 55–60% G+C or from the distant Prochlorococcus counterparts (38–42% G+C). Moreover, when %G+C content is checked in the entire psaD->C->A cassette, similar intermediate %G+C is observed in all three PSI genes (Figure 2). This is clearly distinct from the low 40–44% G+C observed with the other viral PSI cassettes. Clone 9984 contains the viral neck protein gp13 in the vicinity of the PSI proteins. A phylogenetic protein tree of the gp13 neck protein place clone 9984 gp13 close to myocyanophages (Supplementary Figure S1). In addition, it carries a viral hypothetical gene, which is found in the myocyanophage P-SSM4.

Figure 1
figure 1

The relationship between Synechococcus, Prochlorococcus and their GOS phage PsaA proteins and DNA sequences. Following alignment computation (using MUSCLE (Edgar, 2004)), PhyML version 3.0 (Guindon et al., 2009) was used for the calculation of the phylogenetic tree. Sequences from the GOS expedition are shown in bold. For clarity, the tree shows only a subset of the 583 partial PsaA sequences found in the GOS data set. A phage symbol is attached to each GOS sequence identified as also containing structural viral genes. PsaA protein sequences could be found in Supplementary File S1.

Figure 2
figure 2

Schematic physical maps of viral GOS clones, reads and long PCRs containing PSI gene cassettes. PSI genes are colored according to their % G+C content. Gray arrows represent viral ORFs. The ORF next to gp13 on clone 9984 is similar to hypothetical protein 133 from Prochlorococcus phage P-SSM4. Color code indexes indicate % G+C, the calculations were performed for each gene separately. DNA sequences could be found in Supplementary File S2.

We searched for the psaD->C->A gene rearrangements in different publicly available sets in the CAMERA server ((Sun et al., 2011) Database updated: 12 July 2011). Using blastN search and CAMERA default parameters (e-value 1e+1) against in the Sanger sequence-based data sets, a single read (JCVI_READ_1103242526277; raw sequences in Supplementary File S2) from the Indian open ocean GOS station GS112 was detected with a similar psaD->C->A arrangement. The PsaA protein from this read was identical to the PsaA protein from clone 9984 and the genes shares similar %G+C content. The other end of the same clone (JCVI_READ_1103242427903) contained a PSI psaB gene and a viral major capsid gene gp23 (similar to a gp23 from myocyanophage S-SSM7). Interestingly, this psaB gene content was 46% and is clearly distinct from other viral psaB genes described so far (Figure 2) and cluster on a long branch within the Synechococcus PsaB protein cluster (Figure 3).

Figure 3
figure 3

The relationship between Synechococcus (cyan), Prochlorococcus (green) and viral (red) PsaB proteins. Following alignment computation (using MUSCLE), PhyML was used for the calculation of the phylogenetic tree. PsaB protein sequences could be found in Supplementary File S3.

When the search was extended to 454 pyrosequence-based sets, a single similar 500 nt long read (CAM_READ_0231734843) from the Indian Ocean GOS station GS108 (Coccos Keeling, Inside Lagoon) was detected with a partial psaC->A arrangement. The new PsaA protein from station GS108 falls next to clone 9984 PsaA protein in the phylogentic tree (Figure 1a) on a long branch and the genes shares similar %G+C content. PCR reactions using primers to target the psaC->A arrangement performed on viral concentrates from the Pacific Line Islands (Caroline atoll (Millennium Island)) also confirmed the presence of a psaC->A arrangement with similar %G+C content (GenBank accession numbers JQ653152JQ653153). The PsaA deduced proteins from the PCR products were similar to clone 9984 PsaA protein.

The new viral PSI gene arrangement reported here, psaD->C->A->B, is minimal compared with the other known viral arrangement, psaJF->C->A->B->K->E->D, however, PSI containing only PsaA, B, C and D proteins are believed to be an important step in the evolution of PSI (Nelson, 2011). It is therefore intriguing that this minimal psaD->C->A->B arrangement is the one observed on phages and would suggest that this minimal set is functional.

We have not ruled out the possibility that clone 9984 and related reads originate from yet uncultured cyanobacteria, however, based on the existence of two viral proteins on clone 9984 and one on read JCVI_READ_1103242427903, the amplification from viral concentrate, and the unique gene arrangement, we suggest that the observed PSI GOS sequences, although rare, are authentic and represents new viral PSI gene organizations, which needs to be further explored.