Introduction

CO2 is one of the main greenhouse gases. Cyanobacteria can convert CO2 into organic compounds through oxygenic photosynthesis with an efficiency higher than that of terrestrial plants1,2. Cyanobacteria account for 20–30% of earth's photosynthetic productivity; that is, the conversion of solar energy into chemical energy3,4. The use of engineered cyanobacteria capable of producing fuels and other chemicals could reduce greenhouse gas emission and help to address the shortages of energy and resources5. As nonpathogenic aquatic photoautotrophs with a high nutritional value, some cyanobacteria are attractive hosts for producing heterologous proteins that can be used in the health, food, fodder, fertilizer and environment sectors6,7,8.

Proteins of industrial or environmental interest that have been produced in cyanobacteria include a metallothionein protein capable of absorbing heavy metals in wastewater7, a larvicidal protein that can kill mosquito eggs in water8 and a human CuZn superoxide dismutase that can be used in health products6. The human CuZn superoxide dismutase was expressed at a level of 3% of total soluble protein in the cyanobacterium Anacystis nidulans 63016. This expression level was achieved by using a native strong promoter, Prbc343 and an optimized SD sequence. Because this expression cassette was delivered by a replicative plasmid, the constant addition of antibiotics was required to maintain the expression level6.

Several heterologous genes have been expressed in cyanobacteria in an effort to create novel biosynthetic pathways to produce chemicals. Heterologous strong promoters such as Ptrc and Plac and native promoters such as Pcpc, Prbc and PpsbA2 have been used in integrated cyanobacterial expression systems. Lan et al.9 used Ptrc to express trans-enoyl-CoA reductase (Ter) for butanol production and Guerrero et al.10 used Plac to express ethylene-forming enzyme (EFE) for ethylene production. Lindberg et al.11 and Bentley et al.12 used PpsbA2 to drive expression of isoprene synthase (Isps) for isoprene production in cyanobacteria. Although there was successful production of the target chemicals, the expression levels of the proteins encoded by the heterologous genes were not reported9,10. Recently, Angermay et al.13 experimented with various promoters (PrnpB, PpsbA2 and Ptrc) to drive expression of L-lactate dehydrogenase in cyanobacteria and found that even powerful native and artificial promoters from Synechocystis were not strong enough to produce the enzyme in sufficient quantities.

The aim of this study was to create novel expression elements that lead to strong expression of heterologous genes in cyanobacteria. We discovered a super-strong promoter, Pcpc560, consisting of two promoters from the cpcB gene and 14 predicted transcription factor binding sites (TFBSs). Using Pcpc560, two heterologous genes were expressed in the cyanobacterium Synechocystis sp. 6803 (hereafter, S. 6803) to a level comparable to that obtained using the Escherichia coli expression system, demonstrating the strength and efficiency of Pcpc560. We further showed that the presence of multiple TFBSs is crucial for the promoter strength of Pcpc560. The discovery of this super-strong promoter will stimulate research on using cyanobacteria as hosts to produce valuable recombinant proteins from CO2 and inorganic nutrients.

Results

Discovery of the super-strong promoter Pcpc560

Efficient expression elements are required to increase the expression level of heterologous genes. Since c-phycocyanin and ribulose-bisphosphate carboxylase/oxygenase (Rubisco) are the major soluble proteins in cyanobacteria, expression elements from genes encoding these two proteins have been used widely to construct cyanobacterial expression vectors6,7,14,15,16. For example, a metallothionein gene was expressed in the cyanobacterium Synechococcus sp. 7942 using the promoter from cpcB, the gene encoding the c-phycocyanin beta subunit7, while a human CuZn superoxide dismutase gene was expressed in Synechococcus sp. 7002 using the promoter and terminator from the rbc operon, which encodes Rubisco6.

RBSDesigner17 predicted that the RBS in the cpcB promoter is approximately 10-times more efficient than that in the rbc promoter in terms of the predicted translation efficiency. Therefore, we chose the cpcB promoter for further investigations. The genomic sequence 1,000 bp upstream of the initiation codon of the cpcB gene (sll1577) was subjected to promoter analysis using Virtual Footprint18. This analysis revealed two promoters located at 135 bp and 374 bp upstream of the initiation codon of the cpcB gene. Scanning the genomic sequence 1,000 bp upstream of the initiation codon of the cpcB gene also revealed 14 TFBSs located between 381 bp and 556 bp from the initiation codon of the cpcB gene18 (Supplementary Table S1). Negative TFBSs are usually located between 500 bp and 1,000 bp upstream of the transcription start site19 and some mammalian proteins such as soluble adenylyl cyclase20 are highly similar to these proteins in cyanobacteria. Therefore, we chose the genomic sequence of 560 bp upstream of the initiation codon of cpcB as the new promoter sequence; this sequence was designated as Pcpc560. Pcpc560 contains two predicted promoters and 14 predicted TFBSs (Fig. 1).

Figure 1
figure 1

Schematic structure of super-strong promoter Pcpc560.

Green box shows first predicted promoter P1; yellow box shows second predicted promoter P2; red boxes show 14 predicted transcription factor binding sites (TFBSs).

No terminator was found in the genomic region 300 bp downstream of the stop codon of the cpcB gene. This may be because cpcB is the first gene of the cpc operon. Since the rbc terminator was used in a previous study that achieved high-level gene expression (yielding target protein at a level of 3% of total soluble protein) in a cyanobacterium6, we chose the terminator of rbcL, the gene encoding the Rubisco large subunit (slr0009), as the terminator (TrbcL).

High-level expression of crotonyl-CoA-specific trans-enoyl-CoA reductase gene in S. 6803

To validate the efficacy of the newly discovered promoter Pcpc560, we used it to drive the expressions of two genes encoding important enzymes involved in metabolism in S. 6803. One enzyme was crotonyl-CoA-specific trans-enoyl-CoA reductase (Ter), a key enzyme in increasing the driving force towards butanol biosynthesis21,22. ter has been overexpressed in Synechococcus sp. PCC 7942 with the aim to produce butanol from CO29. A strong promoter from E. coli, Ptrc, was used to express a codon-optimized ter in Synechococcus sp. PCC 79429. However, the activity of Ter in the crude cell extract of Synechococcus sp. PCC 7942 was 0.057 ± 0.005 μmol/min/mg crude cell extract. This was 64-fold lower than that of Ter (3.7 ± 0.5 μmol/min/mg) in E. coli, in which the ter gene was expressed under the control of a medium-strength promoter, PL lacO122. This suggests that Ptrc is not an efficient promoter in cyanobacteria in terms of gene expression strength. Therefore, we tested whether the newly discovered promoter Pcpc560 could significantly increase the expression level of ter in cyanobacteria.

To express ter in S. 6803, a codon-optimized ter gene (Supplementary Fig. S1) from T. denticola flanked by Pcpc560 and TrbcL expression elements was integrated into the S. 6803 chromosome at the pta (slr2132) insertion site via homologous recombination. The pta gene encodes phosphotransacetylase, the first enzyme in acetate synthetic pathway and disruption of pta does not affect cell growth16. Homoplasmy and gene insertion were verified by PCR and sequencing (Fig. 2a and 2b). SDS-PAGE analysis demonstrated that ter was strongly overexpressed in the S. 6803 mutant Δpta::Pcpc560ter. The protein expression level was approximately 15% of total soluble protein (Fig. 2c), a surprisingly strong expression for cyanobacteria. The band excised from the position at approximately 43.7 kDa was confirmed as Ter from T. denticola by MALDI-TOF MS analysis (Fig. 2d).

Figure 2
figure 2

Heterologous gene insertion into genome of S. 6803 at pta site and expression of ter and DldhE in S. 6803.

(a) For all mutants, complete segregation was demonstrated by whole-cell PCR with primers from recombinant cassette at pta site and a primer from 100-bp outside of recombinant cassette. (b) Whole-cell PCR with primers for each heterologous gene verified insertion of each heterologous gene into genome of corresponding strain. M: DNA marker III (top to bottom; 4.5, 3, 2, 1.2, 0.8, 0.5 kb). (c) Detection of heterologous gene expression in corresponding strain by 12% SDS-PAGE analysis. Each marker protein band contains 2 μg protein; 20 μg protein was loaded into each lane. Crude extracts from each strain were analyzed. Expression levels of Ter or DldhE under control of strong promoter Pcpc560 were approximately 15% of total soluble protein, whereas Ter expressed under control of Pcpc374 in Δpta::Pcpc374ter strain was undetectable on gel. (d) Peptide mass fingerprint of bands at approximately 43.6 and 36.5 kDa. 43.6-kDa band was identified as Ter from T. denticola, 36.5-kDa band was identified as DldhE from E. coli K12. Mascot score greater than 53 (default MASCOT threshold for such searches) was accepted as significant (p value < 0.05). All strains (WT, Δpta, Δpta::Pcpc374ter, Δpta::Pcpc560ter and Δpta::Pcpc560DldhE) were continuously passaged more than 10 times without antibiotics.

The specific activity of Ter was determined to be 32.22 ± 1.93 μmol/min/mg crude cell extract of S. 6803 strain Δpta::Pcpc560ter (Table 1). In a previous study in which the ter gene was expressed in Synechococcus sp. PCC 7942 under the control of the Ptrc promoter9, the activity of Ter was 0.057 ± 0.005 μmol/min/mg crude cell extract. When ter was expressed in E. coli BW25113 under the control of the PL lacO1 promoter, on a medium-copy number plasmid, the Ter activity was 3.7 ± 0.5 μmol/min/mg crude cell extract22. These activities of Ter reported in the literature were measured in crude lysates obtained from different organisms cultured in various different conditions. The activity of Ter determined in our study indicates that the ter gene was strongly expressed in S. 6803 strain Δpta::Pcpc560ter.

Table 1 Enzymatic activities of Ter and DldhE in S. 6803 strains

TFBSs are crucial for Pcpc560 strength

The newly discovered super-strong promoter Pcpc560 differs from other commonly used promoters in that it contains 14 predicted TFBSs located between −556 bp and −381 bp from the initiation codon of cpcB. To investigate whether these TFBSs contribute to cpcB promoter strength, ter was expressed under the control of Pcpc374, which contained only the two promoters, in S. 6803 Δpta::Pcpc374 (Fig. 2a and 2b). Although RT-PCR confirmed ter transcription from Pcpc374 (Supplementary Fig. S2), the level of Ter protein was too low for detection by SDS-PAGE, whereas the level of Ter protein expressed from ter driven by Pcpc560 was up to 15% of total soluble protein (Fig. 2c). The specific activity of Ter in strain Δpta::Pcpc374ter was only 0.12 ± 0.01 μmol/min/mg crude extract (Table 1). This was 268-fold lower than that of Ter in the strain Δpta::Pcpc560ter (32.22 ± 1.93 μmol/min/mg) and only slightly higher that of Ter in Synechococcus sp. PCC 7942, in which the ter gene was expressed under the control of the Ptrc promoter9. These data show that the newly discovered super-strong promoter Pcpc560 is indeed much stronger than Pcpc374 and Ptrc and the multiple TFBSs present in Pcpc560 are crucial for its promoter strength.

High-level expression of d-lactate dehydrogenase in S.6803

To investigate whether the newly discovered super-strong promoter Pcpc560 also drives strong expressions of other genes, we used it to control expression of Dldh, which encodes d-lactate dehydrogenase. The Dldh gene from E. coli K12 (designated as DldhE) flanked by Pcpc560 and TrbcL was integrated into the chromosome at the pta insertion site via homologous recombination. Homoplasmy and DldhE gene insertion were verified by PCR and sequencing (Fig. 2a and 2b). SDS-PAGE analysis demonstrated that DldhE was strongly overexpressed in the S. 6803 mutant Δpta::Pcpc560DldhE; the protein expression level was approximately 15% of total soluble proteins (Fig. 2c). MALDI-TOF MS analysis confirmed that the band excised from the position at approximately 36.5 kDa was Dldh from E. coli (Fig. 2d). The specific activity was determined to be 51.92 ± 2.31 μmol/min/mg crude extract in strain S. 6803 Δpta::Pcpc560DldhE (Table 1). In a previous study in which the DldhE gene was expressed under the control of PLtetO-1 on a low-copy number plasmid23, the activity of DldhE in E. coli K12 MG1655 was 0.96 ± 0.06 μmol/min/mg crude cell extract. This result confirmed that the DldhE gene was strongly expressed in S. 6803 under the control of Pcpc560.

We also investigated the genetic stability of the mutant containing the strong integrative expression cassette. The mutants Δpta::Pcpc560ter and Δpta::Pcpc560DldhE were continuously passaged 10 times in the absence of antibiotics. The expression levels of both ter and DldhE did not change during this procedure, demonstrating the stability of this expression system.

Discussion

In this study, we discovered and verified a super-strong promoter Pcpc560 for efficient and strong expression of heterologous genes in cyanobacteria. The newly discovered super-strong promoter Pcpc560 consists of two predicted promoters from the cpcB gene and 14 predicted TFBSs. Using Pcpc560, two heterologous genes were expressed in the cyanobacterium S. 6803 to levels of up to 15% of total soluble protein. Despite the fact that only a single copy of the target gene was inserted into the chromosome of S. 6803, the expression level of the target gene driven by Pcpc560 was comparable to that which can be achieved in an E. coli expression system using low- or medium-copy number plasmids. This demonstrates that Pcpc560 has potential applications for efficient production of recombinant proteins in cyanobacteria.

It is generally accepted that the genomic sequence 200–300 bp upstream of the initiation codon of a gene that is constitutively expressed at high levels can be used as a promoter sequence. Previous studies have shown that the thymine at 259 bp upstream of the initiation codon of cpcB is crucial for cpc promoter activity24. Recently, a genomic map of the transcriptional start sites (TSS) of S. 6803 showed that the cpcB gene is one of the genes with a long distance between its TSS and start codon25. Further examination revealed that most of the S. 6803 genes with a long distance between the TSS and start codon are responsive to environmental factors25. Since the cpcB gene is a light-and redox-responsive gene26,27, it is plausible that the length of the cpcB promoter is related to its responsiveness to environmental factors. Pcpc560 is unusually long in that it contains not only two predicted promoters (at 135 bp and 374 bp), but also 14 predicted TFBSs located between 381 bp and 556 bp upstream of the initiation codon of cpcB. Pcpc560 differs from Pcpc374 in that it has an extra 186 bp DNA fragment containing the 14 predicted TFBSs. The large-scale difference in the expression levels of ter under the control of Pcpc560 and Pcpc374 demonstrated that the extra 186 bp DNA fragment in Pcpc560 may contain positive TFBSs and that the presence of these multiple TFBSs is crucial for its promoter strength.

Transcription factors are proteins that play roles in virtually every aspect of the transcription process28. In prokaryotes, RNA polymerase recognizes and binds to the promoter region and initiates transcription and a sigma factor is required for RNA polymerase to bind to the promoter. In eukaryotes, three types of eukaryotic RNA polymerases all require transcription factors to bind to the promoter sequence before transcription can be initiated28. Therefore, compared with eukaryotic promoters such as the yeast promoter29, the promoter of E. coli is rather short (approximately 30–50 bp) and TFBSs are usually not required. For instance, when we scanned the sequences of E. coli strong promoters (including Ptrc, Plac and T7) that have been used to drive gene expressions in cyanobacteria previously, we found only one or two TFBSs in each of the promoters. In this study, we found 14 predicted TFBSs located between 381 bp and 556 bp upstream of the initiation codon of cpcB and verified that the extra 186 bp DNA fragment containing multiple predicted TFBSs is crucial for Pcpc560 promoter strength in cyanobacteria. This novel discovery raises the possibility that the lack of TFBSs in E. coli strong promoters may explain why they perform poorly in driving gene expression in cyanobacteria. Thus, we propose that native positive TFBSs should be considered when designing promoters to drive gene expression in cyanobacteria.

The newly discovered super-strong promoter Pcpc560 will be useful for further research on the production of useful substances using transgenic cyanobacteria and the very cheap substrate CO2. The principle of considering TFBS in cyanobacterial promoter design will also contribute to designing strong and controllable promoters to drive the expressions of genes involved in new biosynthetic pathways from CO2 in cyanobacteria.

Methods

Materials

We purchased restriction endonucleases and other nucleases for plasmid construction from New England Biolabs (Beverly, MA, USA). All chemicals for enzymatic analyses were purchased from Sigma-Aldrich (St Louis, MO, USA).

Construction of expression vectors

The plasmids used and constructed in this study are listed in Supplementary Table S2. Primers are listed in Table S3. General strategies for constructing expression vectors are shown in Supplementary Fig. S3.

The sequences of Pcpc560 and the TrbcL terminator were obtained from S. 6803 genomic DNA by PCR. Codon-optimized ter from Treponema denticola was ordered from Sangon Biotech (Shanghai, China). DldhE was obtained by PCR using genomic DNA of E. coli K12 as the template. Fusion PCR30 was used to ligate DNA fragments, including Pcpc560-ter-TrbcL and Pcpc560-DldhE-TrbcL. To construct the expression vectors pSM2-Pcpc560ter and pSM2-Pcpc560DldhE, the fusion PCR products Pcpc560-ter-TrbcL and Pcpc560-DldhE-TrbcL were digested with Xhol and ligated into the Xhol site between two homologous arms of vector pSM216. The ter gene flanked by Pcpc374 and TrbcL was obtained by PCR from the constructed plasmid pSM2-Pcpc560ter and was inserted into the Xhol site of plasmid pSM216.

Strains

The bacterial strains used in these experiments are listed in Supplementary Table S2. We used E. coli DH5α to propagate constructs and S. 6803 wild-type as the starting strain. All mutant strains were constructed by transforming S. 6803 wild-type with the plasmids listed in Supplementary Table S2, followed by subsequent selection until homoplasmy was achieved.

Transformations were performed as previously described16. All constructed strains are listed in Supplementary Table S2. Briefly, strain Δpta was constructed by integration of the plasmid pSM216 at the pta locus of wild-type S. 6803 via double crossover homologous recombination.

Transforming the wild-type strain of S. 6803 with the plasmids pSM2-Pcpc560ter and pSM2-Pcpc374ter generated strains Δpta::Pcpc560ter and Δpta::Pcpc374ter, respectively. In these strains, a part of the pta gene was replaced with the kanamycin resistance cassette and the Pcpc560- or Pcpc374-driven expression cassette of the codon-optimized gene ter from T. denticola encoding crotonyl-CoA specific trans-enoyl-CoA reductase.

Transforming the wild-type strain of S. 6803 with plasmid pSM2-Pcpc560DldhE generated strain Δpta::Pcpc560DldhE. In this strain, a part of pta was replaced with the kanamycin resistance cassette and the Pcpc560-driven expression cassette of the d-lactate dehydrogenase gene DldhE from E. coli K12.

Culture conditions

Wild-type and mutant lines of S. 6803 were grown in BG11 medium buffered with 10 mM HEPES (pH 8.0) at 30°C at an illumination intensity of 100 μmol photons/s/m2 as described elsewhere31. Kanamycin was added to the medium at a final concentration of 10 μg/ml when necessary. To maintain strains on agar plates, the BG11 medium was supplemented with 1.5% (w/v) agar.

RT-PCR

RT-PCR was performed as previously described16, with modifications. Total RNA was isolated from S. 6803 wild-type (WT), Δpta, Δpta::Pcpc374ter and Δpta::Pcpc560ter cells using Redzol reagent (Qiagen, Beijing, China). Residual DNA in RNA preparations was eliminated by digestion with RNAse-free DNAse and reverse transcription reactions were performed using a Reverse Transcription kit (Qiagen). Reverse transcription products were amplified by PCR and analyzed by electrophoresis on 1.2% (w/v) agarose gels. The ter transcript was amplified using the forward primer 5′-CAAGCCTTGTACCGCAAA-3′ and the reverse primer 5′-CGATCAAAGCGTTCCACT-3′. Transcript levels of rnpB were analyzed as a positive control32.

Preparation of cell extract and gene expression level analysis

Cell crude extracts were prepared as described elsewhere33, with some modifications. We collected 10-ml samples of cultures in mid-exponential growth phase and harvested cells by centrifugation at 14,000 g and 4°C for 2 min. Cell pellets were washed twice and then resuspended in prechilled buffer from the in vitro assay described below. Cells were disrupted by bead beating and cell debris and beads were removed by centrifugation at 14,000 g and 4°C for 30 min. Total protein concentrations of the crude extract were determined using the Bradford method with bovine serum albumin as the standard. To measure protein expression levels, crude extract was subjected to sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). The target protein bands were quantified using VisionWorksLS software, using the absolute quantity of the marker proteins as references. Each marker protein band contained 2 μg protein and samples with 20 μg protein in total were loaded into each lane. The abundance of proteins in the Ter and DldhE target bands was about 1.5-fold that of the marker band, as determined by VisionWorksLS analysis. Therefore, there was approximately 3 μg Ter or DldhE on the gel. From this value, we calculated that the expression level of Ter or DldhE was approximately 15% of total soluble protein. The target protein bands were excised and subjected to in-gel-digestion and MALDI-TOF MS analysis as described in our previous report34,35.

Trans-2-enoyl-CoA reductase (Ter) assay

Ter activity was assayed by monitoring the decrease in absorbance at 340 nm, which corresponded to consumption of NADH as previously described9,36, with slight modifications. Briefly, the reaction mixture (total volume, 200 μl) contained 100 mM potassium phosphate buffer (pH 6.2), 200 μM crotonyl-CoA (Sigma), 400 μM NADH and 0.3 μg crude extract protein from strain Δpta::Pcpc560ter or 7 μg crude extract protein from strains S. 6803, Δpta and Δpta::Pcpc374ter. After preincubation at 30°C for 5 min, the reaction was initiated by adding the substrate. The activity was determined by monitoring the decrease in absorbance at 340 nm using a microtiter plate reader (SpectraMax 190, Molecular Devices, Sunnyvale, CA, USA) at 30°C.

d-lactate dehydrogenase (Dldh) assay

Dldh activity was determined by monitoring the decrease of absorbance at 340 nm as previously reported9,37. The 200 μl reaction mixture contained 50 mM sodium phosphate (pH 6.5), 300 μM NADH, 2.5 mM MgCl2, 3 μg crude extract protein and 30 mM sodium pyruvate. After preincubation at 30°C for 5 min, the reaction was initiated by adding the substrate sodium pyruvate.