Rational design of an artificial tethered enzyme for non-templated post-transcriptional mRNA polyadenylation by the second generation of the C3P3 system

Le Boulch, Marine; Jacquet, Eric; Nhiri, Naïma; Shmulevitz, Maya; Jaïs, Philippe H.

doi:10.1038/s41598-024-55947-0

Download PDF

Article
Open access
Published: 02 March 2024

Rational design of an artificial tethered enzyme for non-templated post-transcriptional mRNA polyadenylation by the second generation of the C3P3 system

Marine Le Boulch¹,
Eric Jacquet²,
Naïma Nhiri²,
Maya Shmulevitz³ &
…
Philippe H. Jaïs¹

Scientific Reports volume 14, Article number: 5156 (2024) Cite this article

3152 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

We have previously introduced the first generation of C3P3, an artificial system that allows the autonomous in-vivo production of mRNA with ^m7GpppN-cap. While C3P3-G1 synthesized much larger amounts of capped mRNA in human cells than conventional nuclear expression systems, it produced a proportionately much smaller amount of the corresponding proteins, indicating a clear defect of mRNA translatability. A possible mechanism for this poor translatability could be the rudimentary polyadenylation of the mRNA produced by the C3P3-G1 system. We therefore sought to develop the C3P3-G2 system using an artificial enzyme to post-transcriptionally lengthen the poly(A) tail. This system is based on the mutant mouse poly(A) polymerase alpha fused at its N terminus with an N peptide from the λ virus, which binds to BoxBr sequences placed in the 3′UTR region of the mRNA of interest. The resulting system selectively brings mPAPαm7 to the target mRNA to elongate its poly(A)-tail to a length of few hundred adenosine. Such elongation of the poly(A) tail leads to an increase in protein expression levels of about 2.5–3 times in cultured human cells compared to the C3P3-G1 system. Finally, the coding sequence of the tethered mutant poly(A) polymerase can be efficiently fused to that of the C3P3-G1 enzyme via an F2A sequence, thus constituting the single-ORF C3P3-G2 enzyme. These technical developments constitute an important milestone in improving the performance of the C3P3 system, paving the way for its applications in bioproduction and non-viral human gene therapy.

Clamping-mediated incorporation of single-stranded DNA with concomitant DNA synthesis by Taq polymerase involves nick-translation

Article Open access 23 January 2024

Branched chemically modified poly(A) tails enhance the translation capacity of mRNA

Article 22 March 2024

Caliciviral protein-based artificial translational activator for mammalian gene circuits with RNA-only delivery

Article Open access 10 March 2020

Introduction

Many experimental and therapeutic approaches require expression of exogenous genes in eukaryotic cells. Currently, there are three common strategies to achieve exogenous gene delivery and expression. First, DNA plasmids can be introduced into cells, but the poor efficiency of nuclear localization in many cells required for exogenous gene expression limits the potency of DNA plasmid approaches. To circumvent this limitation, viruses such as adeno-associated viruses (AAV) and retroviruses can be used to deliver genomes into the nucleus of cells for expression; albeit viruses come with their own limitations in terms of production and safety¹. Finally, lipid nanoparticles (LNP) can be used to deliver RNAs directly into the cytoplasm of cells for immediate protein production, as is used for mRNA SARS-Cov2 vaccines². There are limitations to RNA-based gene expression approaches however, such as the short duration of protein expression and induction of innate signaling pathways. Given the limitations of existing strategies, we have focused on creating a DNA-based system that drives mRNA and protein expression from the cytoplasm. The rationale for this system is to harness the benefits of DNA for long term expression without requiring nuclear import.

We previously applied synthetic biology and molecular engineering to develop an artificial system called C3P3 (standing for cytoplasmic chimeric capping-prone phage polymerase), which allows the autonomous synthesis of mature target mRNAs in the cytoplasmic compartment of mammalian cells³. The first generation of this system (C3P3-G1) relies on single-subunit artificial chimeric enzyme created by the fusion of the DNA-dependent RNA polymerase from the K1E bacteriophage and the NP868R ^m7GpppN-capping (cap-0) enzyme from the African Swine fever virus, separated by a flexible (G₄S)₂ linker. Due to their physical association, mRNA transcribed by RNA polymerase moiety is nearly fully capped by the capping enzymatic moiety, in contrast to what is observed in the absence of their physical association³. Such difference is explained by the restricted diffusion of large macromolecules such as mRNA and DNA caused by the gel-like viscosity of the cytoplasm or nucleoplasm⁴. Although the C3P3 enzyme functions in both of these cellular compartments, we chose to express it in the cytoplasm of the host cell. This cellular localization in fact has the advantage of avoiding the translocation of the transfected exogenous DNA from the cytoplasm towards the nucleus, which is one of the most important barriers for the delivery of exogenous DNA to the cells⁵.

The C3P3 system was designed to transcribe double-stranded DNA templates containing the target gene with C3P3 promoter, and then add a ^m7GpppN cap-0 at the ends 5′ of the target transcripts. The proximity of these two enzymatic functions makes it possible to synthesize a target mRNA that is predominantly capped at its 5′ end³. Although the C3P3-G1 system is highly processive and produces large amounts of target mRNA in cultured mammalian cells, proportionally lesser amounts of protein are produced, suggesting a further bottleneck to maximum protein expression also exists during mRNA translation. For example, the C3P3-G1 expression system produced 6.7-fold more luciferase mRNA, but 2.2-fold less luciferase luminescence signal at peak relative to the standard CMV-driven Firefly luciferase expression plasmid³. These findings suggest that other modifications, in addition to 5′-end ^m7GpppN capping, are required for full translatability of target transcripts.

Polyadenylation is a post-transcriptional modification found at the 3′-end of virtually all eukaryotic mRNAs, with the notable exception of most histone mRNAs, which have a stem-loop at their 3′-end and produces cleaved non-polyadenylated mature mRNAs, although a subset of replication-dependent histone mRNAs can be also expressed as polyadenylated mRNA^6,7. Polyadenylation occurs post-transcriptionally by poly(A) polymerases (PAP), whose prototype is the nuclear poly(A) polymerase alpha (PAPα). This enzyme post-transcriptionally adds residues of adenosine monophosphate of adenosine triphosphate to RNA by releasing a pyrophosphate group. PAPα is part of a protein complex with cleavage and polyadenylation specificity factor (CPSF), which binds to the AAUAAA polyadenylation signal hexamer, and cleaves pre-mRNA 12–30 nucleotides downstream to the hexamer. The polyadenylation complex also includes the cleavage stimulation factor F (CstF), which binds to the G/U-rich signal and cleaves the 3′-most part of a newly produced RNA (Fig. 1A). Once synthesized, the polyadenylated tail plays a crucial role in regulating the stability, nuclear transport, and translation of mRNAs^8,9,10,11,12.

Given the importance of poly(A) tails in mRNA translation, we sought to extend the length of the poly(A) tails generated by the C3P3-G1 system and evaluate benefits in protein expression levels. The original C3P3-G1 system produced a short 40-residue poly(A) tail on mRNAs, transcribed from an adenosine track on the DNA template, followed by a cis-cleaving hepatitis D virus ribozyme. Conversely, native mRNAs contain poly(A) tails of up to 250–300 nucleotides at synthesis, then shortened to varying lengths in the cytoplasm with a median between 50 and 100 nucleotides at steady state^{13,14,15,16,17,18}. Given the role of polyadenylation in eukaryotes on mRNA processing, we thought of developing a second-generation C3P3 system (C3P3-G2) that generates significantly longer poly(A) tails on C3P3 transcripts.

To extend poly(A) tails in the C3P3 system, a chimeric mutant PAPα was tethered to the 3′-untranslated region (3′UTR) of target mRNA. This modification post-transcriptionally lengthened the poly(A) tail of mRNA synthesized by the C3P3-G1 enzyme (Fig. 1B), and increased expression of Firefly luciferase from mRNAs by nearly 3-times compared to the first-generation C3P3-G1 system. The development of a polyadenylation system is therefore an important milestone in the improvement of exogenous gene expression by DNA-based artificial cytoplasmic expression systems.

Material and methods

Plasmids

Artificial gene sequences were synthesized and assembled from stepwise PCR using oligonucleotides, cloned and fully sequence verified by GeneArt AG (Regensburg, Germany). The coding sequences of all constructions were optimized for protein expression in human cells with respect to codon adaptation index¹⁹.

The candidate poly(A) polymerase plasmids consisted of the IE1 promoter/enhancer from the human cytomegalovirus (CMV), 5′UTR from human β-globin, Kozak consensus sequence followed by the open-reading frame (ORF) of tethered poly(A) polymerases, 3′UTR from human β-globin, and SV40 polyadenylation signal (Supplementary Fig. 1). Unless otherwise indicated, poly(A) polymerases were tethered by the fusion at their amino-terminal ends with the N-peptide sequence of 21 amino acids from the λ virus.

The C3P3-G1 plasmid, also named pCMV-NP868R-(G₄S)₂-K1ERNAP(R551S), was previously described³. It contains a single open-reading frame (ORF) encoding for the NP868R capping enzyme of the African Swine fever virus fused through a flexible (G4S)₂ linker to the mutant DNA-dependent RNA polymerase from the K1E bacteriophage (Supplementary Fig. 2).

The Firefly luciferase reporter plasmid pK1E-Luciferase-4xλBoxBr-A40, which was used for the cell assays unless otherwise indicated, consists of the K1E phage RNA polymerase promoter, 5′UTR from human β-globin, Kozak consensus sequence followed by the ORF of the Firefly luciferase gene from Photinus pyralis, 3′UTR from human β-globin, four BoxBr of 17 nucleotides in tandem from the λ-virus, poly(A) track of 40 adenosine, self-cleaving hepatitis D virus antigenomic ribozyme sequence, and terminated by the bacteriophage T7 φ10 transcription stop (Supplementary Fig. 3).

Cell culture and transfection

For standard experiments, the Human Embryonic Kidney 293 (HEK-293, ATCC CRL 1573) cells were routinely grown at 37 °C in 5% CO₂ atmosphere at 100% relative humidity. Cells were maintained in Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with 4 mM l-alanyl-l-glutamine, 10% fetal bovine serum (FBS), 1% non-essential amino-acids, 1% sodium pyruvate, 1% penicillin and streptomycin, and 0.25% fungizone.

Cells were routinely plated in 24-well plates at 1 × 10⁵ cells per well the day before transfection and transfected at 80% cell confluence. Transient transfection was performed with Lipofectamine 2000 reagent (Invitrogen, Carlsbad, CA) according to manufacturer’s recommendations, except when otherwise stated. Lipoplexes were prepared by mixing plasmid DNA (μg) with Lipofectamine 2000 (μl) in a ratio of 1:2.5. For standard luciferase and hSEAP gene reporter expression assays, cells were analyzed 48 h after transfection, unless otherwise indicated.

Firefly luciferase oxidation assays and eSEAP gene reporter measurements

Luciferase luminescence was assayed by the Luciferase Assay System (Promega, Madison, WI) as described elsewhere with minor modifications³. In brief, cells were lysed with 200 µL of Cell Culture Lysis Reagent buffer (CLR, Promega). Cell lysates were transferred to microcentrifuge tubes, briefly vortexed, then centrifuged at 12,000×g for two minutes at 4 °C. The supernatant was transferred to a microplate (20 µL/well), then Luciferase Assay Reagent (Promega; 100 µL/well) diluted at 1:10 was added to protein extracts. Luminescence readout was taken on a Tristar 2 microplate reader (Berthold, Bad Wildbad, Germany) with a read time of one second per well. For translatability assays, the area-under-curve from D0-to-D6 (AUC_D0–D6) of the Firefly luminescence was calculated using the linear trapezoid method.

In order to normalize for transfection efficacy, cells were transfected with the pORF-eSEAP plasmid (InvivoGen, San Diego, CA), which encodes for the human secreted embryonic alkaline phosphatase driven by the EF-1α/HTLV composite promoter. Enzymatic activity was assayed in cell culture medium using the Quanti-Blue colorimetric enzyme assay kit (InvivoGen) as described elsewhere³. Gene reporter expression was expressed as the ratio of luciferase luminescence (RLU; relative light units) to eSEAP absorbance (OD, optic density).

siRNA-mediated gene knockdown

HEK-293 cells were transfected at the final concentration of 100 nM with chemically synthesized pools of four 21-nucleotide siRNAs against human cyclin B and human p34^cdc2 (Supplementary Material and Data 1), or non-targeting pool of siRNA used as a negative control (Dharmacon, Lafayette, CO)²⁰.

Cell proliferation and cytotoxicity assays

Cell viability was measured with the CyQUANT LDH Cytotoxicity Assay Kit (Invitrogen), a colorimetric assay to measure lactate dehydrogenase (LDH), which is a cytoplasmic enzyme released from dead or dying cells into the cell culture medium²¹. The assay is based on the conversion by LDH of lactate to pyruvate through the reduction of NAD+ to NADH, which reduces a tetrazolium salt to a red formazan product that can be measured on a microplate reader. Cytotoxicity assay was performed according to the manufacturer's instructions by adding the reaction mixture to the cell culture medium, followed 30 min later by measuring the difference between the absorbance values at 490 nm and that at 680 nm measured with a monochrometer-based Tecan instrument (Männedorf Switzerland).

Cell proliferation was assayed using the CyQUANT Direct Cell Proliferation kit (Invitrogen), which consists of two components, the green cyanine dye and a background suppression dye. This assay is based on a nucleic acid stain is a live cell permeable reagent that mainly binds to the nuclear DNA of mammalian cells²², whereas the suppression dye is impermeable in live cells and suppresses green fluorescence. The assay was performed according to the manufacturer's instructions by adding the reaction mixture to the cultured cells, followed 60 min later by measuring the fluorescence at 508 nm excitation and 527 nm emission wavelengths using a monochrometer-based Tecan instrument.

Flow cytometry analysis

Transfection rate and protein expression levels were assayed with the Guava easyCyte flow cytometer (Luminex, Austin, TX) in HEK-293 cells transfected with eGFP plasmid under control of the C3P3 promoter or a conventional nuclear CMV enhancer/promoter.

Western-blotting

For C3P3-G1 or Nλ-PAPαm7 Western-blotting, one 3xFLAG-tag was introduced in-frame into the ORF the enzymes²³. For the Western-blot of C3P3-G2, the 3xFLAG-tagged Nλ-PAPαm7 sequence was fused in-frame with the 3xFLAG-tagged C3P3-G1 enzyme through an F2A ribosome skipping sequence. Western blotting was carried out essentially as described elsewhere, with minor modifications³. Cells were transfected as previously described, lysed in 200 µl of CLR buffer, then lysate was clarified by spinning for 15 s at 12,000×g at room temperature. Forty micrograms of total protein were resolved on 7% polyacrylamide gel, and transferred onto nitrocellulose Hybond membrane (GE Healthcare, Pittsburgh, PA) overnight at + 4 °C. Membranes were blocked with 5% skim milk powder in Tris Buffered Saline with Tween 20. Membranes were incubated with the rabbit polyclonal F7425 anti-FLAG primary antibody (1:1000; Sigma-Aldrich, Saint-Louis, MO) for 1 h at room temperature, then incubated with the anti-rabbit IgG-conjugated horseradish peroxidase 7074 antibody (1:1000; Cell Signaling Technologies, Danvers, MA). The membrane was submerged with SuperSignal West Pico Chemiluminescent Substrate solution (ThermoFisher Scientific, Waltham, MA) and scanned with the Amersham Biosciences Imager 600 (GE Healthcare, Chicago, IL). Molecular weights were determined using the high range color-coded prestained protein markers (Cell Signaling Technologies, Danvers, MA). Membranes were analyzed with ImageJ software²⁴.

C3P3 protein fluorescence imaging by laser-scanning confocal microscopy

Each moiety of the C3P3-G2 enzyme, which is obtained by fusion into a single ORF of the Nλ-PAPαm7 sequence with the C3P3-G1 enzyme via an F2A ribosome skipping sequence, were imaged by indirect immunofluorescence. To this end, one 3xFLAG tag was fused in frame to the carboxyl-terminal end of the Nλ-PAPαm7 subunit, while a 3xV5 tag to the amino-terminal ends of C3P3-G1²⁵. Chinese Hamster Ovary K1 (CHO-K1, ATCC CCL-61) were transfected as described above, then fixed in 4% paraformaldehyde for 15 min at room temperature, washed in PBS and permeabilized for 10 min in 0.1% Triton X-100. Non-specific-binding was blocked with 5% normal serum (v/v) in 3% PBS-BSA buffer. Cells were incubated with the rabbit F7425 polyclonal IgG anti-3xFLAG antibody (1:100; Sigma-Aldrich, Saint-Louis, MO) and the mouse R960-25 anti-V5 Tag monoclonal IgG2a antibody (1:250; Thermo-Fisher). Cells were then incubated with the secondary Alexa Fluor 568-conjugated goat anti-rabbit A11036 orange-fluorescent (1:2000) and the Alexa Fluor 488-conjugated goat anti-mouse A11029 green-fluorescent antibody (1:500; Thermo-Fisher). Slides were mounted in the anti-fade Vectashield Mounting Medium (Vector Laboratories, Burlingame, CA), then imaged on a Leica SP8 confocal microscope equipped with the appropriate laser excitation filters under oil immersion with image magnification and processed with Leica LAS-AF software. For nuclear stained cell imaging, slides were incubated with Hoechst 33342 dye.

Real-time mRNA synthesis imaging

Real-time live cell production of mRNA by the C3P3 system was performed with SmartFlare probes (Merck-Millipore, Billerica, MA). These probes are made of a gold nanoparticle conjugated to multiple copies of double-stranded oligonucleotide consisting of capture mRNA-specific complementary sequences hybridized to complementary reporter sequences²⁶. The capture sequences are bound to gold nanoparticles, while the 5′-ends of complementary reporter sequences are bound to Cy-5 fluorophore. The fluorescence of the reporter sequences is quenched by its proximity to the gold core. When the target cellular mRNA is present, it hybridizes with the capture sequences linked to the gold nanoparticles, which releases the released reporter strand, which is no longer quenched and becomes fluorescent. CHO-K1 cells were plated on Lab-Tek chambered cover glasses and co-transfected as described above with C3P3-G1 plasmid, together with a plasmid containing the gene of interest under control of the C3P3 promoter. The SmartFlare probe was added to the culture medium at a final concentration of 400 pM and placed in a thermoregulated flow chamber at 37 °C in a humidified atmosphere carrying 5% CO₂ using a microscope video camera (Zeiss, Iena, Germany). Cells were continuously imaged every five minutes for 48 h using the 650 nm laser excitation filters. Images were analyzed with the Zen Blue 2012 software (Zeiss). For nuclear staining, Hoechst 33342 was added to the culture medium and imaged as described above.

mRNA translatability measurement

The effect of lengthening poly(A) tail on kinetics of target transcripts synthesized by the C3P3 system was assessed by quantitative reverse transcription-polymerase chain reaction (RT-qPCR) as described elsewhere with minor modifications³. HEK-293 cells were transfected as described above, then total RNA was isolated using the Nucleospin RNA columns (Macherey–Nagel, Düren, Germany) and subjected to TURBO DNase treatment (Ambion, Foster City, CA). RNA samples were quantified using a Nanodrop One spectrophotometer (ThermoFisher Scientific) and their integrity was assessed using the Agilent 2100 Bioanalyzer with the RNA 6000 Nano Kit (Agilent Technologies). Total RNA was reverse-transcribed using the high-capacity cDNA reverse transcription kit with RNase inhibitor (Life Technologies). The cDNA was then amplified by real-time RT-qPCR using primer and MGB probe sets for the Firefly luciferase and C3P3-G1 mRNAs with TaqMan detection (Supplementary Material and Data). All measured RIN values for the samples were greater than 9. Copy number quantification relied on the following normalization steps (Supplementary Material and Data). First, the purified RNAs were quantified by nanodrop and by Bioanalyzer, which made it possible to carry out the reverse transcription under similar conditions for all the samples. Second, glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and β-actin (ACTB) mRNAs were quantified by RT-qPCR simultaneously with the mRNA of interest. The values obtained by RT-qPCR were thus corrected by the possible variations of these two reference genes in each of the samples. Third, the exact determination of the copy number is made by using a reference plasmid carrying all the amplicon sequences of the target genes, which makes it possible to correlate copy number and Ct, and thereby the copy number in the initial sample. For the quantification of translatability, the AUC_D0-D6 of the Firefly luciferase mRNA was calculated using the linear trapezoid method, then reported to the AUC_D0-D6 of the Firefly luciferase quantified by luciferin oxidation and also calculated by the linear trapezoid method.

Poly(A) tailing assay

The poly(A) tail length of C3P3 transcripts was investigated using the Poly(A) Tailing Assay (also named RACE-PAT), which is an application of the 3′-RACE²⁷. In brief, HEK-293 cells were co-transfected with the C3P3-G1 plasmid and pK1E-Luciferase-4xλBoxBr-A40, with or without the Nλ-PAPαm7 plasmid. Total RNA was then extracted with Nucleospin RNA columns as previously described, followed by incubation with the yeast poly(A) polymerase and GTP, which adds a limited number of guanosine and inosine nucleotides to the 3′-ends of poly(A)-containing RNAs, thus creating a unique poly(A)-oligo(G) junction (Affymetrix, Cleveland, OH). The tailed-RNAs was then converted to cDNA through reverse transcription using the newly added G/I tails as the priming sites. The resulting cDNA was hot-start PCR amplified using one of the two gene-specific forward primers (Supplementary Material and Data 4) and the universal 35-mers reverse primer that includes the poly(A) tails of the gene-of-interest. Finally, the PCR products are separated on an 2.5% agarose gel and images were analyzed with the ImageJ software. The length of the poly(A) tail was calculated as the sizes of poly(A) PCR-amplified products minus the calculated length of the gene-specific forward primer to the start of the templated poly(A) sequence.

Bioinformatics

Prediction of p34^cdc2 kinase phosphorylation sites was carried with the NetPhos 3.1 algorithm, which identifies candidate serine, threonine or tyrosine phosphorylation sites in eukaryotic proteins using ensembles of neural networks²⁸.

For phylogenetic tree of the poly(A) polymerases, sequences were aligned by the ClustalW algorithm using a BLOSUM matrix (BLOcks Substitution Matrix) and default alignment parameters²⁹. The phylogenetic tree was then drawn using FigTree software, using the distance-based neighbor-join method (http://tree.bio.ed.ac.uk/software/figtree/).

Statistics

The data analysis was carried out with GraphPad Prism software (version 8.4.3, GraphPad Software for Science Inc., San Diego, CA). Comparison between two groups was performed by two-tailed Student’s t-test of two groups or one-way ANOVA, adjusted by Dunnett’s Post Hoc Test to compare the between means of more than two groups. Results are means (n ≥ 4) ± standard deviation. Significant differences are indicated by asterisks (*P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001).

Results

C3P3-G1 is an artificial expression system that allows the autonomous synthesis of the mRNA of interest in the cytoplasmic compartment

The C3P3-G1 system relies on an artificial capping-prone DNA-dependent RNA polymerase that functions in the cytoplasmic compartment. Cytoplasmic expression circumvents the limiting dependence on access to the nucleus for exogenous gene expression. Furthermore, transcription by the artificial RNA polymerase C3P3 is under the control of the corresponding promoter sequence, while its cytoplasmic localization restricts illegitimate transcription of the nuclear genome of the host cell, as supported by Agilent oligonucleotide microarray assays (P. H. Jais, unpublished data).

We then focused on the kinetics of expression by the C3P3 system, which we studied by live-cell mRNA synthesis imaging using SmartFlare probes. These gold nanoparticles are conjugated to multiple copies of double-stranded Cy-5-tagged oligonucleotides that fluoresce upon hybridization with cellular target mRNA³⁰. Time-lapse live-cell imaging showed that the fluorescent signal was detectable as clusters of fluorescent dots as early as eight hours after transfection and gradually increased until the end of recording 48 h later (Supplementary Video).

Altogether, these results confirm that the synthesis of the target mRNA by the C3P3-G1 system takes place in the cytoplasmic compartment, which is an essential parameter to take into account for the development of an artificial poly(A) tail elongation process, such as the one developed here.

Several cytoplasmic PAPs from different kingdoms of life can significantly increase protein expression by the C3P3 system

Poly(A) tail on the 3′ end of mRNAs is important for mRNA stability and efficient translation initiation. In the C3P3-G1 system, a 40-adenosine residue poly(A) tail was added to mRNAs by incorporation of 40 adenosines into the DNA template followed by a trans-cleaving ribozyme from the hepatitis D virus. To develop a second generation C3P3 system that generates longer poly(A) tails to mRNAs synthesized by C3P3, we adapted the tethered function assay technique³¹. In this technique, the 3′UTR of the mRNA forms orthogonal non-covalent RNA–protein interaction. This technique has been used to decipher the role of proteins involved in the transport, localization or post-transcriptional processing of mRNA. Moreover, this technique was used to investigate the functioning of nuclear canonical poly(A) polymerase alpha (PAPα) and non-canonical cytoplasmic GLD-2 poly(A) polymerase^32,33.

Among the well-established orthogonal RNA–protein interaction systems, we chose the system from the lambdoid bacteriophage family because it has nanomolar affinity between short peptide sequences of N anti-termination protein and specific hairpin nucleotide sequences³⁴; the short sequences of both peptide and nucleotide were ideal for engineering into the C3P3 system. The tethering peptides consist of a 19–22 amino acid arginine-rich sequence found at the amino-terminus anti-terminator protein N from lambdoid bacteriophages. On the other hand, the tethered RNA sequences consist of 17–22 nucleotide leftward BoxBl and rightward BoxBr hairpins, in which 4 bases adopt a GNRA-like tetraloop structure, where N is any base and R is a purine³⁵. We used this RNA–protein interaction system to tether various candidate poly(A) polymerases that catalyze the template-independent sequential addition of adenosine monophosphate units from ATP to the 3′-terminal hydroxyl groups of RNAs, releasing pyrophosphate.

Poly(A) polymerases are found in various kingdoms of life. The phylogenetic tree of the poly(A) polymerases which will be tested below is depicted in Fig. 2. First, the canonical poly(A) polymerases from the eukaryotes, PAPα, β and γ, are mainly nuclear and have a tripartite structure as described hereinafter. Second, the non-canonical poly(A) polymerases from eukaryotic cells are essentially cytoplasmic and constitute a more heterogeneous group whose prototype is GLD-2. Third, poly(A) polymerases derived from eukaryotic viruses, in particular DNA viruses whose replication is cytoplasmic. Fourth, the poly(A) polymerases from prokaryotes, for which polyadenylation has a function radically different from that of eukaryotes because it is thought to act as a signal for transcriptional destabilization.

To develop an artificial post-transcriptional polyadenylation system, we fused the amino-terminal ends of candidate poly(A) polymerases to Nλ-tethering peptide from the λ bacteriophage, while four tandem BoxBr RNA hairpin also from λ bacteriophage were introduced in the 3′UTR of the Firefly luciferase reporter gene. HEK-293 human cells were co-transfected with these plasmids together with the C3P3-G1 plasmid. Following expression of the C3P3-G1 enzyme, transcripts are synthesized and ^m7GpppN-capped. Subsequently, the tethered poly(A) polymerases are recruited specifically to BoxBr from the target transcripts, which has the expected effect of lengthening their poly(A) tails. The effect of this post-transcriptional modification on reporter protein expression can be monitored by conventional luciferin oxidation assay (Fig. 3).

In a first screening step, we evaluated several wild-type poly(A) polymerases from eukaryotic viruses or mesophilic prokaryotes, which were selected for their predicted cytoplasmic localization, as well as for their optimum temperature close to 37 °C. Four of these poly(A) polymerases were from cytoplasmic DNA viruses, i.e. MG561 from Megavirus chiliensis³⁶, C475L from African swine fever virus³⁷, R341L from Acanthamoeba polyphaga mimivirus³⁶ and VP55 from Vaccinia virus³⁸. In addition, the prokaryotic PcnB poly(A) polymerase from Escherichia coli was tested³⁹. These enzymes were tethered to the Nλ-peptide as described above. All the poly(A) polymerases increased the protein expression of the reporter gene, the best results being obtained with the R341 and MG561 poly(A) polymerases which respectively increased the expression levels of the reporter protein by 1.80 and 1.67-fold respectively (P < 0.001; Fig. 4A, first and second series).

We also included in the initial screening two wild-type mammalian cytoplasmic poly(A) polymerases, which are found at least partially in the cytoplasmic compartment. First, the mouse PAPβ (also named PAPOLB) which is specifically expressed in testicular cells, where it lengthens the poly(A) tail of certain mRNAs⁴⁰. Secondly, the cytoplasmic non-canonical mouse GLD-2 poly(A) polymerase, which associates with the GLD-3 regulatory subunit to form a heterodimer and is mostly expressed in the central nervous system³². These two enzymes were tethered with the Nλ peptide and tested as described above. Mouse tethered PAPβ enhanced significantly the expression levels of the Firefly luciferase reporter protein by 1.37-fold, but surprisingly, the tethered mouse cytoplasmic GLD2 poly(A) polymerase had nearly no detectable activity (P < 0.001 and NS, respectively; Fig. 4A, third series).

PAPα increases protein expression by the C3P3 system when properly tethered and relocalized to the cytoplasm

Owing to encouraging results obtained with cytoplasmic poly(A) polymerases from various domains of life, we next evaluated if the same approach could also be developed to the canonical nuclear mammalian PAPα, which carries out the bulk of pre-mRNA polyadenylation in mammals. The mouse poly(A) polymerase α canonical isoform 1 (PAPα, also named PAPOLA) was chosen as a prototype because of its extensive structural and functional characterization enabling advanced engineering. As other canonical nuclear poly(A) polymerases, this enzyme has a tripartite structure (Fig. 4B): a nucleotidyl-transferase catalytic domain at the N-terminus, a central domain RNA binding region, and two C-terminal nuclear localization signals (NLS1 and NLS2) that surround a serine/threonine-rich region⁴¹. This C-terminal domain has several cyclin-dependent kinase phosphorylation sites, which finely regulate the enzymatic activity during the cell-division cycle⁴².

The C3P3 system being of cytoplasmic localization, the nuclear localization of the wild-type PAPα could represent a major obstacle to achieving high efficiency of polyadenylation in the cytoplasm⁴¹. To test if cytoplasmic localization of PAPα would enhance efficiency of cytoplasmic C3P3-based expression systems, one or both parts of the bipartite nuclear localization signal (NLS) were mutagenized. Four mutants were tested in this optimization step. First, in the mutant Nλ-PAPαm1, two lysine residues of NLS2 were substituted with arginine at residues at positions 656–657. This mutation inhibits the conjugation of PAPα by the protein SUMO 2/3 (Small Ubiquitin-like MOdifier), which results in relocation of PAPα to the cytoplasm⁴³. Second, in the Nλ-PAPαm2 mutant, the bipartite NLS2 was inactivated and has been shown to relocate bovine PAPα to the cytoplasm⁴¹. Third, in the mutant Nλ-PAPαm3, the entire carboxyl-terminal end was deleted, which included the NLS1 and NLS2, as well as the interaction domain with cleavage and polyadenylation specificity factor (CPSF) that is involved in the cleavage of the signaling region from pre-mRNA at their 3′-end. This regulatory region, which was found to be non-essential for the in vitro activity of mammalian PAPα, is found exclusively in vertebrates and absent in protists^33,41. Fourth, for Nλ-PAPαm4, four tandem nuclear export signals from HIV-1 were fused to the carboxy-terminus of PAPα in order to relocate PAPα to the cytoplasmic compartment⁴⁴.

When biological activity of these mutant tethered PAPα were tested, as expected, the nuclear wild-type nuclear tethered PAPα had virtually no effect on the expression of the Firefly luciferase reporter protein. Conversely, all four tethered PAPα mutants increased the expression of the reporter protein at varying levels (Fig. 4C, first series). Of these, Nλ-PAPαm1 gave the best results by increasing by 1.94-fold the expression of Firefly luciferase reporter protein compared to C3P3-G1 system only (P < 0.001). The Nλ-PAPαm1 mutant was therefore selected for further enhancement, all mutations described below containing this mutation.

Additional mutations of mouse PAPα that inactivate p34^cdc2/cyclin phosphorylation and increase enzymatic processivity enhance protein expression by the C3P3 system

The activity of vertebrate PAPα is tightly regulated through post translational modifications, and such regulation could inactivate the activity of tethered PAPα during certain phases of the cell cycle. We have therefore investigated whether modifications to the post-translational regulation of the tethered PAPα could have a favorable effect on the protein expression levels by the C3P3 system. Specifically, native PAPα has a C-terminal serine/threonine-rich domain with several consensus (S/T-P-X-K/R) and non-consensus (S/T-P-X-X) cyclin-dependent kinase sites, which can be phosphorylated by p34^cdc2/cyclin B during the M-phase of the cell cycle⁴⁵. Such cell cycle-related phosphorylation downregulates PAPα enzyme activity and is thought to contribute to the reduction of mRNA polyadenylation during M phase⁴². We hypothesized that the phosphorylation of mouse Nλ-tethered PAPα by p34^cdc2/cyclin B could reduce its enzymatic activity, and thereby decrease the polyadenylation lengthening of the target mRNA.

The role of PAPα phosphorylation by p34^cdc2/cyclin B on luciferase expression were evaluated by co-transfecting siRNA pools against human cyclin B (CCNB1) or p34^cdc2 (CDK1)_. Both siRNA pools significantly increased protein expression compared to a pool of non-targeting siRNAs when co-transfected with the Nλ-PAPαm1 and C3P3-G1 plasmids (P < 0.001 for both siRNA; Fig. 4D). To demonstrate that the enhancement of luciferase expression upon p34^cdc2/cyclin B silencing is specifically induced by tethered PAPα, the catalytically dead tethered PAPα was used under the same conditions. Co-transfection of siRNAs against human cyclin B or p34^cdc2 did not significantly enhance expression of the Firefly luciferase reporter protein context of the catalytically dead tethered PAPα, thus confirming the specificity of the observed effect.

To engineer PAPα to be insensitive to p34^cdc2/cyclin B, we generated Nλ-tethered PAPα with mutated p34^cdc2/cyclin B phosphorylation sites. Specifically, alanine substitution by site-directed mutagenesis was conducted on validated and/or predicted phosphorylatable serine residues in the C-terminal region of the mouse PAPα (Fig. 4C, second series). Two mutants were tested in this optimization step. First, in Nλ-PAPαm5, alanine residues were substituted into the three serine residues conserved between murine and bovine PAPα and previously confirmed to have biological function on PAPα activity in vivo⁴⁶. Secondly, in Nλ-PAPαm6, in addition to the above serine-to-alanine substitutions, four additional candidate phosphorylatable serine residues at non-consensus p34^cdc2/cyclin B sites predicted by the NetPhos 3.1 neural network algorithm were inactivated by alanine substitutions²⁸. Nλ-PAPαm5 was the most active mutant with 2.38-fold greater Firefly luciferase reporter protein expression compared to the C3P3-G1 system alone (P < 0.001), and was therefore selected for further optimization. In contrast, the Nλ-PAPαm6 mutant had much lower activity, which could possibly be explained by excessive structural modifications of the protein caused by additional mutations.

Finally with respect to optimizing PAPα, we evaluated whether mutations known to increase the processivity of the enzyme in vitro could also increase the level of protein expression by the C3P3 system (Fig. 4C, third series). Two mutants were tested in this final optimization step: Nλ-PAPαm7 and Nλ-PAPαm8, which correspond to F100I⁴¹ and R104A substitutions⁴⁷, respectively. Both mutations, which are located in a α-helical helix, are proposed to widen the access channel to the substrate and thus promote the influx of ATP or the release of pyrophosphate (PPi) after formation of the phosphodiester bonds⁴⁷. Nλ-PAPαm7 was the most active and increased expression of Firefly luciferase reporter protein 2.7-fold compared to no PAPα (P < 0.001). The Nλ-PAPαm7 enzyme was therefore selected as the final attached poly(A) polymerase, this term corresponding below to the enzyme used alone. Nλ-PAPαm7 is designated as C3P3-G2 when used as an independent module or assembled C3P3-G2 when fused to the C3P3-G1 enzyme in a single ORF.

Optimal protein expression by the C3P3 system is found with at least four BoxBr hairpin repeats from the λ virus properly spaced in the 3′UTR of the target mRNA

Having optimized the poly(A) polymerase, the next goals was to establish the best we then optimized the RNA hairpin sequences to tether poly(A) polymerase to the 3′ UTR of mRNAs. The λ virus genome contains two hairpins BoxBl and BoxBr of 17 nucleotides which differ by only one nucleotide. The effects of the substitution of 4xλBoxBr from the λ virus by 4xλBoxBl repeats were tested, the latter having a slightly greater in vitro affinity than the former for the Nλ peptide⁴⁸. No significant difference was observed between the BoxBr and BoxBl hairpins with canonical or extended stems with respect to Firefly luciferase reporter protein expression (Supplementary Fig. 4, first series).

With cis elements, the number of repeats can often affect the efficiency of recruiting trans elements. Accordingly, we evaluated the effect of varying the number of λBoxBr repeats on protein expression by the C3P3 system. From one-to-twelve λBoxBr repeats were introduced into the 3′UTR of the Firefly luciferase reporter gene and tested under the same conditions as before. A gradual increase in protein expression was observed up to four λBoxBr, after which a plateau was observed (Supplementary Fig. 4, second series), suggesting that four λBoxBr repeats are optimal.

Thirdly, we evaluated the effect of spacing between the λBoxBr hairpins, which we varied between 2 to 40 nucleotides. A marked decrease in protein expression was observed when the spacing between λBoxBr hairpins was reduced to just two nucleotides, which may be related to steric hindrance of Nλ-PAPαm7 binding to λ-BoxBr hairpins (Supplementary Fig. 4, third series). There was no virtually difference in Firefly luciferase activity when spacing was increased from 10 to 20 or 40 nucleotides.

The integrity of the complete Nλ-PAPαm7/4xλBoxBr system is necessary for increased protein expression by the C3P3 system

In order to confirm the specificity of the post-transcriptional lengthening of the poly(A) tail, we synthesized an inactive Nλ-PAPαm7 mutant by D115A substitution. This mutation impairs the catalytic site of the poly(A) polymerase and thereby reduces the in vitro enzymatic activity of wild-type bovine PAPα to less than 1%⁴⁷. The catalytically dead Nλ-PAPαm7[D115A], when used under the same conditions as above with the C3P3-G1 and pK1E-Luciferase-4xλBoxBr-A40 plasmids, led to no detectable change in Firefly luciferase reporter protein activity compared to the C3P3-G1 system alone, thus confirming the specificity of the observed effect (Supplementary Fig. 5A).

To further confirm the specificity of the Nλ-PAPαm7/4xλBoxBr system, the 4xλBoxBr hairpins were replaced with scrambled 4xscλBoxBr sequences. Again, no augmented luciferase activity was detected with the 4xscλBoxBr system compared to the C3P3-G1 system, thus confirming the importance of this sequence for the activity of the Nλ-PAPαm7/4xλBoxBr system (Supplementary Fig. 5B).

The Nλ-PAPαm7/4xλBoxBr system can increase protein expression by the C3P3 system in various types of mammalian cells and with various genetic constructs

While the Nλ-PAPαm7/4xλBoxBr significantly augmented expression of luciferase, it was important to establish that similar enhancements would occur with other exogenous gene expression constructs, and hence would hold broad utility towards exogenous gene expression applications. To this end, we evaluated the effect of Nλ-PAPαm7/4xλBoxBr on expression of enhanced green fluorescent protein (eGFP). Specifically, eGFP was inserted into the pK1E-gene of interest-4xλBoxBr-A40 plasmid and co-transfected with C3P3-G1, plus-or-minus Nλ-PAPαm7. Levels of eGFP in transfected cells were quantified by flow cytometric analysis. eGFP expression was 3.7 times higher when the Nλ-PAPαm7 plasmid was co-transfected with the C3P3-G1 plasmid (Supplementary Fig. 6A). We then investigated whether the Nλ-PAPαm7/4xλBoxBr system could also increase the expression of other constructs with various 5′UTR or 3′UTR sequences. These different constructions were cotransfected as described above with the CP3-G1 plasmid, with or without the Nλ-PAPαm7 plasmid. A significant increase in expression was observed when the Nλ-PAPαm7 plasmid was co-transfected regardless of the 5′UTR or 3′UTR sequence (Supplementary Fig. 6B,C). Altogether, these findings suggest that the Nλ-PAPαm7/4xλBoxBr system can be utilized with broad exogenous genes and UTRs, so long as the major components such as the K1E promoter and 4xλBoxBr cis elements are present.

We then tested whether the Nλ-PAPαm7/4xλBoxBr system could be used in other mammalian cell types, which seems likely due to the orthogonal nature of the RNA–protein interaction system. Mouse NIH-3T3 fibroblasts, Chinese hamster ovary CHO-K1 cells, and rat K-9 hepatocytes were transfected under the same conditions as described previously (Supplementary Fig. 7A–C). In all cell lines, co-expression of Nλ-PAPαm7 significantly increased the level of the reporter protein by 2-to-fourfold (P < 0.001). These findings confirm that the Nλ-PAPαm7/4xλBoxBr system is functional in mammalian cultured cells other than human. Noticeably, the C3P3 system has poor efficiency in some cell lines compared to a standard nuclear expression plasmid (e.g. NIH-3T3 and K9 cells), unlike other cell lines where its performance is greater than a standard nuclear expression plasmid (e.g. CHO-K1 and HEK-293). The reasons for such difference remain unclear; their understanding constituting an important avenue for future improvement of the C3P3 system.

We next investigated whether there were differences in cell proliferation and cytotoxicity mediated by the C3P3 system that could explain the expression level differences observed between cell lines (Supplementary Fig. 8A,B). The rate of cell proliferation and cytotoxicity was measured in HEK-293 and CHO-K1 cells transfected with wild-type or enzymatically dead C3P3-G1, wild-type or enzymatically dead Nλ-PAPαm7, and pK1E-Luciferase-4xλBoxBr-A40 plasmids. The rates of cell proliferation and cytotoxicity were roughly similar in these two cell lines, with no statistically significant differences between the different experimental conditions compared to the reference. These results suggest that the C3P3-G1 and Nλ-PAPαm7 enzymes per se, as well as the Firefly luciferase mRNA produced by these systems, have no obvious specific effect on cell growth or death, although more subtle effects cannot be excluded. Cellular toxicity and decreased cell proliferation are, however, observed in these different experimental conditions compared to the transfection reagent alone, which are attributable to the plasmid DNA itself or to traces of endotoxin. A possible solution to reduce these nonspecific effects could therefore rely on either the establishment of stable cell lines that produce the C3P3 enzyme constitutively or inducibly, or on the use of cell lines modified by editing certain genes involved in the innate cellular immune response to exogenous DNA such as the Toll-like membrane receptor 9 for unmethylated CpG DNA⁴⁹, or cyclic GMP-AMP synthase (cGAS)-STING pathway that senses DNA in the cytoplasm⁵⁰.

The Nλ/4xλBoxBr system compares favorably to other technological alternatives to polyadenylation

We then compared different technological alternatives that could have been used instead of or together with the present artificial polyadenylation system. First, in place of 3′-end mRNA polyadenylation, the metazoan replication-dependent histone mRNAs contain a conserved stem-loop sequence of 25–26 nucleotides, which binds to the nuclear stem-loop binding protein (SLBP). This complex is involved in all steps of histone mRNA metabolism, including processing, nuclear export, translation, and degradation⁵¹. Since the addition of a histone stem-loop downstream of a poly(A) track has been shown to potentiate the expression of synthetic mRNA vaccines⁵², we thought to compare such construct to the Nλ/4xλBoxBr system. The human H3 clustered histone 10 (H3C10) stem-loop was therefore introduced in the 3′UTR of the Firefly luciferase reporter plasmid with or without a 40-adenosine residue track. Tested under the same conditions as previously, the H3C10 stem-loop alone had almost the same effect as a 40-adenosine poly(A) track on the expression of the Firefly luciferase reporter gene (Supplementary Fig. 9A; ratio 1.06, NS). Moreover, an additive effect on reporter gene expression was observed when the H3C10 stem-loop was placed downstream of a 40-adenosine residue track (ratio 2.15; P < 0.001), but which was however significantly lower than that obtained with the Nλ/4xλBoxBr-A40 system (2.89-fold, P < 0.001). We finally tested the 3′UTR effect of an H3C10 stem-loop in a Firefly luciferase reporter plasmid containing a 4xλBoxBr-40 track. A marginal but non-significant increase in expression level was observed with this construct compared to a 4xλBoxBr-40 lane only (2.95 vs. 3.21-fold).

Several viruses have also developed functional replacement strategies for polyadenylation based on the use of highly structured 3′UTR sequences. For example, the 3′UTR of the S-segment of Bunyamwera orthobunyavirus⁵³ and Andes hantavirus⁵⁴ smRNA mediates efficient translation in the absence of a poly(A) tail through PABP-independent mechanisms, probably by forming a closed-loop mRNA by direct or indirect binding to eIF4G. Conversely, a conserved stem-loop from the non-polyadenylated 3′UTR of dengue virus type 2 mRNA ensures translation efficiency through its direct binding to PABP, thus forming a closed-loop mRNA⁵⁵. These 3′UTR sequences were introduced upstream of the poly(A) track of the plasmid pK1E-Luciferase-A40, then were tested as described previously. A significant increase in the expression of the Firefly luciferase reporter gene was found with the Bunyamwera orthobunyavirus 3′UTR (ratio 1.66; P < 0.001), but much lower than that obtained with the Nλ/4xλBoxBr-A40 system (2.89-fold, P < 0.001; Supplementary Fig. 9B). No significant effect was observed with the Andes hantavirus or dengue virus 3′UTR sequences.

We also tested the effect of adding a 10-residue polycytosine track placed immediately downstream of the polyadenosine track (A40C10) of the pK1E-Luciferase-A40 plasmid. Such homopolymeric sequence was indeed found to increase and prolong the expression of synthetic mRNA both in vitro and in vivo, presumably by blocking its deadenylation⁵⁶. Tested under the same conditions as previously in Firefly luciferase reporter gene construct without 4xλBoxBr repeat, a slight but non-significant increase in the expression of the Firefly luciferase reporter was indeed observed with the A40C10 compared to the A40 track (Supplementary Fig. 9C). Moreover, when A40C10 was inserted instead of A40 in a plasmid containing a 4xλBoxBr repeat, a slight but non-significant increase of expression in comparison to 4xλBoxBr-A10 was found (ratio of 2.77 and 2.95, respectively, P < 0.001).

The Nλ-PAPαm7/4xλBoxBr system lengthens the poly(A) tail of the transcripts synthesized by the C3P3 system and increases mRNA translatability

To better appreciate the extent of poly(A) elongation on transcripts by the Nλ-PAPαm7/4xλBoxBr system, we used the poly(A)-tailing assay developed by Kusov et al.²⁷. This PCR-based tailing method relies on the formation of a poly(A)-oligo(G) junction at the end of poly(A) tail of the target mRNA. The tailed-RNAs, converted to cDNA, is then amplified with a universal reward primer hybridizing at the poly(A)-oligo (G) junction and one of the two gene-specific forward primers tested (Fig. 5A). In the absence of Nλ-PAPαm7, a single band was mainly observed, corresponding to a poly(A) tail of 40 nucleotides (Fig. 5B; Supplementary Fig. 10, tracks 1). Conversely, a ladder of multiple bands was observed with Nλ-PAPαm7, therefore corresponding to a poly(A) tail of 40 to approximately 250 nucleotides (Fig. 5B; Supplementary Fig. 10, tracks 2). These findings therefore confirms that the Nλ-PAPαm7/4xλBoxBr system allows the elongation of the poly(A) tail of the transcripts synthesized with the C3P3 system.

We then focused on establishing the mechanisms that account for higher protein expression by the Nλ-PAPαm7/4xλBoxBr system. Poly(A) tails are well characterized to increase translatability by promoting the formation of closed loops. To investigate this hypothesis, we first measured by reverse transcription-quantitative PCR (RT-qPCR) the kinetics of the Firefly luciferase reporter mRNA synthesized by the C3P3 system with or without Nλ-PAPαm7/4xλBoxBr, and then compared it to that driven by the standard pCMVScript-Luciferase as a control. Firefly luciferase target reporter mRNA produced by C3P3-G1 with or without Nλ-PAPαm7 peaked at D2, unlike that produced by the pCMVScript which peaked at D1 (Fig. 6A). The C3P3-G1 plus Nλ-PAPαm7/4xλBoxBr system shows a slightly lower copy number of Firefly luciferase target mRNA than the C3P3-G1 system alone. On the other hand, Firefly luciferase AUC_D0–D6 target mRNA copy number calculated by linear trapezoid method was 5.91 and 4.64 times higher with C3P3-G1 and C3P3-G1 plus Nλ-PAPαm7 than with the standard pCMVScript plasmid, respectively. The expression of Firefly luciferase protein in the transfected HEK-293 cells was simultaneously measured by luciferin oxidation assay. Firefly luciferase protein expression peaked at D3 in C3P3-G1 transfected cells with or without Nλ-PAPαm7, unlike the pCMVScript plasmid which peaked at D2 after transfection. The C3P3-G1 plus Nλ-PAPαm7/4xλBoxBr plasmid showed an AUC_D0–D6 of Firefly luciferase bioluminescence 1.95 and 1.76 times higher than with the C3P3-G1 plasmid alone or the pCMVScript plasmid, respectively (Fig. 6B). Finally, the combination of the above results made it possible to calculate a translatability index, which we defined as the ratio AUC_D0-D6 luminescence/AUC_D0-D6 mRNA. This translatability index was 2.7 greater with the C3P3-G1 plus Nλ-PAPαm7 plasmids than with the C3P3-G1 plasmid alone (Fig. 6C). Nevertheless, compared to the standard pCMVScript expression plasmid, this translatability index was only 0.13 and 0.35 with respectively C3P3-G1 plasmid alone or together with the Nλ-PAPαm7 plasmid. These results therefore show that the PAPαm7/4xλBoxBr system clearly increases the translatability of mRNA synthesized by the C3P3 system, possibly by pseudo-circularization of the mRNA though the formation of closed-loops, but nevertheless there are still further advancements necessary to achieve the translatability levels exhibited by mRNA from CMV-driven nuclear expression.

To test the hypothesis that the Nλ-PAPαm7/4xBoxBr system enhances translatability through the formation of closed-loop with the poly(A) tail of the target mRNA, we evaluated the effect of proteins involved in such mRNA pseudo-circularization fused with Nλ tethering sequences, i.e. PABP which binds the poly(A) tail to mRNA 3′ ends, eIF4E which binds to the cap of mRNA 5′ ends and eIF4G which bridges these two proteins to form closed-loop mRNA. Tested under the same conditions as previously, expression of all three tethered proteins significantly increased Firefly luciferase reporter expression, although to a lesser degree than Nλ-PAPαm7, therefore supporting our initial hypothesis (Supplementary Fig. 11).

The C3P3-G2 enzyme encoded by a single open reading frame can be efficiently assembled by fusing the tethered poly(A) polymerase Nλ-PAPαm7 with the C3P3-G1 enzyme NP868R-(G4S)2-K1ERNAP(R551S) via an F2A ribosome skipping sequence

An important feature of the C3P3-G1 system is that the enzyme can be encoded by a single ORF, which makes it easier to use for certain applications, in particular therapeutics. We therefore attempted to generate a single ORF enzyme by in-frame fusion of the Nλ-PAPαm7 coding sequence to the amino-terminus of the NP868R-(G4S)2-K1ERNAP(R551S) C3P3-G1 enzyme. Conversely, reverse constructions by in-frame fusion to the carboxyl-terminus of C3P3-G1 were not tested because phage RNA polymerases do not tolerate carboxyl-terminal extension^57,58. We tested two types of constructions, either by ligation though a flexible (G₄S)₂ linker therefore resulting in the production of a single-unit protein, or through a ribosome skipping F2A sequence from the Aphtovirus (Fig. 7A). The 2A sequences found in different virus species, among which the F2A sequence from the foot-and-mouth disease virus, prevents the ribosome from covalently generate a glycyl-prolyl peptide bond at the C-terminus of the F2A and thereby continue protein translation leading to an apparent co-translational cleavage and a two subunits enzyme⁵⁹. These two constructions were tested under the same conditions as previously. Since higher expression of the Firefly luciferase reporter protein was found with the ribosome skipping F2A sequence than with the flexible linker (G4S)2, the former sequence was selected as the final assembled C3P3-G2 enzyme.

Previous findings have shown variable efficacy of pseudo-cleavage by F2A, either with a fraction of the F2A-containing proteins remaining as a read-through fusion, or with decreased expression of the downstream peptide due to ribosome fall-off⁶⁰. We therefore evaluated the efficacy of F2A-mediated pseudo-cleavage of the C3P3-G2 enzyme into two subunits using Western blot analysis. Each subunit was separately tagged with an in-frame 3xFLAG tag either at the carboxyl-terminal end of Nλ-PAPαm7, or at the amino-terminal end of the NP868R-(G4S)2-K1ERNAP(R551S) protein. The tagged Nλ-PAPαm7-3xFLAG was detected as a 97 kDa band (Fig. 7B, track 1), and for the tagged 3xFLAG-NP868R-(G4S)2-K1ERNAP(R551S) subunit (track 2) as a 202 kDa band (Fig. 7B, track 2). We then immunoblotted the C3P3-G2 protein in which two 3xFLAG tags were inserted in-frame: one immediately before the F2A sequence downstream of Nλ-PAPαm7-3xFLAG, the other immediately after the F2A sequence upstream of 3xFLAG-NP868R-(G4S)2-K1ERNAP(R551S) protein. Three major bands of 101 kDa (Nλ-PAPαm7-3xFLAG), 202 kDa (3xFLAG-NP868R-(G4S)2-K1ERNAP(R551S)), and 303 kDa (Nλ-PAPαm7-3xFLAG-F2A-3xFLAG-NP868R-(G4S)2-K1ERNAP(R551S) read-through protein) were detected (Fig. 7B, track 3). These results confirm that although the ribosome-skipping F2A allows the production of the two subunits as expected, it also induces the production of a single fusion protein by read-through, as well as a greater proportion of the upstream subunit than the downstream one due to ribosome fall-off. Other technical solutions remain to be explored to avoid such read-through and fall-off products, the use of mutant ubiquitin to replace 2A peptides appearing particularly attractive⁶¹.

Finally, we imaged the C3P3-G2 protein by immunofluorescence by introducing two different labels into each of the subunits, which were both found in the cytoplasmic compartment (Fig. 7C).

Discussion

We previously introduced the first generation of the C3P3 system, which to the best of our knowledge, is the first artificial non-viral system that enables in vivo autonomous production of mature mRNA³. Although this system is conceptually adaptable throughout the eukaryotic kingdom due to the conservation of the majority of mRNA post-transcriptional modifications, it has been mainly optimized by us for the human species and more generally mammals.

The C3P3-G1 system enables the production of mRNA having 5′-end ^m7GpppN cap, a post-transcriptional modification which is critical for mRNA translation. It relies on an artificial single subunit chimeric enzyme, which is formed by the fusion of a mutant DNA-dependent RNA polymerase from Enterobacteria bacteriophage K1E and the capping enzyme of the African swine fever virus. The RNA polymerase moiety of the C3P3-G1 enzyme allows the transcription of genes under control of its promoter, whereas its capping enzyme moiety ensures efficient ^m7GpppN capping of target mRNAs. Noticeably, a complete ^m7GpppN cap can be synthesized by the capping enzyme, which has all three enzymatic activities necessary for its synthesis: 5′-triphosphatase that removes the γ-phosphate residue of 5′-triphosphate mRNA end resulting in diphosphate 5′-terminus, a guanylyltransferase that transfers GMP from GTP to diphosphate 5′-terminus, and a N7-guanine methyltransferase that adds a methyl residue onto nitrogen-7 of guanine resulting in ^m7GpppN 5′-cap³.

Polyadenylation is another important post-translational mRNA modification, consisting of the addition of homopolymeric sequences at 3′-ends of mRNA, which is carried out by poly(A) polymerases. In mammalian cells, nuclear poly(A) polymerases are part of complexes of more than a dozen individual subunits that are physically coupled to RNA polymerase II via the carboxy-terminal domain of its largest subunit⁶². Upon recognition of polyadenylation signal sequences and enhancer elements on the pre-mRNA, this complex cleaves the pre-mRNAs and then mediates the synthesis of the poly(A) tail. Upon exiting the nucleus, the poly(A) tail is considered to have a fairly constant length of 250–300 adenosine in mammalian cells, but which actually appears to be quite variable depending on the species and cell type^{13,14,15,16,17}. Once in the cytoplasm, the length of the poly(A) tails then becomes very heterogeneous by deadenylation at variable rates, which is a key mechanism for regulating the translatability and stability of the mRNA^9,12.

Unlike nuclear polyadenylation, which is non-templated, that by the C3P3-G1 enzyme is templated, since it is produced by the transcription of a short track of 40 adenosines in the 3′UTR of the DNA template, followed by a trans-cleaving ribozyme from the hepatitis D virus. As the use of a longer adenosine track in the DNA templates did not provide an acceptable solution due to low production yields of the corresponding plasmids and the risk of plasmid recombination, we have opted for the development of a non-templated post-transcriptional polyadenylation system by an engineered poly(A) polymerase. To selectively enable polyadenylation of mRNA synthesized by the C3P3 system, we have developed an orthogonal RNA–protein interaction system that brings a mouse mutant PAPα fused to the N-peptide from the λ virus to the mRNA target in which 4xλBoxBr harpins were introduced the 3′UTR. As shown by the poly(A) tailing assay, this system makes it possible to extend the poly(A) tail to a variable length of up to about 250 nucleotides.

One of the mechanisms that can account for the increase in the level of protein expression by the Nλ-PAPαm7/4xλBoxBr system is the improvement of mRNA translatability. The poly(A) tail is indeed necessary for the formation of mRNA closed-loop which is a state of translation initiation resulting from the interaction of the poly(A) binding protein (PABP) bound to the 3′ poly(A) tail, eIF4E that binds to the 5′ capping and eIF4G that simultaneously binds eIF4E and PABP^63,64,65. This pseudo-circularization is thought to promote the engagement of terminating ribosomes to a new round of translation at the same mRNA molecule, thus accounting for the functional synergy between ^m7GpppN capping and 3′ poly(A) tail for enhancing protein translation. This has been well demonstrated by the pioneering study by Gallie et al., which has shown that the addition of poly(A₅₀) tail increases by 156-fold the protein expression of a Firefly luciferase reporter mRNA electroporated in CHO cells compared to ^m7GpppN capped mRNA without poly(A) tail, whereas polyadenylation has minimal effect in the absence of capping⁶⁶. This is in agreement with the clear increase in translatability of mRNAs synthesized with C3P3 by the Nλ-PAPαm7/4xλBoxBr system. However, translatability remained lower than that of mRNA produced by a transgene with a conventional nuclear promoter, suggesting that other translation-limiting factors are involved.

Several hypotheses could explain the low translatability of the transcripts produced by C3P3, even with the Nλ-PAPαm7/4xλBoxBr system, which are not necessarily mutually exclusive. First of all, translation can be repressed by a type-I interferon response, which can be triggered by the absence of certain post-transcriptional modifications of the transcripts synthesized by the C3P3 system. These modifications could include adenosine-to-inosine editing⁶⁷, internal modifications (e.g. N6-methyladenosine, 5-methylcytosine or pseudouridine)^68,69, cap1 or cap2 2′-O-ribose-methylation at the first or second nucleotide of nascent mRNA⁷⁰ or even reversible N6,2′-O-dimethyladenosine methylation at the first nucleotide⁷¹. Another molecular trigger of type-I interferon response could be the production of double-stranded RNA, which is an aberrant byproduct of transcription by phage RNA polymerases such as the one from the C3P3 enzyme⁷². Nevertheless, a massive interferon response would be difficult to reconcile with the quasi-normality of the polysomal profile observed with the C3P3-G1 system³, even if more subtle anomalies of translation initiation or elongation cannot be excluded. Second, the poor translatability of the newly synthesized RNA could be related to an inappropriate subcellular localization of the C3P3 enzyme. By analogy, the transcription and translation of nucleocytoplasmic large DNA viruses take place in virus-induced intracellular compartments called viral factories, where factors are recruited and concentrated, thus increasing the efficiency of the processes⁷³. The absence of accumulation of these factors in the subcellular region of the C3P3 system could contribute to the poor translatability of the target mRNA. Third, the mRNA synthesis by the C3P3 system could exhaust the cell’s resources of nutrients, energy or oxygen, which could impair protein translation. For example, RNA synthesis by the C3P3 system probably consumes large amounts of nucleotides and could thus induce depletion of the nucleotide pool, by decreasing mRNA synthesis not only, but also with the production of non-coding rRNAs which are the main components of ribosomes⁷⁴. Deciphering which mechanisms are involved is a crucial avenue to further improve the performance of the C3P3 system.

Polyadenylation also plays a central role in mRNA stability through the regulation of its decay⁸. Indeed, eukaryotic mRNA decay typically begins with poly(A) tail shortening by exonucleases, including the poly(A)-specific ribonuclease (PARN), Pan2/Pan3 complex and CCR4–NOT complex⁷⁵. Therefore, the length of the poly(A) tail is generally positively correlated with the half-life of mRNA⁷⁵. It is therefore likely that the same extension of mRNA half-life is conferred by the Nλ-PAPαm7/4xλBoxBr system, but unfortunately this hypothesis could not be tested in this present study due to the absence of a potent specific inhibitor of the phage RNA polymerase moiety of the enzyme C3P3.

The C3P3 system is a versatile platform technology well suited for many in cellulo applications, such as the rescue of wild-type or recombinant RNA viruses by reverse genetics. For example, the C3P3-G1 system was found to increase by 5–10-fold viral protein expression and by 50–100-fold reovirus titers in rescue experiments compared to the standard expression system⁷⁶. Additionally, C3P3-G1 has been shown to rescue > 80% of rotavirus strains by reverse genetics in MA104 N*V cells, in contrast to standard strategies that rescue only < 20% of strains^77,78. The C3P3 system can also be adapted for the production of recombinant lentivirus vectors, which remains problematic with conventional techniques due to their low yields and biosafety. Preliminary results show that the C3P3 system increases lentiviral vector production yields by at least four times compared to conventional techniques, as well as their predicted biosafety⁷⁹.

Furthermore, the C3P3 system is also of potential interest for the production of recombinant proteins in mammalian cells, for which quality and production yields are essential requirements. Although not yet tested on a large scale, the C3P3 system could meet these requirements for the following reasons. First, the C3P3 system was found to produce proteins of usual quality, which is consistent with the fact that this system has no specific effect on the cellular translation process. For example, human erythropoietin produced in human HEK-293 cells with the C3P3-G1 system was found to have normal functional activity, which requires correct protein folding and post-translational glycosylation³. Second, the current performance of the C3P3-G2 system is attractive for its use for bioproduction purposes in CHO and HEK-293, which are the main cell lines used for bioproduction⁸⁰. Third, the C3P3 enzyme acts as an expression master, which makes it possible to simultaneously produce several proteins necessary for the formation of multi-subunit complexes^76,77. Such multi-subunit proteins, and more particularly antibody-based therapies – monoclonal antibodies, bispecific antibodies and antibody–drug conjugates – have emerged as a major class of therapies accounting for 30% of all new drugs and 73% of biologics approved by the FDA in 2022⁸¹. Fourth, the C3P3-G1 enzyme is active in the cytoplasm or in the nuclear compartment³, therefore making the C3P3 system potentially usable for both transient and stable expression systems.

Still another attractive application of the C3P3 system is its use in vivo for human therapeutics. The therapeutic solution we are currently developing is based on the use of semi-synthetic DNA assembling the C3P3 gene, as well as one or more genes under the control of the C3P3 promoter. This non-viral system has the advantages of ease and speed of development, low production cost, high level of expression, and capacity to express simultaneously several proteins if needed. Actually, the delivery of such semi-synthetic DNA could certainly benefit from recent advances in the field of lipid nanoparticles for mRNA vaccination². This system will initially be tested for non-viral genetic compensation for acute diseases involving slow-cycling/resting organs such as the liver or respiratory tract. Another interesting therapeutic application is prophylactic vaccination against infectious diseases, which could benefit from the ability of the C3P3 system to simultaneously express multiple antigens for greater vaccine efficacy and/or immunization against different strains of pathogens⁸². Similarly, encouraging results for curative solid tumor immunotherapy with synthetic mRNA vaccines also highlight the importance of simultaneous expression of multiple neoantigens⁸³, making the C3P3 system attractive for this application.

In summary, we have successfully developed an artificial polyadenylation system that allows extension of the poly(A) tail of transcripts synthesized with the C3P3 system, thereby significantly increasing the performance of the system. This second generation constitutes an important step in the development of the C3P3 artificial expression system which encourages its continued improvement.

Data availability

The GenBank accession numbers (https://www.ncbi.nlm.nih.gov/genbank/) of the sequences of this article are OQ509033, OQ509034, OQ509035, OQ509036, OQ509037, OQ509038, OQ509039, OQ509040, OQ509041, OQ509042, OQ509043, OQ509044, OQ509045, OQ509046, OQ509047, OQ509048, OQ509049, OQ509050, OQ509051, OQ509052, OQ509053, OQ509054, OQ509055, OQ509056, OQ509057, OQ509058, OQ509059, OQ509060, OQ509061, and OQ509062.

Abbreviations

C3P3-G1 and C3P3-G2:: First and second generations of cytoplasmic capping-prone phage polymerase expression system
PAPα:: Poly(A) polymerase alpha

References

Bulcha, J. T., Wang, Y., Ma, H., Tai, P. W. L. & Gao, G. Viral vector platforms within the gene therapy landscape. Signal Transduct. Target. Ther. 6, 53 (2021).
Article CAS PubMed PubMed Central Google Scholar
Hou, X., Zaks, T., Langer, R. & Dong, Y. Lipid nanoparticles for mRNA delivery. Nat. Rev. Mater. 6, 1078–1094 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Jais, P. H. et al. C3P3-G1: First generation of a eukaryotic artificial cytoplasmic expression system. Nucleic Acids Res. 47, 2681–2698 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kwapiszewska, K. et al. Nanoscale viscosity of cytoplasm is conserved in human cell lines. J. Phys. Chem. Lett. 11, 6914–6920 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yao, J., Fan, Y., Li, Y. & Huang, L. Strategies on the nuclear-targeted delivery of genes. J. Drug Target 21, 926–939 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ielasi, F. S. et al. Human histone pre-mRNA assembles histone or canonical mRNA-processing complexes by overlapping 3’-end sequence elements. Nucleic Acids Res. 50, 12425–12443 (2022).
Article CAS PubMed PubMed Central Google Scholar
Griesbach, E., Schlackow, M., Marzluff, W. F. & Proudfoot, N. J. Dual RNA 3’-end processing of H2A.X messenger RNA maintains DNA damage repair throughout the cell cycle. Nat. Commun. 12, 359 (2021).
Article CAS PubMed PubMed Central Google Scholar
Garneau, N. L., Wilusz, J. & Wilusz, C. J. The highways and byways of mRNA decay. Nat. Rev. Mol. Cell Biol. 8, 113–126 (2007).
Article CAS PubMed Google Scholar
Parker, R. & Song, H. The enzymes and control of eukaryotic mRNA turnover. Nat. Struct. Mol. Biol. 11, 121–127 (2004).
Article CAS PubMed Google Scholar
Beilharz, T. H. & Preiss, T. Widespread use of poly(A) tail length control to accentuate expression of the yeast transcriptome. RNA 13, 982–997 (2007).
Article CAS PubMed PubMed Central Google Scholar
Gallie, D. R. & Tanguay, R. Poly(A) binds to initiation factors and increases cap-dependent translation in vitro. J. Biol. Chem. 269, 17166–17173 (1994).
Article CAS PubMed Google Scholar
Passmore, L. A. & Coller, J. Roles of mRNA poly(A) tails in regulation of eukaryotic gene expression. Nat. Rev. Mol. Cell Biol. 23, 93–106 (2022).
Article CAS PubMed Google Scholar
Legnini, I., Alles, J., Karaiskos, N., Ayoub, S. & Rajewsky, N. FLAM-seq: Full-length mRNA sequencing reveals principles of poly(A) tail length control. Nat. Methods 16, 879–886 (2019).
Article CAS PubMed Google Scholar
Nicholson, A. L. & Pasquinelli, A. E. Tales of detailed poly(A) tails. Trends Cell. Biol. 29, 191–200 (2019).
Article CAS PubMed Google Scholar
Subtelny, A. O., Eichhorn, S. W., Chen, G. R., Sive, H. & Bartel, D. P. Poly(A)-tail profiling reveals an embryonic switch in translational control. Nature 508, 66–71 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Chang, H., Lim, J., Ha, M. & Kim, V. N. TAIL-seq: Genome-wide determination of poly(A) tail length and 3’ end modifications. Mol. Cell 53, 1044–1052 (2014).
Article CAS PubMed Google Scholar
Eisen, T. J. et al. The dynamics of cytoplasmic mRNA metabolism. Mol. Cell 77(786–799), e710 (2020).
Google Scholar
Mitschka, S. & Mayr, C. Context-specific regulation and function of mRNA alternative polyadenylation. Nat. Rev. Mol. Cell Biol. 23, 779–796 (2022).
Article CAS PubMed PubMed Central Google Scholar
Raab, D., Graf, M., Notka, F., Schodl, T. & Wagner, R. The GeneOptimizer Algorithm: Using a sliding window approach to cope with the vast sequence space in multiparameter DNA sequence optimization. Syst. Synth. Biol. 4, 215–225 (2010).
Article PubMed PubMed Central Google Scholar
Jackson, A. L. et al. Position-specific chemical modification of siRNAs reduces “off-target” transcript silencing. RNA 12, 1197–1205 (2006).
Article CAS PubMed PubMed Central Google Scholar
Chan, F. K., Moriwaki, K. & De Rosa, M. J. Detection of necrosis by release of lactate dehydrogenase activity. Methods Mol. Biol. 979, 65–70 (2013).
Article CAS PubMed PubMed Central Google Scholar
Jones, L. J., Gray, M., Yue, S. T., Haugland, R. P. & Singer, V. L. Sensitive determination of cell number using the CyQUANT cell proliferation assay. J. Immunol. Methods 254, 85–98 (2001).
Article CAS PubMed Google Scholar
Einhauer, A. & Jungbauer, A. The FLAG peptide, a versatile fusion tag for the purification of recombinant proteins. J. Biochem. Biophys. Methods 49, 455–465 (2001).
Article CAS PubMed Google Scholar
Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hanke, T., Szawlowski, P. & Randall, R. E. Construction of solid matrix-antibody-antigen complexes containing simian immunodeficiency virus p27 using tag-specific monoclonal antibody and tag-linked antigen. J. Gen. Virol. 73(Pt 3), 653–660 (1992).
Article CAS PubMed Google Scholar
Lahm, H. et al. Live fluorescent RNA-based detection of pluripotency gene expression in embryonic and induced pluripotent stem cells of different species. Stem Cells 33, 392–402 (2015).
Article CAS PubMed Google Scholar
Kusov, Y. Y., Shatirishvili, G., Dzagurov, G. & Gauss-Muller, V. A new G-tailing method for the determination of the poly(A) tail length applied to hepatitis A virus RNA. Nucleic Acids Res. 29, E57-57 (2001).
Article CAS PubMed PubMed Central Google Scholar
Blom, N., Sicheritz-Ponten, T., Gupta, R., Gammeltoft, S. & Brunak, S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4, 1633–1649 (2004).
Article CAS PubMed Google Scholar
Chenna, R. et al. Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 31, 3497–3500 (2003).
Article CAS PubMed PubMed Central Google Scholar
Golab, K. et al. Effect of serum on SmartFlare RNA Probes uptake and detection in cultured human cells. Biomed. J. Sci. Tech. Res. 28, 21788–21793 (2020).
PubMed PubMed Central Google Scholar
Coller, J. & Wickens, M. Tethered function assays: An adaptable approach to study RNA regulatory proteins. Methods Enzymol. 429, 299–321 (2007).
Article CAS PubMed Google Scholar
Kwak, J. E., Wang, L., Ballantyne, S., Kimble, J. & Wickens, M. Mammalian GLD-2 homologs are poly(A) polymerases. Proc. Natl. Acad. Sci. USA 101, 4407–4412 (2004).
Article ADS CAS PubMed PubMed Central Google Scholar
Dickson, K. S., Thompson, S. R., Gray, N. K. & Wickens, M. Poly(A) polymerase and the regulation of cytoplasmic polyadenylation. J. Biol. Chem. 276, 41810–41816 (2001).
Article CAS PubMed Google Scholar
Greenblatt, J., Nodwell, J. R. & Mason, S. W. Transcriptional antitermination. Nature 364, 401–406 (1993).
Article ADS CAS PubMed Google Scholar
Correll, C. C. & Swinger, K. Common and distinctive features of GNRA tetraloops based on a GUAA tetraloop structure at 1.4 A resolution. RNA 9, 355–363 (2003).
Article CAS PubMed PubMed Central Google Scholar
Priet, S., Lartigue, A., Debart, F., Claverie, J. M. & Abergel, C. mRNA maturation in giant viruses: Variation on a theme. Nucleic Acids Res. 43, 3776–3788 (2015).
Article CAS PubMed PubMed Central Google Scholar
Iyer, L. M., Balaji, S., Koonin, E. V. & Aravind, L. Evolutionary genomics of nucleo-cytoplasmic large DNA viruses. Virus Res. 117, 156–184 (2006).
Article CAS PubMed Google Scholar
Moure, C. M., Bowman, B. R., Gershon, P. D. & Quiocho, F. A. Crystal structures of the vaccinia virus polyadenylate polymerase heterodimer: Insights into ATP selectivity and processivity. Mol. Cell 22, 339–349 (2006).
Article CAS PubMed Google Scholar
Cao, G. J. & Sarkar, N. Identification of the gene for an Escherichia coli poly(A) polymerase. Proc. Natl. Acad. Sci. USA 89, 10380–10384 (1992).
Article ADS CAS PubMed PubMed Central Google Scholar
Kashiwabara, S. et al. Regulation of spermatogenesis by testis-specific, cytoplasmic poly(A) polymerase TPAP. Science 298, 1999–2002 (2002).
Article ADS CAS PubMed Google Scholar
Raabe, T., Murthy, K. G. & Manley, J. L. Poly(A) polymerase contains multiple functional domains. Mol. Cell Biol. 14, 2946–2957 (1994).
CAS PubMed PubMed Central Google Scholar
Colgan, D. F., Murthy, K. G., Prives, C. & Manley, J. L. Cell-cycle related regulation of poly(A) polymerase by phosphorylation. Nature 384, 282–285 (1996).
Article ADS CAS PubMed Google Scholar
Vethantham, V., Rao, N. & Manley, J. L. Sumoylation regulates multiple aspects of mammalian poly(A) polymerase function. Genes Dev. 22, 499–511 (2008).
Article CAS PubMed PubMed Central Google Scholar
Fischer, U., Huber, J., Boelens, W. C., Mattaj, I. W. & Luhrmann, R. The HIV-1 Rev activation domain is a nuclear export signal that accesses an export pathway used by specific cellular RNAs. Cell 82, 475–483 (1995).
Article CAS PubMed Google Scholar
Colgan, D. F., Murthy, K. G., Zhao, W., Prives, C. & Manley, J. L. Inhibition of poly(A) polymerase requires p34cdc2/cyclin B phosphorylation of multiple consensus and non-consensus sites. EMBO J. 17, 1053–1062 (1998).
Article CAS PubMed PubMed Central Google Scholar
Colgan, D. F. & Manley, J. L. Mechanism and regulation of mRNA polyadenylation. Genes Dev. 11, 2755–2766 (1997).
Article CAS PubMed Google Scholar
Martin, G., Jeno, P. & Keller, W. Mapping of ATP binding regions in poly(A) polymerases by photoaffinity labeling and by mutational analysis identifies a domain conserved in many nucleotidyltransferases. Protein Sci. 8, 2380–2391 (1999).
Article CAS PubMed PubMed Central Google Scholar
Austin, R. J., Xia, T., Ren, J., Takahashi, T. T. & Roberts, R. W. Designed arginine-rich RNA-binding peptides with picomolar affinity. J. Am. Chem. Soc. 124, 10966–10967 (2002).
Article CAS PubMed Google Scholar
Ohto, U. et al. Structural basis of CpG and inhibitory DNA recognition by Toll-like receptor 9. Nature 520, 702–705 (2015).
Article ADS CAS PubMed Google Scholar
Chen, Q., Sun, L. & Chen, Z. J. Regulation and function of the cGAS-STING pathway of cytosolic DNA sensing. Nat. Immunol. 17, 1142–1149 (2016).
Article CAS PubMed Google Scholar
Marzluff, W. F. & Koreski, K. P. Birth and death of histone mRNAs. Trends Genet. 33, 745–759 (2017).
Article CAS PubMed PubMed Central Google Scholar
Roth, N. et al. Optimised non-coding regions of mRNA SARS-CoV-2 vaccine CV2CoV improves homologous and heterologous neutralising antibody responses. Vaccines (Basel) 10, 25 (2022).
Google Scholar
Blakqori, G., van Knippenberg, I. & Elliott, R. M. Bunyamwera orthobunyavirus S-segment untranslated regions mediate poly(A) tail-independent translation. J. Virol. 83, 3637–3646 (2009).
Article CAS PubMed PubMed Central Google Scholar
Vera-Otarola, J. et al. The 3’ untranslated region of the Andes hantavirus small mRNA functionally replaces the poly(A) tail and stimulates cap-dependent translation initiation from the viral mRNA. J. Virol. 84, 10420–10424 (2010).
Article CAS PubMed PubMed Central Google Scholar
Polacek, C., Friebe, P. & Harris, E. Poly(A)-binding protein binds to the non-polyadenylated 3’ untranslated region of dengue virus and modulates translation efficiency. J. Gen. Virol. 90, 687–692 (2009).
Article CAS PubMed Google Scholar
Li, C. Y. et al. Cytidine-containing tails robustly enhance and prolong protein production of synthetic mRNA in cell and in vivo. Mol. Ther. Nucleic Acids 30, 300–310 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mookhtiar, K. A., Peluso, P. S., Muller, D. K., Dunn, J. J. & Coleman, J. E. Processivity of T7 RNA polymerase requires the C-terminal Phe882-Ala883-COO- or “foot”. Biochemistry 30, 6305–6313 (1991).
Article CAS PubMed Google Scholar
Gardner, L. P., Mookhtiar, K. A. & Coleman, J. E. Initiation, elongation, and processivity of carboxyl-terminal mutants of T7 RNA polymerase. Biochemistry 36, 2908–2918 (1997).
Article CAS PubMed Google Scholar
Donnelly, M. L. L. et al. The “cleavage” activities of foot-and-mouth disease virus 2A site-directed mutants and naturally occurring “2A-like” sequences. J. Gen. Virol. 82, 1027–1041 (2001).
Article CAS PubMed Google Scholar
Liu, Z. et al. Systematic comparison of 2A peptides for cloning multi-genes in a polycistronic vector. Sci. Rep. 7, 2193 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Varshavsky, A. Ubiquitin fusion technique and related methods. Methods Enzymol. 399, 777–799 (2005).
Article CAS PubMed Google Scholar
Hsin, J. P. & Manley, J. L. The RNA polymerase II CTD coordinates transcription and RNA processing. Genes Dev. 26, 2119–2137 (2012).
Article CAS PubMed PubMed Central Google Scholar
Shirokikh, N. E. & Preiss, T. Translation initiation by cap-dependent ribosome recruitment: Recent insights and open questions. Wiley Interdiscip. Rev. RNA 9, e1473 (2018).
Article PubMed Google Scholar
Archer, S. K., Shirokikh, N. E., Hallwirth, C. V., Beilharz, T. H. & Preiss, T. Probing the closed-loop model of mRNA translation in living cells. RNA Biol. 12, 248–254 (2015).
Article PubMed PubMed Central Google Scholar
Alekhina, O. M., Terenin, I. M., Dmitriev, S. E. & Vassilenko, K. S. Functional cyclization of eukaryotic mRNAs. Int. J. Mol. Sci. 21, 25 (2020).
Article Google Scholar
Gallie, D. R. The cap and poly(A) tail function synergistically to regulate mRNA translational efficiency. Genes Dev. 5, 2108–2116 (1991).
Article CAS PubMed Google Scholar
Nishikura, K. A-to-I editing of coding and non-coding RNAs by ADARs. Nat. Rev. Mol. Cell Biol. 17, 83–96 (2016).
Article CAS PubMed Google Scholar
Meyer, K. D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3’ UTRs and near stop codons. Cell 149, 1635–1646 (2012).
Article CAS PubMed PubMed Central Google Scholar
Anderson, B. R. et al. Incorporation of pseudouridine into mRNA enhances translation by diminishing PKR activation. Nucleic Acids Res. 38, 5884–5892 (2010).
Article CAS PubMed PubMed Central Google Scholar
Daffis, S. et al. 2’-O methylation of the viral mRNA cap evades host restriction by IFIT family members. Nature 468, 452–456 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Mauer, J. et al. Reversible methylation of m(6)A(m) in the 5’ cap controls mRNA stability. Nature 541, 371–375 (2017).
Article ADS CAS PubMed Google Scholar
Dousis, A., Ravichandran, K., Hobert, E. M., Moore, M. J. & Rabideau, A. E. An engineered T7 RNA polymerase that produces mRNA free of immunostimulatory byproducts. Nat. Biotechnol. 41, 560–568 (2023).
Article CAS PubMed Google Scholar
Schmid, M., Speiseder, T., Dobner, T. & Gonzalez, R. A. DNA virus replication compartments. J. Virol. 88, 1404–1420 (2014).
Article PubMed PubMed Central Google Scholar
Pelletier, J. et al. Nucleotide depletion reveals the impaired ribosome biogenesis checkpoint as a barrier against DNA damage. EMBO J. 39, e103838 (2020).
Article CAS PubMed PubMed Central Google Scholar
Weill, L., Belloc, E., Bava, F. A. & Mendez, R. Translational control by changes in poly(A) tail length: Recycling mRNAs. Nat. Struct. Mol. Biol. 19, 577–585 (2012).
Article CAS PubMed Google Scholar
Eaton, H. E. et al. African swine fever virus NP868R capping enzyme promotes reovirus rescue during reverse genetics by promoting reovirus protein expression, virion assembly, and RNA incorporation into infectious virions. J. Virol. 91, JVI.0241602416-02416 (2017).
Article Google Scholar
Sanchez-Tacuba, L. et al. An optimized reverse genetics system suitable for efficient recovery of simian, human, and murine-like rotaviruses. J. Virol. 94, 25 (2020).
Article Google Scholar
Kawagishi, T. et al. Mucosal and systemic neutralizing antibodies to norovirus induced in infant mice orally inoculated with recombinant rotaviruses. Proc. Natl. Acad. Sci. USA 120, e2214421120 (2023).
Article CAS PubMed PubMed Central Google Scholar
Jais, P. H. & LeBoulch, M. In International Society for Cell and Gene Therapy Annual Meeting (Cytotherapy, ed.), Vol. 25, S186 (Cytotherapy, 2023).
Dumont, J., Euwart, D., Mei, B., Estes, S. & Kshirsagar, R. Human cell lines for biopharmaceutical manufacturing: History, status, and future perspectives. Crit. Rev. Biotechnol. 36, 1110–1122 (2016).
Article CAS PubMed Google Scholar
Mullard, A. 2022 FDA approvals. Nat. Rev. Drug Discov. 22, 83–88 (2023).
Article CAS PubMed Google Scholar
Freyn, A. W. et al. A multi-targeting, nucleoside-modified mRNA influenza virus vaccine provides broad protection in mice. Mol. Ther. 28, 1569–1584 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rojas, L. A. et al. Personalized RNA neoantigen vaccines stimulate T cells in pancreatic cancer. Nature 618, 144–150 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We would like to thank Dr Valérie Gratio for her technical help with the flow cytometry tests (Inflammation Research Center CRI-U1149 Cytometry Platform, X. Bichat University, France), as well as Dr Samira Benadda, for her help with the SmartFlare mRNA imaging (Inflammation Research Center CRI-U1149, IMA’CRI Platform, X. Bichat University, France).

Funding

This work was supported by Eukarÿs SAS.

Author information

Authors and Affiliations

Eukarÿs SAS, Pépinière Genopole, 4 rue Pierre Fontaine, Genopole Entreprises Campus 3, 4 Rue Pierre Fontaine, 91000, Evry-Courcouronnes, France
Marine Le Boulch & Philippe H. Jaïs
Institut de Chimie des Substances Naturelles, CNRS UPR2301, Université Paris-Saclay, Avenue de la Terrasse, 91198, Gif-Sur-Yvette, France
Eric Jacquet & Naïma Nhiri
Medical Microbiology and Immunology, Li Ka Shing Institute of Virology, University of Alberta, 6-142J Katz Group Centre for Pharmacy and Health Research, 114 Street NW, Edmonton, AB, T6G 2E1, Canada
Maya Shmulevitz

Authors

Marine Le Boulch
View author publications
You can also search for this author in PubMed Google Scholar
Eric Jacquet
View author publications
You can also search for this author in PubMed Google Scholar
Naïma Nhiri
View author publications
You can also search for this author in PubMed Google Scholar
Maya Shmulevitz
View author publications
You can also search for this author in PubMed Google Scholar
Philippe H. Jaïs
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.L.B. has performed the experiments, analyzed the data and edited the manuscript. E.J. and N.N. have performed the RT-qPCR analysis. P.J. and M.S. has designed the experiments, analyzed the data and written the manuscript, M.S. has edited the manuscript. All authors contributed to the article and approved the submitted version.

Corresponding author

Correspondence to Philippe H. Jaïs.

Ethics declarations

Competing interests

Eukarÿs SAS has filed a patent application related to this technology (International Application No. PCT/EP2018-070479).

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1.

Supplementary Information 2.

Supplementary Video 1.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Le Boulch, M., Jacquet, E., Nhiri, N. et al. Rational design of an artificial tethered enzyme for non-templated post-transcriptional mRNA polyadenylation by the second generation of the C3P3 system. Sci Rep 14, 5156 (2024). https://doi.org/10.1038/s41598-024-55947-0

Download citation

Received: 22 October 2023
Accepted: 29 February 2024
Published: 02 March 2024
DOI: https://doi.org/10.1038/s41598-024-55947-0

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.