Identification of a nuclear localization signal in the Plasmodium falciparum CTP: phosphocholine cytidylyltransferase enzyme

The phospholipid biosynthesis of the malaria parasite, Plasmodium falciparum is a key process for its survival and its inhibition is a validated antimalarial therapeutic approach. The second and rate-limiting step of the de novo phosphatidylcholine biosynthesis is catalysed by CTP: phosphocholine cytidylyltransferase (PfCCT), which has a key regulatory function within the pathway. Here, we investigate the functional impact of the key structural differences and their respective role in the structurally unique pseudo-heterodimer PfCCT protein in a heterologous cellular context using the thermosensitive CCT-mutant CHO-MT58 cell line. We found that a Plasmodium-specific lysine-rich insertion within the catalytic domain of PfCCT acts as a nuclear localization signal and its deletion decreases the nuclear propensity of the protein in the model cell line. We further showed that the putative membrane-binding domain also affected the nuclear localization of the protein. Moreover, activation of phosphatidylcholine biosynthesis by phospholipase C treatment induces the partial nuclear-to-cytoplasmic translocation of PfCCT. We additionally investigated the cellular function of several PfCCT truncated constructs in a CHO-MT58 based rescue assay. In absence of the endogenous CCT activity we observed that truncated constructs lacking the lysine-rich insertion, or the membrane-binding domain provided similar cell survival ratio as the full length PfCCT protein.

. Molecular architecture and regulation of the CCT protein. (A) The conceptual regulatory mechanism of CCT on the macromolecular level. In the inactive state, an autoinhibitory (AI) helix in the membranebinding (M-grey) domain inhibits the catalytic domain (C-blue) in the inactive state (left). The decreased PC content alters the physicochemical properties of membranes. This induces a conformational change in the M domain, turning the AI helix into the membrane-induced amphipathic helix (m-AH) that docks into the PC-depleted membrane surface and thereby releases the inhibition (right) 16 . (B) The schematic representation of mammalian CCT proteins and the truncated protein constructs of PfCCT used in this study. N, C, M and P represent the N-terminal cap region, the catalytic domain, the membrane-binding domain and the region of phosphorylation, respectively. C1 and C2 constructs solely contain the catalytic domain of the first and second repeat unit of PfCCT, respectively. C1M1 and C2M2 also include the respective putative membrane binding domains of each repeat unit. ΔK constructs lack the respective lysine-rich loops (red segments noted as K).

Results
Identification of a potential nuclear localization signal in PfCCT . The sequence alignment of PfCCT with two mammalian orthologs, rat CCT (Rattus norvegicus, RnCCT ) and human CCT (Homo sapiens, HsCCT) shows a highly conserved catalytic domain with one major exception ( Fig. 2A). An 18 amino acid-long, lysinerich insertion is present in the catalytically important L5 loop in both the first and second catalytic domains (C1 and C2) of PfCCT, respectively 11 . The L5 loop is responsible for the coordination of the quaternary ammonium moiety of choline with residues Y131/Y714, N133/N716 and Y158/Y741 in C1/C2 and contributes to the formation of the composite aromatic box cleft 43,44 . We generated a homology model of PfCCT to complete the recently resolved crystal structure (PDB: 4ZCS, Fig. 2B) 44 by visualizing the 720-737 Lys-rich insertion that was deleted from the crystallized construct. The insertion appears as a highly flexible region, characterized by low QMEAN local quality scores with an average of 0.4. Therefore, the homology model shows only a representative position of this dynamic region. Notably, the deletion of the lysine-rich segment had no significant impact on the in vitro constitutive enzyme activity of a catalytic domain construct PfCCT 528-795 12 . However, other potential functions might be related to this segment. Firstly, the complex regulatory mechanism of CCT enzymes include several dynamic conformational changes, which could be altered by the lysine-rich loop. In addition, its presence may provide an explanation to the unique diffuse cellular localization of PfCCT. As the large molecular weight (~ 105 kDa) of PfCCT excludes the possibility of its passive nuclear transport, we hypothesize that the presence of a nuclear localization signal (NLS) in the protein is indispensable for nuclear accumulation. In silico predictions with cNLSmapper showed the possibility of a bipartite NLS in the lysinerich loop with a score of 5.8, that indicates a weak NLS with the potential to partially relocate the protein to the nucleus. Based on its amino acid composition as well as the homology model of the PfCCT protein, the lysine-rich loop is expected to be disordered and solvent-accessible region, that facilitates its recognition by Plasmodial nuclear import proteins. Since the nuclear import machinery must be able to access the signal peptide to effectively carry its cargo into the nucleus, the lysine-rich loop can be valid candidate to have an NLS function. Additionally, the Lys-rich insert might be associate with negatively charged membrane surfaces, given its pronounced basic character.
Two additional NLS predictions were found between residues 264-293 and 823-851 with a score 5.7 and 5.2, respectively. These regions both overlap with the presumed C-terminal peptide of the membrane-induced amphipathic helix of the M domain (m-AH-C, Fig. 2C). It was previously reported that the M domain is required for the shuttle between the nucleus and the cytoplasm 36 , therefore we attempted to analyse the effect of the truncation of the M-domain on the localization of PfCCT.
To elucidate the impact of the proposed regions on the nuclear localization, confocal microscopy-based colocalization analysis was applied. Both the full length PfCCT and the second repeat unit construct, C2M2 has a mixed nuclear-cytoplasmic localization in CHO-MT58 cells (Fig. 3A,B). However, the lysine-rich loop truncated C2M2ΔK construct showed a predominantly cytosolic localization, which indicates the nuclear localization signal role of this insertion (Fig. 3C). Furthermore, the membrane-binding domain truncated second repeat unit construct, C2 also showed a decreased nuclear localization compared to the C2M2 construct and the full length PfCCT protein (Fig. 3D,H,M). We also investigated the consequences of a double truncated C2ΔK construct, which showed a decreased nuclear accumulation similar to the C2 and C2M2ΔK constructs (Fig. 3E,K).
We wanted to further investigate the role of these segments on the translocation mechanism. For this, we used phospholipase C (PLC), an enzyme that hydrolyses PC in the cellular membrane structures and therefore simulates a PC-depleted membrane status in the cells 48,49 . The subcellular localization of the full length PfCCT showed a peculiar pattern upon treatment, as the protein distribution became predominantly cytosolic (Fig. 3G). We observed the same change in distribution in case of the C2M2 construct (Fig. 3H). To test whether the relocation could be associated with the nuclear localization signal of PfCCT, we created a simian virus 40 (SV40) NLS-tagged C2M2 construct as a positive control. Interestingly, this construct did not relocate to the cytosol Figure 2. In silico analysis of PfCCT-specific protein segments. (A) The protein sequence alignment of the rat (RnCCT), human (HsCCT) and the two CCT repeat units of P. falciparum CCT (PfCCT_1 and PfCCT_2). Red boxes and red letters indicate the identical and similar amino acid residues within the catalytic domain, respectively. Blue box shows the Plasmodium-specific lysine-rich loop. Green, blue and yellow letters represent acidic, basic and hydrophobic residues in the membrane binding domain, respectively. The highly conserved catalytic domain is highlighted above the sequences (red) and a white line indicates the position of L5 loop. The membrane-induced amphipathic helix of RnCCT (brown box) with the determined autoinhibitory helix (red line) is shown for reference. The two, previously hypothesized N-and C-terminal membrane-induced amphipathic helix of PfCCT M domain is highlighted under the sequence with yellow boxes 24 . Alignment was generated with ESpript 3.0 45 . (B) Homology model of the active site of PfCCT with one potential representative conformation of the flexible lysine-rich loop in a close-up view. Catalytically important residues in the proximity of L5 loop are highlighted, based on the crystal structure of PfCCT (green, PDB: 4ZCS). The position of CDPCho (yellow) is shown in the active site. Blue colored line shows the position of the lysine residues on the main chain of the L5 loop (grey). (C) Helical representation of the putative membrane-induced amphipathic helices in the membrane-binding domain of the first and second repeat unit, respectively were made with HeliQuest 46 . <µH> represents the hydrophobic momentum of each helix. The abundance of basic residues (blue) supports the higher affinity of membrane-binding towards negatively charged membrane surfaces, whilst a few acidic (red) and the vast majority of hydrophobic residues (grey) found in the longer m-AH-C helices supposedly have a role in the detection of local H + accumulation and support the docking into the membrane leaflets 47 .
Functional impact of the lysine-rich loop on the rescue potential. To evaluate the functionality of the different constructs in a cellular environment, we exploited the thermosensitive nature of the CCT-mutant CHO-MT58 cell line. Notably, these cells behave similarly to the parental CHO-K1 cell line at 37 °C, however, they are not viable at 40 °C, mainly due to thermal destabilizing effect of the CCT mutation 50,51 . Transient transfection of CHO-MT58 cells with the full length PfCCT protein was shown to be able to rescue the cells with a rescue potential of 55.9 ± 2.6% (Fig. 4), which is in good correspondence with the data reported 42 . We also used the inactive (IA) mutant PfCCT H45N H630N designed previously 42 as a negative control that has a rescue potential of 11.7 ± 2.5%. In case of the constructs from the first repeat unit, C1M1 and C1 showed no significant difference in terms of cell survival, with a respective rescue potential of 46.2 ± 4.9% and 52.2 ± 10.9%. The sec- www.nature.com/scientificreports/ ond repeat constructs, C2M2 and C2 have similar results with a rescue potential of 45.6 ± 3.3% and 44.9 ± 7.5%, respectively. This suggests that the truncation of the putative M domain had no significant impact on the in vivo functionality of either the first or the second pseudo-monomer of the PfCCT protein. The deletion of lysinerich loop also did not perturb the activity as reflected in a rescue potential of 52.2 ± 10.9% and 47.7 ± 6.7% for C1M1ΔK and C2M2ΔK is, respectively. Remarkably, the double truncated constructs C1ΔK and C2ΔK displayed a difference regarding their activity. While the second repeat unit construct C2ΔK had a rescue potential of 52.2 ± 14.1%, the first repeat unit construct C1ΔK had around half this value, 30.8 ± 9.3. Furthermore, the mammalian CCT control Homo sapiens CCTα (HsCCTα) was also found to be less potent in rescuing the cells as reflected by its rescue potential of 28.5 ± 7.1. While the underlying reason for these different rescue efficiencies remains unclear, notably, the rescue potential of all constructs tested is significantly (p < 0.01) higher than the inactive control PfCCT H45N H630N, indicating CCT enzyme activity dependent rescue. We hypothesize this variability is due to the different transfection efficiencies of constructs in this study.

Discussion
Plasmodium falciparum CCT is in focus of research for a long time as a potential drug target due to its regulatory role in the phosphatidyl-choline de novo biosynthetic pathway. The duplication of catalytic and membrane-binding domains as well as the presence of Plasmodium-specific segments distinguishes PfCCT from its metazoan CCT homologs and the functional relevance of its many traits are yet unexplored. While structural and biochemical studies delineated the catalytic mechanism of PfCCT 12,43,44 , the precise regulatory mechanism including membrane binding and compartmentalization for this enzyme is less understood. Here, we used a mammalian cellular system based on the thermosensitive CCT-mutant CHO-MT58 to evaluate the function of the Plasmodium-specific structural elements of the PfCCT protein. The mammalian pathogens of the Plasmodium genus contain a typically lysine-rich insertion in the catalytic domain of their CCT protein that is absent in higher eukaryotic organisms (Fig. 5A). Intriguingly, other protists of the Apicomplexan phylum also possess similar insertions, albeit with fewer basic residues within. Noteworthy, this insertion is conserved in both repeat units of all Plasmodial CCTs, except the P. berghei CCT that contains an insertion that is less abundant in lysine residues in its C-terminal repeat unit and primarily contains non-charged asparagine residues (Fig. 5B). We investigated whether the positively charged cluster of lysine residues in PfCCT would allow a potential nuclear localization signal role for this segment. The full length PfCCT protein displays a diffuse localization pattern in both schizont stage parasites 41 and the studied CHO-MT58 cell line (present study, Fig. 3). The large molecular mass of the pseudo-heterodimer PfCCT protein prevents its passive nuclear transport, thus the presence of a nuclear localization signal is required for nuclear targeting of this protein. Our confocal microscopy analysis shows that the lysine-rich loop has a nuclear localization signal function as the lysine-rich loop truncated constructs C2ΔK and C2M2ΔK were primarily found in the cytoplasm, in contrast to the full length PfCCT and the C2M2 constructs (Fig. 3). A possible explanation on why this lysine-rich motif is present in many of the Plasmodium species is the AT-rich genome and the tendency to generate indels, possibly due to slippage of DNA polymerase over the AT-rich, low complexity genomic regions, resulting in a biased use of AT-rich codons e.g. lysine or asparagine 53,54 . Nevertheless, it is intriguing that a similar insertion is visible in the CCT protein of Toxoplasma gondii, a taxonomically close relative of Plasmodium species, despite its 52% GC content genome 55 . Here, DNA polymerase slippage -driven www.nature.com/scientificreports/ emergence of the insert is less likely, which in turn raises the possibility for the functional relevance of this insert in parasites. Notably, a similar insertion with importin-dependent NLS function was also reported previously in the Plasmodium falciparum trimethyl-guanosine synthase enzyme 56 . Additionally, low complexity, Lys-rich repetitive regions of Plasmodial proteins are reported to modulate protein targeting into the periphery of infected red blood cells 57 . The reason why the presence of a Lys-rich insertion is advantageous for the parasite is still unclear and requires further characterization. Additionally, we investigated how the PfCCT-expressing CHO-MT58 cells respond to PLC treatment that perturbs the phosphatidylcholine homeostasis. Our confocal microscopy analysis revealed a significant decrease in the nuclear fraction of both the full length PfCCT and the construct C2M2 following PLC treatment. In comparison, the PfCCT C2 construct that lacks the M domain displayed a decreased nuclear compartmentalization and the PLC treatment had no further effect on its localization. It was previously demonstrated that farnesol or oleate induced nuclear trafficking of exogenous CCT proteins in CHO-MT58 cells is dependent on amphipathic helix within M-domain 36 . Based on our results, the putative M domain of PfCCT also influences nuclear-to-cytoplasmic localization though its precise function is still unclear as we found no evidence of translocation occurring in either M domain truncated construct. A notable difference in the function of M domain was highlighted previously as it was proposed that it consists of two, shorter membrane-induced amphipathic helices 24 . Furthermore, this disparity is linked to a mere sixfold activation in the presence of oleate 27 that is in sharp contrast to the robust, up to 100-fold activation increase in case of rat CCT, accounting for a less fine-tuned PfCCT regulatory mechanism 58 .
To evaluate the impact of the presence of the M domain on the exogenous PfCCT activity in a cellular environment, we transiently transfected the CHO-MT58 cells with several truncated constructs and assessed their effect on cell survival at the restrictive 40 °C temperature. Our findings suggest that neither the lysine-rich insert in the catalytically important L5 loop, nor the putative regulatory M domain is essential to rescue the cells. Nonetheless, C1ΔK showed a slightly lower capability of rescuing the thermosensitive CHO-MT58 cells, though it was still significantly more potent than our inactive PfCCT control. Moreover, we showed that HsCCTα was also capable of rescuing the cells and further shows the possibility of a previously established concept to test PfCCT-specific drugs by creating two, HsCCT-and PfCCT-expressing transgenic CHO-MT58 cell lines and use it as a cell-based test system for structure-function relation studies for drug development, targeting the PC biosynthetic pathway 42 .
We conclude that PfCCT has adapted to provide a unique robustness in its functionality and has a key role in supporting the excessive need for PC biosynthesis during the intraerythrocytic development. Its unique membrane-binding domain as well as the Lys-rich insertion may contribute to the regulation of the parasite PC homeostasis and intracellular transport throughout the widely diverse developmental stages. As confocal imaging did not reveal any distinct membrane compartmentalization following PLC treatment, it would be of further interest to investigate whether the enzyme could serve the increased need for PC biosynthesis in absence of the membrane-binding induced regulatory mechanism throughout the intraerythrocytic life cycle of parasites. A notable limitation of our results, combined from several independent experiments, originates from the mammalian cell line-based model system. The true functional relevance of PfCCT-specific structural elements has to be further assessed by in vivo experiments in parasites. In silico analysis. Multiple sequence alignment was carried out by COBALT (NIH NCBI) with default parameters. To visually represent the alignment, ESpript 3.0 was used and manually annotated with the relevant secondary structure elements and domains. Homology model of PfCCT including the lysine-rich loop was created with SwissModel, based on the crystal structure of the PfCCT catalytic domain (PDB: 4ZCS) 44 and was assessed using the Structure Assessment tool of SwissModel. Nuclear localization signal prediction was carried out with cNLSmapper for the entire protein with a cut-off score of 5.0. Helical representation of the different structural elements was made with HeliQuest using 3-11 helical structure 46 . Phylogenetic tree was generated with the Interactive Tree of Life (iTOL) tool 52 after multiple sequence alignment of the different proteins using ClustalOmega.

Materials
Cloning of PfCCT constructs. The PfCCT cDNA sequence (PlasmoDB: PF3D7_1316600) was cloned into pIRES-EGFP-puro plasmid. The cDNA sequence was then used to design primer pairs for the respective monomer and their truncated constructs. Constructs C1, C1M1, C2, C2M2, SV40-NLS C2M2 and their respective lysine-rich loop deleted (ΔK) versions (present in Fig. 1B) were amplified with the following primers (Table 1). Immunofluorescent staining. 50,000 CHO-MT58 cells were passed to a coverslip and were transiently transfected at 80% confluence with 50 µl of transfection reagent mix containing 1.5 µl of X-treme GENE HP transfection reagent and 0.5 µg purified plasmid DNA in Opti-MEM serum-free medium. After 24 h, PLC + cells were treated with 10 mU/ml phospholipase C for 3 h at 37 °C. Cells were then fixed with 4% paraformaldehyde for 10 min and permeabilized with 0.1% Triton X-100 solution. A blocking solution containing 3% bovine serum albumin (BSA) and 5% FBS in PBS was added for an hour. After the blocking, polyclonal primary antibody anti-CCT was added in 1:1000 ratio for an hour. Secondary antibody anti-rabbit Alexa Fluor 633 from goat was added in 1:1000 ratio after the primary antibody treatment for an additional hour. Finally, the cells were stained with 0.1 µg/ml 4′,6-diamidino-2-phenylindole dihydrochloride (DAPI) in PBS. After every step, the coverslips were washed 3 times with PBS. Every step was carried out at room temperature. The stained slides were embedded in FluorSave antifading agent to preserve fluorescence. Confocal imaging was carried out on a Zeiss LSM 710 microscope using a Plan-APOCHROMAT 40 × oil immersion objective.
Colocalization analysis. 8-bit images acquired by the confocal imaging were analysed in Fiji 60 by the Coloc 2 plug-in. Threshold regression was carried out with the Costes method on images with determined regions of interest (ROI). Mander's correlation coefficient was used to determine the nuclear overlap coefficient based on the DAPI and the corresponding anti-CCT signal using the following formula: M = i S1 i,coloc i S1 i , where S1 i,coloc equals S1 i (intensity of the anti-CCT) signal if DAPI is above the determined threshold.