Real-time in vivo Global Transcriptional Dynamics During Plasmodium falciparum Blood-stage Development

Genome-wide analysis of transcription in the human malaria parasite Plasmodium falciparum has revealed robust variation in steady-state mRNA abundance throughout the 48-hour intraerythrocytic developmental cycle (IDC) suggesting that this process is highly dynamic and tightly regulated. However, the precise timing of mRNA transcription and decay remains poorly understood due to the utilization of methods that only measure total RNA and cannot differentiate between newly transcribed, decaying and stable cellular RNAs. Here we utilize rapid 4-thiouracil (4-TU) incorporation via pyrimidine salvage to specifically label, capture and quantify newly-synthesized P. falciparum RNA transcripts at every hour throughout the IDC following erythrocyte invasion. This high resolution global analysis of the transcriptome captures the timing and rate of transcription for each newly synthesized mRNA in vivo, revealing active transcription throughout all stages of the IDC. To determine the fraction of active transcription and/or transcript stabilization contributing to the total mRNA abundance at each timepoint we have generated a statistical model to fit the data for each gene which reveals varying degrees of transcription and stabilization for each mRNA corresponding to developmental transitions and independent of abundance profile. Finally, our results provide new insight into co-regulation of mRNAs throughout the IDC through regulatory DNA sequence motifs associated with these processes, thereby expanding our understanding of P. falciparum mRNA dynamics.

The malaria parasite life cycle commences upon the transfer of Plasmodium sporozoites from the saliva of an infected Anopheles spp. female mosquito vector to a human host. These sporozoites travel to and colonize a small number of hepatocytes within the liver where they replicate, forming thousands of merozoites 4 . Liver-stage merozoites are released into the circulatory system and initiate the blood stage of infection, which is responsible for the clinical symptoms of malaria. Development during the 48-hour intraerythrocytic development cycle (IDC) is comprised of parasite maturation and cell division, resulting in the formation of up to 32 daughter cells, which are released and invade new erythrocytes 5 . A small proportion of asexual parasites commit to an alternative cellular fate of sexual differentiation during the IDC, eventually resulting in the development of male and female gametocytes. Maturation of P. falciparum sexual stage parasites takes 10-12 days, and it is these sexual forms that are taken up by a mosquito vector for eventual onward transmission to another human host, thus completing the Plasmodium full life cycle 6 . Differentiation of the malaria parasite in these various cell types and tissues is regulated by coordinated programs of gene expression [7][8][9][10][11][12][13] .
Gene expression during the IDC is highly periodic and follows a transcriptional cascade with the majority of genes expressed in a "just-in-time" fashion 7,13,14 . Although canonical transcriptional machinery is present, the mechanisms that underlie the regulation and specificity of active transcription remain largely uncharacterized 15,16 .
To date, only a single family of specific transcription factors, the 27-member Apicomplexan AP2 (ApiAP2) proteins, has emerged as transcriptional regulators with functions across all developmental stages 15,17-25 . However, the challenge of regulating roughly 5,500 Plasmodium genes with a small repertoire of transcription regulators suggests a significant role for epigenetic control and post-transcriptional regulation in Plasmodium spp. 3,16,[26][27][28][29][30] . The control of gene expression at the RNA level is evidenced by several studies reporting a significant delay between transcription and translation [31][32][33][34] , ribosomal influence on the timing of mRNA translation 35,36 , and a lack of coordination between active transcription and mRNA abundance during parasite development 37,38 . Bioinformatic analyses have suggested that between 4-10% of the Plasmodium genome encodes RNA-binding proteins (RBPs) 39,40 . While the majority of predicted RBPs remain uncharacterized, several post-transcriptional regulatory factors have been well-studied such as DOZI (DDX-6 class DEAD box RNA helicase), CITH (Sm-like factor homolog of CAR-I and Trailer Hitch) and pumilio family proteins (PUF1 and PUF2), that have been demonstrated to play critical roles in the translational repression of genes essential for Plasmodium transmission [41][42][43][44][45] . Additionally, the ALBA (acetylation lowers binding affinity) domain-containing proteins (PfAlba1-4) [46][47][48] , Bruno/CELF family members (CELF1 and CELF2) 49 , and Pfg27 50 have been demonstrated to interact with RNA; however, the exact role has yet to be defined for a majority of these RBPs in parasite development.
In all living systems, cellular mRNA levels are determined by the interplay of tightly regulated processes of RNA production, processing, stabilization and degradation 51 . Thus, changes in transcript abundance measured by DNA microarray, RNA-sequencing, or RT-qPCR studies reflect the relative rates of both transcription and mRNA stabilization. Since the advent of the complete P. falciparum genome sequence, numerous studies have utilized whole-genome methods to measure transcript abundance from populations of parasite-infected red blood cells [11][12][13]31,52 as well as, more recently, single cells 53,54 . In addition, other studies have measured transcriptional activity 38,55 , mRNA half-lives 56 , RNA Pol II binding 57 and transcription start sites 58 throughout development.
Although these studies have revealed many important details of the regulation of gene expression in Plasmodium during the IDC, none capture mRNA dynamics on a whole-genome scale without perturbations to the parasite.
Measurement of mRNA half-lives is commonly performed following chemical inhibition of transcription which induces a stress response or cellular death thereby impacting the ability to accurately measure physiological mRNA turnover parameters 59 . Similarly, whole genome capture of transcriptional activity via nuclear run-on 55 or GRO-seq 38 methods require isolation of nuclei or permeabilization of cells, which compromises cellular physiology and nascent nuclear transcriptional activity. Therefore, we characterized the mRNA dynamics of the P. falciparum transcriptome by comprehensively capturing nascent mRNA transcription, stabilization, and total abundance of mRNA in vivo throughout the IDC using 4-TU biosynthetic RNA labeling.
Biosynthetic labeling of pyrimidines has revolutionized the capture and measurement of in vivo cellular mRNA dynamics during development and in response to various stressors 60-64 across a diverse array of organisms 62,[64][65][66][67][68][69][70] including Plasmodium 37 . In human and yeast cells, short periods of metabolic labeling into nascent RNA has been used to capture RNA synthesis, decay and splicing rates 62,63,69,[71][72][73][74] . However, thiol-modified RNA capture and kinetic profiling of RNA dynamics is dependent upon pyrimidine salvage, and the evolutionary lineage of Plasmodium has lost this biochemical capacity 75 . We have previously demonstrated that genetic engineering of the pyrimidine salvage pathway into Plasmodium parasites enables the efficient uptake of 4-TU and biosynthetic capture of thiol-labeled RNAs to examine RNA dynamics during parasite development 37 .
Herein, we report the application of biosynthetic RNA labeling and capture 37 to measure genome-wide nascent mRNA transcription and stability in vivo throughout the intraerythrocytic development cycle of the malaria parasite. This method relies on the genetically facilitated scavenging of 4-TU and genome-wide DNA microarray measurements of labeled and unlabeled RNAs. Using this method, we have captured mRNA dynamics during asexual development at hourly resolution and apply a quantitative normalization model to define gene-specific mRNA metabolism. These data provide dynamic transcription and stability information for all mRNAs expressed during the IDC and also recapture known features of the "just-in-time" cascade of mRNA transcription.
Additionally, modeling of the mRNA dynamics reveals regulatory mechanisms that cooperate during parasite development, which we used to predict functional roles for putative transcriptional regulators. Our results provide an updated in-depth view of mRNA transcriptional dynamics and gene regulation in P. falciparum throughout the IDC.

RESULTS:
Engineering P. falciparum to establish biosynthetic pyrimidine salvage In vivo mRNA labeling is dependent upon the salvage of 4-thiouracil (4-TU); however, P. falciparum is not capable of pyrimidine precursor uptake and incorporation into RNA 2,75 . We previously reported the use of episomal expression of the bifunctional fusion gene encoding Cytosine Deaminase and Uracil Phosphoribosyltransferase (FCU) from yeast to enable parasite salvage of 4-TU 37 . To avoid cell-to-cell heterogeneity in plasmid copy number expressing FCU, this study uses a single copy cassette that is stably 6 integrated into the attB locus of 3D7 attb 76 to constitutively express a GFP-tagged FCU protein (Fig. S1A) 37 . Stable integration of fcu-gfp (3D7 attb::FCU-GFP ) was verified by PCR (Fig. S1A) and whole-genome sequencing of a final clonal parasite line (data not shown). Expression of FCU-GFP was confirmed by Western blot and localized to the cytoplasm by immunofluorescent imaging analysis ( Fig. S1B and C). Salvage and incorporation of a 4-TU into total cellular RNA by the 3D7 attb::FCU-GFP transgenic parasites was confirmed by Northern blot (Fig. S1D).

Nascent in vivo mRNA labeling and capture throughout the IDC
The goal of this study was to produce high-resolution RNA dynamics data for each transcript expressed during the 48-hour P. falciparum IDC transcriptome. To capture nascent RNA transcription and stabilization we exposed highly synchronized 3D7 attb::FCU-GFP to 10 min pulses of 40µM 4-TU followed by total RNA extraction and capture of labeled and unlabeled transcripts ( Fig. 1A and Fig. S2A) 37 . We chose a short labeling time of 10 min based on the time required for 4-TU uptake and incorporation into 4-thiol-RNA transcripts 37 . When the length of 4-TU incubation is sufficiently short, the labeled RNA is likely still in the nucleus and subjected to minimal degradation, and thus is reflective of the average transcription rate 72 . To capture nascent transcription throughout the IDC, 10 min pulses were repeated on individual timepoints every hour for the full 48-hour developmental cycle (Fig. 1A) followed by total RNA extractions for each hourly sample and biotinylation of the thiolated RNA transcripts.
Nascently transcribed (4-TU labeled) mRNA was affinity-captured via streptavidin magnetic beads ( Fig.   S2A) 37,64 . For each timepoint the genome-wide profiles for the resulting RNA populations, total, labeled (transcribed), and unlabeled (stabilized) RNAs, were quantified using DNA microarray analysis (Fig. S2A). The RNAs present in the cell that were not modified during the 10 min 4-TU labeling pulse are designated as "stabilized" throughout this study. However, we note that our method cannot distinguish whether these "stabilized" RNAs are free or bound by RBPs and are likely turning over at variable rates. Therefore, the term "stabilized" herein is simply used to describe the persistence of an unlabeled transcript that was present prior to the labeling pulse.
To investigate whether in vivo RNA labeling perturbed transcription dynamics, we correlated the total RNA abundance levels from the 4-TU treated 3D7 attb::FCU-GFP timecourse with an earlier P. falciparum transcriptome from untreated wild-type 3D7 cells 13 and found no significant changes in the pattern of RNA levels during the IDC (median corr = 0.72, Fig. S2B). This suggests that transcription or steady-state mRNA levels are not perturbed in the presence of a short pulse of 4-TU. We next determined if there was bias in our RNA labeling due to the brevity of the 4-TU pulse by comparing gene length and dUTP content to intensity of the signal captured ( Fig. S2C and S2D). Although the distribution of gene length and dUTPs/per gene distribution is broad (23-30864 bps and 7-9242 dUTPs, respectively; Fig. S2C), there is no significant bias in the mRNAs that are labeled with 4-TU and captured by biotin-streptavidin interaction (Fig. S2D). Therefore, the chosen 10 minute pulse length allows for the capture and accurate measurement of nascent transcription genome-wide without further normalization. Overall, genome-wide biosynthetic labeling and capture throughout the IDC revealed nascent transcription profiles for 5,198 genes (91.4% of P. falciparum annotated transcripts). Biosynthetic mRNA capture timecourse with 4-TU (40uM) labeling (10 min pulse) of highly synchronous transgenic 3D7 attb::FCU-GFP parasites every hour throughout the 48 hour IDC followed by total RNA extraction. 4-TU labeled RNA is separated from the total RNA resulting in two fractions, transcribed (labeled) and stabilized (unlabeled). These RNAs and the total cellular RNA pool were quantified by DNA microarray analysis (Fig. S2). B) Data obtained by cDNA microarray was normalized and modeled to determine the contribution of nascent transcription (•) and stabilization (•) to the total abundance profile (•) for each gene. C) The modeled expression values of nascent transcription (5207 genes, red) and stabilization (5236 genes, blue) values for each gene were ordered based on the peak timing of total abundance (5428 genes) during the 48 hour IDC as determined by Lomb-Scargle analysis and displayed in a heatmap as the Log2 expression value. The summation of the expression values for both transcription and stabilization is displayed as a heatmap representing the estimated total mRNA abundance (black), recapitulating the canonical phaseogram of measured total mRNA abundance (left, RGB heatmap).

Statistical model of P. falciparum asexual mRNA dynamics
Because standard two-color DNA microarray analysis normalizes log2(green/red) values within individual timecourses, the log2 values do not distinguish the relative contributions of the various pools of RNA (labeled, unlabeled) to the total RNA signal ( Fig. 1B; Raw Log2 Ratios) when timecourses are directly compared to each other. To circumvent this issue, we developed a statistical model to quantitatively determine the contribution of nascent transcription (labeled) and mRNA stabilization (unlabeled) to the total abundance profile of each gene by comparing the signal intensities for each probe on the DNA microarrays across all three timecourses (Materials and Methods). After the three timecourses were separately normalized by Rnits 77 (Fig. S3), the mRNA dynamics were estimated by non-negative least squares since the contribution of unlabeled and labeled mRNA to the total cellular RNA is always additive (Fig. S3). The balance between nascent transcription and mRNA stabilization is estimated by optimizing the following equation: Total mRNA abundance = α(nascent transcription) + β(mRNA stabilization) Estimated coefficients for α and β were utilized to calculate the Expression Value of nascent transcription and mRNA stability, respectively (Fig. S3). Expression Values representing the total abundance, transcription and stabilization were averaged across the gene to which they mapped using Tukey's biweight 78,79 and plotted across the 48 hour IDC (Fig. 1C, Calculated Expression Values), resulting in mRNA dynamic profiles for most P.
falciparum genes (Table S1). Gene plots can be found on individual gene pages on http://PlasmoDB.org 80 and http://iris.bx.psu.edu/projects/manuel-llinas/output/results.html where both the raw log2(green/red) ratios and normalized Expression Values are also available for download.
Based on the calculated normalized nascent mRNA expression values, we found that active transcription from the nuclear genome occurs throughout the IDC ( Fig. 1C and Table S1). We also found that transcription of nascent mRNAs from both the mitochondrial and apicoplast organellar genomes is tightly coordinated and begins 8-10 hpi ( Fig. S4A and 4B). Transcription of these non-nuclear genes peaks at 31 hpi (Fig. S4B), coinciding with the start of organellar genome replication and division [81][82][83] . In contrast, nuclear encoded genes destined for the apicoplast are transcribed prior to apicoplast encoded genes (Fig. S4C), likely due to their role in transcription and translation of the apicoplast genome 84 . Interestingly, we also observe that the capture of nascent transcription from the apicoplast genome reveals a wide range of expression levels regardless of their genome position (Fig.   S4D).
Based on the statistical model of the data, we found that the abundance profiles of individual genes are dictated by a combination of nascent transcription and mRNA stability as the parasite develops (Fig. 1B) Fig. 2A), likely due to the differences in peak timing of nascent transcription versus stabilization. By calculating the difference in the maximum transcription and stabilization to the total abundance, we estimate that peak transcription precedes the peak abundance (-80 min), while peak stabilization follows approximately 41 minutes after (Fig. 2B). This supports the current notion that the majority of genes are transcribed in a "just-in-time" manner with nascent transcription peaking prior to mRNA abundance, while post-transcriptional regulation likely influences the delay from peak time of mRNA abundance.

Variable rates of transcription and stabilization shape temporal mRNA dynamics throughout the IDC
To determine the stage-specific effects of mRNA dynamics on transcript abundance, we binned genes into six groups based on their peak timing of abundance during the IDC as captured by the total RNA timecourse (Fig.   2C). These six groups reflect the major morphological stages of development during the IDC including early ring  The dominant mode of regulation was determined for each gene by calculating the correlation of the modeled transcription or stabilization (r ≥ 0.5) to the estimated total mRNA abundance profile for each gene throughout the IDC and are displayed in order of the peak timing of abundance (hpi). The percentage of genes regulated by either transcription (red) or stabilization (blue) is summarized for major morphological stage transitions; early ring (790 genes), mid-ring (842 genes), late-ring/early-trophozoite (669), mid-trophozoite (1186 genes), late-trophozoite (799), and schizont stage (1131). Box-whisker plot of D) transcription and E) decay rates calculated for early ring, mid-ring, late ring/early trophozoite, midtrophozoite, late trophozoite, and schizont stages of parasite development (Table S2). The mean rate (+) and Tukey's distribution are displayed for each stage of development. Individual rates that are outliers are displayed as individual points. genes) (Fig. 2C). However, there is a higher proportion of contribution from nascent transcription to mRNA abundance profiles at a majority of developmental stages (early ring = 55%, late ring/early trophozoite = 68%, mid-trophozoites = 61%, late trophozoites = 52%), while stabilization dominates at the mid-ring (64%) and schizont (62%) stage transitions ( Fig. 2C and S5D). These results reveal that there are stage-specific regulatory mechanisms that influence the patterns of mRNA abundance through both transcription and mRNA stability during the IDC.
Previous studies have suggested that both transcriptional activity and mRNA half-life increase as the parasite progresses through the IDC 55,56 . These results were based on methodologies that relied on measuring ex vivo transcription from isolated nuclei 38,55 or inhibition of transcription with Actinomycin D 56 followed by RNA-seq or DNA microarray to quantify mRNA. Because our dataset was generated without physical or chemical perturbations to the parasite, we analyzed our results in the context of these earlier studies 55,56 . To do this, the rate of transcription for each gene was calculated based on the slope of the expression value of the labeled mRNA just prior to the peak of nascent transcript levels, while the rate of decay was determined by the slope of the unlabeled mRNA expression value following the peak of stabilization (Table S2). The transcription and decay rates of genes with peak expression during the early ring, mid-ring, late ring/ early trophozoite, mid-trophozoite, late trophozoite and schizont stages were grouped together. The rate of transcription and decay are represented in Fig. 2D and 2E for each phase of development. These data show that genes expressed during the early and mid-ring stage are transcribed at a faster rate (median = 1.77 and 2.6 gene transcripts/min, respectively), compared to the rest of the life-cycle (median = 4.1 -6.3 gene transcripts/min) (Fig. 2D, Table S2). Conversely, the mean rate of decay varies significantly as the parasite progresses through the IDC ( Fig. 2E; Table S2), suggesting that the regulation of mRNA turnover is stage-specific. Surprisingly, the median rate of decay is slowest during the mid-to late-ring stages and becomes significantly more rapid during the mid-trophozoite stage of parasite development (Fig. 2E). However, there are an increased number of genes that are stabilized during the later stages of parasite development (Fig. 2E positive outliers 36-48 hpi = 21 genes, Table S2). Although the increase in stability of a small number of genes during the late IDC stage confirms earlier findings by Shock et al. 56 , it is important to note that the vast majority of genes with peak abundance during the late trophozoite and schizont stage do not display a significantly lower rate of decay (Fig. 2E).

mRNA regulatory "bursts" coincide with major morphological transitions of asexual development
Although it is conventionally accepted that transcription during the Plasmodium IDC occurs in a "just-in-time" fashion 13 , this is largely based on transcript abundance measurements and annotated gene function. Using the measured mRNA dynamics, we examined the timing of active transcription and stabilization of the parasite as it Histogram plot displaying the number of genes with peak transcription (left) or stabilization (right) timing binned into 30 min intervals throughout the IDC representing the hours in which there are greater than 5 genes enriched (5-45 hpi). The histogram is divided into five distinct groups based on the distribution of the modes and adjacent minima, corresponding with major morphological stage transitions of the IDC. Separate histograms displaying the number of genes which peak in transcription (red) and stabilization (blue) at either 1 hpi or 48 hpi highlighting the reliance on a specific regulatory mechanism at each stage of development. The transcribed (red) or stabilized (blue) genes within each group were analyzed for GO-term enrichment and significant terms (Bonferroni corrected p-value ≥ 0.01) are displayed in pie charts and listed in Table S3. develops throughout the IDC. By binning the genes based on their peak time of nascent transcription or peak time of stabilization every 30 minutes throughout the IDC, our data reveal peaks in the numbers of genes being transcribed and/or stabilized at four major time periods during parasite development ( Fig. 3 and Table S3).
Interestingly, these 'bursts' of mRNA regulatory activity coincide with major morphological transitions during the IDC (Fig. 3). Next, we determined the total number of genes within each 'burst' by placing the boundaries at the nearest minimum trough adjacent to each peak and then performed GO-term enrichment analysis on all genes.
The first peaks of both transcription and stability occurs during the mid-ring stage of development (12 hpi), and is characterized by peak transcription of genes whose products are enriched for RNA processing, translation, and antigenic variation (Fig. 3 and Table S3). During this stage, genes encoding for variant surface antigens and cell adhesion also reach their peak stability ( Fig. 3 and Table S3). As expected, peaks of gene regulation that occur in the early-, mid-and late-trophozoite, are enriched for genes whose functions are necessary and unique for each phase of trophozoite development (Table S3). For example, the mid-trophozoite is known to be the most metabolically active stage of development, and during this phase there is an enrichment in transcription and stabilization of genes which encode for amino acid, tRNA, ncRNA, pyruvate, glycolytic, and carbohydrate metabolic processes (Fig. 3 and Table S3). In contrast, the late-trophozoite phase is characterized by a peak in the regulation of genes involved in DNA metabolism and replication in preparation for mitotic production of daughter cells.
Beyond the identification of genes enriched during developmental transitions, we also identified a number of genes whose peak timing of transcription and stabilization occurred at either 1 or 48 hpi. As expected, at 48 hpi the number of stabilized genes (378 genes) outnumbers those that are actively transcribed (140 genes) (Fig 3 and Table S3). This corroborates earlier studies demonstrating the increased presence of mRNA-interacting proteins during late schizogony 39 and the increased half-life of genes at this stage 56 . As expected, these genes are enriched in processes necessary for parasite invasion and the establishment of a niche within the erythrocyte that will allow for the uptake and utilization of nutrients from the extracellular space (Table S3 Biological process GO-term enrichment; vesicle mediated transport (GO:0006888, p-value = 4.52e-3) and cytoskeletal protein binding (GO:0008092, p-value = 8.59e-10) including kinase signaling, and microtubule formation. Conversely, genes with peak transcription occurring immediately following invasion outnumber any other hour of the IDC (Fig. 3; Peak Transcription 1 hpi, 684 genes; Table S3). These genes are enriched for RNA metabolic processes (GO:0016071, p-value = 7.96e-5), likely in preparation to increase cellular transcription levels necessary for parasite development and aid in post-transcriptional regulatory processes. Identification of genes stabilized prior to or transcribed upon invasion may facilitate the dissection of the cellular and molecular processes that govern the establishment of the parasite within a new erythrocyte and provide molecular targets for future investigation.

Nascent transcription captures clonal variation within the parasite population
One advantage to biosynthetic mRNA capture is the ability to detect and compare the level of transcriptional activity for individual genes. Due to the importance of parasite-derived erythrocyte surface antigens in modulating immunity to malaria infections, transcriptional regulation of these antigens is one of the most highly- studied areas of Plasmodium biology 85,86 . It has long been established that members of surface antigen gene families (vars, rifins, stevors, pfmc-2tms, and surfins) are transcribed during the late ring to early trophozoite stages of development and exhibit variation in their expression among parasite clones 85 . To identify the timing of transcription, we calculated the average normalized expression values (Log2 mean centered) for each gene family to determine the peak time of nascent transcription during the IDC (Fig. 4A). Transcription of the var and rifin gene families peaked at 11 hpi, while stevors reached their peak transcription at 12 hpi (Fig. 4A).
Interestingly, the peak expression of surfins and pfmc-2tm genes occur at 10 and 16 hpi respectively (Fig. S6A), which may be explained by the demonstration that the transcriptional regulation of these two families occurs through different factors 87 . To determine which member(s) of these surface antigen gene families was being transcribed in this clonal population of transgenic parasites, we determined the expression values of each gene at the peak hour of transcription for each family of surface antigens (Fig. 4B). Our data demonstrate transcription of a single dominant var, as well as multiple rifin and stevor genes (Fig. 4B), supporting the mutually exclusive expression of only the var gene family of surface antigens. Although pfmc-2tm and surfin genes are also considered surface antigens, they do not demonstrate clonal variation (Fig. S5B), as reported previously 88 . Taken together, the biosynthetic labeling and capture of nascent transcription detects transcriptional variation within clonally variant gene families.

Regulatory dynamics of genes involved in erythrocyte invasion are predictive of protein function
A closer examination of the mRNAs stabilized during the schizont stage of development revealed that a majority encode proteins that function in the invasion of merozoites into erythrocytes. Because of this observation, we determined if these mRNAs are co-regulated by comparing their similarity in total abundance, transcription, and stabilization (Fig. 5A). As previously demonstrated 13 we found that the total mRNA abundance profiles of invasion genes peak during schizogony (Fig. 5A). However, the levels of active transcription and stabilization vary drastically (Fig. 5A and B), revealing that there are at least two distinct regulatory mechanisms that dictate these patterns of total abundance. For example, a majority of mRNAs that encode merozoite surface proteins (msps) including msp7 and msp2 are predominantly regulated at the level of transcription and appear to be minimally influenced by stabilization ( Fig. 5A and 5B). This may suggest that msps are rapidly transcribed and translated during schizogony for presentation on the surface of the merozoite prior to rupture from the red blood cell 89 . Conversely, although rhoptry genes rhoph2 and rhoph3 are also transcribed during the schizont stage, their peak abundance is contributed by mRNA stabilization (Fig. 5B), suggesting that these mRNAs may be translationally repressed or are slow to turn over until their gene products are necessary. Previous reports have demonstrated a "step-wise" role for proteins involved in invasion which is required after the merozoite has formed a tight interaction with the erythrocyte surface 90 . More recently, invasion proteins (such as RhopH2 and RhopH3) have been shown to assist in the establishment of nutrient transporters across the parasitophorous vacuolar membrane following invasion 89,91-93 . Although these genes are transcribed during the schizont stage, their stabilization guarantees that the mRNAs will be available for translation immediately upon invasion.
Interestingly, the mRNA dynamics of both sera3 and sera4 reveal a balance of transcription and stabilization ( Fig. 6B), potentially due to their dual role in two stages of the IDC, rupture as well as reinvasion. The SERA proteins have been demonstrated to be essential for schizont rupture and remain associated with the merozoite surface until attachment and invasion of a new red blood cell 94 .

Global mRNA dynamics aid in the prediction of cis-regulatory DNA-sequence motifs
High resolution capture of mRNA dynamics provides a more refined set of data for the identification of DNA regulatory motifs associated with co-transcription. Using the FIRE (Finding Informative Regulatory Elements) algorithm 95 , we used co-regulated mRNAs to predict DNA sequence motifs that are enriched in the 5' untranslated regions of these genes (up to -1000bp from ATG start site) (Fig. 1C and Table S1). We found enrichment of 21 DNA motifs associated with cotranscription during the IDC (Fig. 6A). Of these 21 motifs, seven have been previously identified as ApiAP2 cis-interacting motifs 96 . To verify the accuracy of our motif identification, we calculated the average transcription of genes enriched for the known DNA recognition motif (GTGCA), which is bound by the third AP2 domain of PfAP2-I 96 , compared to the expression of pfap2-i (Fig. 6B). As expected for a transcriptional activator, genes containing the GTGCA regulatory motif are actively transcribed following the peak expression of pfap2-i (Fig. 6B). Additionally, our bioinformatic analysis identified 120 genes previously demonstrated to be directly regulated by PfAP2-I via ChIP-seq analysis (Fig. 6C) 19 .
Conversely, transcription of genes enriched for the cis-regulatory motif ACAAC decreases following the transcription of ap2-g2 (Fig. 6C), which has been suggested to act as a transcriptional repressor during the IDC in Plasmodium berghei 18 .
We compared the target genes enriched with the ACAAC DNA-motif with the AP2-G2 ChIP-seq gene targets and found that capture of nascent transcription, combined with FIRE analysis, identified greater than 50% of the reported targets of AP2-G2 18 (623/1222) (Fig. 6C). Taken together, using motif identification analysis, we can predict the potential regulatory role of ApiAP2 proteins based upon the transcriptional patterns of genes containing their cognate DNA-interaction motif compared to the mRNA dynamics of individual ApiAP2s (Fig. S7A). Finally, we used the nascent transcription data to predict the possible regulatory roles for ApiAP2s proteins for which no identified cis-motif has been reported 96 . To do this we compared the timing and intensity of all genes enriched for the orphan AGACA motif (Fig. 6A) to the transcriptional pattern of two functionally uncharacterized ApiAP2 protein-coding genes: pf3D7_1139300 and pf3D7_0404100. The peak timing for both of these apiap2 genes occurs just prior to a decrease in transcription of genes enriched with the AGACA motif ( Fig S7B) (as seen for AP2-G2 (Fig. 6C)).

DISCUSSION:
Cellular RNA levels are subject to extensive regulation involving alterations in the rates of RNA synthesis (transcription), processing (e.g., splicing, polyadenylation, transport), stabilization, and decay 51 . Data on gene expression levels in P. falciparum obtained by microarray or RNA-seq 9,11,13,14,88 reflect steady state quantities of intracellular RNA which are a product of the rate of both genome-wide transcription and stabilization. However, a major constraint of these analyses is the inability of these methods to differentiate between transcription and stabilization which contribute to global abundance profiles throughout the IDC. As we have demonstrated, this constraint can be overcome by 4-TU biosynthetic labeling of newly transcribed mRNA providing direct isolation and analysis of nascent transcripts with minimal perturbations and toxic effects to the cell 37 .
Here we genetically modified P. falciparum to constitutively express the pyrimidine salvage pathway by singlecopy genome integration, enabling the rapid incorporation of 4-TU into nascent mRNA transcripts. Using these parasites we labeled and captured nascent transcription and mRNA stability and determined their contribution to the total abundance profiles of each gene throughout intraerythrocytic development. Earlier transcriptome studies captured timing and mRNA abundance of each gene by DNA microarray analysis 7,13 , revealing a periodicity to gene expression during the 48h IDC. Here we demonstrate that the periodicity of each gene is defined by contributions from both transcription and stabilization. However, mRNA regulation is also time-dependent and gene-specific. Given the complexities of genome-wide mRNA dynamics, we examined the contribution of mRNA regulation to gene expression as the parasite progresses through the IDC revealing that transcription and stabilization do not contribute equally to the patterns of abundance observed for a given gene. Our results identify active transcription throughout the IDC (Fig. 1C), in contrast to previous reports which suggest that there is minimal transcriptional activity during the ring and schizont stages of development 38 . We also find that there are distinct periods of transcriptional and post-transcriptional regulatory bursts which coincide with breaks in protein production at the late-ring and trophozoite stages of development 33 , supporting the suggestion that replenishment of proteins at these stages is regulated at the level of gene expression 31 . Our data also support previous studies which suggest that the half-life of the transcripts increases during the intraerythrocytic developmental cycle 56 .
However, this slow rate of decay applies to less than 25 genes and the majority of genes undergo a fairly consistent rate of decay through the IDC. Therefore, the degradation machinery is consistently active throughout the IDC 40 and it is more likely that post-transcriptional regulation leads to stabilization of this small subset of genes.
Interestingly, during the ring stage of development both the rate of transcription and the rate of decay are the highest, suggesting that ring-stage metabolism is dependent upon mRNA recycling. Perhaps this is due to the absence of active transcription of genes encoding for pyrimidine synthesis. This observation is consistent with reports that ring-stage parasites are more resistant to anti-malarial chemotherapies 97 and supports evidence suggesting that they may be able to respond to stressors more readily than other stages 98 .
Beyond the examination of genome-wide P. falciparum mRNA dynamics throughout the IDC, these global transcriptional dynamics provide gene-specific profiles of nascent transcription and stabilization. From this we distinguished regulatory dynamics of genes with similar patterns of abundance and determine mutually exclusive expression of single genes within clonally variant families. Co-transcription and co-stabilization of genes provide a useful resource for the detection of cis-acting regulatory motifs and, the prediction of motifs with associated trans-factors. These orphan motifs can now be used to identify new DNA-binding proteins which remain uncharacterized. Our data, provides a comprehensive resource that will enable further investigation into the gene regulatory systems and mRNA metabolism of the malaria parasite.
Future studies will need to clarify the mechanisms regulating genome-wide rates of decay by evaluating the RNAinteracting proteins that regulate transcript stability throughout the IDC. By combining biosynthetic mRNA labeling and capture with crosslinking and mass spectrometry 99-104 a complete RNA-protein interactome of developmentally regulated transcripts can now be achieved. Additional future applications of this method will also focus on measuring immediate transcriptional responses to environmental or chemotherapeutic treatments.
This data set not only provides the malaria research community with measurements of the RNA dynamics for all individual genes throughout the IDC, but will also serve to enhance the bioinformatics analysis of transcription models, co-expression networks, and will aid in defining the timing of biological processes.

MATERIALS AND METHODS:
Transgene construction. The open reading frame of the yeast FCU gene was PCR amplified from Plasmodium falciparum vector pCC1 105 , and cloned as a translational fusion into the unique AvrII and BsiWI sites of the pLN-ENR-GFP vector 76 , resulting in pLN-5'cam-FCU-GFP plasmid and placing transcriptional control under the P. falciparum calmodulin promoter (pf3d7_1434200). These plasmids were transformed into, replicated in, and isolated from DH5α E. coli for transfection into P. falciparum.
Strains and culture maintenance. P. falciparum strain 3D7 attB 76 have been described in previous studies and parasite cultures were maintained under standard conditions 106 at 5% hematocrit of O+ human erythrocytes in RPMI1640 containing hypoxanthine, NaH2CO3, HEPES, glutamine and 5 g/L AlbuMAX II (Life Technologies).
The 3D7 attB lines has been genetically altered to contain an acceptor attB site (from M. smegmatis) into the nonessential cg6 locus. This recombinant locus is maintained by continuous selection with 2.5 nM WR99210 (Jacobus Pharmaceuticals, Princeton, NJ) 107 .
Generation of genetically modified parasite lines. Transfection of P. falciparum strains was performed as previously described 107 . Briefly, 5-7% ring-stage parasite cultures were washed three times with 10 times the pellet volume of cytomix. The parasitized RBC pellet was resuspended to 50% hematocrit in cytomix. In preparation for transfection, 100µg of both pLN-5'cam-FCU-GFP and pINT plasmid, expressing a Mycobacterium Bxb1 integrase that mediates attP and attB recombination, were precipitated and resuspended in 100μL cytomix. The plasmid and 250μL of the 50% parasitized RBC suspension were combined and transferred to a 0.2cm electroporation cuvette on ice. Electroporation was carried out using a BioRad GenePulser set at 0.31kV, 960uF. The electroporated cells were immediately transferred to a T-25 flask containing 0.2mL uninfected 50% RBCs and 7mL medium. To select for parasites containing both plasmids, medium containing 1.5ug/µL Blasticidin S (Sigma-Aldrich) and 250 µg/mL G418 Sulfate (Geneticin®, Thermo Fisher Scientific) was added at 48 hours post transfection. Cultures were maintained under constant 1.5ug/µL Blasticidin S (Sigma-Aldrich) pressure, splitting weekly, until viable parasites were observed. Viable transgenic parasites were then cloned by limiting dilution and genomic DNA isolated to detect the presence of the integrated fusion gene of fcugfp was PCR verified using previously published primer pairs 76 .
Protein extracts from both 3D7 attB wild-type parasites and 3D7 attB::fcu-gfp (clone A2) transgenic parasites were run on a SDS-10% polyacrylamide gel and transferred to a nitrocellulose membrane. The membrane was blocked with 5% milk for 30 min. To detect protein, the membrane was exposed to Sheep IgG anti-Cytosine Deaminase primary antibody (Thermo Fisher Scientific) diluted 1:500 in 3% bovine serum albumin and incubated overnight at 4 °C followed by a 1 hour incubation with goat-anti sheep-HRP secondary antibody (Thermo Fisher Scientific) diluted 1:1000 in 3% BSA. The membrane was incubated for 1 min in Pierce® ECL-reagent (Thermo Fisher Scientific) and protein detected by exposure to film.

Assessment of FCU-GFP transgene ability to salvage 4-thioluracil via Northern blot. Function and FCU-
GFP transgene was verified in the 3D7 attB::fcu-gfp compared to wild-type 3D7 attb by the addition of 40 µM of 4thiouracil (4-TU) (Sigma Aldrich) to the culture medium from a 200mM stock concentration prepared in DMSO as previously described 37 . After parasites were grown in the presence of 4-TU for 12 hours, total RNA was prepared from parasites in 5ml of Trizol and resuspended in DEPC-treated water to a final concentration of 0.5 µg/µL. As described in Painter et al. 37 , Total RNA was biotinylated with EZ-link Biotin-HPDP dissolved in formamide to a concentration of (Thermo Fisher Scientific) for three hours at room temperature under RNasefree conditions to reduce the amount of RNA degradation that can occur. Biotinylated RNA (2.5 µg) was run on an Ethidium Bromide 1% agarose gel at 4 °C for 30 min at 200V and was transferred to Hybond-N+ nitrocellulose (Amersham) membrane using traditional northern blotting techniques. The RNA was UV-crosslinked to the membrane and probed with streptavidin-HRP (1:1000) (Thermo Fisher Scientific). Biotinylated RNA was detected via incubation with ECL-reagent and exposure to film.
Thiol-labeling timecourse. P. falciparum transgenic parasites were cultured in human erythrocytes at 5% hematocrit in RPMI1640 complete medium as described above. All transgenic parasites were synchronized with three successive (48 hours apart) L-Alanine treatments to remove late stage parasites 108 . To ensure a sufficient amount of mRNA was labeled and captured for further analysis, we grew 48 individual parasite cultures to 15% parasitemia in 100 mL of medium at 5% hematocrit, one culture flask (150 cm 2 ) for each timepoint. Upon invasion 40 µM 4-TU was added to the medium and parasites were incubated for 10 min followed by total RNA isolation with TriZol. This 4-TU addition, incubation, and RNA extraction was repeated every hour for 48 successive timepoints throughout intraerythrocytic development.
Biosynthetic modification and isolation of labeled mRNA throughout development. Briefly, transgenic P. falciparum was incubated with 4-TU and total RNA was extracted as described above. All RNA was subjected to the biosynthetic modification and magnetic separation as previously published with a few adjustments 60,64,109 .
Specifically, 80 µg of total RNA (at a concentration of 0.4 µg/µl) was incubated at room temperature protected from light for 3 hours in the presence of 160 µl of 1 mg/ml solution of EZ-link Biotin-HPDP (Thermo Fisher Scientific). Biotinylated total RNA was precipitated and resuspended in DEPC-treated water to a final concentration of 0.5 µg/ml. Incorporation of 4-TU and biotinylation was determined by NanoDrop analysis and Northern blot probed with streptavidin-HRP. 4-TU labeled-biotinylated RNA was purified using Dynabeads® MyOne™ Streptavidin C1 magnetic beads (Life Technologies) at a concentration of 2 µl/µg of RNA. Beads were prewashed as per manufacturers protocol to remove any RNases and blocked with 16 µg of yeast tRNA (Life Technologies). 4-TU labeled-biotinylated RNA was added to the bead slurry and incubated at room temperature for 20min with rotation. The RNA-bead slurry was placed on a magnetic stand for 1 min, and liquid carefully removed and saved for RNA precipitation. This sample contained RNA that was not thiolated or biotinylated.
Then the RNA-bound beads underwent five rounds of stringent washes with buffer consisting of 1M NaCl, 5mM Tris-HCL (pH 7.5), 500µM EDTA in DEPC-treated water. 4-TU labeled-biotinylated RNA was eluted from the beads with 5% 2-mercaptoethanol incubated for 10 min, placed back on the magnetic stand, liquid removed and Array scanning and data acquisition. Arrays were scanned on an Agilent G2505B Microarray Scanner (Agilent Technologies) with 5μm resolution at wavelengths of 532 nm (Cy3) and 633 nm (Cy5) using the extended dynamic range (10-100%) setting. Normalized intensities were extracted using Agilent Feature Extractor Software Version 9.5 employing the GE2_1100_Jul11_no_spikein extraction protocol.
Array data normalization and expression value calculation. Each set of 48 arrays for Total, Labeled, and Unlabeled RNA was normalized using the Rnits (v1.2.0) from the Bioconductor project in R 77 . Arrays for 45 hpi contained an abundance of extreme outliers and were excluded from further analysis. Prior to data smoothing and fitting, probes that are Agilent Controls, rRNAs (28 probes), or found to be not unique (133 probes) were removed. The timecourse profiles of each probe were then smoothed using cubic splines and all negative values set to 0.5. For each probe, contribution of unlabeled (stabilized) and labeled (transcribed) RNAs was modeled to sum up to the total abundances, assuming that contributions are naturally non-negative: Total mRNA abundance = α(nascent transcription) + β(mRNA stabilization) In this equation α and β are the coefficients applied to the probe intensity values on the labeled and unlabeled RNA arrays, respectively. This problem of non-negative least squares is solved by the Lawson-Hanson algorithm (the nnls package in R). As a simple verification that the expression values estimated for both transcription and stabilization were accurate, the predicted total mRNA abundance was computed by adding the estimated unlabeled and labeled expression values. The Pearson correlation between the predicted total mRNA abundances and the conventional observed abundances was calculated. Each probe was summarized by gene using Tukey's biweight with a parameter c = 5. Finally, three timecourses for each gene are plotted over 48 hrs. These data are available on http://www.PlasmoDB.org 80 on individual gene pages or for download of the full dataset. The same data are available on Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo) under study accession number GSE66669.
Determination of Peak Timing and Rates of Transcription or Decay. We utilized the Lomb-Scargle periodogram approach to determine the peak time of total abundance, transcription, and stabilization for each gene as described previously 112 using the available LombScargle package for R 113 . The resulting peak times for each gene can be found in Table S1. These peak times were used to order genes based on peak time of total abundance and to bin genes based on their peak times of transcription and stabilization.
Rates of transcription for each gene were calculated based upon the slope of the calculated expression values from the start of transcription to the adjacent peak. Conversely, rates of decay were calculated by determining the slope of the expression values of unlabeled RNA (stabilized fraction) from the peak to the following trough. Calculated rates of transcription and decay for each gene were then binned into groups based upon the peak timing of nascent transcription or stabilization of that gene (Table S2). These groups represent six major morphological stage transitions; early ring (0-10 hpi), mid-ring (11-15 hpi), late ring/ early trophozoite (16-21 hpi), mid-trophozoite (22-26 hpi), late trophozoite (27-32 hpi) and schizont (33-48 hpi). The calculated rates are plotted as a boxwhisker plot with Tukey's Biweight distribution and median represented.
Motif and GO-term enrichment analysis. We identified motifs enriched in the 5' UTRs of genes co-transcribed throughout the IDC using a regulatory element discovery algorithm (FIRE) 95 and these results are represented as a heatmap. The activity profile of each motif was used to identify a refined list of target genes that were both enriched for the motif and similarly transcribed. Gene lists identified by FIRE analysis to be enriched for either the AP2-I or AP2-G2 motif were compared to the previously published target gene listed identified by ChIPseq 18,19 .
GO-term enrichment of select gene groups or clusters was carried out using the Analysis Tool at http://www.PlasmoDB.org 80 and terms that are significantly enriched (Bonferroni corrected p-value ≤ 0.05) are summarized in Table S3.