Dual RNA-seq to catalogue host and parasite gene expression changes associated with virulence of T. annulata-transformed bovine leukocytes: towards identification of attenuation biomarkers

The apicomplexan parasite Theileria annulata is transmitted by Hyalomma ticks and causes an acute lymphoproliferative disease that is invariably lethal in exotic cattle breeds. The unique ability of the schizont stage of T. annulata to transform infected leukocytes to a cancer-like phenotype and the simplicity of culturing and passaging T. annulata-transformed cells in vitro have been explored for live vaccine development by attenuating the transformed cells using lengthy serial propagation in vitro. The empirical in vivo evaluation of attenuation required for each batch of long-term cultured cells is a major constraint since it is resource intensive and raises ethical issues regarding animal welfare. As yet, the molecular mechanisms underlying attenuation are not well understood. Characteristic changes in gene expression brought about by attenuation are likely to aid in the identification of novel biomarkers for attenuation. We set out to undertake a comparative transcriptome analysis of attenuated (passage 296) and virulent (passage 26) bovine leukocytes infected with a Tunisian strain of T. annulata termed Beja. RNA-seq was used to analyse gene expression profiles and the relative expression levels of selected genes were verified by real-time quantitative PCR (RT-qPCR) analysis. Among the 3538 T. annulata genes analysed, 214 were significantly differentially expressed, of which 149 genes were up-regulated and 65 down-regulated. Functional annotation of differentially expressed T. annulata genes revealed four broad categories of metabolic pathways: carbon metabolism, oxidative phosphorylation, protein processing in the endoplasmic reticulum and biosynthesis of secondary metabolites. It is interesting to note that of the top 40 genes that showed altered expression, 13 were predicted to contain a signal peptide and/or at least one transmembrane domain, suggesting possible involvement in host-parasite interaction. Of the 16,514 bovine transcripts, 284 and 277 showed up-regulated and down-regulated expression, respectively. These were assigned to functional categories relevant to cell surface, tissue morphogenesis and regulation of cell adhesion, regulation of leucocyte, lymphocyte and cell activation. The genetic alterations acquired during attenuation that we have catalogued herein, as well as the accompanying in silico functional characterization, do not only improve understanding of the attenuation process, but can also be exploited by studies aimed at identifying attenuation biomarkers across different cell lines focusing on some host and parasite genes that have been highlighted in this study, such as bovine genes (CD69, ZNF618, LPAR3, and APOL3) and parasite genes such as TA03875.


Global gene expression changes following in vitro serial propagation T. annulata infected leukocytes
In total, 3538 parasite transcripts were detected across the six replicates sequenced (three from each passage), 149 of which were significantly up-regulated and 65 were down-regulated in the attenuated passage (Fig. 2A).Analysis of the bovine cell transcriptome revealed a total of 16,514 genes, 284 of which were significantly upregulated, 277 were down-regulated and 15,953 genes were unchanged after attenuation (Fig. 2B).

Broad functional categories represented by the genes that significantly changed expression in the host and the parasite
The gene functions are classified into three groups including biological processes (BP), cellular components (CC) and molecular function (MF).The refined list of 171 differentially expressed bovine genes that exhibited a twofold or greater significant changes between virulent and attenuated cell lines were associated with the cell surface and with functions such as tissue morphogenesis, regulation of leukocyte cell-cell adhesion, regulation of cell adhesion and regulation of leucocyte and lymphocyte cell activation (Fig. 3).
Although there was no significant GO enrichment, the differentially expressed parasite genes were associated with different metabolic pathways such as protein folding, lipid biosynthetic process, cellular lipid metabolic process, chromosome organization, generation of precursor metabolites and energy (Supplementary Fig. 2b).KEGG analysis similarly identified parasite genes enriched in metabolic pathways such as carbon metabolism, oxidative phosphorylation, protein processing in the endoplasmic reticulum and phagosome and biosynthesis of secondary metabolites (Supplementary Fig. 2a).

Transcriptional changes in T. annulata-infected bovine leukocytes during long-term in vitro culture
Because the loss of virulence and clinical pathogenicity of high passaged (attenuated) T. annulata transformed cells could directly be due to changes in host cell gene expression during the long-term process of in vitro passaging 20,24 , the bovine transcriptome of leukocytes of low and high passage were compared resulting in a total  of 561 host genes that were significantly differentially expressed (284 were up-and 277 were downregulated).
Based on the padj value, 168 genes were found with a fold change value greater than 2 with significant p adj values (< 0.05).The most up-regulated genes in high passage attenuated Beja cells were UPK3BL1 (Bos taurus uroplakin 3B-like) and P2RX3 (purinergic receptor), while the most down-regulated genes were ZNF618 (zinc finger protein 618) and the B. taurus CD69 molecule.A list of top 20 up and top 20 down-regulated genes is shown in Table 1.The abscissa in the graph (A) is the ratio of the number of differential host genes on the KEGG pathway to the total number of differential genes, and the ordinate is KEGG pathway.The abscissa in the graph (B) is the ratio of the differential gene number to the total number of differential genes on the GO Term, and the ordinate is GO Term.www.nature.com/scientificreports/ We were wondering if our list of DEGs contain genes that are directly controlled by T. annulata infection and are down-regulated during the attenuation process.To identify this group of genes we merged the lists of genes upregulated upon infection with those downregulated in attenuated Beja.The list of genes significantly de-regulated by T. annulata between non-infected immortalized bovine B lymphocyte lines BL3 and BL20 compared to parasite infected TBL3 and TBL20 was previously published 33 .A comparison of our results with that study revealed that 11 and 13 genes were common with TBL3 and TBL20, respectively, whereas only two genes (ANXA and FAM161A) were common among the three datasets (Fig. 4, panel a).The other category of potentially interesting genes was those that were repressed by T. annulata infection (down-regulated in TBL3/ TBL20) and upregulated in attenuated Beja.This category likely represents tumour suppressors and were more than pro-tumorigenic genes (only two genes) shared between TBL3, TBL20 and attenuated Beja (i.e., nine genes) (Fig. 4, panel b).
To date, comparative differential host cell gene expression of only one attenuated T. annulata (Anand, India) transformed macrophage, termed Ode has been reported 33,34 .For the purpose of finding common differentially expressed host genes between the Beja cell line studied herein and Ode that might be useful as attenuation biomarkers, a comparison of the differential transcriptome between virulent and attenuated passage of the two cell lines was conducted.The comparison showed only five common genes, three of which were up-regulated (CAPN3, CTNND1, IL4I1) and two were down-regulated in attenuated Beja passage (IL12A and TXNIP) (Fig. 4, panel C).Activator protein 1 (AP-1) is a transcription factor whose constitutive activity is essential for T. annulatainduced leukocyte transformation 10 .Down-regulated AP-1 activity has been demonstrated in Indian (Ode) and Tunisian (Jed4) cell lines 10,27,28 .We conducted a bioinformatic screen (PROMO) of our DEG list for the presence of potential AP-1 binding sites on their promoters (1000 base pairs upstream of the start codon).Among the 277 down-regulated genes and 284 up-regulated genes, a total of 30 and 39 genes possesses potential binding sites for AP-1 transcription factor, respectively (Supplementary Table 2).
Nuclear factor (NF-κB) is known to be regulator of Theileria transformed cell proliferation and survival 35 .In our DEGs, putative binding sites of NF-κB were found in the promoters of 13 up-regulated genes and 14 downregulated genes (Supplementary Table 2).

Transcriptional differences in T. annulata parasite between virulent and attenuated passage of Beja strain
As it has been demonstrated that virulence and attenuation of T. annulata transformed leukocytes is a parasite encoded trait 25,26 , the most interesting genes are those that were down-regulated during attenuation especially the ones with a signal peptide (SP) or transmembrane domain(s) (TMDs).These are potential virulence factors as they are likely to be secreted and transferred to either schizont membrane, host cytoplasm or host cell nucleus.These sets of genes hold the potential to interact with host cell signalling pathways that are associated with virulence of the infected host cell.Based on the p value, we selected the top 40 genes which were significantly up-or down-regulated.Using PiroplasmaDB, we found that five of them possessed signal peptides and ten had one to nine TMDs which suggest their involvement in host parasite interaction (Table 2).www.nature.com/scientificreports/We focused our analyses on the significantly down-regulated genes containing SPs or/and TMDs and looked at their orthologs in T. parva or T. orientalis to examine if their function is known.However, these genes were either not annotated (unspecified product or uncharacterized protein) or are known as integral membrane proteins (Supplementary Table 3).Only one gene (TA03875) was predicted to contain a NLS fragment (PKRKNIPG-YNRRRTNNRT) located between position 116 and 133 aa (Supplementary Table 3).
Of the most significantly differentiated Theileria genes (based on the padj value), only three (TA04760, TA04840 and TA09695) were predicted to be involved in regulating different process such as gene expression, transcription, translation, biosynthesis, protein metabolic process and cellular metabolic process.It is interesting to note that these three parasite genes are hypothetical proteins that are associated with 29 common predicted pathways and are down-regulated in attenuated passages (Supplementary Fig. 3).
Using PiroplasmaDB, we looked at the orthologues of these genes in other Apicomplexa and their possible described function(s).TA04760 in T. parva is annotated as MED6 mediator sub complex component family protein and it regulates the transcription using polymerase II, thus its down-regulation following attenuation might affect expression of a subset of parasite genes that are potentially virulence factors.TA04840 and TA09695 are ribosomal proteins with similar functions and are involved in translation and structural constituent of the ribosome (Supplementary Table 4).
Furthermore, given the cancer-like phenotype induced by infection, we screened our dataset for genes that have a role in tumorigenesis.Interestingly, many of the transcripts that are differentially expressed (24 out of 40 DEGs) remain unannotated, despite publication of the T. annulata Ankara genome.As for the genes for which annotation is available, the roles of most of them in T. annulata pathogenesis or attenuation remain uncharacterised and only few are known to possess transcription coregulator activity (TA04760), nucleic acid and protein www.nature.com/scientificreports/binding (TA09685, TA15135, TA03040) and nucleic acid and zinc ion binding (TA15770).All these genes are down-regulated in attenuated passage.
For validation of Illumina RNA-seq determined levels of parasite gene expression by RT-qPCR, we selected 11 parasite genes which exhibited significant differential expression based on their padj values; nine of them were down regulated and two were up-regulated (supplementary Table 1).The comparative RNA seq results were corroborated by the qRT-PCR data (Fig. 5).

Discussion
The mechanisms by which the multinucleated schizont stage of T. annulata hijacks host cell intracellular signalling and transcription pathways to reversibly transform infected leukocytes are not fully understood.Similarly, the complex interplay between T. annulata and the bovine macrophage during the generation of attenuated cell-line vaccines is not unravelled either.The amount of host and parasite genetic information is increasing exponentially and so are the resources with which to investigate the complex host-pathogen interactions.One such resource, RNA-seq analysis, is suited for investigating complex host-pathogen interactions as it allows for the simultaneous analysis of the expression of thousands of genes.Here, we report on an RNA-seq analysis comparing host and parasite mRNA levels in a virulent and attenuated T. annulata cell line from Tunisia.
The analysis identified over 500 bovine host genes and 200 parasite genes that exhibited a statistically significant differential expression, implying that there are differences in transcriptional pathways between virulent and attenuated cell lines.Overall, there was good agreement between the RNA-seq data and the qRT-PCR results for the genes exhibiting cell line-specific differential expression that were selected for validation.A demonstration that the data set represents differences in gene expression profiles between virulent and attenuated cell lines was www.nature.com/scientificreports/provided by PCA where clustering of the samples with passage level can clearly be observed (Fig. 1), and PC1 explained 50.6% of variation in the data.In order to include genes of putatively greater biological relevance, we refined the list of differentially expressed genes to include only those that exhibited a twofold or greater average difference in expression between the virulent and attenuated (Table 1).Based on the fold change value, the largest difference between virulent and attenuated cell line was observed for the host more than parasite genes.A total of 168 host genes showed a fold change value of 2 or more with significant padj values.Among these genes, 99 were up-regulated and 69 were down-regulated.Given the cancer-like transformation of infected macrophages by T. annulata, it is not surprising that a large proportion of the differentially expressed genes with fold change above the twofold cut-off, encode proteins that have been associated with tumours.The most striking host gene expression difference (exceeding sixfold) was observed for UPK3BL1 (Bos taurus uroplakin 3B-like) followed by P2RX3 (purinergic receptor), KCNQ2 (potassium voltage-gated channel subfamily Q member 2) and KRT17 (keratin 17) with a fold change of 5.8, 5.7 and 5.4, respectively.
UPK3BL1 is implicated in stabilizing and strengthening the urothelial cell layer of the bladder, it has been shown to be highly expressed in mesotheliomas and in the urothelial tumours of the urinary bladder 36 .P2RX3, has been reported to be involved in tumour formation and play an important role in malignant transformation of several different cell types and they are highly expressed in cancer 37,38 .The biological relevance of its high expression in attenuated T. annulata infected cell lines has not been investigated.KCNQ2 gene is known to encode for subunits of a potassium channel complex and to be involved in membrane polarisation, they are acting as a suppressors or activator of gastrointestinal cancer 39 .The most down-regulated gene was ZNF618 (zinc finger protein 618) followed by CD69 (Bos taurus CD69 molecule), LPAR3 (Bos taurus lysophosphatidic acid receptor 3) with a fold change of − 6.5, − 6 and − 5.6, respectively.CD69 is a glycoprotein type II known to regulate inflammation through T-cell migration and retention in tissues and has a crucial role in inducing the exhaustion of tumour-infiltrating T cells 40,41 .CD69 is also an AP-1 target gene 42,43 , its role in T. annulata pathogenesis and attenuation might be important to investigate.As already mentioned, the genes that exhibited statistically significant passage differences in expression and are also supported by the highest fold difference in expression can be investigated as candidates underlying attenuation.
A comparison of Beja DEGs with attenuated Ode showed five common genes.Two of them were downregulated (IL12A and TXNIP), which could be potential attenuation markers, and three were up-regulated (CAPN3, CTNND1, IL4I1).The up-regulated genes could be that they are not involved in the attenuation process but rather other mechanisms such as immunosuppression or inflammation.For example, IL4I1 (interleukin 4 induced 1) protein may play a role in immune system escape as it is expressed in tumour-associated macrophages and it was described as a metabolic immune checkpoint which promotes tumour progression 44 .While Calpain-3 (CAPN3) is an intracellular cysteine protease, it has been shown in human melanoma that a variant of this gene called hMp84 increases the intracellular production of ROS (Reactive Oxygen Species) leading to DNA damage 45 which has been also confirmed in Theileria to cause an oxidative stress by elevation of ROS and disruption of the redox balance which are required for parasite transformation 46 .This is interesting as it might explain why in attenuated macrophages H2O2 type of oxidative express accumulates in spite of lower JNK (an antioxidant) activity 47 .It should be noted that H2O2 levels in virulent and attenuated T. annulata-transfomed Ode macrophages inversely correlated with their Matrigel traversal capacity 47 .
Parasite-dependent induction of key host cell transcription factors (AP-1, NF-kB, etc.) in Theileria annulata transformed cells causes massive reprogramming of the host cell transcriptome 33,48 .Thousands of host cell genes increase or decrease upon Theileria annulata-induced cell transformation.We conducted an inversed comparison with previously published T. annulata infected BL3 and BL20 cell lines differential transcriptome in order to find genes that have been induced by infection and are down-regulated in attenuated passage or vice-versa (Fig. 3).For example, MAPK13 (mitogen-activated protein kinase 13) has been found in gynaecological cancer to be preferentially expressed in cancer stem cells and it is implicated in the tumour-initiation 49 .It was induced by infection in TBL20 but down-regulated in attenuated Beja (log2 Fold Change = − 1.2; p adj = 0.0001).AnxA1 (annexin A1) is a phospholipid-binding protein that has been described to have either activator or suppressor role in different cancer types and stages 50 .In TBL3 and 20, this gene was up-regulated by infection.It is found to be significantly down-regulated in attenuated transcript (log2 Fold Change = − 1.08; p adj < 0.001).FAM161A (centrosomal protein) is known to be involved in development of retinal progenitors during embryogenesis and it is a part of microtubule-organizing centres in cultured cells and it has been demonstrated to be involved in the intracellular microtubule network 51,52 , it was down-regulated in attenuated Beja (log2 Fold Change was − 1.005; p adj value = 0.01).
It has also been shown in breast cancer research that apolipoprotein modulates the cholesterol metabolism and stimulates the proliferation, migration, and tumour growth 53 .In our list Apolipoprotein 3 (APOL3) is induced by Theileria infection and down-regulated upon attenuation (log2 Fold Change was − 1.09; p adj value < 0.001) which suggest its implication in the attenuation process.Tumour protein 63 (p63) is known to enhance the migration, invasion, and tumour growth in human cancer 54 .This gene was down-regulated in attenuated passage (log2 Fold Change was − 1.09; p adj value < 0.001).
ARHGAP9 (Rho GTPase-activating protein 9), a tumour suppressing gene in bladder cancer 55 that has been reported to be suppressed by infection in TBL, was significantly up-regulated in the attenuated passage (log2 Fold Change = 1.6; p adj < 0.001) but other ARHGAP were significantly down-regulated such as ARHGAP15 (log2 Fold Change = − 1.05; p adj < 0.001) and ARHGAP29 (log2 Fold Change = − 1.32; p adj < 0.001).These genes could also be evaluated as candidates involved in the attenuation process.
Our results suggest that long term passage of infected cells primarily results in transcriptional down-regulation of some cytokines (e.g., IL6, IL12 and IL21) and genes that may be active in pathways controlling the interaction between the cytokines and their receptors.Remodelling the host cell transcriptome in this way by www.nature.com/scientificreports/attenuation appears consistent with the previous suggestion that upregulation of pro-inflammatory cytokines might play a role in pathology.This was on the basis of the observation that the clinical signs of T. annulata infection, particularly fever, cachexia, leucopoenia and anaemia are the same signs that follow experimental administration of pro-inflammatory cytokines 56 .
A high proportion of the differentially expressed genes (51 host genes in total where 14 of them had fold changes between 2 and 6) are unannotated and the importance of these proteins is unclear at this time.However, identification of these genes may provide vital information pertaining to attenuation of T. annulata infected cell lines.It is also important to mention that besides the reduced expression of pro-inflammatory cytokines that we confirm, attenuation has also previously been associated with reduced expression of the matrix metalloproteinase MMP9 25 , which is linked to the invasive capacity of virulent cell lines 57 .In T. annulata-transformed macrophages, MMP9 transcription is regulated by Activator protein 1 (AP-1) 27 .Down-regulated AP-1 activity has been demonstrated in Indian (Ode) and Tunisian (Jed4) cell lines 10,27,28 .However, our comparative RNAseq transcription profiles did not reveal a downregulation of this candidate attenuation marker.Interestingly, MMP9 had very low level of read counts in our RNA-seq data.Therefore, differential expression of genes other than MMP9 could be associated with diminished virulence of attenuated Beja, such as those presented in Fig. 4 (for example MAPK13, ANXA, etc.…).Also, the fact that Beja cell line is less virulent than Jed4 at low passage should be taken into account as it might be that the expression level of AP-1 is low at virulent passage 58 .Although we did not measure AP-1 activity in virulent versus attenuated Beja, we found potential AP-1 binding sites on promoters of 30 genes that were down-regulated in attenuated Beja.The presence of potential AP-1 binding sites on down-regulated gene promoters and dampened AP-1 activity in attenuated cells suggest that reduced AP-1 activity might explain the attenuated phenotype of high passage Beja leukocytes.It has also been reported that NF-κB (nuclear factor kappaB) is activated in Theileria transformed cells via the IKK (IκB kinase) complex to regulate cell proliferation and protect it against apoptosis 35,59 but its role in the attenuation process is not yet demonstrated.In our DEGs list we found NF-κB binding sites in 13 up-regulated genes and 14 down-regulated genes promoters.The role of NF-κB in the attenuation process of T. annulata need to be investigated.
As regards to the parasite, it has previously been suggested that both altered gene expression and clonal selection of parasite populations may be involved in the loss of pathogenicity of T. annulata during continuous in vitro culture 60 .For the parasite transcript, only five down-regulated genes had a fold change more than − 2, all of which were hypothetical proteins.Availability of Beja genome sequence will likely make it possible to expand the annotation available for these genes, particularly given that analysis of antigen gene sequences, including TaSP, have shown that the Tunisian T. annulata differs from parasites from other locations including Turkey from where the reference genome is derived 61 .There was one up-regulated parasite gene that had a fold change of 3 (cytochrome C oxidase subunit III), while the most of the DEGs have a fold change value less than 2. Nonetheless, we below describe some of the parasite genes such as TA08610, TA03875, TA18945, TA20090 and TA09695 that likely have a role in T. annulata virulence.
TA08610 is a gene coding for the prefoldin subunit (one of the prefoldin complex), a cytoplasmic chaperone involved in protein folding with multiple roles in different tumours.The alpha subunit has a tumour-suppressing role and the beta subunit is involved in tumorigenesis 62,63 .This gene was significantly down-regulated in the attenuated cell line dataset (log2 Fold Change = − 1.5; p < 0.001).TA03875 codes for RNA poly(A)-binding protein which is a post-transcriptional regulator of gene expression controlling mRNA stability, polyadenylation, and other functions.An aberrant expression of this gene has previously been reported to be associated with the development of breast cancer 64,65 , this gene was down-regulated in attenuated passage in the attenuated Beja passage (log2 Fold Change = − 0.9; p < 0.001).We also found that this gene contains a nucleus localization signal (NLS) which suggest its possible transport into the host nucleus, however this needs to be confirmed since previous studies showed that some genes such as TaMISHIP (T.annulata proline-rich microtubule and SH3 domain-interacting protein) presented NLS but they were not detected in the host nucleus 66 .Additionally, we checked some of the previously characterised T. annulata effector proteins (TaSP, TashATs, TaHSP90, TaPIN1, Ta-p104; reviewed by Tajeri and Langsley (2021) 67 for their possible differential expression, all of these genes were unchanged in our list.
It is known that protozoa execute a strict regulation on the expression of genes that are involved in their pathogenicity 68 .In Theileria transcripts, we found three hypothetical proteins (TA04840, TA04760 and TA09695) that have been reported to be involved in gene expression and in translation.TA04760 is involved in the regulation of various process such as cellular process, biological process, transcription process, regulation of gene expression and RNA biosynthetic processes.These genes were down-regulated in the attenuated passage.The role of the mentioned hypothetical proteins needs to be characterized in order to confirm their implication in the attenuation process and if they are involved in regulation of the host gene expression.If this is confirmed, it might be helpful for the control of the T. annulata by manipulating gene expression as it was described in Plasmodium falciparum by using Peptide Nucleic Acids (PNAs) for gene silencing which induced down-regulation of stably expressed transgene and endogenous essential gene led to reduction of plasmodium viability 69 .
In conclusion, our comparative analysis catalogued for the first time differentially expressed host and parasite genes between virulent and attenuated Tunisian cell lines (Beja).A major finding of the study is that both the bovine host and Theileria display changes in gene expression following attenuation.Besides the use of statistical significance and average fold change to prioritize candidates, recent advances in cancer also enabled description of additional genes with known roles in different tumour and their potential role in T. annulata pathogenesis.In particular, the number of host (e.g., ZNF618, LPAR3, APOL3, CD69) and parasite (TA03875, TA04840, TA04760 and TA09695) differentially expressed genes listed herein could be further validated using previously described methods such as the Matrigel chamber assay 33 , Enzyme-Linked Immunosorbent Assay (ELISA) 70 , immunoblotting and immunofluorescence 71 , in order to find potential attenuation markers.Such studies would contribute to a better understanding of the mechanism driving attenuation, reduce the time needed to establish an attenuated cell line and contribute to the application of 3Rs principle in Theileria research by reducing the number of animals required for the validation of attenuation.

Theileria annulata cell line and culture
The T. annulata-infected cell line used in this study is a stock of the Beja strain (also referred to as CL1 in literature).The strain was originally isolated in 1989 from an infected five-year-old crossbred cow with an acute form of TT and it was used to vaccinate cattle in Tunisia against tropical theileriosis 24,72 .Phenotyping of the Beja strain, up to passage 300 using flow cytometry and specific monoclonal antibodies, revealed a dominance of the myeloid lineage (macrophages/monocytes).Furthermore, the markers associated to B cells were not entirely absent relative to the parental lines, which tend to indicate that a minor population of infected B cells (2-8%) may persist during the process of attenuation 58 .The cell line was frozen in liquid nitrogen and was resuscitated and propagated in RPMI supplemented with 10% FBS (RPMI-FBS), 2 mM L-alanyl-l-glutamine, 100 IU/ml penicillin and 100 μg/ml streptomycin.The culture flasks (three replicates for each passage) were incubated at 37 °C in a 5% CO 2 -in-air atmosphere.

RNA extraction
RNA was extracted from three replicates of both virulent (passage 26) and attenuated (passage 296) cells using a Direct-Zol RNA miniprep plus kit (Zymo Research Europe GmbH, Freiburg, Germany) according to the manufacturer's guidelines.The transcriptome of the T. annulata virulent and attenuated Beja cell line from Tunisia was subsequently determined using Illumina RNA sequencing.

RNA quantification, qualification and library preparation for transcriptome sequencing
Three purified RNA replicates from each passage with concentrations from 168 to 361 ng/µl were sequenced at Novogene laboratories (Cambridge Science Park, Cambridge, CB4 0FW, United Kingdom).Messenger RNA was purified from total RNA using poly-T oligo-attached magnetic beads.After fragmentation, the first strand cDNA was synthesized using random hexamer primers, followed by second strand cDNA synthesis using dUTP for directional library or dTTP for non-directional library.The library was checked with Qubit and real-time PCR for quantification and bioanalyzer for size distribution detection.Quantified libraries were pooled and sequenced on Illumina platforms.The clustering of index-coded samples was performed on a cBot Cluster Generation System using TruSeq PE Cluster Kit v3-cBot-HS (Illumina) according to the manufacturer's instructions.After cluster generation, the library preparations were sequenced on an Illumina Novaseq platform and 150 bp pairedend reads were generated.Raw data were first processed through in-house perl scripts where clean reads were obtained by removing reads containing adapter, reads containing poly-N and low-quality reads.The RNA seq raw data has been submitted to the SRA database (sequence read archive) under the accession number PRJNA957332.

Mapping to the reference genome and quantification of gene expression levels
Reference genome annotation files were downloaded for the T. annulata Ankara cell line (PRJNA16308) and Bos taurus (PRJNA391427).Index of the reference genome was built using Hisat2 v2.0.5 and paired-end clean reads were aligned to the reference genome using Hisat2 v2.0.5.The mapped reads of each sample were assembled by StringTie (v1.3.3b) 73in a reference-based approach.Feature Counts v1.5.0-p3 was used to count the reads numbers mapped to each gene after which the FPKM (number of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced) of each gene was calculated based on the length of the gene and reads count mapped to this gene.

Differential expression analysis, GO and KEGG enrichment analysis of differentially expressed genes
Differential expression analysis of the two groups was performed using the DESeq2 R package (1.20.0).The resulting P-values were adjusted using the Benjamini and Hochberg's approach for controlling the false discovery rate 74 .In general, genes with p value ≤ 0.05 were assigned as differentially expressed.A more stringent criteria of an absolute fold change of 2 or more with p adj value ≤ 0.05 was initially applied to both the host and parasite genes.The list of differentially expressed parasite genes was subsequently extended to include genes that exhibited significant differential expression based on the p value only, but additional possessed important features such as transmembrane domains and/or signal peptides.
Gene Ontology (GO) enrichment analysis of differentially expressed genes was implemented by the cluster-Profiler R package, in which gene length bias was corrected.GO terms with corrected p-values less than 0.05 were considered significantly enriched by differentially expressed genes.Cluster Profiler R package was used to test the statistical enrichment of differential expression genes in KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways 75 .

Additional bioinformatic screen: in silico search for differentially expressed T. annulata proteins with signal peptides, transmembrane domains or nuclear localization signals and prediction of AP-1 binding sites on promoters from differentially expressed bovine genes
PiroplasmaDB (https:// pirop lasma db.org/ piro/ app/) was used to further characterize Theileria DEGs including their assignment to GO functions and predictions of the presence of a signal peptide (SP) and transmembrane domains (TMDs).In order to find nuclear localization signals (NLS) we used the novopro lab online tool (https:// www.novop rolabs.com/ tools/ nls-signal-predi ction).We also conducted a bioinformatic screen of our host DEGs list for the presence of AP-1 (activator protein 1) and NF-kappa B potential binding sites using the PROMO website (http:// alggen.lsi.upc.es/ cgi-bin/ promo_ v3/ promo/ promo init.cgi? dirDB= TF_8.3).A promoter region was considered 1000 bp upstream of the start codon and retrieved from Ensembl website (https:// www.ensem bl.org/ bioma rt/ martv iew/ 151e6 b41a7 a359e 264c5 aa1d9 6328f a7).

Validation of RNA-seq results by qRT-PCR
The differential gene expression was validated by quantitative reverse transcription PCR (qRT-PCR).cDNA synthesis was conducted using the ProtoScript® II First Strand cDNA Synthesis Kit (New England BioLabs Inc, Ipswich, US) according to manufacturer´s instructions from the same RNA samples used for the RNA-seq.The specific primer pairs for 11 differentially expressed T. annulata genes were designed using Geneious software 76 (Supplementary Table 1).The Theileria actin gene (TA13410) was used as a reference gene.The qPCR reactions were performed using Luna® Universal qPCR Master Mix (New England BioLabs Inc) according to the manufacturer's protocol.The 2 −ΔΔCT methodology was used to estimate the relative gene expression levels 77 .

Figure 1 .
Figure 1.Principal component analysis (PCA) plots and inter-sample correlation heat map (R 2 : Square of Pearson correlation coefficient(R)) showing the clustering and the correlation between the samples (BT296 is the attenuated and BT26 is the virulent).

Figure 2 .
Figure 2. Differential Gene expression Volcano Map showing the up-and down-regulated genes as well as the unchanged genes in the parasite (A) and in the host (B).The abscissa in the figure is log2 Fold Change, and the ordinate is − log10 p adj or − log10 p value, the dashed line indicates the threshold line for differential gene screening criteria.

Figure 3 .
Figure 3. KEGG (A) and GO (B) enrichment GO enrichment analysis scatter plot of differentially expressed host genes.The abscissa in the graph (A) is the ratio of the number of differential host genes on the KEGG pathway to the total number of differential genes, and the ordinate is KEGG pathway.The abscissa in the graph (B) is the ratio of the differential gene number to the total number of differential genes on the GO Term, and the ordinate is GO Term.

Figure 4 .
Figure 4. Venn diagrams showing the genes inversely differentially expressed in Beja high passage (attenuated) compared to TBL3, TBL20 and attenuated Ode.Panel A showing an inverse comparison of genes downregulated in attenuated Beja vs genes up-regulated by infection; Panel B showing genes up-regulated in Beja vs genes down-regulated in TBL and panel C showing the common genes in attenuated passage between Beja and Ode cell lines.

Table 2 .
Top 40 up-and down regulated genes in Theileria annulata virulent and attenuated strain based on P value.*RPKM = reads Per Kilobase of transcript, per Million mapped reads.