RNA-Seq analysis of ileocecal valve and peripheral blood from Holstein cattle infected with Mycobacterium avium subsp. paratuberculosis revealed dysregulation of the CXCL8/IL8 signaling pathway

Paratuberculosis is chronic granulomatous enteritis of ruminants caused by Mycobacterium avium subsp. paratuberculosis (MAP). Whole RNA-sequencing (RNA-Seq) is a promising source of novel biomarkers for early MAP infection and disease progression in cattle. Since the blood transcriptome is widely used as a source of biomarkers, we analyzed whether it recapitulates, at least in part, the transcriptome of the ileocecal valve (ICV), the primary site of MAP colonization. Total RNA was prepared from peripheral blood (PB) and ICV samples, and RNA-Seq was used to compare gene expression between animals with focal or diffuse histopathological lesions in gut tissues versus control animals with no detectable signs of infection. Our results demonstrated both shared, and PB and ICV-specific gene expression in response to a natural MAP infection. As expected, the number of differentially expressed (DE) genes was larger in the ICV than in the PB samples. Among the DE genes in the PB and ICV samples, there were some common genes irrespective of the type of lesion including the C-X-C motif chemokine ligand 8 (CXCL8/IL8), apolipoprotein L (APOLD1), and the interferon inducible protein 27 (IFI27). The biological processes (BP) enriched in the PB gene expression profiles from the cows with diffuse lesions included the killing of cells of other organism, defense response, immune response and the regulation of neutrophil chemotaxis. Two of these BP, the defense and immune response, were also enriched in the ICV from the cows with diffuse lesions. Metabolic analysis of the DE genes revealed that the N-glycan biosynthesis, bile secretion, one-carbon pool by folate and purine metabolism were significantly enriched in the ICV from the cows with focal lesions. In the ICV from cows with diffuse lesions; the valine, leucine and isoleucine degradation route, purine metabolism, vitamin digestion and absorption and the cholesterol routes were enriched. Some of the identified DE genes, BP and metabolic pathways will be studied further to develop novel diagnostic tools, vaccines and immunotherapeutics.


Results
Summary of the RnA-Seq data. The animals included in our study were tested by multiple diagnostic tests including histopathological analysis of gut tissues, acid-fast stain, specific antibody response to MAP, fecal qPCR and tissue and fecal bacteriological culture. The results of the premortem (at the annual sampling time) and post-mortem (at the time of slaughter) diagnostic tests are showed in Table 1. All control animals (n = 3) showed a negative result for all diagnostic tests. Culling of control animals was due to age, accident, and coxofemoral luxation. All the infected cows included in our study had detectable lesions in gut tissues with distinct severity, focal or diffuse. At the slaughter time, the animals with focal lesions in gut tissues (n = 6) showed a negative fecal culture result but were fecal PCR positive (1/6), ELISA positive (1/6), and/or had evidence of MAP infection in gut tissues by ZN (1/6), PCR (5/6) and/or bacteriological culture (2/6). None of the animals with focal lesions had a heavy bacterial load (>50 cfu/gr) in feces and tissues and none of them was culled due to paratuberculosis-associated clinical signs. In contrast, the cows with diffuse histopathological lesions in gut tissues (n = 5) were culled due to a positive ELISA or PCR result (2/5), or because they showed PTB-associated clinical signs, including low milk production (1/5) or profuse diarrhea and weight loss (2/5). At the slaughter time, all of the cows with diffuse lesions had a positive ZN and ELISA result, a PCR-positive result in feces and tissue samples, and had a positive bacteriological culture from gut tissues. The mean amount of MAP estimated by qPCR in the gut tissues of all the animals with diffuse lesions was >66765 copies of MAP/gr tissue while the mean bacterial load in the gut tissue samples of the animals with focal lesions was ˂ 30528 copies of MAP/gr tissue.
RNA-Seq libraries were prepared from PB and ICV of 14 animals within the three categories; with focal or diffuse histopathological lesions in gut tissues and control. Duplicate RNA extractions and corresponding RNA-Seq libraries were generated from PB and ICV samples of five of the 14 cows (ID 2, 6,8,11,12) to check the reproducibility of the RNA-Seq technology. The RNA-Seq data summary for each biological sample and experimental replicate including number of reads per individual RNA-Seq library at each phase of the analysis is provided in Supplementary Table I. All the libraries were sequenced generating an average of de 22.31 million raw reads per library with a Phred quality score >30 which is considered a benchmark for quality in next-generation sequencing. Following filtering of reads based on quality score, minimum size of 60 bp and ambiguous base N percentage less than 5%, an average of 21.44 million reads remained. Alignment of the filtered RNA-Seq reads to the Bos taurus reference genome yielded mean values per library of 19.96 million reads. From the mapped reads, an average of 5% of the reads mapped to multiple locations in the genome and were excluded for gene expression analysis. A more detailed analysis of the reads mapping to unique locations was obtained using bedtools and featureCounts, two software packages developed for counting reads to genomic features such as upstream and downstream regions of genes, exons, introns and intergenic regions. Analysis of the individual library reads from the PB samples revealed that 57% of the reads aligned to exons, 28% to introns, 3% to intergenic regions, 2% to upstream regions and 10% to downstream regions. In the ICV samples, introns accounted for 24% of reads, exons for 63% of the reads, 2% of the aligned reads mapped intergenic regions, 1% upstream regions and 10% downstream regions.
Analysis of differential gene expression from RNA-Seq data. Although high technical reproducibility has been claimed for the RNA-Seq technology 25,26 , we performed duplicate RNA extractions and corresponding RNA-Seq library preparations from PB and ICV samples of some of the animals (ID 2, 6,8,11,12). Variations in this experiment could arise from the RNA preparation, fragmentation or priming method, ligation efficiencies, amplification and other variables in the technical steps of library construction and amplification. After performing gene expression analysis as describe in materials and methods, no significant differences were observed between each sample and corresponding replicate which support the reproducibility of gene expression studies using RNA-Seq ( Supplementary Fig. 1).
The DE genes (FDR < 0.05) between cows with focal or diffuse histopathological lesions versus control cows are presented as red dots in Fig. 1a. The number of DE genes was higher in the ICV samples than in the PB samples and increased in animals with diffuse histopathological lesions when compared with the group of animals with focal lesions. This was expected since diffuse lesions represent a more advanced stage of the disease. In the PB samples, 109 and 207 genes were DE in the comparisons between animals with focal or diffuse lesions and control cows, respectively (Fig. 1b). Data analysis revealed 51 DE genes that were common in the PB samples regardless of the type of lesion. The transcriptomic analysis of the ICV samples showed that 3189 and 4724 genes were DE in the animals with focal or diffuse lesions when compared with control animals, respectively. Between these two comparisons, 2557 genes were DE in the ICV samples of the MAP-infected cows regardless of the comparison. A total of 19 DE genes were common among the animals with focal lesions regardless of the tested sample, ICV or PB. Similarly, 89 genes appeared DE in the PB and ICV samples of the animals with diffuse lesions.  Gene expression comparison between animals with ptB-associated histophalogical lesions versus control cows. A total of 109 and 207 were DE in the PB samples of the animals with focal or diffuse lesions when compared with the control animals, respectively. Among these DE genes, 21 and 48 were upregulated in each comparison, respectively (Table 2).   www.nature.com/scientificreports www.nature.com/scientificreports/ uncharacterized protein ( Table 3). The top five downregulated genes in this comparison were: the Cathelicidins 3 and 4 (CATHL3 and CATHL4), CD177 molecule (CD177), C-C motif Chemokine ligand 14 (CCL14), and the Multidrug resistance-associated protein 4-like. CATHL3 and CATHL4 exert a potent antimicrobial activity due to an impairment of the function of the respiratory chain and of energy-dependent activities in the inner membrane of susceptible microorganisms. CD177 is a neutrophil G-protein-linked surface glycoprotein.
RNA-Seq analysis revealed that 3189 and 4724 genes were dysregulated in the ICV samples from the animals with focal or diffuse lesions versus de control group, respectively ( Table 2). The top five upregulated genes in the ICV of cows with focal lesions were the Intelectin 2 precursor (ITLN2), Cholecystokinin (CCK), Homeobox B13 (HOXB13), Fatty acid binding protein 6 (FABP6) and Heat shock protein family A member 6 (HSPA6). The upregulated genes mean fold change of expression was 1.5 but, in this case there were larger differences in the expression levels between genes, having the most upregulated gene a fold change of 10.6 and the less upregulated one a fold change of 0.6. CCK is a peptide hormone that induces gall bladder contraction and release of pancreatic enzymes in the gut. HOXB3 is a transcriptional factor previously implicated with inflammation. FABP6 is involved in the gastric acid and pepsinogen secretion and is also required for efficient apical to basolateral transport of conjugated bile acids in ileal enterocytes. The top five downregulated genes were two uncharacterized proteins, the Duodenase-1-like with both trypsin-like and chymotrypsin-like activities, Tumor suppressor candidate 5 (TUSC5), and the Aquaporin 8 (AQP8) ( Table 3). Mean fold change of the downregulated genes was −1.1 with values ranging between −6.7 and −0.6.
In the ICV samples, a total of 4724 DE genes were identified in the comparison between the animals with diffuse lesions versus the control group. From these 4724 DE genes, 1473 genes were upregulated and 3251 were downregulated ( Table 2). The level of gene expression was similar between the up-and downregulated genes as demonstrated by the log 2 fold changes. The upregulated genes on average showed a 1.2 fold increase between the cows with diffuse lesions and the control group (range between 6.8 to 0.5), whereas the downregulated genes averaged a −1.2 fold decrease between groups (range between −8.3 to −0.5). The top five upregulated genes included the Intelectin 2 precursor (ITLN2), Heat shock protein family A member 6 (HSPA6), Immunoglobulin superfamily member 23 (IGSF23), Apolipoprotein B (APOB) and the Carbamoyl-phosphate synthase 1 (CPS1). On the other hand, the top five downregulated genes were the Lysozyme C milk isozyme 1 (LYZ1), Adiponectin C1Q (ADIPOQ), Lysozyme C intestinal isozyme (LYSB), Acyl-CoA synthetase medium chain family member 1 (ACSM1) and the Aquaporin 8 (AQP8). Several of the downregulated genes in this comparison such as the ITLN2, LYZ1, LYSB and ADIPOQ play important roles in the defense response against pathogens. Lysozymes such as LYZ1 and LYSB have primarily a bacteriolytic function. ADIPOQ is an adiponectin involved in the control of the fat metabolism and it has an anti-inflammatory activity by negatively regulating TNF-α expression in macrophages, and also by counteracting its effects.  (Table 2). Similarly, 2557 genes were DE in the ICV gene expression profiles from the infected cows versus the control group. The PB and ICV gene expression profiles from the cows with focal lesions shared 19 DE genes, and 89 DE genes were common in the PB and ICV expression profiles of the cows with diffuse lesions. In Table 4 we show the top three upregulated and downregulated genes in the PB and/or ICV gene expression profiles from MAP-infected animals. The Oxidized low density lipoprotein receptor (ORL1) and the Tweety family protein 2 (TTYH2) appeared both upregulated in the PB samples of the MAP-infected cows, regardless of the severity of their lesions in gut tissues. The CCL14 chemokine, an epithelial-derived chemokine with antibacterial properties, was the most downregulated gene in the PB gene expression profile from MAP-infected cows versus the control group.
In the ICV samples, the ITLN2 precursor showed the highest transcription increase in the cows with both focal and diffuse lesions; log 2 fold-change was 10.6 and 6.8, respectively. Immunohistochemistry provided further evidence of the ITLN2 production and localization in ICV. As seen in Fig. 2, ITLN2 expression increases upon infection with MAP compared to control samples. Goblet cells and Paneth cells along the ICV from the infected animals were intensively labeled with the bovine ITLN2 antibody. FABP6, a gastrotropin involved in the efficient apical to basolateral transport of conjugated bile acids in ileal enterocytes was highly upregulated in the ICV gene  www.nature.com/scientificreports www.nature.com/scientificreports/ expression profile from the infected cows irrespective of the severity of the histopathological lesions. The stress response protein, HSPA6, was also highly expressed in the ICV from MAP-infected cows with log 2 fold-changes of 4.2 and 5.8 in the gene expression profiles of the cows with focal or diffuse lesions versus control cows, respectively. A significant reduction in the expression of the AQP8 gene, a protein involved in water/solute intestine homeostasis, was observed in the ICV gene expression profile from the cows with focal or diffuse lesions versus the control group; log 2 fold-change = −3.2 and −4.5, respectively.
Several genes involved in the control of infectious diseases such as the Basic leucine zipper ATF transcription factor 2 (BATF2) and the C-X-C motif chemokine ligand 10 (CXCL10), were upregulated in the PB and ICV gene expression profiles from the animal with diffuse lesions. BATF2 belongs to the activator protein 1 (AP-1) transcription factor family and interacts with the IFN regulatory factor 1 to mediate downstream proinflammatory immune responses. CXCL10 is chemotactic for monocytes and T-lymphocytes by binding to CXCR3. C4b-binding protein alpha-like, epiregulin (ERG) and the CXCL8/IL8 were downregulated in the PB and ICV gene expression profiles from cows with diffuse lesions versus the control group.
Five genes were DE in the PB and ICV gene expression profiles from the animals with PTB-associated histopathological lesions irrespective of the type of lesion including the CXCL8/IL8, Apolipoprotein L domain

ENSBTAG00000019406
Insulin growth factor 2 binding protein 3 (IGF2BP3)  were downregulated in all the comparisons. The ARAP2 gene is thought to play a major role in the activation of the major histocompatibility complex class II genes expression 27 . The IFI27 was downregulated in the PB samples and upregulated in the ICV samples from the cows with focal (log 2 fold = 1.6) and diffuse lesions (log 2 fold = 2.6) when compared with the control group. The IFI27 (ISG12) mediates IFN-induced apoptosis characterized by a rapid and robust release of cytochrome C from the mitochondria and activation of caspases 2,3,6,8 and 9 which may in turn influence the antiviral activities of the IFN 28 .
Gene ontology (Go) analysis. Functional categorization of the DE genes in each comparison was performed using the Bioconductor GOseq package to identify biological process (BP), cellular component (CC) and molecular function (MF) ( Table 5). While in the PB samples of the cows with focal lesions we did not identify  www.nature.com/scientificreports www.nature.com/scientificreports/ any significantly overexpressed GO, eleven GOs were significantly enriched in the PB samples of the animals with diffuse lesions. Out of these eleven GO; 5 were BP, 4 were CC and 2 were MF (Fig. 3a). The five identified BP were related to the immune response, including killing of cells of other organisms (GO:0031640), defense response (GO:0006952, GO:0050832), immune response (GO:0006955) and positive regulation of neutrophil chemotaxis (GO:0090023). The enriched CC identified in the PB samples of cows with diffuse lesions were the extracellular space and cell membrane.
We identified 83 and 80 overrepresented GO in the ICV samples of the cows with focal and diffuse lesions, respectively. The top 11 GOs identified in the ICV samples of the animals with focal and diffuse lesions were CC. Enriched CC in the ICV samples of the cows with both focal and diffuse lesions included the cytosol and the cytoplasm being these the top dysregulated CC. Among the top five BP enriched in the ICV samples of the cows with focal lesions were the cellular response to DNA damage stimulus (GO:0006974), protein transport (GO:0015031), immune system process (GO:00023746), DNA repair (GO:0006281) and the neutrophil chemotaxis GO (GO:0030593) (Fig. 3b). Three of the top five overexpressed BP in the ICV from the cows with diffuse lesions were associated with the innate immune and inflammatory response (Fig. 3c). Interestingly, the defense response (GO:0006952) and immune response (GO:0006955), were both enriched in the PB and ICV gene expression profiles from the cows with diffuse lesions (Fig. 4). Our results revealed that the chemokines CXCL8/ IL8 and CXCL10 were regulated in opposite directions in cows with the more advanced lesions. While CXCL8/IL8 was downregulated in the PB and ICV samples, CXCL10 was upregulated in both MAP-targeted samples from the cows with diffuse lesions. This differential modulation of CXCL10 and CXCL8/IL8 may suggest a different role of both chemokines in the infection. While CXCL8/IL8 attracts neutrophils, basophils and T-cells to the site of the infection, CXCL10 is chemotactic for monocytes and T-lymphocytes by binding to CXCR3. Expression changes in genes with antibacterial activity such as the Cathelicidin 6 (CATHL6) and β-defensin (DEFβ4A) were also observed. Seven DE genes included in the immune response pathway (GO:0006955) were dysregulated in the PB and ICV gene expression profiles from the cows with diffuse lesions including the CCL14, ENPP2, CD36, CXCL8/ IL8, BOLA-DQβ, MHC class I heavy chain and one uncharacterized protein (ENSBTAG00000040323). While MHC class I heavy chain was upregulated; ENPP2, CD36, CXCL8/IL8, and BOLA-DQβ were downregulated in the PB and ICV expression profiles from the cows with diffuse lesions when compared with the control group.
Metabolic analysis. Enriched metabolic routes related to the DE genes in each comparison were analyzed.
Although no metabolic pathways referenced in the KEGG database were enriched in the PB samples, 78 and 82 metabolic routes were dysregulated in the ICV samples of the cows with focal and diffuse lesions, respectively. Only three of the 78 metabolic pathways dysregulated in the ICV gene expression profiles from the cows with focal lesions were significantly enriched including the N-Glycan biosynthesis (bta00510, P = 0.004), purine metabolism (bta00230, P = 0.015) and the one carbon pool by folate pathway (bta00670, P = 0.021). The metabolic analysis using the DE genes in the ICV from cows with diffuse lesions versus the control group revealed that the DE genes in this comparison affected metabolic processes such as the valine, leucine and isoleucine degradation (bta00280, P = 0.008) and the purine metabolism (bta00230, P = 0.003) witch sequences corresponding to 10 and 34 enzymes, respectively. Therefore, our metabolic analysis revealed bta00230 as a common metabolic route significantly enriched in the ICV from the infected cows versus the control cows. Using the STRING database, the bile secretion metabolic route (bta04976) was significantly enriched in the ICV from the cows with focal lesions (enrichment score = 244.99, FDR = 0.001) with 18 proteins involved in this network. In the ICV from the cows with diffuse lesions, the vitamin digestion and absorption (enrichment score = 599.07, FDR = 0.001) and the cholesterol (enrichment score = 677.38, FDR = 0.001) routes were significantly enriched with six and four proteins matching each route, respectively. protein to protein interaction analysis. Protein to protein interaction analysis using the DE genes in the PB samples from the animals with focal lesions revealed a HBEGF-CXCL8/IL8 functional association (Fig. 5a). Using the DE genes in the PB samples from the cows with diffuse lesions, 11 functional associations and two CXCL8/IL8 and Collagen type I, α2 chain (COL1A2) networks were identified (Fig. 5b). The COL1A2 was linked to a secreted protein, the Acidic and rich in cysteine protein (SPARC), the Syndecan 4 (SDC4) and to the Integrin subunit α9 (ITGA9). When the DE genes in the ICV gene expression profile from the cows with focal lesions were used, 13 functional interactions and one CXCL8/IL8 centered network were retrieved (Fig. 5c). With a confidence cuff of 0.7, 90 functional interactions and a centered mitogen-activated MAP kinase 8 (MAPK8) network were retrieved when the DE genes in the ICV of the animals with diffuse lesions were used (Fig. 5d). Our protein-protein interaction analysis revealed that the interaction between MAPK8 and CXCL8/IL8 might be mediated by two leucine zipper protein members of the AP1 transcription factor complex, FOS and FOSB.

Discussion
Transcriptomic profiling of the host response to MAP infection has the potential to advance our understanding of MAP-host interactions and to reveal the mechanisms involved in the establishment and progression of bovine PTB. This information can help to set up the basis for novel diagnosis tools and for the development of novel preventive and therapeutic approaches which could improve cattle health. Host transcriptomic studies have highlighted the importance to compare whole blood versus other issues. To our knowledge, our study represents the first attempt to study the specificities and common signatures of the ICV and PB gene expression profiles of PTB-infected cattle using RNA-Seq. All the infected cows included in our study had detectable lesions in gut tissues with distinct severity, focal or diffuse. Although the number of control animals is limited, our results showed extensive expression differences between infected and control cows. We found that the number of DE genes was higher in the ICV than in the PB samples and increased with the severity of the lesion. More specifically, our study revealed 19 genes DE in the PB and ICV gene expression profiles from the cows with focal lesions versus the control group. Similarly, 89 genes were DE in the PB and ICV gene expression profiles from the cows with diffuse lesions versus the control group. Interestingly, the most upregulated gene in the ICV samples of the infected animals was the ITLN2 precursor; log 2 fold change = 10.6 and 6.8 in the cows with focal or diffuse lesions versus the control group, respectively. The ITLN2 is an extracellular lectin able to bind Mycobacterium tuberculosis and Mycobacterium bovis bacillus Calmette-Guerin (BCG) 29 . The FABP6, an ileum-specific bile acid (BA) transporter, was also highly upregulated in the ICV gene expression profile from the infected cows versus control cows which suggests a dysregulation of the BA secretion in MAP-infected animals. We also observed that MAP infection downregulated the expression of the BA receptor FXR in the ICV from the cows with focal and diffuse lesions versus control cows; log 2 fold = −0.8 and −1.0, respectively. In agreement with our results, Hempel et al.   www.nature.com/scientificreports www.nature.com/scientificreports/ found the FABP6 transporter as the top upregulated gene in the comparison of clinical versus uninfected control animals 24 . Apart from FABP6, we found other genes highly dysregulated in the ICV gene expression profile from the cows with diffuse lesions such as the Immunoglobulin superfamily member 23 (IGSF23) (log 2 fold = 5.1), Apolipoprotein B (APOB) (log 2 fold = 4.8), Solute carrier family 10 member 2 (SLC10A2) (log 2 fold = 2.7), and the Matrix metallopeptidase 13 (MMP13) (log 2 fold = 2.0) which had been previously reported by Hempel et al. in the comparison of clinical versus control cows. Little is known about the role of IGSF23, but other IGSF members are cell adhesion molecules that perform important immunological functions, including recognizing a variety of counterpart molecules on the cell surface or extracellular matrix 30 .
Several genes involved in the control of infectious diseases such as the basic leucine zipper transcription factor 2 (BATF2) and CXCL10 chemokine were found in our study highly upregulated in the PB and ICV gene expression profiles from the cows with diffuse lesions versus the control group but were unaffected in the cows with focal lesions. BATF2 is predominantly expressed in monocytes and macrophages and has a central role in macrophage activation by regulating inflammatory responses during mycobacterial infection. In human tuberculosis (TB), BAFT2 has been suggested as a biomarker and a potential host directed drug target 31,32 . The CXCL10 chemokine, also known as IFN-γ inducible protein-10 (IP-10), is chemotactic for monocytes and T-lymphocytes by binding to CXCR3 receptor. CXCL10 is expressed mainly by activated CD14+ memory T cells, which produce a Th1 pattern a.

b.
c. www.nature.com/scientificreports www.nature.com/scientificreports/ of cytokine production 33 . Several studies have demonstrated that CXCL10 can be used as a biomarker of TB, where significantly greater levels of mycobacterial antigen-induced CXCL10 protein were detected in whole blood of TB patients compared to healthy controls [34][35][36] . More recently, the simultaneously measurement of CXCL10 and IFN-γ enhanced test sensitivity for bovine TB identification in cattle in clinical stages of the infection 37 .
Our study revealed enrichment of the following GOs in the ICV from the cows with focal lesions: cellular response to DNA damage stimulus (GO:0006974), protein transport (GO:0015031), immune system process (  www.nature.com/scientificreports www.nature.com/scientificreports/ cattle versus healthy control cattle 39,40 . Since the CXCL8/IL8 is the major chemokine responsible for neutrophil recruitment and activation by binding to CXCR3, decreased CXCL8/IL8 expression may reflect impairment of neutrophil recruitment and activation during a natural MAP infection. Other CXCL8/IL8-associated biological functions include plasma exudation, general granulocytophilia, degranulation of neutrophils, respiratory burst response, and mobilization of intra-cellular Ca 2+ 41 . In our study, the Epiregulin (ERG) and CXCL8/IL8 genes were both downregulated in the PB and ICV gene expression profiles from the cows with diffuse lesions versus control cows. ERG belongs to the epidermal growth factor (EGF) family together with the HBEGF; acting both as mitogenic stimulators via binding to EGF receptors (EGFRs). Transactivation of EGFR leads to downstream signaling events including MAPK8 phosphorylation and activation which leads to apoptosis induction 42 . Induced CXCL8/ IL8 release is predominantly dependent on EGF binding to the EFGR leading to activation of MAPK8 43 . Our protein-protein interaction analysis using the DE genes in the cows with diffuse lesions revealed a central role for CXCL8/IL8 and MAPK8 in MAP pathogenesis. This kinase was downregulated in the ICV gene expression profiles from the cows with focal and diffuse lesions when compared with the control group; −1.7 and −2.16938, respectively. Our results support the idea that MAP infection limits signaling via the MAPK8-IL8 expression pathway which may favor MAP survival. Although the role of CXCL8/IL8 in MAP pathogenesis has not been characterized before, the CXCL8/IL8 has been shown to play an important role in the pathogenesis of the human inflammatory bowel disease (IBD) 33 .
Our protein-protein interaction analysis using the DE genes in the PB from the cows with focal and diffuse lesions identified a CXCL8/IL8 network in both stages of the disease. In other hand, using the DE genes in the PB from the cows with diffuse lesions a COL1A2 centered network was identified containing a COL1A2-SPARC functional interaction. Interestingly, a SPARC-centered network was expressed more strongly in Mycobacterium bovis-challenged MDM from bovine TB infected cows that in uninfected cows 44 . SPARC, also known as www.nature.com/scientificreports www.nature.com/scientificreports/ osteonectin, is a matrix protein that binds collagen, and it is required for the development of granuloma-like structures during chronic infections 45 . In our study, SPARC and COL1A2 were upregulated in the PB gene expression profiles from the cows with diffuse lesions which suggests that the expression of both proteins could lead to a bad prognosis in MAP-infected cows.
Recently, metabolomic analysis yielded a clear separation between non-infected and MAP-infected cattle, indicating changes in general metabolism, nutrition uptake and energy balance during the early stage of the disease 46 . Using RNA-Seq, we identified three metabolic pathways significantly enriched in the ICV from the cows with focal lesions including the N-Glycan biosynthesis (map00510, P = 0.004), purine metabolism (map00230, P = 0.015) and the one carbon pool by folate route (map00670, P = 0.021). Altered concentrations of N-glycans, such as mannose, were previously observed by the Buck et al. early after MAP infection 46 . In addition, it has been recently proposed that Mycobacterium tuberculosis infection manipulates the glycosylation machinery and the N-Glycoproteome of human macrophages 47 . The metabolic analysis performed using the DE genes in the ICV from the cows with diffuse lesions revealed enrichment of the branched amino acids (BCAA; valine, leucine and isoleucine) degradation (map00280, p = 0.008) and of the purine metabolism (map00230, p = 0.003). The degradation of the BCAA in advanced stages of the infection is consistent with the fact that PTB eventually leads to intestinal malabsorption and hypoproteinemia in the final stages of the infection 4 . Interestingly, our study identified the purine metabolism as a common enriched route in the ICV gene expression profiles of MAP-infected cows regardless of the type of lesion. In agreement with our findings, a previous study showed that MAP infection increased intracellular ATP in bovine monocytes 48 . Using the STRING database, the bile secretion metabolic route (bta04976) was significantly enriched in the ICV gene expression profiles from the cows with focal lesion (enrichment score = 244.99, FDR = 0.001). In the ICV gene expression profile from the cows with diffuse lesions, the cholesterol and the vitamin digestion and absorption routes were significantly enriched, respectively. In agreement with these findings, recent reports have demonstrated that MAP is able to manipulate the host lipid metabolism and accumulate cholesterol within macrophages which might favor MAP persistence 49 . In addition, decreases in serum 25-hydroxyvitamin D 3 (25OHD 3 ) levels were significantly lower in cows in the clinical stage of disease compared with either cows in the subclinical stage and non-infected control cows 50 . It is generally accepted that cattle in advanced stages of MAP infection are unable to metabolize and absorb vitamin D 3 across the intestinal wall, and therefore suffer from weight loss and cachexia.
This study is the first to simultaneously describe transcriptomic changes in PB and ICV samples of cattle naturally infected with MAP. Several important observations were made. First, by comparing the transcriptomic profiles of two MAP-targeted tissues we described both unique and overlapping changes in the transcriptome of the infected cows versus the control group. Second, our study highlighted the ability of the RNA-Seq technology to reveal roles for genes that have not been previously implicated in the host response to MAP infection. Third, our transcriptomic analysis provided potential biomarkers for the development of future diagnostic tools, vaccines and therapeutics.

Materials and Methods
ethic statement. Experimental procedures performed on the animals used in this study were approved by the Animal Ethics Committee of the Servicio Regional de Investigation y Desarrollo Agroalimentario (SERIDA) and authorized by the Regional Consejeria de Agroganadería y Recursos Autoctonos of the Principality of Asturias (approval code PROAE 29/2015). All the procedures were carried out in accordance with the European Guidelines for the Care and Use of Animals for Research Purposes (2012/63/EU). PB and fecal samples were collected by trained personnel and in accordance with good veterinary practice.
Animal population. Fourteen Holstein Friesian cows from a single commercial dairy farm in Asturias (Spain) were monitored annually by obtaining PB and fecal samples to assess disease status by ELISA, and fecal bacteriological culture and PCR. The mean prevalence of the disease in the farm estimated by ELISA was 6.29% in the sampling period (2016-2018).

tissue and fecal sampling for histopathology, immunohistochemistry and bacterial culture.
For histopathological analysis, samples from ileocecal lymph nodes, distal jejunal lymph node, ICV, and distal jejunum were collected aseptically from each animal and placed in formalin. The collected samples were fixed in 10% neutral buffered formalin, and dehydrated through graded alcohols and xilol before being embedded in paraffin wax. Several sections were cut from each tissue sample using a microtome Leica RM2035 (Leica Mycrosystems, Barcelona, Spain), mounted on treated microscope slides (Fisher Scientific Co) and subsequently stained with hematoxylin-eosin (HE) and Ziehl-Neelsen (ZN). The stained sections were examined under a microscope Olympus BX51 equipped with an Olympus U-CMAD3 digital camera for pathological lesions and for the presence of acid-fast bacteria (AFB). According to their location and extension, inflammatory cell type, and Mycobacterial load, PTB-associated histopathological lesions were classified as focal (focal and multifocal), and diffuse (paucibacillary, intermediate or multibacillary) 3 .
For immunohistochemistry, the Avidin-Biotin kit with peroxidase-based detection was used (Vector Laboratories, California, USA). Briefly, ICV sections were deparaffinised, hydrated and rinsed with tap water. Afterwards, slides were treated to quench the endogenous peroxidase by incubation with methanol containing 3% H 2 O 2 for 10 min at room temperature and washed with water for 10 min. The tissue sections were incubated overnight at 4 °C with an anti-ITLN2 polyclonal antibody (Aviva Systems Biology, San Diego, California), washed with TBS and incubated with a goat biotinylated anti-rabbit IgG (1:200 dilution). The sections were then incubated with VECTASTAIN Elite ABC Reagent for 30 min at room temperature. Finally, the sections were incubated with 3,3-diaminobenzidine tetrahydrochloride (DAB, Sigma, St. Louis, MO, USA) for 5 min and washed with TBS. The stained slides were dehydrated, mounted and studied under light microscopy.
For bacteriological culture, a pool (2 gr) of ileocecal lymph nodes, distal jejunal lymph node, ICV, and distal jejunum were decontaminated with 38 mL of hexa-decyl pyridinium chloride at a final concentration of 0.75% (Sigma, St. Louis, MO) and homogenized in a stomacher blender. After 30 min of incubation at room temperature, 15 mL of the suspension was transferred to a new tube and incubated overnight for decontamination and sedimentation. Approximately, 200 µl of the suspension was taken from the layer near the sediment and inoculated into two slants of Herrolds egg yolk medium (HEYM; Becton Dickinson, Sparks, MD) and into two slants of Lowenstein-Jensen medium (LJ; Difco, Detroit, MI), both supplemented with Mycobactin J (Allied Monitor, Fayette, MO) as previously described 51 . Bacterial load in tissues was classified as low (<10 cfu; estimated average 2 cfu/tube), medium (between 10 to 50 cfu, estimated average 20 cfu/tube), or heavy (>50 cfu; estimated average 200 cfu/tube). At the time of slaughter, feces were taken from the rectum of each animal and processed within 48 h after arrival at the laboratory. The fecal samples (2 g each) were decontaminated, blended in a stomacher, and cultured in HEYM and LJ, as previously described for tissue culture. fecal quantitative real-time pcR (qpcR). PCR-positive samples were also tested by real-time qPCR using the ParaTB Kuanti-VK kit following the manufacturer's instructions (Vacunek, Bizkaia, Spain). The kit uses a F57 TagMan probe labeled with the fluorescent reporter dye 5-carboxyfluorescein (FAM) at the 5′end and primers that specifically amplify the single-copy F57 insertion sequence of MAP. Inhibition of the amplification reaction is ruled out by including in the master mix an internal plasmid control with specific primers and an internal hybridization probe labeled with 6-carboxy-4′,5′-dichloro-2′,7′-dimethosyfluorescein succinimidyl ester (JOE) at the 5′-end. This internal amplification control molecule is co-amplified alongside the F57 diagnostic target in a duplex format. Quantification of MAP titer (F57 copy numbers per gram of feces) was accomplished by preparing a standard curve using serial dilutions of a standard sample containing a known number of MAP DNA copies. Real-time qPCR amplifications were performed using the Step One Plus detection system (Applied Biosystems, Carlsbad, CA) with the following conditions: 1 cycle at 95 °C for 10 min, 45 cycles of denaturation at 95 °C for 15 s, and annealing/extensions at 60 °C for 60 s. The results were analyzed with the ABI Prism software version 1.4.
RnA extraction and RnA-Seq library preparation. At the time of slaughter, PB samples were collected from the coccygeal vein of all the cows included in the study in PAXgene Blood RNA tubes (2.5 ml) (Qiagen, Hilden, Germany). Total RNA was extracted from the PB samples using the PAXgene blood RNA kit according to the manufacturer's instructions (Qiagen, Hilden, Germany). For RNA isolation, 150-200 mg of ICV of all the animals were harvested and immediately submerged in 2 ml of RNA later (Sigma, St. Louis, MO). Purification of RNA was performed using the RNeasy Mini Kit according to the manufacturer's instructions (Qiagen, Hilden, Germany). Residual genomic DNA was removed using DNase digestion with RNase-free DNase I Amplification grade following the recommended protocol (Invitrogen, Spain). Concentration and quality of the total RNAs were measured using an Agilent Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, US). All samples had an RNA integrity value of 7 or greater. Approximately 250 ng of RNA were used for each RNA-Seq library creation using the Illumina NEBNext ® Ultra Directional RNA Library preparation kit following the manufacturer's instructions (Illumina Inc, CA, US). All RNA-Seq libraries were quantified using a Qubit ® Fluorometer and doubled stranded DNA high Sensitivity Assay Kit (Invitrogen, Spain). RNA-Seq libraries quality was assessed using an Agilent Bioanalyzer and Agilent high sensitivity DNA chip to confirm that the insert sizes were 200-250 bp for all the individual libraries.
RnA sequencing and bioinformatic analysis of RnA-Seq data. RNA-Seq libraries were single-end sequenced in a 1 × 75 format using an Illumina NextSeq500 sequencer at the Genomic Unit of the Scientific Park of Madrid, Spain. The raw reads were filtered by their length (minimum size 60 bp long) and percentage of ambiguous base N less than 5% using Prinseq-lite 52 . Trimmed reads were subsequently mapped to the Bos Taurus reference genome (Bos_taurus.UMD3.1. version 87) with TopHat mapper 53 . Reads were assigned to a gene if they were not multi-hit reads. The resulting alignment files were provided to Cufflinks to generate a transcriptome assembly for each condition. These assemblies were then merged together using Cuffmerge, which is included in (2019) 9:14845 | https://doi.org/10.1038/s41598-019-51328-0 www.nature.com/scientificreports www.nature.com/scientificreports/ the Cufflinks package. This merged assembly provided a uniform basis for calculating gene and transcript expression levels in each condition. The reads and merged assembly were fed to Cuffdiff which calculates expression levels and tested the statistical significance of each observed change in expression 54 . The fold change (log 2 scale), P-values and false discovery rates (FDR) for each gene were obtained. The genes with a FDR-adjusted threshold <0.05 were considered differentially expressed (DE) when compared to the control group. To visualize and integrate all the data produced by the Cufflink analysis, CummeRbund was used for cluster analysis and for handling the transformation of Cuffdiff data into R statistical computing environment. RNA-Seq data have been deposited in the NCBI Gene Expression Omnibus (GEO) database under the accession number (GSE137395).
Gene ontology (Go) and metabolic analysis of the De genes obtained using RnA-Seq. To identify GO and metabolic routes involved in MAP infection, genes with a significant DE in each comparison were evaluated using the GOseq R Bioconductor [55][56][57] . To begin the analysis, GOseq quantifies the length bias present in the DE genes by calculating a probability weighting function (PWF) which gives the probability that a gene will be DE based on its length alone. Next, GOseq analysis requires the use of the Wallenius non-central hypergeometric distribution to approximate the null distribution for GO category membership and to calculate representation of GO categories amongst the set of DE genes. Using the Wallenius approximation, P-values for representation of the DE genes in each GO were generated and a P-value threshold <0.05 selected. GO analysis provided categories of genes involved in different biological processes (BP), molecular functions (MF) and those integral for different cell compartments (CC). KEGG pathways enrichment analysis was also performed using the STRING v11.0 58 . protein-protein association networks. The DE genes with log 2 fold changes >2 or < 2 were analyzed using the STRING database v10.5 59 . The STRING database aims to collect, score and integrate all known and predicted protein-protein association data. The basic interaction unit in the STRING database is the functional association, i.e. a link between two proteins that both contribute jointly to a specific function. Network nodes represent proteins and edges protein-protein interactions. For each interaction a combined and final score is computed based on seven evidence channels including neighborhood in the genome, gene fusions, co-occurrence across genomes, co-expression, experimental/biochemical data, association in curated databases, and co-mentioned in PubMed abstracts. Only functional interactions with a high confidence score (>0.7) were included in the networks.

Data Availability
The datasets generated during the current study are available from the corresponding author on reasonable request.