Investigation of the transcriptomic response in Atlantic salmon (Salmo salar) gill exposed to Paramoeba perurans during early onset of disease

Amoebic Gill Disease (AGD), caused by the protozoan extracellular parasite Paramoeba perurans (P. perurans) is a disease affecting Atlantic salmon (Salmo salar). This study investigated the gill transcriptomic profile of pre-clinical AGD using RNA-sequencing (RNA-seq) technology. RNA-seq libraries generated at 0, 4, 7, 14 and 16 days post infection (dpi) identified 19,251 differentially expressed genes (DEGs) of which 56.2% were up-regulated. DEGs mapped to 224 Gene Ontology (GO) terms including 140 biological processes (BP), 45 cellular components (CC), and 39 molecular functions (MF). A total of 27 reference pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and 15 Reactome gene sets were identified. The RNA-seq data was validated using real-time, quantitative PCR (qPCR). A host immune response though the activation of complement and the acute phase genes was evident at 7 dpi, with a concurrent immune suppression involving cytokine signalling, notably in interleukins, interferon regulatory factors and tumour necrosis factor-alpha (tnf-α) genes. Down-regulated gene expression with involvement in receptor signalling pathways (NOD-like, Toll-like and RIG-1) were also identified. The results of this study support the theory that P. perurans can evade immune surveillance during the initial stages of gill colonisation through interference of signal transduction pathways.

Detection of P. perurans DNA. Gill tissue from six naïve control, and six AGD-affected fish were analysed by qPCR for the presence of P. perurans DNA at 0, 4, 7, 14 and 16 dpi. The higher the crossing threshold value (Ct), the less P. perurans DNA that was detected. The qPCR was negative for the naïve control fish at all five sampling time points. The Ct values in the AGD-affected fish at 4 dpi were 34.6 ± 0.47, n = 5 (1 fish had a negative result with no P. perurans DNA detected), at 7 dpi was Ct 31.8 ± 2.7, n = 6, at 14 dpi the Ct was 28.8 + 1.5, and at 16 dpi the Ct was 29.4 + 1.7. The qPCR results indicate an increasing amoebic load, based on the level of P. perurans DNA detected, on the gill from 4 to 16 dpi. Histopathology analysis. Gills were examined microscopically following staining with haematoxylin/ eosin ( Fig. 1). Some small areas of mild epithelial hyperplasia was evident at 4 dpi, while more extensive hyperplasia and lamellar fusion can be seen at 7dpi. Substantial hyperplasia and lamellar fusion with the formation of lamellar vesicles, indicative of AGD, were evident at 14 dpi and 16 dpi.
Mapping RNA-seq reads to reference genome. Information related to the number of reads and alignments to the Salmo salar reference genome ICSASG_v2 (GenBank: GCF_000233375.1) 23 for the 6 fish from each of the 5 sampling time points (0, 4, 7, 14 and 16 dpi) is presented in Supplementary Table S1 online. The sequencing depth ranged from 21.7 to 35.4 M raw reads. The number of clean, paired reads ranged from 19.8 to 32.9 M (88.4% to 94.7%). Assigned fragments in HISAT2 ranged from 18.7 to 31.8 M (91% to 97%) and the number of assigned feature counts ranged from 14.7 to 25.2 M. Sample similarity. Principal component analysis (PCA) was generated to visualise the relationships between all samples, as well as samples by time point (Supplementary Fig. S1 online). The similarity between samples from thirty-six RNA-seq libraries (6 fish × 6 groups, including 2 groups from T0) was analysed. Only one of the control libraries (T0_GCRL) was used in the pairwise comparisons. The variation in the PC1 was 70%, and in PC2 was 4%. From the PCA plot, it was evident that four of the six AGD-affected samples at 7 dpi clustered together, while one sample aligned with the earlier time points, and one sample aligning with the later time points.
Differentially expressed genes (DEGs). Differentially expressed genes were identified at each of the four experimental time points (4, 7, 14 and 16 dpi) relative to naïve control fish (0 dpi). The number of genes differentially regulated in the AGD-affected gill tissue was 19,251 of which 56.2% were up-regulated over the course of the AGD trial. The pattern of gene expression at 4, 14 and 16 dpi identified more genes down-regulated (61.6%, 52.4%, and 54.7% respectively) with the majority of genes having a log 2 fold change (log 2 FC) expression between − 2 and + 2. The pattern of gene expression shifted more toward gene up-regulation at 7 dpi (59.6% vs 40.4%) and a greater increase in the log 2 fold expression with most genes falling between a log 2 FC of − 2 and + 15 ( Validation of RNA-seq data using qPCR. Ten genes from the gill RNA-seq data were selected for qPCR validation across all sampling time points (Table 1). All analysis was carried out on individual fish gill samples (n = 30). The qPCR pairwise comparisons matched those carried out in the RNA-seq study. Genes included acod1: aconitate decarboxylase 1 (also known as irg1), cath1: cathelicidin 1, clra: c type lectin receptor a, cc4: c-c . Some small areas of mild epithelial hyperplasia is evident at 4 dpi, indicated by arrows, (B), also shown at a higher magnification (C). More extensive hyperplasia and lamellar fusion, indicative of AGD, can be seen at 7dpi (D) and is indicated by asterisks. The increasing hyperplasia, lamellar fusion (asterisks) with the formation of lamellar vesicles (arrowhead) is evident at 14 dpi (E). At 16 dpi (F) there is extensive lamellae fusion in adjoining gill filaments (asterisks). Slides were stained with H&E and the scale is indicated on each image.

Figure 2.
Volcano plots illustrating the pattern of differential gene expression over time. Plots represent 4 dpi (A), 7 dpi (B), 14dpi (C) and 16 dpi (D). The X-axis represents the log 2 FC in expression with negative numbers representing down-regulated expression and positive numbers representing up-regulated expression. The Y-axis is the p-value adjusted for or the false discovery rate (FDR < 0.05) to the negative log base 10. The pattern of expression was different at 7 dpi with more genes being differentially expressed, more genes being upregulated and with a higher log 2 FC compared to the other time-points. These genes were selected from the RNA-seq data for validation based on their expression at more than one time point and their involvement in the immune response ( Table 1). The results of the qPCR confirmed that acod1, cc4, il-17a/f1 and nos2, and ptx3 were down-regulated at every time point. Cath1 was down-regulated at 4, 7 and 16 dpi. Two genes, clra and lect2 showed the same pattern of expression being greatly up-regulated at 7 dpi, while being down-regulated at the other time points. Il-8 was down-regulated at 7 dpi and up at 16 dpi. Sap was up-regulated at 4, 14 and again at 16 dpi (Table 1). To use the qPCR data to validate the RNA-seq data, a Spearman correlation analysis was performed using the qPCR log 2 FC average of 6 individual fish for each of the ten genes. The correlation coefficient rho (ρ) and statistical significance (p) was 4 dpi: ρ = 0.91, p < 0.0002, 7 dpi: ρ = 0.89, p < 0.0003, 14 dpi: ρ = 0.78, p < 0.008 and 16 dpi: ρ = 0.91, p < 0.0002.
Gene ontology (GO). GO and pathway enrichment analyses were undertaken to identify the function of genes whose expression was differentially expressed in the gill following exposure to P. perurans. DEGs mapped to 224 Gene Ontology (GO) terms of which 140 were categorised as a biological process (BP), 45 as cellular components (CC), and 39 as having a molecular function (MF). Multiple BP terms were identified for epithelial cell migration, RNA processing, positive regulation of protein catabolic process, canonical Wnt signaling, Genes with a p-value adjusted for a false discovery rate (FDR) < 0.05 were assigned as differentially expressed in DESeq. www.nature.com/scientificreports/ regulation of Type 1 interferon, and the ERK1/ERK2 cascade. The negative regulation of NF-κβ (GO: 0032088) was also identified. Enriched ontology clusters were generated in Metscape 24 at each time point (4, 7, 14 and 16 dpi) for terms associated with both down-regulated and up-regulated genes. In the top 20 ontology clusters enriched with down-regulated genes (Fig. 4), the reactone gene set R-DRE: 168256: Immune system was identified at 7 dpi (Fig. 4B) while in the top 20 ontology clusters enriched with up-regulated genes, GO: 0002768: immune response-regulating cell surface receptor signalling pathway was identified at 4 dpi ( Fig. 5A) and 16 dpi (Fig. 5D), with GO: 0002682: Regulation of immune system process identified at 14 dpi (Fig. 5C). The oxidation-reduction biological process (GO: 0055114, − log10(P) 13.82) was identified as being the most enriched for up-regulated genes (108 genes) (Fig. 5B). The top 20 down-regulated genes and the top 20 up-regulated genes at each for the 4 time points (4 dpi, 7 dpi, 14 dpi and 16 dpi) can be found in Supplementary Table S2 online and Supplementary Table S3 online, respectively.

Mucin gene expression.
As AGD leads to the production of excessive mucus on the gills, the DEGs dataset was also interrogated for the expression of additional mucin gene expression (Supplementary Table S7 online).
Mucins identified in the current study included muc-2-like, 5AC-like, 7-like, 12-like 13-like, 17-like and an intestinal mucin-like gene, with a possible similarity to muc-2. Interestingly, there were 5 paralogues of muc-2-like identified in this study at the earlier times of 4 dpi and 7 dpi, where all but one (Gene ID 106584523) was upregulated (log 2 FC 3.0). Two paralogues of muc-5AC-like were present, with the expression of one (GeneID1 106612949) at 4 dpi up-regulated, while the other paralogue (GeneID1 106577588) was present at 7 dpi where the expression was down-regulated. One gene for muc-7-like was consistently down-regulated at every time point. The expression of muc-12-like was up-regulated (log 2 FC 0.9) at 14 dpi, while the expression of a muc-13like (GeneID 106579100) was substantially up-regulated (log 2 FC 7.8) at 7 dpi.
Immune gene up-regulation. There were 14 immune-related genes up-regulated at times other than 7 dpi associated with various biological processes (  www.nature.com/scientificreports/ Ikzf1.7, kita), GO: 0002768: immune response-regulating cell surface receptor signalling pathway (cd79b, kita, themis2). Genes were also associated with the Reactome gene set R-DRE-983695: Antigen activates B-cell receptor (BCR) leading to generation of second messengers (btk, cd79b, dapp). Immune activation was evident from 7 dpi with complement activation (DRE-166658, FDR 0.0178) involving ten genes (crp, c6, c8a, c8b, c8g, c9, cpb2, hbl3, masp1, masp2) and is consistent with what was found in the proteome analysis of AGD-affected gill in our laboratory 26 . The most up-regulated gene was mannan binding lectin serine peptidase 2 (masp2, 30.0 log 2 FC) gene ID 106572058). Screening of the top 100 genes identified thirteen other complement transcripts also up- . Also up-regulated at 7 dpi were genes involved in the acute phase response, an early innate immune function that is initiated by inflammatory signals, leading to the release of acute phase proteins into the bloodstream to re-establish homeostasis following microbial infection 27 . Genes involved in the acute phase response (APR) markedly up-regulated at 7 dpi in the AGD affected gill included pentraxin, alpha-1-antitrypsin, alpha-2-macroglobulin, fibrinogen (α, β, γ), leukocyte cell-derived chemotaxin-2 (lect2), C-type lectin receptor (clra), and serum amyloid P-component (sap) (data not shown).

Discussion
Despite the growing number of gene expression and proteomic studies being carried out on AGD-affected salmon (reviewed in Marcos-Lopez & Rodger 2020) 4 , there are still knowledge gaps remaining regarding host immune response in early stage disease. The progression of AGD on the gill can be monitored visually and scored based on the absence or presence of mucoid patches and their distribution on the gills 6 . In the current study, AGD mucoid lesions were not evident on the gills in AGD-affected groups at the earlier time points (4 and 7 dpi) and the gill score in the AGD-affected fish remained less than 2 by day 16. Although it is known that long term cultivation of P. perurans results in the loss of parasitic virulence in vivo 28 , the amoeba used in the current study were twelve weeks in culture prior to use in the challenge trial. The progression of the disease based on gill scores observed raised a concern in relation to the virulence of the pathogen. However, at the first sampling point (4 dpi) P. perurans DNA was detected by qPCR and microscopically, there was evidence of epithelial hyperplasia. Amoebic virulence was therefore shown to be adequate and the rate of disease progression appropriate in order to investigate gene expression at early disease stages. Moreover, at 4 dpi, 4838 genes were differentially regulated with 2985 genes down-regulated and 1853 up-regulated. By 7 dpi the pattern of gene expression changed, with more genes up-regulated and to a greater extent (Fig. 2). In addition to hyperplasia, fusion of the lamellae was now evident, indicative of AGD in the absence of visible mucoid patches. At 14 dpi and 16 dpi, the pattern of gene expression returned to that which was seen at 4 dpi. Principal component analysis of the 7 dpi time point showed four of the six AGD-affected samples clustered together and were clearly different from the other fish Table 2. Relative gene expression of Interleukins in AGD-affected gill. GeneID: Salmo salar gene identification number, chr: chromosome location, p-adj: P-values adjusted for false discovery rate, Log 2 FC: log of gene expression fold change to base 2, up-regulated genes are highlighted with bold text, dpi: days post inoculation. www.nature.com/scientificreports/ sampled. With the other samples in the 7 dpi AGD-affected group, one sample clustered with the pre-7 dpi samples, while the other sample clustered with the post-7 dpi samples. The fish were exposed to the P. perurans by immersion. In comparison to intraperitoneal injection methods, the number of amoeba colonising the gill of each fish cannot be controlled, so this finding was not unexpected and highlights the importance of using individual instead of pooled samples for fish studies, and having a minimum of 6 fish per group for RNA-seq AGD studies 29 . For this study, the same gill (gill 2) was taken from each fish with no pre-selection for the presence or absence of mucoid lesions. The RNA-seq data was validated using qPCR analysis of a panel of immune genes. Down regulated genes at each time point included acod1, cc4, il-17a/f1 and nos2, and ptx3 with cath1 being down-regulated at 4, 7 and 16 dpi. Acod1, also known as immune responsive gene 1 (irg1) is a gene coding for an enzyme producing itaconic acid through the decarboxylation of cis-aconitate. The expression levels of acod1/irg1 correlates to itaconic acid production, an indicator of inflammation activation in mammals 30 . The acod1 gene is emerging as a regulator of immuno-metabolism in inflammation and infection 31 . In the current study, the pattern of expression of acod1/irg1 and nos2 were similar with both genes being consistently down-regulated. Furthermore, both nos2 and acod1 genes were also shown to have involvement in the biological process 'GO: 0072593: Reactive oxygen species metabolic process' containing 7 down-regulated genes (nos1, ncf1, cybb, nos2a, noxa1, hace1, acod1) at 7 dpi. The lack of an increase in nos2 expression in the current study, is consistent with the previous finding that there was no detectable increase in the expression of iNos (nos2) between healthy and AGD-affected fish during early disease stage 15 . In a previous study, the differential expression of the superoxide dismutase (sod) and catalase (cat), two genes involved in oxidative stress, were reported to be up-regulated in late stage AGD 20 . SOD catalyses the dismutation of the superoxide (O 2 − ) into O 2 or H 2 O 2 while CAT plays a role in protecting cells from oxidative damage. In the current study, at 7 dpi there were 108 genes identified as being up-regulated in GO: 0055114, the oxidation-reduction biological process (Fig. 5B). Three sod genes were found to be up-regulated including soluble sod1, (GeneID: 30553), the mitochondrial sod2 (GeneID: 335799) and the extracellular sod3b (GeneID: 794006). In the same study, cat gene expression was reported as being significantly upregulated in gill lesion tissue with gill score 2. In the current study, catalase was not found to be up-regulated prior to the appearance of lesions on the gill, at 7 dpi, further confirming this gene as a possible lesion specific marker for AGD. The up-regulation of genes involved in the oxidative-reduction biological process during early stage AGD will be further investigated in a future study. TGF-β1 is a potent suppressor of nos2 by multiple mechanisms in numerous cell types, including macrophages 32 . An up-regulation of tgf-β isoforms (β1, β 2, induced) was evident in the current study from 7 dpi. TGF-β can promote IL-17 cell differentiation however the expression of Il-17a/ f1 was consistently down-regulated. CC4 is a chemokine for natural killer cells, monocytes and a variety of other  www.nature.com/scientificreports/ immune cells 33 . LECT2 is a chemotactic factor attracting neutrophils to the site of infection. The genes lect2 and clra showed the same pattern of expression being substantially up-regulated at 7 dpi, while being down-regulated at the other time points. Indeed it is the interaction between LECT2 and C-type lectin receptor (CLRA) proteins that is thought to be responsible for the ''neutrophil-chemotactic'' characteristic of LECT2 34 . Furthermore, Lect2, along with ptx and sap are acute phase response genes. Interestingly, the expressions of ptx and sap were not as expected. While ptx (GeneID 106581433) was found to be down regulated, and sap (GeneID 106604759) upregulated approximately twofold, paralogues of sap (GeneID 106608633) and ptx (GeneID 100136583) were both substantially up-regulated at 7 dpi. Previous findings reported no differential gene expression of an SAP pentraxin in AGD-affected gill having visible mucoid lesions 13 . It is noteworthy that Il-8, also a chemotactic factor with similar functions to lect2, did not show a similar expression trend, and was down-regulated at 7 dpi, and up-regulated at 16 dpi. From the RNA-seq data set, down-regulated immune genes at 7 dpi were identified in the Reactome gene set R-DRE-68256: Immune system, which were associated with NOD-like receptor (NLR), Toll-like receptor (TLR) and RIG-1 like receptor signaling pathways, and also cytokine and interleukin signalling. NOD-like receptors (NLRs) can initiate or regulate host defence pathways through formation of signalling platforms that subsequently trigger the activation of inflammatory caspases and NF-κβ 35 . A recent study of advanced stage AGD (day 21, gill score 3.3) reported activation of such defence pathways, with upregulation of genes mapped to NLR, TLR signalling, and Herpes simplex virus 1 infection pathways 36 . The current study also identified NLR and TLR signalling pathways, but enriched for down-regulated genes at 7 dpi. Involvement of the Herpes simplex virus 1 infection pathway suggests a viral-like response to parasite invasion in AGD 36 . In the current study the Herpes simplex infection pathway (dre05168: FDR 0.00015) was also enriched with down-regulated genes at 7 dpi, suggesting that the anti-viral response is either not activated, or suppressed, in early stage AGD, a theory that is further supported by down-regulation of genes associated with the biological process GO: 0060337: Type 1 interferon signaling pathway, and interferon regulatory factors. Other pathways enriched with down-regulated genes at 7 dpi included B-cell receptor (BCR) signalling, T-cell receptor (TCR) signalling, the differentiation of Th1/Th2 cells, and Th17 cells. Interestingly, these signalling pathways had 2 genes in common, ikbkb and mapk3. Host invasion by pathogens frequently induces activation of NF-κβ, which plays an important role in initiation of innate immune responses by regulating the expression of many immunological mediators, including chemokines, cytokines, adhesion molecules, and enzymes that produce secondary inflammatory mediators 37 . Based on the pattern of gene expression in the current study, specifically at 7 dpi, P. perurans may have developed an immune evasion strategy to prevent the activation of NF-κβ during the early onset of AGD. Of particular interest is the down regulation of genes involved in NOD-like receptor pathway, perpetuating the inhibition of NF-κβ by Iκβ. Pathogens have previously been reported to have developed strategies to circumvent the activation of the NF-κβ activation, by preventing the inhibitor, Iκβ, from being ubiquitinated and therefore preventing its degradation, causing NF-κβ to remain sequestered in the cell cytoplasm and therefore inactive 38 . Indeed, some viruses for example vaccinia viral protein B14, encode virulence factors, to target IKKβ to inhibit the NF-κβ-mediated antiviral immune response 39 . Therefore it is conceivable that P. perurans virulence factors could potentially have some immunomodulatory effects on their host.
With regard to the down-regulation of genes with involvement in cytokine and interleukin signalling, interrogation of the DEG dataset identified multiple transcripts from nine interleukin (il) gene families with differential expression in AGD-affected gills, with one gene il-17a/f1-like (Gene ID 106600843) was found to be down-regulated at every time point. Only two interleukins were found to be substantially up-regulated, il-27β at 7 dpi, (12.4 log 2 FC), and one of 2 paralogues of il-8 (Gene ID 106577833) at 16 dpi (2.4 Log 2 FC). The up-regulation of Il-1β has been reported as the hallmark of late stage AGD infection 17 associated with larger AGD-lesions. In these cases the observed increase in mucous cell hyperplasia has led to the contention that mucous cells are the potential source of il-1β 13 . Multiple paralogues of il-1β were identified in the current study with down-regulated expression seen at 4 dpi, 7 dpi and 16 dpi, suggestive of an impaired inflammatory response. Two paralogues of Il-12β-like, the p40 subunit for IL-12 and IL-23 were also down-regulated at 7 dpi. IL-12 is a growth factor for Th1 40 while IL-23 is involved in Th17 cell differentiation 41 . IL-17 is a key cytokine produced by Th17 cells, involved in the inflammatory and neutrophil response. IL-17A is produced mainly in T cells, whereas IL-17F is produced in T cells, innate immune cells, and epithelial cells 42 . The role of IL-17F is mainly in the mucosal host defence mechanisms 43 . A recent study reported the expression of Il-17a/f1b and Il-17d to be significantly down-regulated in comparison to the negative control in gills from fish inoculated with a high concentration of P. perurans trophozoites (5000 amoeba /L) 12 . In the current study where fish were infected with 2750 amoeba/L, Il-17a-like was also found to be down-regulated at 4 dpi, while il-17a/f1 was consistently down-regulated at every time point. Interleukins involved in inducing IFN-γ, which promotes Th1 cell differentiation, Il-15 and Il-18, were also down-regulated in the current study. Interestingly, the expression of Il-27β, a component of both interleukins IL-27 and IL-35, was substantially up-regulated at 7 dpi (12.4 log 2 FC). IL-27 is unique in that although it induces Th1 differentiation, it can antagonise the development of the Th17-cell response and limit Th-17 driven inflammation 44 , critical for host defence against bacterial, fungal and viral infections at mucosal surfaces 40 while IL-35 induces proliferation of Treg cell populations but reduces activity of Th17 cell populations 45 . In addition to the down-regulation of interleukins, numerous ifn receptors and regulatory factors were down-regulated throughout the study, with the exception of irf2-like and irf4-like which were both up-regulated at 14 dpi. The down-regulation of tnf-α was also evident.
Mucins are high molecular weight glycoproteins secreted by goblet cells and the main structural component of mucus. Excessive mucus production in the gills is a hallmark of AGD 3 with substantial up-regulation of the secreted MUC5AC detected in clinical AGD 14 . Specialized epithelial (goblet) cells are the major source of MUC5AC, which can be induced by MMP9 through the activation of the epidermal growth factor receptor (EGFR) and mitogen-activated protein kinase 3/2 MAPK 3/2(ERK1/2) cascade 46 Table S7 online). Two paralogues of muc-5AC-like were found to be differentially expressed, with one paralogue at 4 dpi up-regulated, while the other paralogue at 7 dpi was down-regulated. Interestingly, the expression of a muc-13-like was found to be substantially up-regulated (log 2 FC 7.8) at 7 dpi. Muc-13 is associated with mucosal immunity, and is an epithelial and hemopoietic cell surface mucin that protects against inflammation and may also play a role in cell signaling 47,48 . Muc-13 has also been reported as a quantifiable host marker of plasmodium parasite infection which could potentially be used to distinguish infected from uninfected cells 49 . A recent RNAseq study also reported the differential expression of a mucin-13-like gene in the head-kidney of advanced stage AGD-affected fish but not the gill 36 . Further investigation is warranted to determine the role of muc-13-like in the various tissues of AGD-affected fish and if the expression of this mucin is triggered through P. perurans cell attachment.
The present study provides the initial discovery and description of cytokine down-regulation during the amoeba attachment phase of early-stage AGD in the gill of Atlantic salmon, and provides the basis for future, more in-depth studies to elucidate the mechanisms behind this potential immune evasion strategy.

Methods
Study design. This early stage AGD study was designed to compare the transcriptomic profile from individual gill tissue from 6 fish at 4 time points (4, 7, 14 and 16 dpi) to 6 uninfected control fish (0 dpi). As the 0 dpi time point was critical, it was decided to include two groups of 6 control fish (T0_GCRL, T0_GAGD) in the RNA sequencing. As all six fish in both control groups had a similar gene profile, the T0_GCRL group was subsequently selected for the data analysis. The experiment was designed to establish an AGD challenge with a type I error of 5% assuming a success rate of 80% (power analysis) and also taking into consideration a recent paper on biological replicates for RNA-seq 29 .
Fish husbandry. Atlantic salmon smolts (78.8 ± 20.6 g) were sourced from a local freshwater hatchery on the west coast of Ireland and transported to the Marine and Freshwater Research Center, at the Galway Mayo Institute of Technology. Fish are were distributed into 8 circular black 1 m −3 L tanks (45 fish/tank) connected to recirculating aquaculture systems at a stocking density of 3.6 kg m −3 , water temperature 12 °C, artificial seawater (30ppt) and a 12 h/12 h light/dark cycle. Fish were acclimatized for 6 days prior to the start of the trial. Fish were fed a commercial salmon diet (Le Gouessant) at 1% body weight per day. The in-vivo fish trial was carried out according to the ARRIVE guidelines for animal research 50  Paramoeba perurans isolation and culture. Paramoeba perurans trophozoites were collected by gill swabbing from AGD infected Atlantic salmon on a commercial salmon farm on the west coast of Ireland. Amoebae were cultured on marine yeast agar plates (MYA; 0.01% malt, 0.01% yeast, 2% Bacto Agar), 16 °C overlaid with 7 ml sterile sea water 51 , and sub-cultured weekly by transferring free-floating cells to fresh MYA plates. Confirmation of P. perurans identity was performed using qPCR as previously described by Downes (2015) 5 . The amoeba was in culture for 12 weeks prior to challenge. Paramoeba perurans challenge. Six MYA agar plates were scraped gently and the sterile seawater (SSW) overlay and cells collected into a 2 × 50 ml tubes. The volume of each tube was brought up to 40 ml using SSW at 16 °C, centrifuged at 800×g for 10 min at 16 °C. The supernatant, with the exception of the last 5 ml was removed from each tube. The pellet was re-suspended in the remaining 5 ml and then both 5 ml were combined into one 10 ml concentrate of amoeba. In a 15 ml tube, a 1:40 dilution was made using 250 µl of amoeba into 9.75 ml SSW. The tube was closed and inverted several time to ensure proper distribution of the amoeba. The diluted amoeba (1 ml) were placed on a Sedgewick rafter counter chamber. The cells were allowed to settle on the chamber for 10 min prior to counting. Counts were performed in triplicate. After an acclimatisation period of 6 days, 90 fish were challenged with P. perurans (2750 amoebae/L) in 300 L for 4 h in artificial seawater (30ppt), with oxygen saturation, water temperature (12 °C), and fish behaviour and welfare closely monitored. The 90 control fish were also held at 300 L for 4 h under the same conditions without P. perurans. Following the challenge, fish were returned to their respective tanks to be held under the same conditions as prior to challenge. Disease progression. Clinical symptoms of AGD were determined by macroscopic examination of the intact gills in euthanised fish and were scored according to Taylor et al. 6 . Prior to sampling, 3 fish were taken from each duplicate control and AGD-affected tanks, and all 6 fish placed in an anaesthetic bath. Naïve control fish (6) and experimental fish (6) were sampled at each time point (0, 4, 7, 14 and 16 dpi) and all fish were gill scored.
Sample collection. The current study was carried out in parallel with another study investigating the proteomic expression in early stage AGD from fish (6 control, 6 AGD-treated) sampled at 9 time points over 16 days (0, 1, 2, 3, 4, 7, 9, 11, 14, 16 dpi) 26 . Prior to sampling, 3 fish were taken from each duplicate control and AGD-affected tanks, and all 6 fish placed in anaesthetic at the same time. Fish were euthanised by overdose of anaesthetic (400 mg L −1 tricaine methane sulfonate) and the gills examined and scored based on the absence (0) www.nature.com/scientificreports/ or presence (1 to 5) of the mucoid patches on euthanised fish 6 . Prior to the tissue sampling, the fish were bled from the caudal vein. To minimize blood contamination in gill tissue, prior to extraction, the gills were perfused by injection of 2 ml phosphate buffered saline (PBS) into the heart. For RNA sequencing the entire second left gill, irrespective of the presence of absence of mucoid lesions, was removed with the arch and placed in 300ul of RNAlater (Ambion Inc, Austin, Texas), stored overnight at 4 °C before being transferred to − 80 °C. For histopathology, the first right gill with arch was fixed in 10% buffered neutral formalin solution prior to being routinely processed and embedded in paraffin wax blocks. Gill tissue sections (5 µm) were stained with haematoxylin and eosin (H&E), and examined using an Olympus BX41 Microscope and CellSens software (Olympus, Tokyo, Japan) and imaged. For amoeba identification, the fourth right gill with the arch was removed from individual fish and collected into 300ul of RNAlater (Ambion Inc, Austin, Texas) prior to DNA extraction (DNeasy Blood and Tissue Kit, Qiagen) from individual gills. qPCR analysis was performed using a Taqman assay developed for the detection of P. perurans DNA 5 .
Total RNA extraction. Following the removal of the whole, 2nd left gill from RNALater, Total RNA was extracted using the RNeasy Mini Plus Kit (Qiagen, Germany) according to the manufacturer's instructions. Briefly, 30 mg gill tissue from each individual fish gill, irrespective of the presence of absence of mucoid lesions, was homogenized using a bead mill (Fisher Scientific) and 2.8 mm ceramic beads in 350 µl RLT lysis buffer and 1 µl DX antifoam reagent (Qiagen, Germany), 5 pulses/sec, 10 s, repeated 3 times. The optional DNase I (Qiagen, Germany) step was included to ensure complete elimination of gill genomic DNA. RNA was eluted in 70 µL of nuclease -free water and stored at − 80 °C until required. RNA was quantified using the Qubit RNA Assay Kit in Qubit 2.0 Fluorimeter (Thermo Fisher Scientific, USA). RNA integrity (RIN) was assessed using the RNA Nano 6000 Assay Kit of the Bioanalyzer 2100 system (Agilent Technologies CA, USA). Total RNA with RIN ≥ 8.0 or higher were used for library construction.
Library construction and transcriptome sequencing. Library construction and transcriptome sequencing was outsourced (Novogene, Cambridge, UK). Total RNA from 6 control fish (0 dpi) and 6 fish from 4, 7, 14 and 16 dpi dpi was used for the construction of 30 RNA sequencing libraries using NEBNext Ultra RNA Library Prep Kit for Illumina (NEB, USA) according to manufacturer's instructions. Index codes were added to attribute sequences to each sample. The clustering of the index-coded samples was performed on a cBot Cluster Generation System using HiSeq PE Cluster Kit cBot-HS (Illumina) according to the manufacturer's instructions. After cluster generation, library preparations were sequenced on an Illumina Hiseq 2000 platform and 125 bp/150 bp paired-end reads were generated. FastqQc (Version 0.11.8) (www. bioin forma tics. babra ham. ac. uk/ proje cts/ fastqc/) was utilised for quality assessment of reads from each sample and Multiqc (Version 1.7) 52 was used to visualise all FastQc results. No samples were identified as being bad quality, and all samples were included in the next step of the workflow process. Trimmomatic (v0.36) 53 was used to trim paired reads in FASTQ files, using default parameters for paired-end mode and a minimum read length of 50 bp. Differential expression analysis. Differential expression analysis was performed on total RNA from 6 fish from 4, 7, 14 and 16 dpi using the DESeq2 (Version 1.24.0) 54 where each time point was compared to 6 fish from time 0, naïve, pre-AGD samples. DESeq2 provided statistical routines for determining differential expression in digital gene expression data using a model based on the negative binomial distribution. The resulting p values were adjusted to control for the false discovery rate (FDR) 55 . Genes with an adjusted p-value (p-adj) of < 0.05 were determined to be differentially expressed. No minimum log 2 FC threshold was assigned as a cut-off value for differential expression. Four lists of DEGs were generated: T0 vs T4 (4 dpi), T0 vs T5 (7 dpi), T0 vs T8 (14 dpi) and T0 vs T9 (16 dpi) down. Each list was further divided into down-regulated and up-regulated genes resulting in a total of eight DEGs lists. Volcano plots were created to visualise the DEGs at individual time points (Fig. 2).
Principal component analysis (PCA) plots were generated to visualise the relationships between all samples, as well as samples by time point. The similarity between samples from thirty RNA-seq libraries (6 fish × 5 groups) was analysed ( Supplementary Fig. S1 online). An additional T0 group (6 individual fish) was sent for sequencing as an added precaution as this time point was considered critical to the success of the study. The RNA sequencing from all six fish in both control groups (T0_Ctrl, T0_AGD) was similar with regard to the total number of reads per fish and also in the total number of reads mapped (T0 Ctrl: 58312820 with 88.42% mapped, T0 AGD: (preinoculation) 56224414 with 88.03% mapped). A pairwise comparison between T0_Ctrl and T0_AGD identified only 1 DE gene between the 2 groups. Subsequently, the T0_GCRL fish (n = 6) were selected for the pairwise analysis of the RNA-seq data to the other time points.
Pathway and process enrichment analysis. Gene Ontology (GO) 58,59 and Kyoto Encyclopedia of Genes and Genomes (KEGG) 60 pathway mapping was performed using Metscape 24 . The zebrafish (Danio rerio) database was used to determine GO enrichment as the option to use Salmo salar was not available in Metscape. rerio homologs, was used as the enrichment background. Terms with a p-value < 0.01, a minimum count of 3, and an enrichment factor > 1.5 (the enrichment factor is the ratio between the observed counts and the counts expected by chance) were collected and grouped into clusters based on their membership similarities. More specifically, p-values were calculated based on the accumulative hypergeometric distribution 61 , and q-values were calculated using the Benjamin-Hochberg procedure to account for multiple testings 55 . Kappa scores 62 were used as the similarity metric when performing hierarchical clustering on the enriched terms, and sub-trees with a similarity of > 0.3 were considered a cluster. The most statistically significant term within a cluster was chosen to represent the cluster.
Protein-Protein interaction (PPI). PPI was carried out using STRING (version 11.0, https:// string-db. org) to generate protein association networks based on function for immune genes identified as enriched in KEGG pathways and Reactome gene sets. The default settings were used and Danio rerio selected as the species of interest as STRING does not provide Salmo salar as a species option.
Validation of RNA-seq data using Real-time PCR. A pre-amplification multiplex step was carried out on the target genes using a MiniAmp Plus PCR machine (Applied Biosystems) using the Preamp Master Mix (Cat. No. 100-5581, Fluidigm) as per manufacturer's instructions. The pre-amplified cDNA was treated with Exonuclease I to remove unincorporated primers prior to running on the IFC chip. The PCR assay mix consisted of 0.7 µL of 50 mM primer mix (IDT, Belgium), 3.5 µL of 2X assay loading reagent (Fluidigm) and 2.8 µL of 1X DNA elution buffer (Qiagen, Germany). The sample premix was prepared with 200 µL of 2X SsoFast EvaGreen supermix with low ROX (Bio-Rad, München, Germany) and 20 µL of 20X Binding Dye Sample Loading Reagent (GE 48.48 Dynamic Array™ Sample & Assay Loading Reagent Kit, Item 85000821, Fluidigm), of which 3.85 µL of the sample pre-mix was combined with 3.15 µL of the diluted pre-amplified PCR product for each sample inlet of the IFC chip. The PCR program was 95 °C for 60 s, 30 cycles of 95 °C for 5 s, 58 °C for 20 s followed by a melt curve protocol of 55 °C to 95 °C with a ramp rate of 1 °C/3 s. Each sample was run in triplicate/chip and 3 chips used per validation. Melt curve analysis was preformed to confirm the specificity of the amplified PCR product. PCR amplification efficiency (E) was calculated for each gene of interest and the housekeeping gene by the generation of standard curves using tenfold serial dilutions of the cDNA template (standard curve: R 2 > 0.980, amplification efficiency range 90-105%). Melt curve analysis was used to confirm the amplification of single, PCR products. The relative expression for each gene was calculated using the delta delta CT (ddCT) method 64 by the Real-Time PCR Analysis software on the Biomark HD (Fluidigm, USA).

Statistical analysis.
For pathway and process enrichment analyses, p-values < 0.01 were calculated based on the accumulative hypergeometric distribution, and q-values calculated using the Benjamin-Hochberg correction for multiple testing using Metascape. Kappa scores were used as the similarity metric when performing hierarchical clustering on the enriched terms, and sub-trees with a similarity of > 0.3 were considered a cluster. The most statistically significant term within a cluster was chosen to represent the cluster. For qPCR analysis, the fold change of each gene at each time point was analysed relative to the T0 control using an un-paired t-test with differences considered significant at p < 0.05. Ggplot2_3.2.1 in R studio Version 3.5.1 was used to generate the Spearman correlation data.

Data availability
All data supporting this study are included in the results section, the Supplementary material section online and openly available from the NCBI Geo Database, Accession Number GSE179972.