Genome-scale comparative analysis for host resistance against sea lice between Atlantic salmon and rainbow trout

Sea lice (Caligus rogercresseyi) is an ectoparasite which causes major production losses in the salmon aquaculture industry worldwide. Atlantic salmon (Salmo salar) and rainbow trout (Oncorhynchus mykiss) are two of the most susceptible salmonid species to sea lice infestation. The objectives of this study were to: (1) identify genomic regions associated with resistance to Caligus rogercresseyi in Atlantic salmon and rainbow trout by performing single-step Genome-Wide Association studies (ssGWAS), and (2) identify candidate genes related to trait variation based on exploring orthologous genes within the associated regions across species. A total of 2626 Atlantic salmon and 2643 rainbow trout were challenged and genotyped with 50 K and 57 K SNP panels, respectively. We ran two independent ssGWAS for sea lice resistance on each species and identified 7 and 13 regions explaining more than 1% of the genetic variance for the trait, with the most important regions explaining 3% and 2.7% for Atlantic salmon and rainbow trout, respectively. We identified genes associated with immune response, cytoskeleton function, and cell migration when focusing on important genomic regions for each species. Moreover, we found 15 common orthogroups which were present in more than one associated genomic region, within- or between-species; however, only one orthogroup showed a clear potential biological relevance in the response against sea lice. For instance, dual-specificity protein phosphatase 10-like (dusp10) and dual-specificity protein phosphatase 8 (dusp8) were found in genomic regions associated with lice density in Atlantic salmon and rainbow trout, respectively. Dusp10 and dusp8 are modulators of the MAPK pathway and might be involved in the differences of the inflammation response between lice resistant and susceptible fish from both species. Our results provide further knowledge on candidate genes related to sea lice resistance and may help establish better control for sea lice in fish populations.


Results and discussion
The comparative genomics analysis presented here allowed us to identify groups of orthologues genes and several candidate genes among adjacent single nucleotide polymorphisms (SNP) that explained more than 1% of the genetic variance for resistance to C. rogercresseyi. This is the first study aimed at comparing the genomic basis of sea lice resistance in both Atlantic salmon and rainbow trout.
Sea lice challenge test. There was no difference between the average number of sea lice found on Atlantic salmon or rainbow trout in the experimental challenge (Table 1). An average of 5.9 ± 6.6 and 6.1 ± 4.2 parasites per fish was estimated for Atlantic salmon and rainbow trout, respectively. In terms of the maximum number of parasites, this value varied from 106 parasites in Atlantic salmon to 28 in rainbow trout and, for both species, some animals did not present any parasites. The average weight at the end of the experimental challenge was 278.1 ± 90.3 g (ranging from 104 to 569 g) and 173.1 ± 31.4 g (ranging from 86 to 265 g), for Atlantic salmon and rainbow trout, respectively. Although the average number of parasites was not significantly different between the two species, the difference in the average final weight at the end of each challenge could explain the difference in the maximum number of parasites found (~ 4 times more parasites in Atlantic salmon). The number of parasites counted in each species after the challenge is below the range determined in previous studies. For instance, Ødegård et al. 11 obtained an average of 20.96 ± 19.68, while Robledo et al. 21 reported an average of 38 ± 16 sea lice. The lower number of parasites in comparison to these studies is most likely related to differences in the area of counting (whole body surface versus only fins), parasite species (Lepeophtheirus salmonis versus C.rogercresseyi) and the time of sampling after infestation (8-15 versus 6 days).
To measure resistance to C. rogercresseyi, lice count values were transformed into lice density on the log scale (LogLD), which allows for correction of the number of parasites based on the body weight of each fish 11 . The empirical LogLD distribution for both species is shown in Fig. 1. The range of LogLD for Atlantic salmon was greater than for rainbow trout, varying from − 4.60 to 1.02 and from − 4.06 to 0.02, respectively. The average LogLD was − 2.14 ± 0.9 and − 1.66 ± 0.67 for Atlantic salmon and rainbow trout, respectively, which are similar to those reported in a previous study in a different Atlantic salmon population (between − 1.66 ± 0.73 and − 2.55 ± 0.58) 11 .
Genotyping and genomic heritabilities. A total of 2040 (77.6%) animals, and 45,117 (96.7%) SNPs passed the genotyping quality control for Atlantic salmon. In the case of rainbow trout, 2466 (93.3%) fish and 27,146 (67.4%) SNPs remained for subsequent analyses. In both species, significant genetic variation for resistance to C. rogercresseyi was estimated by using genomic information, with heritability values of 0.19 ± 0.03 and 0.08 ± 0.01 for Atlantic salmon and rainbow trout, respectively (Table 2).  12 , estimated genomic heritability values for resistance against L. salmonis of 0.22 ± 0.08 and 0.33 ± 0.08, while Ødegård et al. 11 , found heritability values of 0.14 ± 0.03 and 0.13 ± 0.03. Similarly, Yañez et al. 10 and Correa et al. 7 estimated heritability values ranging from 0.10 to 0.12 when defining resistance as the total number of parasites found on all fins using pedigree and genomic information, respectively, and Lhorente et al. 9 estimated heritability values of 0.22 ± 0.06 in Atlantic salmon for total count of sessile sea lice per fish, corrected by body weight in the statistical analysis.

GWAS.
In Atlantic salmon, we found genomic regions explaining more than 1% of the genetic variance for sea lice resistance in five different chromosomes (Fig. 2). Two of these chromosomes (Ssa3 and Ssa9) showed two QTL peaks associated with the trait. Only four SNPs located in Ssa3, Ssa11, Ssa14 and Ssa23 were shown to be significantly associated with the trait at a chromosome-wide level. However, none of these QTLs explained more than 1% of the genetic variance of the trait (Supplementary Figure S1). In general, these regions explained a low percentage of the total genetic variation with a maximum of 3% explained by a single locus. The two SNP windows in Ssa3 explained 1% and 1.4% of the genetic variance while those found in Ssa9 explained 1.7% and 3%. Other QTLs found in Ssa6, Ssa20, and Ssa25 explained 1.9%, 1.05% and 1.33% of the genetic variance, respectively. Supplementary Table S1 shows the variance explained by each window of SNP in both species.
In the case of rainbow trout, the wssGWAS for LogLD identified 13 regions located in different chromosomes that exceeded 1% of the total genetic variance for the trait (Fig. 3). Similar to Atlantic salmon, these windows explained a low percentage of the total variance with a minimum of 1% and a maximum of 2.7%, for QTLs located in Omy17 and Omy15, respectively. In addition, three SNPs located in Omy3, Omy6, Omy9 showed chromosome-wide significant association with the resistance trait in rainbow trout (Supplementary Figure S2).
Our results suggest that resistance against C. rogercresseyi is mainly under polygenic control (i.e., influenced by several genes with small effects) in both species. These results are in agreement with previous studies on sea lice resistance, where a similar genetic architecture was suggested for resistance against L. salmonis and C. royercresseyi resistance 7,12,22 . Recently Robledo et al. 23 described and characterized three QTLs related to sea lice resistance in Atlantic salmon by using GWAS and RNA-sequencing approaches. Since sea lice resistance is a polygenic trait, the acceleration of genetic improvement will most likely be best accomplished by employing genomic selection instead of marker-assisted selection or pedigree-based genetic evaluations. For instance, Correa et al. 12 and Tsai et al. 14 described an increase of over 22% in the accuracy of estimated breeding values (EBVs) using genomic selection over the use of pedigree-based models in Atlantic salmon 24 .  www.nature.com/scientificreports/ Candidate genes. The exploration of the genes within the windows that explained over 1% of the genetic variance for LogLD showed several potential candidate genes that were classified into three groups: related to the immune response, cytoskeleton or metalloproteases. The genes are listed in Tables 3 and 4 for Atlantic salmon and rainbow trout, respectively. A recent study on gene expression with C. rogercresseyi infestation in susceptible and resistant Atlantic salmon indicated that several components of the immune system (inflammatory response, cytokine production, TNF and NF-kappa B signaling and complement activation) and tissue repair are upregulated during infestation 21 . In salmonids, the main response of the immune system to parasites is mediated by T-Helper 1 and T-Helper 2 cells 25 . Thus, genes related to the immune response, either by promoting leukocyte growth or favoring migration or activation are strong candidate genes. For instance, in Atlantic salmon we found, T-cell activation Rho GTPase-activating protein (tagap) in Ssa6, which participates in the activation and recruitment of T cells by cytokines 26 , and tenascin R (tnr) in Ssa3, which is an extracellular matrix protein, present in bone marrow, thymus, spleen and lymph nodes 27 . The latter has been described as having an adhesin function favoring the mobility of lymphocytes and lymphoblasts 27,28 . In rainbow trout, we found candidate genes with similar functions, such as T-box 21 (tbx21), also known as T-bet (T-box expressed in T cells), found in Omy16. This gene belongs to the sub-Tbr1 family 29 , and generates type 1 immunity and participates in the maturation and migration of T-helper 1 (Th1) cells, which in turn produce interferon-gamma (IFN-γ). Studies have described T-bet expression in NK cells (natural killer), dendritic cells and T CD8+ cells 30,31 .
Forkhead box protein N1-like (foxn1) present on Ssa09 of Atlantic salmon is part of a family of genes widely studied in humans, which are related to various functions including cell growth, lymph node development, and T cell differentiation 32 . It has been proposed that foxn1 has a role in the activation of fibroblast growth factor receptors 32 .
Meanwhile, in rainbow trout on Omy21, serine/threonine-protein phosphatase 2A 56 kDa was identified, which is described as having participated in cell growth and signaling 33 . Robledo et al. 21 recently found that in Atlantic salmon, this protein showed the most significant change in the expression differences between healthy skin and skin where sea lice were attached 21 . In Atlantic salmon, we identified Tripartite motif-containing protein 45 (trim45) on Ssa25 which belongs to a large family of proteins present in diverse organisms that can function as a ligase and can modify ubiquitin and proteins stimulated by interferon of 15 kDa (isg15) 34 .
Several metalloproteases were found in genomic regions associated with resistance in both species, but for the interest of this study, we focused on GEM-interacting protein which interacts with rab27a or its effector in leukocytes. Rab is a large family of GTPases responsible for vesicle cellular transport 35 . Deficiencies of this  www.nature.com/scientificreports/ molecule is correlated with immune deficiencies due to the malfunction of cytotoxic activity of T-lymphocytes, natural killer cells and neutrophils 36 .
Considering the importance of cell growth and movement in response to sea lice infestation, the cytoskeleton may play a considerable role in sea lice resistance as well. For Atlantic salmon, genes related to the cytoskeleton, such as epidermal growth factor (egf) in Ssa9, were identified. This gene is part of a superfamily of receptors with tyrosine kinase activity that have been described in a variety of organs with growth promoter functions, cellular differentiation 38 and could participate in tissue repair by promoting cell growth 29 . In rainbow trout, fibroblast growth factors (fgf11, fgf13) located in Omy10 and Omy29 respectively, are involved in angiogenesis and proinflammatory responses, and were identified as important genes in sea lice resistance in previous transcriptomic studies by Skugor et al. (2009) and Robledo et al. (2018) in Atlantic salmon 21,39 . In addition, ELMO/CED-12 domain-containing prot 1 was identified in Omy10 in rainbow trout. This protein participates in phagocytosis of apoptotic cells, and in mammals, it also has a role in cell migration 40 . Other cytoskeleton related candidate genes include: Procollagen galactosyltransferase 1 present in Ssa6, collagen alpha-1 (XXVIII) chain-like on Ssa25 and pleckstrin homology domain-containing family H member 1-like on Ssa9 41 . The top ten SNPs that explained the highest variance for sea lice resistance are located on Ssa9 in Atlantic salmon, representing the most important QTL in this species. This QTL is harboring the breast carcinoma-amplified sequence 3 (bcas3) gene, which in Atlantic salmon codes for a cell migration factor associated with microtubules that favor cellular mobility 42 . Cell migration is generally induced in response to chemotactic signals, which induces changes in the cytoskeleton and extracellular matrix 43 . We also found, the tripartite motif-containing protein 16-like on Omy15, which is part of the trim superfamily and has functions related to cell differentiation, apoptosis, regulation of transcription and signaling pathways 34 . This gene is similar to Tripartite motif-containing protein 45 present on Ssa25. In this region, we also found a locus that codes for interferon-γ 2 (ifng2), which is a cytokine that participates in type 1 immune responses and that favors the presentation of antigens and activation of macrophages 44 . On this same chromosome (Omy15), we also identified putative ferric-chelate reductase 1 (frrs1), which functions in the fixation of iron in teleosts 45 . Robledo et al. 21 identified heme-binding protein 2 (hebp2) as a gene involved in Atlantic salmon sea lice resistance, which has an iron-binding function. Different authors 46,47 have stated that decreasing the availability of iron can be part of a nutritional defense mechanism against sea lice infestation.

Comparative genomics. The comparative genomic analyses performed show regions of synteny between
chromosomes associated with sea lice resistance in Atlantic salmon and rainbow trout (Fig. 4). Thus, there are homologous regions which are associated with the trait and share similarities between chromosomes from both species. However, there were no obvious shared regions associated to sea lice resistance within species (i.e. homeologous regions). The examined populations shared homeologous regions harboring genes controlling resistance, which might suggest similar genomic regions involved in the regulation of resistance in the two species. For example, Ssa3 (Atlantic salmon) shares extensive homology with Omy28 (rainbow trout) and Ssa25  Table S2), we found uncharacterized proteins in both species, which shared functionality identified by genomic ontology. We determined 15 orthogroups that were present in QTLs for sea lice resistance and were shared both within and between species (Supplementary Table S2). These orthogroups were classified according to gene ontology annotations 48 . One of the most interesting groups is orthogroup 12 which contained lysophosphatidic acid receptor 2-like (lpa2) in Atlantic salmon, and a G-protein coupled receptor 12-like in rainbow trout. This orthogroup shares the same GO categories (GO: 0004930, GO: 0007186, GO: 0016021, GO: 0070915, GO: 0007165, GO: 0016020) related to the receptor signaling pathway associated with protein G. The activation of LPA 2 participates in multiple biological processes, such as cytoskeleton modification via actin fiber formation 49 and have a role in the activation of related adhesion focal tyrosine kinase (raftk) 50 , which in turn participates like a stimulating factor for monocytes and macrophages 51 . In orthogroup 13, we identified dual-specificity protein phosphatase 10-like (dusp10) in Atlantic salmon and dual-specificity protein phosphatase 8 (dusp8) in rainbow trout. These genes might have a similar function in both species, which is most likely related to modulating p38 52 within the MAPK cascade 53 , a pathway of pro-inflammatory regulators. It has been previously shown that the lice resistant individuals have an up regulated production of pro inflammatory genes than most susceptible fish 54 . The other orthogroups found here did not show a clear relationship with sea lice resistance (Supplementary Table S2).

Figure 4.
A circos plot for genomic regions explaining more than 1% of the genetic variance for sea lice resistance in Atlantic salmon and rainbow trout. The inner ribbons mark syntenic regions between Atlantic salmon (green and labeled Ssa) and rainbow trout (orange and labeled Omyk) chromosomes. Genetic variance explained by 20 SNP windows obtained from the wssGWAS analysis are plotted on the outer ring, with the most important windows plotted in red (windows explaining ≥ 1% of the genetic variance). www.nature.com/scientificreports/

Conclusion
The GWAS performed here for Atlantic salmon and rainbow trout made it possible to compare the genetic basis of sea lice resistance in both species. We present novel information about resistance to sea lice in both species. Our results suggest that resistance might be mediated by genes controlling leukocyte response and the cytoskeleton, which promote cell mobility and repair of the wound. The analysis of orthologous proteins provided few characterized proteins. Therefore, further investigations are needed to better annotate genes and generate advances in the elucidation of the genetics behind resistance to Caligus rogercresseyi and other important traits in salmonids. We found uncharacterized common genes classified under similar mechanisms by GO terms that could explain resistance in both species. These results suggest that similar mechanisms may regulate sea lice resistance in Atlantic salmon and rainbow trout. Our results provide further knowledge to help establish better control and treatment measures for one of the most important parasitic diseases affecting Atlantic salmon and rainbow trout aquaculture.  55 . For this study, a total of 2588 PIT (Passive Integrated Transponder) -tagged rainbow trout, originated from 105 maternal full-sib families from the 2012 year-class, were used. For the challenge, the fish were separated into three different tanks so that each family was equally represented in each tank. The C. rogercresseyi infestation was conducted with a total of 105,600 copepodites, i.e., an infestation pressure of ~ 40 copepods/fish, which were produced in vitro from ovigerous females. The infestation consisted of depositing the copepodites in each one of the three test tanks, stopping the water flow and keeping the room in darkness for 6 h. On the sixth day after infestation, parasite counting on all fins was performed and caudal fins were sampled for genetic analysis. All fish were euthanized and fins were examined for parasite count using a stereoscopic magnifying glass. Body weight was also recorded for each animal at the end of the challenge.

Sea lice challenge in Atlantic salmon.
A total of 2559 Atlantic salmon smolts belonging to 118 maternal full-sib families from the 2010 year-class of Salmones Chaicas S.A. (Puerto Montt, Chile), were challenged with C. rogercresseyi. The fish were PIT-tagged, acclimated and distributed into three tanks as described in previous studies 9, 10 . Infestation with the parasite was carried out using 13-24 copepods per fish, stopping the water flow for 6 h after the infestation. The challenge lasted 6 days, then the fish were euthanized and the sea lice were counted on all of the fins. A sample of the caudal fin was taken for genetic analysis and the body weight of each fish was measured at the end of the challenge.
Genotyping. Genomic  Genomic association analysis. Resistance to C. rogercresseyi was defined as follows, according to Ødegård et al. 11 : where LD is the C. rogercresseyi density defined as the lice count (LC) on each fish at the end of the experimental challenge plus a unity, divided by the cube root of the squared body weight of the fish on the same day (BW), which is an approximation of the surface of the skin of each fish. The logarithm of LD was used as it has an approximately normal distribution. A weighted single-step genomic association study (wssGWAS) 58 was used to identify associations between SNPs and resistance to C. rogercresseyi in both species, using the BLUPF90 family of programs 59 . This approach uses a combination of both genomic and pedigree matrixes. Genotype and pedigree information were used to generate the H kinship matrix 60  www.nature.com/scientificreports/ where A −1 is the inverse of the relationship matrix for all the animals, constructed from the pedigree, A −1 22 is the inverse of the pedigree relationship matrix for the genotyped animals, and G −1 is the inverse of the genomic relationship matrix for the genotyped animals. The SNPs were weighted with equal value and assigned the constant 1 to perform the single-step GWAS method. For the weighted single-step GWAS method, the markers were assigned to weights estimated by the previous method. The association analysis for both species were performed using the following mixed linear model y = Xb + Za + e, where y is the vector of phenotypic values (LogLD); b is the vector of fixed effects (tank); a is the vector of random animal effects, considering the structure of covariance between individuals established by matrix H, and e is the vector of random residuals; X and Z are the incidence matrices for fixed and random animal effects, respectively.
To identify the regions of the genome associated with resistance, we generated windows of 20 adjacent SNPs. Thereafter, if a window explained more than 1% of the genetic variance, it was considered associated with the trait. We also estimated the p-values for individual SNP-trait associations by using BLUPF90 software 61 .
Genome comparison. The rainbow trout (GCF_002163495.1) 62 and Atlantic salmon (GCF_000233375.1) 63 genomes were downloaded from the NCBI database and the subset for chromosomes associated with sea lice resistance was aligned with software Samtools 64 . Synteny between the chromosomes was identified by aligning sequences using the program Symap v3.4 65 . Circos 66 was used to visualize the relationships between genomic regions associated to sea lice resistance in rainbow trout and Atlantic salmon chromosomes.
Candidate genes. The 71 pb flanking sequences surrounding SNPs associated with sea lice resistance were aligned to the most recent reference genomes of Atlantic salmon and rainbow trout using BLASTn 67 . Sequences covering 1 Mb, flanking the associated SNPs (0.5 Mb downstream and 0.5 Mb upstream), were saved in the FASTA format. BLASTx was then used to identify coding sequences for proteins in these 1 Mb associated windows. Blast2GO 68 was used in parallel with the FASTA file to identify proteins and classify them by function. For both species, the reference genome of Danio rerio (GenBank Assembly Accession: GCA_000002035.4) was used to annotate proteins that were not characterized in the rainbow trout or Atlantic salmon reference genomes. To identify orthologous proteins/genes between species, the OrthoFinder 69 program was used with the FASTA sequences obtained with BLASTx.