## Introduction

Sea lice is currently the most harmful parasite for salmon farming worldwide1. Economic losses due to different sea lice species are mainly associated with the reduction of feed conversion rate, growth, indirect mortalities and loss of product value. Furthermore, it has been estimated that the global costs for sea lice control have reached \$480 million (USD) annually2. The two most important sea lice species which generate a considerable negative impact in salmon farming are Lepeophtheirus salmonis and Caligus rogercresseyi3.

Caligus rogercresseyi, first described in 2000 by Boxshall and Bravo4, is the main sea lice species affecting salmon aquaculture in Chile5. C. rogercresseyi primarily affects Atlantic salmon (Salmo salar) and rainbow trout (Oncorhynchus mykiss), while coho salmon (Oncorhynchus kisutch) has an innate lower susceptibility to the parasite6. The consequences of the infestation by sea lice include skin lesions, osmotic imbalance and greater susceptibility to bacterial and viral infections through the suppression of the immune response by the damage generated in the skin of the host7. The parasite life cycle is comprised of eight stages of development8: two states of nauplii, one copepod state, four chalimus states and the adult state. The stages of nauplii (I and II) and copepods (infectious stage) are planktonic. The four stages of chalimus (I–IV) are sessile while the adult is a mobile stage9.

Recent studies have estimated low to moderate genetic variation for resistance to sea lice in Atlantic salmon populations, with heritabilities ranging from 0.12 to 0.32 and from 0.13 to 0.33 when resistance was defined as the number of parasites fixed in all the fins7,9,10, or as the logarithm of the parasite density11,12, respectively. A recent study reported a heritability value of 0.09 for sea lice resistance in a rainbow trout breeding population13. These results indicate that it is feasible to improve resistance to sea lice in Atlantic salmon and rainbow trout populations by utilizing selective breeding9,10,14.

Comparative genomic approaches15 allow the identification of genomic similarities between different species, including conserved genes and motifs, traces of genome duplication and gene functions16. Traditionally, these analyses are focused on orthologous genes, which are homologous genes present in different species resulting from direct transmission from a common ancestor16. To date, comparative genomic studies between salmonids have mainly focused on finding evolutionary similarities in the genetic basis of body and sex related traits, including growth, development and sexual differentiation17,18,19.

A recent study compared results from genome-wide association studies (GWAS) on three salmonid species and identified functional candidate genes involved in resistance to the infection caused by Piscirickettsia salmonis, an intracellular bacterium20. To date, no studies have aimed at comparing genomic regions associated with resistance to sea lice in salmonid species.

The main objectives of the present study were to: (1) identify genomic regions associated with resistance to Caligus rogercresseyi in Atlantic salmon and rainbow trout through GWAS, and (2) identify functional candidate genes potentially related to trait variation through a comparative genomics approach, based on exploring orthologous genes within the associated regions between species.

## Results and discussion

The comparative genomics analysis presented here allowed us to identify groups of orthologues genes and several candidate genes among adjacent single nucleotide polymorphisms (SNP) that explained more than 1% of the genetic variance for resistance to C. rogercresseyi. This is the first study aimed at comparing the genomic basis of sea lice resistance in both Atlantic salmon and rainbow trout.

### Sea lice challenge test

There was no difference between the average number of sea lice found on Atlantic salmon or rainbow trout in the experimental challenge (Table 1). An average of 5.9 ± 6.6 and 6.1 ± 4.2 parasites per fish was estimated for Atlantic salmon and rainbow trout, respectively. In terms of the maximum number of parasites, this value varied from 106 parasites in Atlantic salmon to 28 in rainbow trout and, for both species, some animals did not present any parasites. The average weight at the end of the experimental challenge was 278.1 ± 90.3 g (ranging from 104 to 569 g) and 173.1 ± 31.4 g (ranging from 86 to 265 g), for Atlantic salmon and rainbow trout, respectively. Although the average number of parasites was not significantly different between the two species, the difference in the average final weight at the end of each challenge could explain the difference in the maximum number of parasites found (~ 4 times more parasites in Atlantic salmon). The number of parasites counted in each species after the challenge is below the range determined in previous studies. For instance, Ødegård et al.11 obtained an average of 20.96 ± 19.68, while Robledo et al.21 reported an average of 38 ± 16 sea lice. The lower number of parasites in comparison to these studies is most likely related to differences in the area of counting (whole body surface versus only fins), parasite species (Lepeophtheirus salmonis versus C.rogercresseyi) and the time of sampling after infestation (8–15 versus 6 days).

To measure resistance to C. rogercresseyi, lice count values were transformed into lice density on the log scale (LogLD), which allows for correction of the number of parasites based on the body weight of each fish11. The empirical LogLD distribution for both species is shown in Fig. 1. The range of LogLD for Atlantic salmon was greater than for rainbow trout, varying from − 4.60 to 1.02 and from − 4.06 to 0.02, respectively. The average LogLD was − 2.14 ± 0.9 and − 1.66 ± 0.67 for Atlantic salmon and rainbow trout, respectively, which are similar to those reported in a previous study in a different Atlantic salmon population (between − 1.66 ± 0.73 and − 2.55 ± 0.58)11.

### Genotyping and genomic heritabilities

A total of 2040 (77.6%) animals, and 45,117 (96.7%) SNPs passed the genotyping quality control for Atlantic salmon. In the case of rainbow trout, 2466 (93.3%) fish and 27,146 (67.4%) SNPs remained for subsequent analyses. In both species, significant genetic variation for resistance to C. rogercresseyi was estimated by using genomic information, with heritability values of 0.19 ± 0.03 and 0.08 ± 0.01 for Atlantic salmon and rainbow trout, respectively (Table 2).

In previous studies in Atlantic salmon populations, Tsai et al.12, estimated genomic heritability values for resistance against L. salmonis of 0.22 ± 0.08 and 0.33 ± 0.08, while Ødegård et al.11, found heritability values of 0.14 ± 0.03 and 0.13 ± 0.03. Similarly, Yañez et al.10 and Correa et al.7 estimated heritability values ranging from 0.10 to 0.12 when defining resistance as the total number of parasites found on all fins using pedigree and genomic information, respectively, and Lhorente et al.9 estimated heritability values of 0.22 ± 0.06 in Atlantic salmon for total count of sessile sea lice per fish, corrected by body weight in the statistical analysis.

### GWAS

In Atlantic salmon, we found genomic regions explaining more than 1% of the genetic variance for sea lice resistance in five different chromosomes (Fig. 2). Two of these chromosomes (Ssa3 and Ssa9) showed two QTL peaks associated with the trait. Only four SNPs located in Ssa3, Ssa11, Ssa14 and Ssa23 were shown to be significantly associated with the trait at a chromosome-wide level. However, none of these QTLs explained more than 1% of the genetic variance of the trait (Supplementary Figure S1). In general, these regions explained a low percentage of the total genetic variation with a maximum of 3% explained by a single locus. The two SNP windows in Ssa3 explained 1% and 1.4% of the genetic variance while those found in Ssa9 explained 1.7% and 3%. Other QTLs found in Ssa6, Ssa20, and Ssa25 explained 1.9%, 1.05% and 1.33% of the genetic variance, respectively. Supplementary Table S1 shows the variance explained by each window of SNP in both species.

In the case of rainbow trout, the wssGWAS for LogLD identified 13 regions located in different chromosomes that exceeded 1% of the total genetic variance for the trait (Fig. 3). Similar to Atlantic salmon, these windows explained a low percentage of the total variance with a minimum of 1% and a maximum of 2.7%, for QTLs located in Omy17 and Omy15, respectively. In addition, three SNPs located in Omy3, Omy6, Omy9 showed chromosome-wide significant association with the resistance trait in rainbow trout (Supplementary Figure S2).

Our results suggest that resistance against C. rogercresseyi is mainly under polygenic control (i.e., influenced by several genes with small effects) in both species. These results are in agreement with previous studies on sea lice resistance, where a similar genetic architecture was suggested for resistance against L. salmonis and C. royercresseyi resistance7,12,22. Recently Robledo et al.23 described and characterized three QTLs related to sea lice resistance in Atlantic salmon by using GWAS and RNA-sequencing approaches. Since sea lice resistance is a polygenic trait, the acceleration of genetic improvement will most likely be best accomplished by employing genomic selection instead of marker-assisted selection or pedigree-based genetic evaluations. For instance, Correa et al.12 and Tsai et al.14 described an increase of over 22% in the accuracy of estimated breeding values (EBVs) using genomic selection over the use of pedigree-based models in Atlantic salmon24.

### Candidate genes

The exploration of the genes within the windows that explained over 1% of the genetic variance for LogLD showed several potential candidate genes that were classified into three groups: related to the immune response, cytoskeleton or metalloproteases. The genes are listed in Tables 3 and 4 for Atlantic salmon and rainbow trout, respectively.

A recent study on gene expression with C. rogercresseyi infestation in susceptible and resistant Atlantic salmon indicated that several components of the immune system (inflammatory response, cytokine production, TNF and NF-kappa B signaling and complement activation) and tissue repair are upregulated during infestation21. In salmonids, the main response of the immune system to parasites is mediated by T-Helper 1 and T-Helper 2 cells25. Thus, genes related to the immune response, either by promoting leukocyte growth or favoring migration or activation are strong candidate genes. For instance, in Atlantic salmon we found, T-cell activation Rho GTPase-activating protein (tagap) in Ssa6, which participates in the activation and recruitment of T cells by cytokines26, and tenascin R (tnr) in Ssa3, which is an extracellular matrix protein, present in bone marrow, thymus, spleen and lymph nodes27. The latter has been described as having an adhesin function favoring the mobility of lymphocytes and lymphoblasts27,28. In rainbow trout, we found candidate genes with similar functions, such as T-box 21 (tbx21), also known as T-bet (T-box expressed in T cells), found in Omy16. This gene belongs to the sub-Tbr1 family29, and generates type 1 immunity and participates in the maturation and migration of T-helper 1 (Th1) cells, which in turn produce interferon-gamma (IFN-γ). Studies have described T-bet expression in NK cells (natural killer), dendritic cells and T CD8+ cells30,31.

Forkhead box protein N1-like (foxn1) present on Ssa09 of Atlantic salmon is part of a family of genes widely studied in humans, which are related to various functions including cell growth, lymph node development, and T cell differentiation32. It has been proposed that foxn1 has a role in the activation of fibroblast growth factor receptors32.

Meanwhile, in rainbow trout on Omy21, serine/threonine-protein phosphatase 2A 56 kDa was identified, which is described as having participated in cell growth and signaling33. Robledo et al.21 recently found that in Atlantic salmon, this protein showed the most significant change in the expression differences between healthy skin and skin where sea lice were attached21. In Atlantic salmon, we identified Tripartite motif-containing protein 45 (trim45) on Ssa25 which belongs to a large family of proteins present in diverse organisms that can function as a ligase and can modify ubiquitin and proteins stimulated by interferon of 15 kDa (isg15)34.

Several metalloproteases were found in genomic regions associated with resistance in both species, but for the interest of this study, we focused on GEM-interacting protein which interacts with rab27a or its effector in leukocytes. Rab is a large family of GTPases responsible for vesicle cellular transport35. Deficiencies of this molecule is correlated with immune deficiencies due to the malfunction of cytotoxic activity of T-lymphocytes, natural killer cells and neutrophils36.

Considering the importance of cell growth and movement in response to sea lice infestation, the cytoskeleton may play a considerable role in sea lice resistance as well. For Atlantic salmon, genes related to the cytoskeleton, such as epidermal growth factor (egf) in Ssa9, were identified. This gene is part of a superfamily of receptors with tyrosine kinase activity that have been described in a variety of organs with growth promoter functions, cellular differentiation38 and could participate in tissue repair by promoting cell growth29. In rainbow trout, fibroblast growth factors (fgf11, fgf13) located in Omy10 and Omy29 respectively, are involved in angiogenesis and pro-inflammatory responses, and were identified as important genes in sea lice resistance in previous transcriptomic studies by Skugor et al. (2009) and Robledo et al. (2018) in Atlantic salmon21,39. In addition, ELMO/CED-12 domain-containing prot 1 was identified in Omy10 in rainbow trout. This protein participates in phagocytosis of apoptotic cells, and in mammals, it also has a role in cell migration40. Other cytoskeleton related candidate genes include: Procollagen galactosyltransferase 1 present in Ssa6, collagen alpha-1 (XXVIII) chain-like on Ssa25 and pleckstrin homology domain-containing family H member 1-like on Ssa941. The top ten SNPs that explained the highest variance for sea lice resistance are located on Ssa9 in Atlantic salmon, representing the most important QTL in this species. This QTL is harboring the breast carcinoma-amplified sequence 3 (bcas3) gene, which in Atlantic salmon codes for a cell migration factor associated with microtubules that favor cellular mobility42. Cell migration is generally induced in response to chemotactic signals, which induces changes in the cytoskeleton and extracellular matrix43. We also found, the tripartite motif-containing protein 16-like on Omy15, which is part of the trim superfamily and has functions related to cell differentiation, apoptosis, regulation of transcription and signaling pathways34. This gene is similar to Tripartite motif-containing protein 45 present on Ssa25. In this region, we also found a locus that codes for interferon-γ 2 (ifng2), which is a cytokine that participates in type 1 immune responses and that favors the presentation of antigens and activation of macrophages44. On this same chromosome (Omy15), we also identified putative ferric-chelate reductase 1 (frrs1), which functions in the fixation of iron in teleosts 45. Robledo et al.21 identified heme-binding protein 2 (hebp2) as a gene involved in Atlantic salmon sea lice resistance, which has an iron-binding function. Different authors46,47 have stated that decreasing the availability of iron can be part of a nutritional defense mechanism against sea lice infestation.

### Comparative genomics

The comparative genomic analyses performed show regions of synteny between chromosomes associated with sea lice resistance in Atlantic salmon and rainbow trout (Fig. 4). Thus, there are homologous regions which are associated with the trait and share similarities between chromosomes from both species. However, there were no obvious shared regions associated to sea lice resistance within species (i.e. homeologous regions). The examined populations shared homeologous regions harboring genes controlling resistance, which might suggest similar genomic regions involved in the regulation of resistance in the two species. For example, Ssa3 (Atlantic salmon) shares extensive homology with Omy28 (rainbow trout) and Ssa25 with Omy3. In addition, when performing the search for orthologue genes (Supplementary Table S2), we found uncharacterized proteins in both species, which shared functionality identified by genomic ontology.

We determined 15 orthogroups that were present in QTLs for sea lice resistance and were shared both within and between species (Supplementary Table S2). These orthogroups were classified according to gene ontology annotations48. One of the most interesting groups is orthogroup 12 which contained lysophosphatidic acid receptor 2-like (lpa2) in Atlantic salmon, and a G-protein coupled receptor 12-like in rainbow trout. This orthogroup shares the same GO categories (GO: 0004930, GO: 0007186, GO: 0016021, GO: 0070915, GO: 0007165, GO: 0016020) related to the receptor signaling pathway associated with protein G. The activation of LPA2 participates in multiple biological processes, such as cytoskeleton modification via actin fiber formation49 and have a role in the activation of related adhesion focal tyrosine kinase (raftk)50, which in turn participates like a stimulating factor for monocytes and macrophages51. In orthogroup 13, we identified dual-specificity protein phosphatase 10-like (dusp10) in Atlantic salmon and dual-specificity protein phosphatase 8 (dusp8) in rainbow trout. These genes might have a similar function in both species, which is most likely related to modulating p3852 within the MAPK cascade53, a pathway of pro-inflammatory regulators. It has been previously shown that the lice resistant individuals have an up regulated production of pro inflammatory genes than most susceptible fish54. The other orthogroups found here did not show a clear relationship with sea lice resistance (Supplementary Table S2).

## Conclusion

The GWAS performed here for Atlantic salmon and rainbow trout made it possible to compare the genetic basis of sea lice resistance in both species. We present novel information about resistance to sea lice in both species. Our results suggest that resistance might be mediated by genes controlling leukocyte response and the cytoskeleton, which promote cell mobility and repair of the wound. The analysis of orthologous proteins provided few characterized proteins. Therefore, further investigations are needed to better annotate genes and generate advances in the elucidation of the genetics behind resistance to Caligus rogercresseyi and other important traits in salmonids. We found uncharacterized common genes classified under similar mechanisms by GO terms that could explain resistance in both species. These results suggest that similar mechanisms may regulate sea lice resistance in Atlantic salmon and rainbow trout. Our results provide further knowledge to help establish better control and treatment measures for one of the most important parasitic diseases affecting Atlantic salmon and rainbow trout aquaculture.

## Material and methods

### Experimental animals

All experiments were performed under relevant guidelines and regulations and were approved by the Institutional Committee for Animal Care and Use of the University of Chile (Certificate N 17,041-VET-UCH).

### Sea lice challenge in rainbow trout

The fish for this study belong to a rainbow trout breeding population established in 1998 by Aguas Claras S.A., at Quetroleufu, IX Region, Chile, and currently owned by EFFIGEN S.A. (Puerto Montt, Chile). The population of this study were from the year-class 2011, which has undergone three generations of selection growth, carcass quality and others traits of interest. The details of the population management and breeding program were described by Yoshida et al.55. For this study, a total of 2588 PIT (Passive Integrated Transponder) -tagged rainbow trout, originated from 105 maternal full-sib families from the 2012 year-class, were used. For the challenge, the fish were separated into three different tanks so that each family was equally represented in each tank. The C. rogercresseyi infestation was conducted with a total of 105,600 copepodites, i.e., an infestation pressure of ~ 40 copepods/fish, which were produced in vitro from ovigerous females. The infestation consisted of depositing the copepodites in each one of the three test tanks, stopping the water flow and keeping the room in darkness for 6 h. On the sixth day after infestation, parasite counting on all fins was performed and caudal fins were sampled for genetic analysis. All fish were euthanized and fins were examined for parasite count using a stereoscopic magnifying glass. Body weight was also recorded for each animal at the end of the challenge.

### Sea lice challenge in Atlantic salmon

A total of 2559 Atlantic salmon smolts belonging to 118 maternal full-sib families from the 2010 year-class of Salmones Chaicas S.A. (Puerto Montt, Chile), were challenged with C. rogercresseyi. The fish were PIT-tagged, acclimated and distributed into three tanks as described in previous studies9,10. Infestation with the parasite was carried out using 13–24 copepods per fish, stopping the water flow for 6 h after the infestation. The challenge lasted 6 days, then the fish were euthanized and the sea lice were counted on all of the fins. A sample of the caudal fin was taken for genetic analysis and the body weight of each fish was measured at the end of the challenge.

### Genotyping

Genomic DNA was extracted from the caudal fin of each challenged fish using the DNeasy Blood & Kit tissue kit (Qiagen), following the manufacturer's instructions. The 2628 Atlantic salmon samples were genotyped using a custom Affymetrix® 50 K Axiom® myDesign™ Genotyping Array designed by AquaInnovo and the University of Chile56, while the 2643 rainbow trout samples were genotyped with a 57 K SNP array developed by the United State Department of Agriculture (USDA)57. Quality control of the genotypes was carried out in PLINK v1.90b3.34. SNPs with a call rate ≤ 0.95, a Minor Allele Frequency (MAF) < 0.05 and those that were not in Hardy–Weinberg equilibrium (p < $$1\times 10^{-6}$$) were discarded. Individuals were filtered if they had a call rate ≤ 0.95. All the SNPs and fish that passed quality control, were used for downstream analysis.

### Genomic association analysis

Resistance to C. rogercresseyi was defined as follows, according to Ødegård et al.11:

$$LogLD={log}_{e}\left(\frac{LC+1}{\sqrt[3]{} {BW}^{2}}\right)$$

where LD is the C. rogercresseyi density defined as the lice count (LC) on each fish at the end of the experimental challenge plus a unity, divided by the cube root of the squared body weight of the fish on the same day (BW), which is an approximation of the surface of the skin of each fish. The logarithm of LD was used as it has an approximately normal distribution.

A weighted single-step genomic association study (wssGWAS)58 was used to identify associations between SNPs and resistance to C. rogercresseyi in both species, using the BLUPF90 family of programs59. This approach uses a combination of both genomic and pedigree matrixes. Genotype and pedigree information were used to generate the H kinship matrix60, as defined in the following equation:

$${H}^{-1}={A}^{-1}+\left[\begin{array}{cc}0& 0\\ 0& {G}^{-1}-{A}_{22}^{-1}\end{array}\right]$$

where $${A}^{-1}$$ is the inverse of the relationship matrix for all the animals, constructed from the pedigree, $${A}_{22}^{-1}$$ is the inverse of the pedigree relationship matrix for the genotyped animals, and $${G}^{-1}$$ is the inverse of the genomic relationship matrix for the genotyped animals. The SNPs were weighted with equal value and assigned the constant 1 to perform the single-step GWAS method. For the weighted single-step GWAS method, the markers were assigned to weights estimated by the previous method. The association analysis for both species were performed using the following mixed linear model y = Xb + Za + e, where y is the vector of phenotypic values (LogLD); b is the vector of fixed effects (tank); a is the vector of random animal effects, considering the structure of covariance between individuals established by matrix H, and e is the vector of random residuals; X and Z are the incidence matrices for fixed and random animal effects, respectively.

To identify the regions of the genome associated with resistance, we generated windows of 20 adjacent SNPs. Thereafter, if a window explained more than 1% of the genetic variance, it was considered associated with the trait. We also estimated the p-values for individual SNP-trait associations by using BLUPF90 software61.

### Genome comparison

The rainbow trout (GCF_002163495.1)62 and Atlantic salmon (GCF_000233375.1)63 genomes were downloaded from the NCBI database and the subset for chromosomes associated with sea lice resistance was aligned with software Samtools 64. Synteny between the chromosomes was identified by aligning sequences using the program Symap v3.465. Circos66 was used to visualize the relationships between genomic regions associated to sea lice resistance in rainbow trout and Atlantic salmon chromosomes.

### Candidate genes

The 71 pb flanking sequences surrounding SNPs associated with sea lice resistance were aligned to the most recent reference genomes of Atlantic salmon and rainbow trout using BLASTn67. Sequences covering 1 Mb, flanking the associated SNPs (0.5 Mb downstream and 0.5 Mb upstream), were saved in the FASTA format. BLASTx was then used to identify coding sequences for proteins in these 1 Mb associated windows. Blast2GO68 was used in parallel with the FASTA file to identify proteins and classify them by function. For both species, the reference genome of Danio rerio (GenBank Assembly Accession: GCA_000002035.4) was used to annotate proteins that were not characterized in the rainbow trout or Atlantic salmon reference genomes. To identify orthologous proteins/genes between species, the OrthoFinder69 program was used with the FASTA sequences obtained with BLASTx.

### Ethics approval and consent to participate

All the experimental challenges were approved by the Institutional Committee for Animal Care and Use of the University of Chile (Certificate N 17,041-VET-UCH). We also confirm that the study was carried out in compliance with the ARRIVE guidelines.