Introduction

Determining the role of the host genetic architecture in driving variation of fitness-related traits is critical to determining a population’s evolutionary trajectory (Carlson and Seamons 2008). In the past decade, the microbiota, defined as a community of microbes that live in or on a defined environment (Berg et al. 2020), has emerged as a factor associated with host fitness (Rosshart et al. 2017; Suzuki 2017; Gould et al. 2018; Berg et al. 2020). Particular interest has been given to the microbiota of the gastrointestinal tract (gut microbiota), where it plays important roles both in the health and development of the host (Nayak 2010; Romero et al. 2014; Ghanbari et al. 2015). In fish, the gut microbiota (hereafter, microbiota) plays many beneficial roles, such as aiding in metabolism (Semova et al. 2012; Tremaroli and Bäckhed 2012), immunity (Galindo-Villegas et al. 2012; Milligan-Myhre et al. 2016), and development (Bates et al. 2006; Nikouli et al. 2019). As such, the microbiota is generally shaped by host species (Ye et al. 2014), life stage (Llewellyn et al. 2016), diet (Bolnick et al. 2014a; Bolnick et al. 2014b; Webster et al. 2018), physiology (Bolnick et al. 2014c; Ye et al. 2014), geographical isolation (Ye et al. 2014; Webster et al. 2018), and genetic divergence (Sullam et al. 2015; Webster et al. 2018; Riiser et al. 2020). While the gut microbiota for many fish species has been characterized, the role of host genetics is not fully characterized (Nayak 2010; Llewellyn et al. 2014; Ghanbari et al. 2015; Sullam et al. 2015; Li et al. 2018; Xiao et al. 2021), especially with regards to the host genetic effects acting among and within populations.

The genetic architecture of a trait encompasses the mapping of an organism’s genotype to phenotype, accounting for environmental influences and resulting in variation in quantitative traits (Hansen 2006). The genetic architecture of a phenotype may be described in terms of its variance components, such as additive genetic variance or maternal effects without reference to specific genes (Falconer and MacKay 1996; e.g., Aykanat et al. 2012a). We use the term “quantitative genetic architecture” to distinguish this approach. Various components of the quantitative genetic architecture have been shown to affect microbiota variation as shown by differences between monozygotic and dizygotic human twins (Goodrich et al. 2016); recombinant strains of mice (Mus musculus, Snijders et al. 2016); among-family effects in pigs (Chen et al. 2018), and maternal and individuality effects in red squirrels (Tamiasciurus hudsonicus, Ren et al. 2017). In fish, however, the contribution of the host genetics, including its interaction with the environment, is not fully characterized following a quantitative genetics approach, despite a wealth of published fish microbiota studies (Wong and Rawls 2012; Bolnick et al. 2014b; Ghanbari et al. 2015). With over 32,000 described species (Eschmeyer and Fong 2015), fish comprise more than half of the known vertebrate species and encompass a wide range of phenotypes, life histories, and ecologies (Nelson et al. 2016). Therefore, the host quantitative genetic architecture and its effects on the composition and diversity of the microbiota presents a significant knowledge gap in our characterization of the microbiota in fish. The host controls intestinal mucosa and immune factors that play essential roles in the establishment and maintenance of the microbiota (Spor et al. 2011; Romero et al. 2014; Ghanbari et al. 2015). Therefore, it is expected that host genome variation may play a role in shaping the microbiota. Using high-throughput sequencing technology, studies have shown the effects of host genetic variation on the microbiota at three levels of host biological organization in fish. First, at the among-species level, it is known that microbiota variation occurs among taxonomically related species reared in the same or related environments, suggesting a role of host genetics in microbiota community structure (e.g. between bluegill (Lepomis macrochirus), channel catfish (Ictalurus punctatus) and largemouth bass (Micropterus salmoides), Larsen et al. 2014; silver carp (Hypophthalmichthys molitrix) and gizzard shad (Dorosoma cepedianum), Ye et al. 2014; bighead carp (Hypophthalmichthys nobilis), crucian carp (Carassius cuvieri) and grass carp (Ctenopharyngodon idellus), Li et al. 2015; Argyrosomus regius, Dicentrarchus labrax, Diplodus puntazzo, Sparus aurata and Pagrus pagrus, Nikouli et al. 2020).

Second, at the among-populations, within-species level, studies have demonstrated strong population-level effects on the microbiota of fish reared in either wild or artificial environments in zebrafish (Danio rerio, Roeselers et al. 2011), Trinidadian guppies (Poecilia reticulata, Sullam et al. 2015), Threespine stickleback (Gasterosteus aculeatus; Smith et al. 2015; Milligan-Myhre et al. 2016), Atlantic salmon (Salmo salar, Webster et al. 2018) and Atlantic cod (Gadus morhua, Riiser et al. 2020). These studies illustrate that alpha diversity estimates are generally higher in fish from wild than reared populations (Roeselers et al. 2011; Webster et al. 2018) and that microbial community differences may be present among populations at the OTU-level (Roeselers et al. 2011; Sullam et al. 2015; Webster et al. 2018). To explain the interpopulation microbiota differences, these studies invoke both neutral (Roeselers et al. 2011; Sullam et al. 2015) and selection based (Sullam et al. 2015; Webster et al. 2018) evolutionary processes; however, evidence for both remains inconclusive. However, the lack of within-species population microbial effects in Atlantic cod highlights the main remaining question regarding the within-species population level effect: what is the driver of microbiota differences at population-level, is it host genetics or the environment (Riiser et al. 2020)? This makes population-level studies important to understanding the host-microbiome co-evolutionary relationship, as they have been used to deduce putative microbial roles in host adaptation (Sullam et al. 2015; Webster et al. 2018). Investigating microbiota variation due to population-level host quantitative genetic architecture effects provides an exciting platform to advance microbiota research.

Third, at the within-population, within species level, studies have shown the interaction effects of -diet with genotype (based on sex, Bolnick et al. 2014b) or major histocompatibility complex class IIb (MHC class IIb) polymorphism on microbiota diversity and function (Bolnick et al. 2014c); however, studies utilizing related families within a population to estimate within-population genetic effects are scarce. Such studies are critical to characterizing the role of host genetics on the microbiota, as they allow us to address the estimate of heritability of microbial diversity and differential abundance (Dvergedal et al. 2020), defined as the proportion of phenotypic variance in a population attributable to additive genetic variance (Visscher et al. 2008). Collectively, the literature shows that the host genome plays a pivotal role in determining the composition of the microbiota in various fish species (Roeselers et al. 2011; Bolnick et al. 2014b, Sullam et al. 2015; Nikouli et al. 2018; Webster et al. 2018) but there is still a gap in our knowledge regarding the relative contribution of among- and within-population genetic effects on their microbiotas. It is shown then, overall, that measuring the extent of gut microbiota variation among and within populations is important in efforts pertaining to the sustainability, optimization (Dvergedal et al. 2020), management and conservation of salmonids (Garcia de Leaniz et al. 2007).

Environmental contributions to microbiota diversity often complicate the interpretation of microbiota divergence among locally sourced populations (Sullam et al. 2015; Riiser et al. 2020). For example, some studies attributed the observed interpopulation effects on microbiota variation to environmental variation (Roeselers et al. 2011; Sullam et al. 2015; Webster et al. 2018), in spite of a lack of a controlled, common rearing environment rarely demonstrated in microbiota studies (Dvergedal et al. 2020). Given that variation in the environment contributes strongly to microbiota variation (e.g., due to diet, Navarrete et al. 2012; Wong et al. 2013; Smith et al. 2015; or rearing environments, Smith et al. 2015; Webster et al. 2018; Parshukov et al. 2019), this is not a surprising outcome. These environmental contributions may be strong in the wild, as these environments offer more diverse feeding options (Smith et al. 2015; Sullam et al. 2015), ecosystem-specific ecological interactions (Sullam et al. 2015) and variation across different life stages and their associated habitats (Llewellyn et al. 2016). In addition, some quantitative genetic architecture components, such as maternal effects, include both environmental and additive genetic effects (Aykanat et al. 2012a). Such effects are known to contribute substantially to among-population phenotypic variation in Chinook salmon (Oncorhychus tshawytscha) for life history and fitness-related traits (Aykanat et al. 2012a; Aykanat et al. 2012b). These environmental effects can act as confounders in microbiota studies (Goodrich et al. 2014a, 2014b; Dvergedal et al. 2020) even with attempts to statistically account for their affects (Smith et al. 2015), underscoring the need to control for rearing environments to explicitly determine host quantitative architecture effects (Goodrich et al. 2014a; Ghanbari et al. 2015; Dvergedal et al. 2020).

Perhaps among the best-studied quantitative genetic architectures of non-model animals, and of fish, in general, are those of salmonids’, including the Pacific salmon (Waples et al. 2019). Here, we focus on the microbiome of Chinook salmon (Oncorhychus tshawytscha), an anadromous and semelparous salmonid recognized as the largest species in its genus (Rounsefell 1958; Quinn 2018; Ohlberger et al. 2018). Through aquaculture production of over 14,800 tonnes, the economical contributions of Chinook salmon top 190 million USD across in 2017, with an additional estimated 5,750 capture tonnes (FAO 2017). Furthermore, Chinook salmon are a key species playing critical roles in nutrient and bioenergy cycling (Cederholm et al. 1999; Koehler et al. 2006), trophic ecology (Bernatchez and Dodson 1987) and maintaining biodiversity (Waples et al. 2004). Small sample-size studies have recently accomplished the sequencing of the mid- (n = 30, Ciric et al. 2019) and distal-gut microbiota of Chinook salmon (n = 4, Booman et al. 2018; n = 30, Ciric et al. 2019), but no studies have investigated the role of host genetics on the Chinook salmon microbiome.

The aim of this study was to quantity the relative contribution of among-population and within-population genetic effects on the gut microbiota variation in Chinook salmon while minimizing known confounding environmental and maternal effects. Salmonids are known to lend themselves to traditional breeding designs, permitting us to partition genetic and environmental sources of variance (Lynch and Wash 1998). In addition, salmonids are characterized by a stable population genetic structure and low gene flow (Vähä et al. 2008; Palstra and Ruzzante 2010; Narum et al. 2008; Gomez-Uchida et al. 2012), allowing us to test for population effects on the microbiota. Capitalizing on these characteristics, we reared half-sib families from a single fully domesticated and seven wild-domestic hybrid crosses of Chinook salmon in replicated pens. In creating the hybrid crosses, we chose wild populations that are partially reproductively isolated (Quinn et al. 2000; Unwin et al. 2000; Heath et al. 2006), allowing for the inference of among hybrid-cross effects, reflective of population and domestication effects (Fig. 1). Further, the half-sibling relatedness among individuals corresponding to various sires within each hybrid-cross allow us to test within-population additive genetic effects on gut microbiota diversity and composition. Finally, due to using a common dam in our breeding design (Semeniuk et al. 2019), we limited the potential for maternal environmental effects.

Fig. 1: Map showing the stock source of the male Chinook salmon used for fertilization of eight pure and hybrid crosses used in this study.
figure 1

Crosses included pure (YIAL) and hybrid (CAP, CHILL, NIT, PUNT, RC, BQ, QUIN) crosses of Chinook salmon. Robertson Creek “RC”, Big Qualicum River “BQ”, Capilano River “CAP”, Chilliwack River “CHILL”, Nitinat River “NIT”, Puntledge River “PUNT”, Quinsam River “QUIN”.

Materials and Methods

Field collections, breeding design, and rearing environment

Fish husbandry was carried out at Yellow Island Aquaculture Ltd. (YIAL), a Chinook salmon hatchery and organic-based farm located east of Campbell River on Quadra Island, Vancouver Island, British Columbia, Canada (Fig. 1). Wild sourced eggs from Robertson Creek and milt from Big Qualicum were used to produce the seven-generation, fully-domesticated, stock of Yellow Island (YIAL), which has been in production since 1985 (Details of fish husbandry in Semeniuk et al. 2019). To create the hybrid stocks used in this study, mature male fish (sires) were collected by seine, and then milt was collected from each of 10 random sires at salmon enhancement facilities representing each wild population, cryopreserved following a commercially available protocol (Canada Cryogenetics Services; www.cryogenetics.com), and tested for sperm density uniformity before storing in Square Packs® until time of fertilization (Semeniuk et al. 2019). Then, eggs from 17 highly inbred females (offspring from a self-crossed hermaphrodite) were mixed, and 80 subsets of the mixed eggs (~600 eggs) were fertilized using a 0.25 mL sample of thawed milt from each of 10 sires representing the fully domestic and each of the wild stocks of Chinook salmon on November 1st 2013. This produced 80 full- and half-sib families belonging to seven outcrossed hybrid stocks (YIAL × Wild) and a fully inbred domesticated stock (YIAL × YIAL). Due to the large geographical distance among the sire-sourced natural populations, it is likely that sires come from partially reproductively isolated populations. Consequently, the maternal line is identical for all hybrid crosses (herein “crosses”), but the paternal line for those crosses varies depending on the geographical origin of the paternal line.

Eggs from each family were incubated in replicate cells in divided vertical incubation trays, until swim-up stage once they had accumulated 1000 Accumulated Thermal Units (ATUs) (19 weeks post-fertilization). To prevent potential density effects, a random subset of no more than 120 alevin per family were transferred to 200 L freshwater barrels for rearing. In the tanks, individuals were an ad libitum organic standard diet four times daily (Taplow Feeds, BC, Canada), and the temperatures and dissolved oxygen levels were maintained at 10–12 °C and above 90% saturation, respectively. Thirty-two weeks post fertilization, the individuals were PIT tagged for later cross and family identification and were vaccinated for vibriosis (Vibrogen 2: Vibrio anguillarum-ordalii; Novartis Animal Health Canada, Inc. Charlottetown, PEI). Smoltifying fish were separated by stock and divided randomly between replicate saltwater newly built net pens (dimensions: 4.5 m × 4.5 m × 3.0 m deep) until 80 weeks post-fertilization. All the rearing nets were purpose-built for this study; therefore, they were brand new mesh and rope with no prior contact with marine water. Fish were fed pre-weighed commercially available salmon food four times a day (Taplow Feeds, BC, Canada), and care was taken to prepare and maintain identical pen rearing conditions. To maintain the organic standard of fish rearing, we avoided using antibiotics or chemotherapeutants, hormones, growth enhancers or any other compounds that the fish would not consume or be exposed to in their natural state; in addition, we maintained a stocking density that mimics the wild situation by keeping the maximum density in the sea pens much lower than the maximal densities of wild schools of salmon (5 kg per cubic meter).

Sample collection, DNA extraction, and next generation sequencing

On June 2015, a 10–25 fish subset of the 80-week-old fish was randomly selected from each pen and individually weighted (mean size = 182 g; Supplementary Table 1), with the minimum sample size allowing us to minimize individual microbiota effects (Panteli et al. 2020). To obtain gut content samples for DNA extraction, the fish were humanely euthanized and sacrificed, the body cavity was cut open with a sterile scalpel and the hindgut of each offspring was collected. Given that all the fish were fed the same diet, the use of gut content from each individual allowed us to test the allochthonous microbiota composition primarily due to differences in host quantitative genetic architecture. Individual gut samples were immediately stored in RNAlaterTM for transport to the research facility, where it was stored in the freezer at −20 °C until DNA extraction.

We extracted DNA from gut content (and a negative control, containing only nuclease-free water) of the distal intestine using commercially available E.Z.N.A Stool DNA Kit (OMEGA Bio-tek) following the manufacturer’s protocol. Amplicon libraries were prepared for individual gut samples in two steps as previously described (He et al. 2018). In the first step, we used 2 µL gDNA, 15.4 μL of ddH2O, 2.5 μL of 10 × buffer (including Mg2+), 3.5 μL of MgSO4 (2 μM), 0.5 μL dNTPs (10 mM), 0.1 µL of Taq polymerase and 0.5 µL of each of the universal primer set of 787 F (V5F) (ATTAGATACCCNGGTAG) and 1046 R (V6R) (CGACAGCCATGCANCACCT) which were used to target the 16 S rRNA encoding gene sequences containing the V5-V6 hypervariable regions for amplification. Those primers were modified by adding a short sample-code specific sequence (“barcode”) and an Ion Torrent adapter sequence to the 5’ end of the forward (acctgcctgccg) and reverse (acgccaccgagc) primers. The PCR program consisted of an initial denaturation step at 95 °C for 60 s, followed by 28 repeated cycles of denaturation, annealing and elongation (95 °C for 15 s, 55 °C for 30 s, and 72 °C for 30 s, respectively) and a final elongation stage at 72 °C for 7 min.

The PCR product was visualized for amplification success on a 2% agarose gel, and PCR product purification was then carried out using Agencourt AMPure XP beads (Beckman Coulter Genomics GmbH, Mississauga, ON, Canada). A second short-cycle ligation PCR was conducted to ligate adapter and the barcode sequences to the amplicon using 10 µL of purified PCR product from the first round PCR, in addition to the following: 2.3 μL of ddH2O, 2.5 μL of 10 × Buffer (including Mg2 + ), 3.5 μL of MgSO4 (2 μM); 0.5 μL dNTPs (0.10 mM), 0.2 μL of Taq, 0.5 μL UniA (CCATCTCATCCCTGCGTGTCTCCGACTCAGXXXXXXXXXXGATacctgcctgccg) forward primer and 0.5 μL of UniB (CCTCTCTATGGGCAGTCGGTGATacgccaccgagc) reverse, where the underlined sequence in UniA consisted of unique 10–12 bp barcode sequences necessary for the sample demultiplexing in sequence analysis and the lower-case sequence were the reverse compliment of the added sequence in the first primer set. Barcoded samples were combined with amounts selected based on PCR band intensity and a commercially available kit (GenCatchTM, Epoch Life Science, Inc., Sugar Land, TX., USA) was used to purify the PCR product (~360 bp) from incomplete amplicons and primer dimers. The final library was sequenced with an Ion Torrent™ Personalized Genome Machine (Thermo Fisher Scientific, Inc., Mississauga, Canada) on a 318 chip capable of sequencing fragments of length up to 400 bp.

Sequence processing and data analysis

Sequence quality checks were initially conducted using personal genome machine (PGM) software (Torrent Suite™ v5.6) using default parameters to: 1) remove mixed clonal libraries on Ion Sphere Particles (ISPs) known as polyclonals, 2) remove low-quality sequences, and 3) remove sequences with low quality data at the 3’ end of the read. Prior to sequence processing, the PGM-generated FASTQ file was split into multiple FASTQ files based on exact matches of barcode sequences (mismatches = 0) corresponding to the samples in study using FASTX Barcode Splitter (Gordon and Hannon 2017).

Unless otherwise stated, all sequence processing and statistical analysis was performed using the Quantitative Insights into Microbial Ecology (QIIME-2) pipeline (version 2020.6; Bolyen et al. 2019). Primer and NGS-adapter sequences were removed using Cutadapt (Martin 2011) and reads with no primer-sequence matches were discarded. In total, 8,820,813 raw reads contained the primer sequence and retained for processing. Sequencing reads were denoised, dereplicated, and chimera-filtered using DADA2 (Callahan et al. 2016) with default parameters, including a maximum error rate of 2 and consensus chimera removal using minimum parent-sequence fold-count of 1. Based on visual inspection of per-bp position quality score plots, reads were trimmed to 265 bp length sequences using DADA2 to main high-quality scores across reads. The DADA2 sequence quality-check process generated 3,909,887 high-quality reads, corresponding to 3,139 unique amplicon sequence variants (ASVs) across 330 samples. Although no band was observed for our negative control, sequencing resulted in 123 reads, representing 6 ASVs. We followed three steps to accurately assign taxonomy to ASVs across all samples. First, full-length 16 S non-redundant small subunit (NR-SSU) sequences clustered at 99% sequence identity from the SILVA database (Version 138.1; Quast et al. 2012; Yilmaz et al. 2014) were used to extract reference-reads matching our high-quality filtered ASVs, and then trimmed to 265 bp-length reads. Second, the extracted reference reads were used to train a Naïve Bayes classifier using default Scikit-learn parameters (Quast et al. 2012; Bokulich et al. 2020). Finally, the trained classifier was used to assign SILVA-based taxonomic annotations (Quast et al. 2012; Yilmaz et al. 2014; Bokulich et al. 2020) to all distinct ASVs using a confidence level of 80%. The assigned taxonomy labels were used to filter ASV-table sequences classified as Eukaryota, Archaea, Chloroplasts, Mitochondria, and those identified as “Ambiguous taxa” based on the SILVA classifications. In total, 21,084 sequences were removed based on these classifications, accounting for 0.54% of all sequences. ASVs detected in the negative control were measured for prevalence using the R (version 4.0.2, R Core Team 2016) package, decontam (Davis et al. 2018), using a prevalence value of 20%, and 3 contaminant ASVs were subsequently removed from our sequencing library. Two ASVs identified with decontam belonged to genera known as common reagent contaminants – Ralstonia, and Burkholderia (de Goffau et al. 2018). To main sufficient sequencing depth for statistical analysis, samples with 3000 reads or less were removed from the analysis. To visually determine sequencing depth sufficiency, rarefaction curves of observed ASVs were plotted with 10 iterations at each of 10 increments (steps) leading to the smallest sample sequencing depth (3190, Supplementary Fig. 1). The final processed ASV-table contained 3,751,832 reads corresponding to 220 samples (Supplementary Table 2) and 2869 unique ASVs.

Diversity analyses

To visualize microbial community composition across hybrid crosses, relative abundances were computed for all ASVs, and ASVs comprising relative abundance of 0.5% in at least one hybrid cross were used to create representative barplots of major taxa across the hybrid crosses. To achieve uniform sequencing depth across samples, a rarefied table was generated in QIIME2 (version 2020.6), using the minimum library sequencing depth (n = 3,190 reads), which was used to compute alpha and beta diversity metrics. To visualize differences among crosses, boxplots were created using the R (v4.0.2; R Core Team 2016) package ggplots2 (v3.3.2; Wickham 2016). To perform alpha diversity analyses, the Chao1, Shannon, and Simpson indices were measured using the rarefied table, and nested-ANOVA models were constructed for each diversity metric in the R (v4.0.2) package lme4 (v1.1–23; Bates et al. 2006). ANOVAs were constructed as follows: “Cross” was used as a main factor, “sire” was nested in cross, and “replicate pen” was nested in “sire” nested in “cross”. Reduced models were constructed for each factor, and likelihood ratio tests (using maximum likelihood) were run to compute χ2 and probability values. Multiple t-tests were conducted post-hoc in the emmeans package (version 1.4.8; Lenth et al. 2018) in “R” (v4.0.2) to test for pairwise hybrid-cross differences, and corrections for multiple tests were made using Bejanmini and Hochberg adjusted p values (Benjamini and Hochberg 1995).

To explore differences in the microbiota community composition, the rarefied ASV table was used to measure pairwise distances using an abundance-based metric (Bray-Curtis), and a presence/absence-based metric (Jaccard). Patterns of differences among crosses were visualized by plotting 2D principal coordinate analyses (PCoA) using the two most variance-explaining eigenvectors for Bray-Curtis and Jaccard distance matrices. To simplify visualization, principal coordinate scores were extracted using PAST (v3.25; Hammer et al. 2001), averaged for each cross and plotted as a centroid, and then, 95% confidence interval was computed to represent dispersion error bars. To visualize ASV presence/absence patterns among crosses, we listed ASVs present in the rarefied table for each cross and then created an intersection plot in the UpSetR package (v1.4.0; Lex et al. 2014) in R (v4.2.0; R Core Team 2016). The intersection plot shows the ASVs found exclusively in certain stocks, and ASVs found commonly among multiple stocks. In order to test if sampling size (number of fish per family) has an impact on the number of ASVs generated, we sub-sampled equal number of fish per family and compared the number of ASVs generated.

To measure mean differences in community composition, permutational analysis of variance models (PERMANOVAs; Anderson 2001) tests were conducted using an “R” vegan package wrapper in QIIME2. First, global nested-PERMANOVAs were run to test for differences in pairwise distance matrices (Bray-Curtis and Jaccard) among crosses, among sires (nested in crosses) and between replicate pens (nested in sires, with sires nested in crosses). To further explore specific patterns in community composition differences in the microbiota among crosses, post-hoc pairwise PERMANOVAs were conducted for pairs of crosses to determine hybrid cross-specific differences. Pairwise PERMANOVA tests were corrected for multiple comparisons among pairs of crosses using false rate discovery (Benjamini and Hochberg 1995). Models were also run for each cross to determine the statistical significance of microbiota compositional differences due to sire and pen effects (nested within sire) for each cross separately. To determine whether significant statistical differences in crosses were attributed to differences in locational (mean composition) or dispersion (spread) differences, ad-hoc permutational homogeneity of dispersion (PERMDISP; Anderson 2004) tests were conducted globally and pairwise among crosses. Each non-parametric model was run with 9999 permutations, and pairwise tests were corrected for multiple comparisons among pairs of crosses using false rate discovery (Benjamini and Hochberg 1995).

To determine the presence and the nature of taxa driving community composition differences among crosses, linear discriminant analysis effect size (LEfSE; Segata et al. 2011) tests were run in MicrobiomeAnalyst (Dhariwal et al. 2017) on relative abundances of ASVs. To prevent rare ASV effects, we filtered ASVs occurring in 5 samples or less, and then removed ASVs with 200 reads or less across all remaining samples. The remaining ASVs corresponded to 96.4% of all reads in our filtered ASV table.

Results

The gut microbiota in pure and domesticated hybrid crosses of Chinook salmon and within-group diversity

The gut microbiota of Chinook salmon is predominated by family Vibrionaceae, family Mycoplasma, and family Comamonadaceae (Fig. 2). Five major taxa occurred commonly across the hybrid-crosses, including family Vibrionaceae, order Alderbacteria, and the following genera: Mycoplasma, Bacillus, and Lactococcus. The mean microbial community Chao1 index across all crosses was 59.7, ranging from 41.7 (PUNT) to 69.1 (YIAL) (Fig. 3); for Shannon diversity, the mean across all crosses was 4.33, ranging from 3.97 (PUNT) to 4.54 (YIAL) (Fig. 3); for Simpson’s index, the mean across all crosses was 0.91, ranging from 0.90 (PUNT) to 0.92 (NIT) (Fig. 3) within crosses. A large variation was observed in the alpha diversity measures across all crosses in the study for the alpha diversity indices (Fig. 3).

Fig. 2: Relative abundance of the major taxa of the gut microbial community of Chinook salmon.
figure 2

The taxa shown represent microbial taxa assigned to ASVs comprising 0.5% or higher of the relative abundance across all hybrid crosses. Other Taxa represents all ASVs comprising 0.5% or less of relative abundance.

Fig. 3: Alpha diversity using Chao1, Shannon, and Simpson indices across all hybrid-crosses.
figure 3

Boxplots represent 25th percentile (bottom-end) to 75th percentile (top-end) values. Whiskers represent values 1.5 times above or below interquartile range. Dots represent values > 1.5 times and <3 times the interquartile range.

Our likelihood ratio tests showed no significant differences due to cross or sire effects using the Chao1 and Shannon indices, but a significant difference was found between replicate pens for the Chao1 index (F = 1.53, P = 0.038; Table 1) and not for Shannon’s index. Post-hoc pairwise t-tests analyses showed no statistically significant differences in the means between the pairs of crosses for Chao1, Shannon, or Simpson indices (Table 1), corroborating the overall analysis results.

Table 1 Analysis of Variance using alpha diversity indices: Chao1, Shannon, and Simpson indices. Nested ANOVAs were constructed using Cross, Sire (nested within Cross), and Pen (nested within Sire within cross) as main factors. Significant differences were found for Pen effects (P < 0.05) using Shannon’s index.

Partitioning of overall microbial community composition variance according to hybrid cross, sire, and netpen effects

Overall, the first two principal coordinates accounted for approximately 37% of all variance in Bray-Curtis distances across all sample pairs and ~22% (PC1 = 15.81%, PC2 = 6.31%) for Jaccard distances. The multidimensional clustering revealed YIAL as an outlier cross, CHILL as intermediate, and the remaining stocks clustering more closely together on the axes (Fig. 4).

Fig. 4: Principal coordinate analysis (PCoA) plot with the principal coordinates (PCs) explaining the highest percent variance.
figure 4

Pairwise Bray-Curtis distances were used to perform PCoA across all gut microbiota samples from the Chinook salmon hybrid crosses used in the study. Each open circle represents the average PCoA coordinates for a hybrid cross, and error bars represent 95% confidence intervals (CI). Cross abbreviations are defined in Fig. 1.

Using the overall PERMANOVA model, significant cross effects were found in the microbial community structure using Bray-Curtis distances (P = 0.0025, pseudo-F = 1.74) and Jaccard distances (P = 0.00040, pseudo-F = 1.55); however, no sire or replicate pen effects were found on community composition (Table 2). Cross-specific nested-PERMANOVA models showed significant sire effects within CHILL on Bray–Curtis distances (P = 0.028, pseudo-F = 1.48), and within CHILL and NIT on Jaccard distances (P = 0.0017, pseudo-F = 1.35 and P = 0.009, pseudo-F = 1.32, respectively). No significant sire effects were found within other crosses, and no replicate pen effects were found for any cross for Bray-Curtis and Jaccard distances. PERMDISP showed no significant differences of within-group dispersions among crosses for Bray–Curtis (P = 0.83, pseudo-F = 0.42) and Jaccard distances (P = 0.35, pseudo-F = 0.99). Further, no pairwise PERMDISP differences were found among crosses.

Table 2 Microbiota community composition analysis results using nested-PERMANOVA. Sires were nested in crosses, and replicate pens were nested in sires (nested in crosses).

Pairwise PERMANOVA tests using Bray–Curtis distances (BCD) and Jaccard distances (JD) showed that YIAL was statistically different from all hybrid crosses except CHILL (P > 0.05); further, CHILL was different than CAP (P = 0.0028) with BCD and JD, in addition, CHILL was different than BQ (P = 0.040) with JD; however, all other pairwise PERMANOVA comparisons showed a lack of significant pairwise differences (Supplementary Table 3). Finally, PERMANOVA results using only wild crosses shows no overall significant cross differences among wild crosses (Jaccard P = 0.104, pseudo-F = 1.12; Bray–Curtis P = 0.24, pseudo-F = 1.13), but there are pairwise differences between CHILL and certain populations, before correcting with FDR for multiple tests (Supplementary Table 4).

Detecting differentially abundant ASVs among hybrid-cross differences

In total, 124 ASVs were found to be common to all crosses, and the largest number of unique ASVs per cross were distinctly found to be in YIAL (429), CHILL (293), and QUIN (206) (Fig. 5a). YIAL had 429 unique ASVs and 1109 total ASVs when all 38 fish were included (Supplementary Table 4). When number of fish was reduced to 21 per cross to make equal number of fish per family, unique ASVs for YIAL had 354 and total number of ASVs for YIAL had 863 (Supplementary Fig. 2 and Supplementary Table 4). Similarly, total number of ASVs common to all crosses had 112 ASVs. Although number of fish per cross has an impact on the total number of ASVs, YIAL and CHILL still had highest number of observed and unique ASVs independent of number of fish tested, indicating that YIAL and CHILL guts have high bacterial diversity (Supplementary Fig. 2). So we continued using all fish data for further analysis.

Fig. 5: Representation of unique ASVs across the hybrid-crosses used in the study, accompained by their corresponding taxonomic desgination.
figure 5

A In the UpsetR plot, the dots on the horizontal lines represent the hybrid crosses in which a unique ASV detected based on rarefied data, and the bargraph above the dots represent the number of unique ASVs found for the indicated crosses. B The vertical line bubble plots present breakdowns of assigned family taxonomy of unique ASVs corresponding to the indicated crosses in (A). Amplicon sequence variants table was rarefied to minimum library size prior to analysis. Associations with size 9 or less were removed from plot. Taxonomic classifications are presented for unique ASVs with 10 or more counts across all hybrid crosses.

Considering the taxonomic families of unique ASVs occurring across the hyrid crosses at least 10 times or more, YIAL showed the largest number of unique ASVs for 8 families (Sphingomonadaceae; Moraxellaceae; Corynebacteriaceae; Chitinophagaceae; Pseudomonadaceae; Beijerinckiaceae; Paracaedibacteraceae; Pirellulaceae; Caulobacteraceae), with CHILL showing the highest number of unique ASVs for Lactobacillaceae (Fig. 5b). Furthermore, the highest number of ASVs found common to all hybrid-crosses were taxonomically classified as Bacillaceae (15 ASVs), Comamonadaceae (12 ASVs), Streptococcaceae (14 ASVs), Vibrionaceae (14 ASVs), or Mycoplasmataceae (11 ASVs).

Using multiple (FDR-corrected) Kruskal-Wallis tests implemented in the LefSe algorithm, 17 ASVs showed significant differences among crosses (Fig. 6). Out of these 17 differentially abundant ASVs, 15 ASVs that belongs to Bacillus, Bedllovibrio, Hydrogenophaga, Lactobacillus, Lactococcus, Sphingopyxis, Comamonadaceae, were classified as potentially beneficial genera (Soltani et al. 2019; Dwidar et al. 2012; Minich et al. 2020; He et al. 2018; Oh et al. 2019; Willems 2014), and 2 ASVs that belongs to Renibacterium salmoninarum and Yersiniaceae were identified as potentially pathogenic (Evenden et al. 1993; Wrobel et al. 2019). The Yersiniaceae family has non-pathogenic strains too. Multiple non-parametric pairwise comparisons within each ASV showed significant differences in the mean abundance for fourteen ASVs between pairs of hybrid crosses (Fig. 7).

Fig. 6: Identification of LEfSe Biomarkers associated with statistically significant ASVs (indicated by FDR P-values of KW sum-rank test) across all hybrid crosses and their associated effect size (represented by LDA scores).
figure 6

The right-hand mini-heatmap represents abundances associated with each ASV across the hybrid crosses. Filtering was performed to remove ASVs occurring in 5 samples or less, and with 200 reads or less across all remaining samples.

Fig. 7: Histograms showing relative frequencies of candidate gut microbiota taxa across all eight Chinook salmon hybrid crosses.
figure 7

Shown are 17 ASVs that showed significant differences in the biomarker LEfSe analysis among the crosses and their associated taxonomic classification. Error bars represent 95% confidence intervals. Letters above error bars represent post-hoc pairwise statistical differences among crosses, based on multiple student T tests of the mean (P < 0.05, adjusted for multiple comparisons with BH). Hybrid cross abbreviations are defined in Fig. 1.

Discussion

This study provides key evidence for two critical host quantitative genetic architecture component effects on the gut microbiota composition of Chinook salmon reared in a common environment. The first component is characterized by significant hybrid cross effects acting among partially reproductively isolated populations at bacterial 16 S rRNA community and ASV levels, inferring strong inter-population genetic divergence effects. Based on the published literature for fish, we expected to find among-population gut microbiota differences, indicative of previously reported host genetic divergence effects and the known role of the microbiota in assisting the hosts to cope with their environment (Sullam et al. 2015; Webster et al. 2018). The second component constitutes cross-specific significant sire effects, representing the first report of additive genetics effects acting within populations on fish microbiota composition at the community level. The sire effects observed in the microbiota composition were surprising: Although there is no literature on the effects of additive genetics in fish, studies in humans show they contribute minimally in humans (Yatsunenko et al. 2012; Kurilshikov et al. 2017; Rothschild et al. 2018; but see Goodrich et al. 2014b). While we did not find hybrid-cross or sire effects on alpha-diversity, we did find significant, but small and rare, pen effects on alpha-diversity reflective of environmental effects. Pen effects were not expected as the replicate pens were designed to be as similar as possible (size, water quality, feeding regime, etc.); however, these differences are likely due to the generally reported high magnitude of environmental drivers on the microbiota (Wu et al. 2013; Goodrich et al. 2014a; Sullam et al. 2015; Rothschild et al. 2018). Overall, this study presents a rare but much-needed approach to studying quantitative genetic architecture and putative host-genotype effects on the fish microbiota.

A surprising and key aspect of this study is the presence of strong hybrid-cross effects on microbiota composition in spite of use of replicate net pens to control for environmental variation. By eliminating the previously reported environmental confounders such as diet as contributors to microbiota variation (Sullam et al. 2015; Webster et al. 2018), we further highlight the importance of the quantitative genetic architecture effects on the microbiota. Even after limiting potential confounding rearing and maternal environmental effects (Goodrich et al. 2014a, Aykanat et al. 2012b), among hybrid-cross effects accounted for the largest amount of variation explained for microbiota compositional differences. For instance, hybrid effects accounted for over 36% of variance explained for microbiota composition using pairwise Bray-Curtis distances. Therefore, we show that the divergent microbiotas among hybrid-crosses reflect primarily additive among-population quantitative genetic architecture effects. Such a pattern of effects is consistent with domestication selection (in YIAL) and local selection pressures among the native environments of each of the contributing populations.

Two main drivers of phenotypic differences in animals reared in captivity from their wild counterparts include genetic selection for favorable traits (domestication) or a direct effect of the environment on phenotype variation (Metcalfe et al. 2003; Webster et al. 2020). This study provides motivation for further exploration of the role played by selection-driven phenotypic variations (Kawecki and Ebert 2004) among hybrid crosses at the putative functional sequence (ASV) level. For instance, CHILL, showed the most extreme counts of the lactic acid bacteria (LABs), frequently showing significantly higher relative abundances in pairwise contrasts to other crosses. Lactic acid bacteria are known to contribute favorably to host health in fish (Ingerslev et al. 2014; He et al. 2018). If higher LAB abundance is indeed adaptive, this may explain why CHILL had the highest survival observed at the sampling (saltwater) incubation phase (Semeniuk et al. 2019). Moreover, YIAL and CHILL harbored higher counts of ASVs known to exhibit biochemical and ecological versatility such as Comamonadaceae (Willems 2014). Finally, three hybrid crosses (BQ, CAP, and PUNT) showed higher levels of Bacillus sp. bacteria, thought to play beneficial growth, immune and probiotic roles in fish health (Soltani et al. 2019). However, it is critical to note that there the potential disconnect between the putative pathogenicity and actual health and disease indicates that there may be non-beneficial or non-pathogenic taxa in some of the groups we discuss. Therefore, while not conclusive, these results point towards non-random microbiota differences due to host genetic effects. Further work is needed to characterize the effects culminating in the formation of divergent microbiota community compositions among population crosses.

At the hybrid-cross level, the fully-domesticated cross, YIAL, exhibited the most divergent community composition at the ASV levels. Indeed, YIAL consistently showed significant compositional differences from all other crosses bar CHILL. As all fish were reared in a common environment, it is likely that strong domestication selective pressures experienced within the YIAL production stock led to rapid divergence in both the host and gut microbiota community. Although differing captive-rearing pressures are known to affect the microbiota (Roeselers et al. 2011; Webster et al. 2020), it is not known whether microbiotas reared under domesticated conditions would yield individuals with higher fitness. Determining the competitive performance of hybrid-cross versus fully domesticated microbiota phenotypes will have important consequences for salmonid conservation programs aiming to supplement declining populations of Chinook salmon (Janowitz‐Koch et al. 2018), or for determining the impact of unintentional introgression following salmon aquaculture escapes (Glover et al. 2017; Wringe et al. 2018).

The most unexpected wild-hybrid microbiota results were observed for CHILL, as its microbial community was compositionally different from BQ (Jaccard distances) and CAP (Jaccard and Bray-Curtis), but not from any other hybrid crosses, including YIAL. These patterns indicate that CHILL possesses an intermediate microbiota composition similar to YIAL but less divergent from other populations. Interestingly, using the same study system and hybrid crosses described here, CHILL was shown to vary from the other hybrid crosses in related studies. For example, in a study designed to detect gene expression differences among and within the hybrid crosses, CHILL exhibited a marked difference in gene transcription profile relative to the other hybrid cross stocks (including YIAL) consistent with the divergence observed pattern in this study (Toews et al. 2019). Furthermore, over the entire production period, CHILL interestingly exhibited the lowest survival relative to the other crosses (Semeniuk et al. 2019). Here, we consider potential evolutionary processes in contributing to CHILL’s intermediate microbiota. If the CHILL microbial community reflects population additive genetic effects, then its differing composition could be explained by forces of selection, including those of anthropogenic stressors, or genetic drift (Yeaman and Otto 2011). The native environment of the CHILL hybrid-cross, the Chilliwack River (CHILL) channels, has historically experienced a wide range of anthropogenic and natural stressors, which may have contributed to these processes. The known anthropogenic stressors include extensive forest harvesting (Boyle et al. 1997) and road building (Blackwell et al. 1999), introducing woody debris and sediments into the river, respectively. In addition, there are known natural stressors including large floods experienced in the Chilliwack River between the years of 1952 and 1980 (Ham 1996), but the impact of those floods on the Chinook salmon stocks is unknown (Bradford 1995).

This study presents the first report of within-population additive genetic variance effects on the microbiota composition in fish. Additive genetic variation is a critical component of the overall quantitative genetic architecture for any trait, as it defines the scope for traditional evolutionary response to selection (Gjedrem 1983; Garcia de Leaniz et al. 2007; Visscher et al. 2008; van Oppen et al. 2015). Although within-population microbiota variation was previously found among unrelated families of rainbow trout (Navarrete et al. 2012), estimates of additive genetic variation for fish gut microbiotas are lacking. Inferring additive genetics effects was made possible in the breeding design used in this study since using a common egg source (i.e. highly inbred females combined) for all crosses meant that all additive genetic variance is assumed to be contributed by sires. Using individuals from multiple families (represented by sires) within each hybrid cross allowed for the estimation of additive genetic variance using their known relatedness as half-siblings. Given this breeding design and previous reports of low additive genetic variance, we expected that no additive genetic effects would be observed in this study. Indeed, compared to additive among-crosses variance, sire effects were low, indicating that additive genetics contribute to overall microbiota composition. This finding is in agreement with studies in other vertebrates demonstrating a small influence of additive genetic variance on microbiota variation in cows (Difford et al. 2018), mice (Leamy et al. 2014), and humans (Yatsunenko et al. 2012; Kurilshikov et al. 2017; Rothschild et al. 2018; Brüssow 2020). Despite the overall low additive genetic variance contribution to microbiota variation across all study crosses, we found significant, cross-specific, additive genetics effects on the microbiota composition at the community level of the microbiota. This is observed for NIT (Using Bray-Curtis and Jaccard pairwise comparisons), and CHILL (Jaccard), indicating that the natal environments of sires from these population crosses select for more diverse microbiotas. As little studies have been conducted on these effects, the underlying cause remains a knowledge gap for their role as genetic factors on microbiota variation. A previous study showed that microbial quantitative trait loci (mbQTLs) interact with host immunity to shape the gut microbiota in humans (Kurilshikov et al. 2017). Additionally, MHC class II genotypes are known to contribute to the regulation of the microbiota composition among hosts in a sex-dependent manner in three-spine stickleback (Gasterosteus aculeatus; Bolnick et al. 2014c). Variation in underlying quantitative genetic architecture factors such as additive genetic variance among populations is critical to predict a population’s immediate response to selection and are requisite for artificial selection-based commercial (e.g. aquaculture) and non-commercial (e.g. conservation and restoration) breeding applications (Gjedrem 1983; Falconer and Mackay 1996; Visscher et al. 2008; van Oppen et al. 2015).

Pairs of replicate net pens for each hybrid cross were used to allow the partitioning of possible environmental effects; however, our use of common rearing environments and matched net pens made strong environmental effects on gut microbiota unlikely, mirroring recently published findings on select most abundant OTUs in domestic juvenile Atlantic salmon (Dvergedal et al. 2020). Nonetheless, replicate pen (i.e., environmental) effects were found for the Chao1 index, corroborating findings from previous studies (Schmidt et al. 2016). Given that the Chao1 index gives more weight to low-abundance species (Kim et al. 2017), this may indicate that microbiota alpha diversity is strongly influenced by the environment. These environmental effects may be explained by fine-scale environmental heterogeneity. Such effects can drive subtle phenotypic differences, often complicating the study of local adaptation, or genetics, in host-microbe systems (Kaltz and Shykoff 1998; Savolainen et al. 2013). Furthermore, we suspect that uncontrollable variation in social interactions among individuals may exist within pens (Gilmour et al. 2005), and drive microbiota differences between replicates. Another pattern observed with the use of a common hatchery/netpen environment is that our alpha diversity indices across all population-crosses matched those of salmonids of similar age/size sampled from hatchery environments more closely to those caught in the wild (for Chao1 and Shannon indices, Llewellyn et al. 2016 and Fogarty et al. 2019; for Chao1 but not for Shannon’s index, Webster et al. 2018; for Shannon and Simpson indices, Villasante et al. 2019), regardless of hybrid status of the fish. This emphasizes the challenge in minimizing the effect of the environmental factors driving the gut microbiota, which have been shown to dominate host-related factors in humans (Wu et al. 2013; Rothschild et al. 2018).

In conclusion, our study shows a rarely reported pattern of population-level variation in the gut microbiota community in fish. Such a pattern is consistent with local adaptation, perhaps due to selection associated with seven generations of domestication combined with local selection forces acting to create divergent microbiota community compositions. Inter-population effects were the largest and most consistent drivers of gut microbiota variation among the hybrid cross stocks. Additive genetic variance contributed to microbiota community variation in a cross-specific manner, and insignificantly to overall microbiota composition. Although pen effects contributed insignificantly to community composition, rare significant effects were found for alpha diversity. Microbiota ASV-level effects were found to be population-specific, further supporting the role of local population effects driving microbiota structure, despite rearing in a common environment with a common dam. Our findings highlight the effects of host-genetic variation in determining Chinook salmon microbiota composition, which will have important consequences for conservation programs aiming to protect the endangered species from potential escapes and enhance its survival. Future studies measuring genetic divergence among hybrid crosses and microbiota gene expressions could allow us to test host-microbiota codivergence patterns and their significance.

Data accessibility

Raw sequences that support the findings of this study are publically available from National Center for Biotechnology Information, under BioProjectID PRJNA680597 and SubmissionID SUB8621326. Code used for microbial data analysis is provide in the GitHub: https://github.com/Raochaganti/Aquatic-Microbiome.