One of the major challenges of today is how to secure the supply of healthy and nutritious foods for the rapidly expanding human population. Aquaculture, as the fastest growing sector of animal protein production (FAO 2018), is constantly challenged by outbreaks of infectious diseases inflicting considerable damages on this industry and impeding its further development (Lafferty et al. 2015). Diseases are such a major problem also because measures for prevention and control of diseases under aquaculture conditions are very limited. Therefore, side by side with developing vaccines (Gudding and van Muiswinkel 2013; Sommerset et al. 2005), breeding of disease-resistant strains is a sustainable solution to a range of diseases in various aquaculture species (Chevassus and Dorson 1990; Fjalestad et al. 1993; Stear et al. 2012). Accordingly, studies to identify sources of resistance, understand resistance mechanisms and improve resistance in susceptible stocks have been conducted for several fish species and different diseases (Palti et al. 2015; Tadmor-Levi et al. 2017; Vallejo et al. 2017; Yáñez et al. 2014).

Despite the severity of problems caused by infectious diseases and the research efforts invested in addressing them, successful incorporation of resistant strains in aquaculture production is limited. The notable successful examples are of strains in which a major disease resistance quantitative trait locus (QTL) has been identified in cultured stocks (Fuji et al. 2006; Houston et al. 2010; Moen et al. 2009). A few difficulties impede the breeding of disease-resistant strains. Having a reliable disease challenge model that resembles infection conditions in the field is necessary for accurate trait measurement. Since an effective challenge model is not easy to establish (Gjedrem and Rye 2018), estimates of breeding value and heritability can be compromised (Bishop and Woolliams 2010; Ødegård et al. 2011; Stear et al. 2012). In addition, improvement by breeding requires trait variation, which is often more limited in cultured stocks as they suffer from the disease and thus, lack the resistance. Finally, uncovering the genetic basis, the results of which can enhance breeding, can be a daunting challenge for multigenic disease resistances, especially given the difficulties in phenotypic measurements mentioned above and more so in non-model organisms.

Included in the five top produced fish species, common carp (Cyprinus carpio) is an important source for animal protein in the diet of people in many parts of the developing world. A major threat to aquaculture of both food strains and ornamental Koi varieties is a disease caused by the cyprinid herpes virus type 3 (CyHV-3), also called Koi herpes virus (KHV), which belongs to the double-stranded DNA Alloherpesviridae family. Reports on outbreaks of this disease started in the 1990s and since then, the disease has spread to almost all places where common carp is cultured (Adamek et al. 2014; Ilouze et al. 2011; Rakus et al. 2013). The spread of outbreaks indicates that cultured carp strains in many countries are susceptible to the disease. Several studies noted the susceptibility of cultured strains in contrast to the resistance of a feral strain, which is unsuitable for aquaculture (Dixon et al. 2009; Ødegård et al. 2010; Piačková et al. 2013; Rakus et al. 2009; Shapira et al. 2005; Tadmor-Levi et al. 2017; Zak et al. 2007).

To address the problems caused by CyHV-3 disease and develop disease-resistant strains, our group has been using introgression crosses between the feral strain Amur-Sassan, as a source for resistance, and susceptible cultured food strains. Many families of both F1 and backcross1 (BC1) generations were produced, holding large variation for CyHV-3 resistance, which was heritable from one generation to another. Furthermore, a reliable and robust disease challenge protocol was developed allowing accurate measurements of resistance levels (Tadmor-Levi et al. 2017). The current study investigates the genetic basis of CyHV-3 survival. QTL regions affecting CyHV-3 survival were identified and main QTL effects, dominance and epistasis within and between QTLs were validated in multiple families. Analyses highlighted the utility of a feral strain as a source for resistance and some of the difficulties in applying the associated markers to assist selection for improved resistance.

Materials and methods

Carp families production and phenotyping

Families were produced and reared as described by Bar et al. (2013). Three-way crosses were used to produce mapping families 2 (MF2) and 3 (MF3). Female parents were an F1 cross between a feral strain Amur-Sassan (“S”) (Pokorný et al. 1995), and Našice, a cultured Yugoslavian food strain (“Y”) (Wohlfarth et al. 1975). Male parents were from the cultured strain Dor-70 (“D”) (Wohlfarth et al. 1975). At 12 months old and weight of 30–50 g, 110 fish from each family were tagged and sampled for DNA extraction. Fish were infected by cohabitation with CyHV-3 according to our disease challenge model (Tadmor-Levi et al. 2017). Throughout the challenge trial, each fish was scored for end status (survivor/dead) and for day to mortality (post-infection). Survivor fish received the score of 100 days. One additional mapping family (MF1) [described in Kongchum et al. (2011)] and 11 additional families from a backcross generation 1 (BC1) background were used [described in Tadmor-Levi et al. (2017)]. The BC1 families were of two similar, yet somewhat variable, genetic backgrounds [(S × Y) × Y or (S × D) × D]. Since some parents were used more than once, the 14 families of this study were produced by 6 “F1 – S × Y”, 5 “F1 – S × D”, 2 “Y” and 4 “D” parents (Table S1).

Genotyping by sequencing analysis

DNA was extracted from fin clips of individuals using a standard phenol–chloroform protocol (Sambrook and Russell 2006). Because mortalities during the challenges reached 65–70%, all survivor fish (29 from MF2 and 25 from MF3) were selected for genotyping by sequencing (GBS) analysis and these were augmented by dead fish to a total of 91 per family. GBS analysis was done at the Genomic Diversity Facility of the Cornell University Biotechnology Resource Center. PstI restriction and 96-plex multiplexing were used as in Elshire et al. (2011). Each 96-plex, comprising 91 individuals from one family, four parents of both families, and one blank control, was sequenced on a single lane of Illumina HiSeq2 500 flow cell to obtain 100 bp single-end reads. A total of 337,192,585 and 284,402,761 raw sequence reads were obtained for MF2 and MF3, respectively. Out of these, 294,307,761 and 260,706,711 were usable barcoded reads that were combined to 26,022,077 and 17,435,366 sequence tags for MF2 and MF3, respectively. For all samples, except one from MF3, sufficient reads (above 0.5 million) were obtained. After merging sequence tags from all samples, a total of 4,918,724 unique tags were aligned to the carp genome assembly produced by Henkel et al. (2012). Out of these, 3,968,777 tags (80.7%) aligned with unique positions, 434,454 (8.8%) with multiple positions, and 515,493 (10.5%) could not be aligned. Tags of similar sequence (up to 5% mismatch allowed) were stacked for single nucleotide polymorphism (SNP) calling in individuals. All analysis steps were performed using default settings of the GBS analysis pipeline in TASSEL3 software (Glaubitz et al. 2014) and resulted in 64,606 HapMap format SNPs.

Genetic map construction

For each family, markers deviating from the expected Mendelian segregation ratio (χ2, P < 0.05) and markers for which progeny genotypes did not match parental genotypes were filtered out using JMP13 software (SAS Institute, NC). After filtering, a total of 27,128 markers, polymorphic in at least one of the four parents (Table 1) were used to construct a single sex-averaged map for both families, using LepMap2 software (Rastas et al. 2015). First, the filtering module of the software was used to exclude markers with >10% missing data in each family separately. Afterwards, the SeparateChromosomes module assigned markers to linkage groups (LGs). All LG solutions with logarithm of the odds (LOD) scores of 5–16 were considered and the solution with LOD score 12 was chosen, since it gave both high confidence in the linkage among markers and the expected number of LGs (n = 50). This solution excluded about 3000 markers with lower confidence. Marker order was calculated using OrderMarkers module, all “duplicated” markers (markers sharing identical genotypes in all individuals) were concatenated by the software and one representative was used for calculations of order. For each LG, 10 order solutions were correlated with each other using marker LG position and order. Markers that were consistently mapped to different positions were removed to get only the consistent map locations. After removal of inconsistent markers, 10 more order solutions were made and the best order solution (highest likelihood) was selected and refined by running the EvaluateOrder module 10 times for each LG. The highest likelihood solution was picked for the final map, which contained 50 LGs numbered based on descending number of markers included. Table S2 details marker information including polymorphism type, physical position, and map position. Map was charted using MapChart 2.3 (Voorrips 2002).

Table 1 Polymorphic GBS markers by parental backgrounds and families

QTL identification

For QTL identification, a file containing information on segregation type of mapped markers (paternal or maternal; markers heterozygous in both parents were removed by the software) of each family was analyzed using a generalized linear model (GLM) in TASSEL5 software (Bradbury et al. 2007). P-value was Bonferroni adjusted for multiple comparisons. Significant markers with [Log10(P-value)] of 4 and higher were grouped by their map position to identify QTL regions. QTL boundaries were calculated using the LOD-1.5 support intervals method (Dupuis and Siegmund 1999). The left boundary of QTL2 was extended to the start of the LG since this QTL was identified by paternal markers, which could not be found upstream to the peak.

Extracting candidate gene lists from QTLs

All markers within the QTL boundaries were selected and their sequence tags were aligned with genomic contigs of the common carp genome assembly (Kolder et al. 2016). A list was made of all annotated carp genes within these contigs. Since the common carp genome assembly is still discontinuous, some genes were potentially skipped. Therefore, using the carp gene list, orthologous regions on the zebrafish (Danio rerio) (ZF) genome (version GRCz11) were identified. Each putative carp gene was assigned with a map location (cM) according to the closest marker on the same genomic scaffold. Given the good co-linearity between species, the carp lists were complemented by genes from the orthologous regions of the well-annotated ZF genome. Based on the ZF gene IDs and annotations, gene onthology (GO) terms analysis was used to assign carp genes to GO term categories and identify carp immunity-related genes.

QTL validation

Primer pairs (Table S3) were designed for high-resolution DNA melting (HRM) analysis of selected GBS markers at the peak of each QTL. Each marker was tested on a panel of eight progeny from each mapping family to validate the originally called GBS genotypes. Such validated markers were further used for segregation and association with survival in the validation BC1 families. For the same purpose, HRM markers where developed within the coding region of IL10a (QTL3) and IL10b (QTL4). Hence, QTL were validated in families based on genotypes of one or more markers per QTL (Table S4 and S5), depending on whether polymorphic or not. For some markers, a nested HRM PCR was done using a larger PCR amplicon (diluted 1:50) as a template. Nested HRM increased locus specificity especially for markers where amplification of both paralogous loci was suspected. For analysis, the genotype per QTL of each individual was determined consolidating data from all QTL markers. For QTL2 and QTL3, markers were only a few hundred base-pairs apart and thus, genotypes of few individuals with marker discrepancies within the QTL were removed. For QTL1, the two markers were further apart and few recombinants were found. Therefore, kvhrm4, which was polymorphic in more families and had higher association with survival, was chosen as representative for this QTL. To validate the main effect of each of the QTLs, association between individual genotypes and phenotypes was tested by likelihood ratio χ2 test using JMP13 software. P-values of 0.05 or lower were considered as significant whereas, values between 0.05 and 0.1 as suggestive.

PCR procedures

A template of 50 ng purified DNA was used in a total reaction volume of 20 µL, as in Shapira et al. (2014). For SNP genotyping by HRM analysis, the PCR mixture was supplemented with 50 µM of SYTO-9, a fluorescent DNA intercalating dye. PCR primers were designed to produce short amplicons of 100–200 bp in size. PCR was followed by a melting profile of increasing 0.1 °C per second from 65 °C to 95 °C. HRM analysis to determine genotypes was done using LightCycler ® 96 software (Roche).

Determination of QTL alleles and genotypes

Different HRM melting patterns were observed in different families indicating that HRM amplicons included more polymorphisms than originally identified in the mapping families. Since the melting pattern of PCR products does not directly translates into the underlying sequence alleles and genotypes, sequencing is required to determine how many distinct alleles were in each QTL and which alleles were segregating in each family. Thus, primer pairs (Table S3) were designed for sequencing the genomic region containing the HRM marker amplicon. For each polymorphic HRM marker, parents and progeny representing all different HRM melting patterns in each family were sequenced to define QTL genotypes. For family MF1 and BC1-D1, QTL1 and QTL3 genotypes, respectivly, were not determined due to repeated failure of sequencing reactions, therefore individuals were divided only by genotypic groups according to HRM patterns. Sequencing of the amplified QTL region was done as in Shapira et al. (2014). Sequences were aligned to a genomic reference (Kolder et al. 2016) using SeqScape v2.7 software (Applied Biosystems). The number of distinct alleles and genotypes was determined by matching sequences with HRM patterns.

Dominance and epistasis analyses

To test for interactions between alleles (dominance), families having three genotypes per QTL were selected. The difference in proportion of survival between the three genotypes in each family was tested by the Steel-Dwass all-pairs test. To test for QTL–QTL interactions, a full factorial GLM analysis, including QTLs as main fixed effects and their pairwise interaction, was used separately for each pair of QTLs in each of the families. The binomial (survivor/dead) phenotype of fish was analyzed by a Logit link function test (JMP13 software). The P-values of main effects from previous χ2 tests and those of interactions from full factorial tests were used for comparisons across families. P-values of 0.05 or lower were considered as significant whereas, values between 0.05 and 0.1 as suggestive.

Analyzing effects of survival QTLs across families

Individuals of all families together were combined to represent the study population and QTL effects were tested at two levels. First, the effect of different genotypes at QTL2 (the most common QTL) on survival of all individuals of all families together was tested by a Logit link function test (JMP13 software) incorporating family % survival as a covariate to account for family differences. Secondly, the difference in mean % survival between families with different frequency of QTL alleles was tested for each allele of each QTL. To do so, the frequency of QTL alleles in a family was calculated as the expected frequency based on parental genotypes. For progeny of two heterozygous parents for the same alleles, the expected allele frequency was 0.5. For progeny of one homozygous and one heterozygous parents sharing one allele, the expected allele frequency was 0.75 for the shared allele and 0.25 for the other. Actual deviations from the expected frequencies were random and this was validated by a χ2 test (χ2 likelihood ratio; P > 0.05). Then, families with the same allele frequency were grouped, their family % survival values were arcsine transformed [Y’ = Arcsine (√Y)] and a t-test (JMP13 software) was used to test the difference in mean % survival between families with different QTL allele frequencies.


Mapping of CyHV-3 survival QTLs

Two relatively susceptible mapping families, MF2 and MF3 (mean survival of 30 and 35%, respectively), were CyHV-3 challenged (Fig. 1a). Genotyping these progeny by GBS yielded 27,126 polymorphic markers. Of these polymorphisms, 76% were contributed by the crossbred females (Y × S), 16% by the D males and 8% by both (Table 1), consistent with the feral “S” strain being divergent from the cultured “D” and “Y” food strains. One sex-averaged genetic linkage map was constructed composed of markers from both families and anchored by the 17% of markers shared between families (Fig. 2). The map contained 15,829 non-redundant markers grouped into 6495 unique map positions on 50 LGs with a total length of 3453.43 cM. The number of unique positions per LG ranged from 83 to 240 with an average of 129.9. LG size ranged from 42.04 cM to 121.37 cM with an average of 69.07 cM. Marker clusters spacing (between unique positions) was between 0.001 and 18.486 with a mean of 0.54 cM, and 84.76% of the markers were <1 cM apart.

Fig. 1
figure 1

Phenotypic and genetic variation in CyHV-3 survival. a, b Cumulative mortality curves of families challenged by CyHV-3 for a three mapping families used for QTL analysis [data for MF1 was adapted from Kongchum et al. (2011)] and b BC1 families used for QTL validation [phenotypic data adapted from Tadmor-Levi et al. (2017)]. In common for panels c-e, each dot is a polymorphic GBS marker and the genome-wide significance [–Log10(P-value)] (GLM-TASSEL) of the association between marker and survival (survivor/dead) was plotted against linkage map position. Red lines mark the genome-wide significance (Bonferroni adjusted) threshold. c Manhattan plots showing all mapped GBS markers ordered according to their positions in linkage groups 1 to 50 (different dot colors) for the two mapping families (MF2 and MF3). QTL1 and QTL2 are marked next to the relevant plot peaks. d, e Zoom in on the QTL containing LGs. Markers on LG30 (d) or LG46 (e) are colored by mapping families and have different shapes by origin (maternal or paternal) to indicate what parent contributed the QTL allele polymorphisms

Fig. 2
figure 2

Sex-averaged genetic linkage map for Cyprinus carpio based on GBS markers from two mapping families (MF2 and MF3). Each bar is 1 out of 50 linkage groups. Unique marker positions (clusters of markers mapping to the same genetic position) are presented by a line across the LG bar. All LGs are scaled uniformly and a cM scale is present to the left. Position of the four CyHV-3 survival QTLs are shown next to their respective LGs and the mapping intervals of QTLs 1 and 2 are marked by black lines

Applying a GLM analysis using all mapped markers, 92 markers in MF2 and 11 markers in MF3 were significantly associated (Bonferroni-adjusted P < 0.05) with survival (Fig. 1c). Ten of these associated markers were shared by both families. Similar markers were associated also with days to mortality and thus, only the survival measure (survivor/dead) was further analyzed as the phenotype. Of these markers, 67 were clustered on the map to form QTL1 on LG30 and 26 to form QTL2 on LG46 (Fig. 2). QTL1 significantly affected survival in MF2, whereas QTL2 did in both families. The boundaries for QTL1 were between 4.838 cM and 24.374 cM (Fig. 1d), and for QTL2, between 0 cM (left end) and 38.789 cM (Fig. 1e). In a previous collaborative study (Kongchum et al. 2011) using a candidate gene approach, IL10a was reported as a potential candidate gene for CyHV-3 survival in a different family (MF1, Fig. 1a). Therefore, markers developed specifically for IL10a (QTL3) and its paralog IL10b (QTL4) were used to genotype the mapping families allowing to map them to LG14 and LG43, respectively (Fig. 2). This mapping step verified that IL10 copies were independent QTLs from those newly identified in MF2 and MF3.

From the sequence scaffolds of the carp genome assembly, 112 and 341 putative genes were extracted for QTL1 and 2 regions, respectively. Based on the carp gene lists, orthologous regions to QTL1 and QTL2 were identified on ZF chromosomes 2 and 21, respectively. These regions showed good co-linearity between species, except for one possible inversion in QTL1/chromosome 2 regions (Fig. 3). Due to better assembly and improved annotation of the ZF genome and based on the syntenic relationship, carp gene lists were complemented with 652 and 629 additional ZF genes for QTL1 and QTL2, respectively (Table S6). All genes in these regions are potential candidates but, further prioritization by gene ontology terms analysis identified 17 and 16 genes for QTL1 and 2, respectively, with immunity-related annotations and these were highlighted in the list of candidate CyHV-3 survival genes.

Fig. 3
figure 3

Co-linearity between carp and zebrafish genomes in the QTL regions. Carp genetic map position was plotted against ZF physical gene position (gene start, bp) for a LG30 (QTL1)—ZF chr. 2 region and b LG46 (QTL2)—ZF chr. 21 region. Note the possible inversion in a

Validating QTLs in multiple families

The four QTLs were identified in three families and not all QTLs had an effect in all families. This might be expected from a quantitative trait but, to validate if this is the case, HRM assays were developed for genotyping of 1–3 SNPs per QTL. HRM markers were selected from GBS markers located to the peaks of QTL1 and 2 regions (Fig. 1d, e) and from the coding regions of IL10a (QTL3) and IL10b (QTL4). Survivor and dead individuals from 12 additional families (Table S1) were genotyped with these markers. These BC1 families contained a mix of feral and food strain alleles and showed a range of resistance levels (Fig. 1b and Table S1).

In each family, main effects of the four QTLs, as well as pairwise QTL–QTL interactions were tested and their significance levels were tabulated (Fig. 4). Significant main effects (χ2 likelihood ratio; P < 0.05) were found in about 21.5% of the family–QTL combinations. QTL1 had a significant main effect only in MF2, where it was identified, and a suggestive effects in three additional families (P = 0.064, 0.054, and 0.059). QTL2 had a significant effect in 7 out of the 14 families tested, more than any other QTL. QTL3 was tested on a subset of individuals from MF1, where it was originally identified (Kongchum et al. 2011), and the HRM marker genotypes perfectly co-segregated with the genotypes of the markers used in that previous study. However, the effect found here for MF1 was only suggestive (P = 0.1), most likely due to the smaller sample size used here. Nevertheless, QTL3 main effect was validated in two more families (P < 0.05). The main effect of QTL4 was suggestive in two families (P = 0.06 and 0.08).

Fig. 4
figure 4

A matrix of QTL effects contributing to CyHV-3 survival in 14 families. Rows are for families (mapping families or BC1 families belonging to either Y or D backgrounds). First four columns are for main QTL effects, followed by columns for QTL–QTL interaction effects, and by a column for family % survival in a descending order. Numbers in cells are the significance level of the association test as an approximation for effect size. P-values above 0.1 are not shown. NP non-polymorphic families

In addition to main effects, 8.8% of the pairwise QTL–QTL interactions were significant. For some families, interactions were significant between QTLs for which the main effect of one or both was insignificant. QTL4, for which only suggestive main effects were found, was actually validated by its significant interactions with QTL1 and QTL3. QTL3 had more interactions significant than main effects. Interestingly, the paralogous QTLs significantly interacted in two families. Taken together, all QTLs were validated and therefore, CyHV-3 survival was found to have a multigenic basis, with main additive effects and epistatic interactions contributing to trait variation in different families.

Interactions within and between QTLs in multiple families

Interactions between alleles (aka dominance) and between loci (aka epistasis) are non-additive effects contributing to trait variation. Since the specific alleles and genotypes segregating for each marker and family were determined, our data allowed studying these effects. Four families, having three genotypes in QTL2 (1/1, 1/2, and 2/2), were analyzed for allelic interactions. Interestingly, although all families shared the same alleles, the allelic interactions were of different nature in each family. In family BC1-D2, the mean survival proportion of heterozygotes was similar to the mean of the two homozygotes, indicating a co-dominant or additive mode of inheritance (Fig. 5a). In families BC1-D1, BC1-D3, and MF3, a dominant mode of inheritance was found (Steel-Dwass all-pairs test, P < 0.05). However, allele 2 was dominant in families BC1-D3 and MF3 (Fig. 5b, c), whereas in family BC1-D1 allele 1 was (Fig. 5d). Furthermore, although homozygotes for allele 1 consistently had better survival, the mean survival of these homozygotes varied between 63 and 100% in different families, indicating that QTL2 effect also depended on the genetic background.

Fig. 5
figure 5

Examples for interactions between alleles (dominance) and between QTLs (epistasis) affecting CyHV-3 survival. In all panels, proportion of survival is plotted against specific QTL genotypes of a certain family. Family names are above each plot. ad Different types of interactions were found between the same QTL2 alleles in different families. P-values are given for the QTL effect test (χ2 likelihood ratio) and different lowercase letters above represent significant differences between genotypes (Steel-Dwass all-pairs test, P < 0.05). Four different modes of inheritance are shown: a co-dominant or additive (no allelic interactions), b, c dominance of allele 2, and d dominance of allele 1. e-g Examples of QTL–QTL interactions. Genotypes of one QTL are on the X axis, whereas genotypes of the other QTL are presented by different dashed and full lines. P-values are given for the interaction effect test (Full factorial GLM, Logit function). Three different types of QTL–QTL interactions are shown: e no interaction, f a synergistic interaction where genotype in one QTL enhances the survival effect of another QTL, and g a nullifying interaction where both QTLs have an effect but in opposite directions and therefore canceling out each other’s main effects

Epistatic interactions between QTLs had noteworthy effects on survival (Fig. 4) and explained some of the differences found between families. For instance in MF2, QTL1 and QTL2 did not interact and therefore, their effects were consistent as reflected by the two nearly parallel lines in Fig. 5e. However, in MF3, differences between genotypes at QTL3 were found only for 1/1 homozygotes at QTL2 (Fig. 5f). This significant interaction (P= 0.015) complicated further what previously seemed like a recessive state of the 1/1 homozygotes at QTL2 in MF3 (Fig. 5c). Actually, the recessive state at QTL2 is pronounced only when at QTL3 the genotype is also 1/1. When at QTL3 the genotype is 1/2 (heterozygous), the QTL2 genotypes are co-dominant (Fig. 5f). QTL–QTL interactions could also explain why QTLs had main effects in some families but not in others. In family BC1-D1, interactions between QTL3 and QTL1 averaged out their main effects (Fig. 5g). Since genotypes G1 and G3 for QTL3 had opposite effects depending on the genotype at QTL1, on average, no effect was evident when QTL3 was considered independently (Fig. 4). However, much like the main effects of three QTLs, also each of the interactions within or between QTLs were significant in no more than two out of 14 families.

Prevalence and source of survival alleles

All families used in this study had a general BC1 genetic architecture. Thus, all cultured strain alleles together are expected to be at a proportion of 0.5 in the F1 parents and 0.75 in the BC1 families. Since cultured strains were generally susceptible to the virus while the feral strain was rather resistant (Tadmor-Levi et al. 2017), feral alleles were expected to improve survival. To address this expectation, the frequency of different alleles was tested at different levels.

First, allele frequency was calculated for the parental backgrounds (cultured or F1), excluding Koi that was represented by only one parent. For each of QTL1, 3, and 4, cultured parents were almost exclusively homozygotes for one allele per QTL, whereas the F1 parents as a whole were more polymorphic, with seven, four, and three alleles, respectively (Fig. 6a). Notably, the frequency of the commonly found cultured strain allele was approximately 0.5 in the F1 background, indicating that most likely the source of the other rarer alleles was the feral background. This suggests that not only were the cultured strains divergent from the feral strain, but they were also less polymorphic. For QTL2, the two parental backgrounds had similar allelic distributions (Fig. 6a).

Fig. 6
figure 6

Prevalence and source of CyHV-3 survival QTL alleles. All plots depict allele frequency distribution (Y axis) by different fish groups (X axis). Different QTL alleles are marked by different colors (legend on top for each QTL). Significant differences (χ2 likelihood ratio, P < 0.05) between groups are marked by asterisks and suggestive differences (χ2 likelihood ratio, 0.1 ≥ P ≥ 0.05) are marked by plus signs. a Distribution of alleles by genetic background (cultured or F1) in the parents of all families together. b Distribution of alleles by survival phenotype (survivor/dead) in progeny of all families together. c Example distributions of alleles by survival phenotype in specific families for which a significant QTL effect was found. For each QTL, individuals are grouped by family and survival phenotype. Note that for common alleles (QTL2), the difference between genetic backgrounds was insignificant (in a) whereas differences between survivor and dead were significant overall progeny together (in b) and in half of the families (examples in c). For rare alleles (QTLs 1, 3, and 4), differences between genetic backgrounds were significant (in a) whereas differences between survivor and dead were insignificant overall progeny (in b)

Second, for each QTL, allele frequencies were compared between survivor and dead fish of all BC1 families together, a situation that resembles the population level (Fig. 6b). Only for QTL2, for which common alleles were found, the difference between survivor and dead was significant (χ2, P < 0.0001). Allele 1 of QTL2 that was more frequent in survivors was shared by both backgrounds. Since the expected proportion of the common allele from the cultured backgrounds in BC1 families is 0.75, the frequency of rare alleles found in QTLs other than QTL2, was similar between survivor and dead fish. Thus, assuming that the 17 parents approximate the study population, feral alleles that were also potentially beneficial for survival were rare in three out of four QTLs.

Consequently, identification of what alleles were beneficial for survival was done within families for which QTL effects were significant (examples are shown in Fig. 6c). Main effect of QTL4 was insignificant in all families and therefore, families with suggestive effects were analyzed. In these families, allele 3 of QTL4 was shared by both F1 and cultured backgrounds and was more frequent in dead fish. For QTL1, two different rare alleles, 7 and 2, were more frequent in survivors of MF2 and BC1-D7, respectively. Interestingly, for the same QTL, a different rare allele (allele 6) was more frequent in dead fish. Since all these rare alleles at QTL1 were only present in the F1 parent, these were likely feral alleles with opposite effects. For QTL3, one common and one rare allele were more frequent in survivors. In family BC1-D4, two common alleles segregated, allele 1 (the cultured allele) was more frequent in survivors while allele 2 (probably a feral allele) in dead fish. Perversely, in family BC1-D2, the same cultured allele 1 was more frequent in dead fish while the rare allele 3 in survivors. QTL2 was the exception with the same allele (allele 1) significantly more frequent in survivors of seven families. Taken together, both rare and common alleles contributed to survival, sometimes in opposite directions, and some of these alleles originated also from the cultured background.

Effects of survival QTLs across families

Since the QTLs were identified and validated within full-sib families, the predictive value of these QTLs at the population level was estimated by combining individuals across all families together. By analyzing the effect of alleles, only QTL2 had a significant effect across all families (Fig. 6b). Consistently, by analyzing the effect of QTL2 genotypes, after correcting for % survival of families, the proportion of survivors differed significantly (Logit link function test, P < 0.0001) also among the genotypes (Fig. 7a).

Fig. 7
figure 7

Effects of CyHV-3 survival alleles across families. a The proportion of survival phenotypes (survivor/dead) over progeny of all families together by genotype in QTL2. Distributions were significantly different from each other (Logit link function test, P < 0.0001). b Means and standard error bars of family % survival, by the frequency of allele 1 of QTL2. Insignificant difference (t-test, P = 0.292) in mean family % survival was found between groups of families with different frequencies for allele 1 of QTL2

Furthermore, since the trait for which QTLs were identified was survival of individuals (survivor/dead) within families, while breeding for resistance seeks to improve survival across families (population % survival), the effect of the QTLs on cumulative survival across families was estimated. To do so, expected allele frequency for each QTL in each family was deduced from the parental genotypes and families with similar QTL allele frequency were grouped to calculate their mean % survival. Then, mean % survival was compared between family groups with different frequencies of QTL alleles. No significant differences (t-test, P > 0.05) in mean % survival were found for any of the QTL alleles. In this test, even the survival allele of QTL2 had an insignificant effect across all families (Fig. 7b), or even across only the subset of families in which the QTL was originally found to be significant. Therefore, we could not find an indication that QTLs or alleles, which explained some of the variation in survival of individuals within families, had an effect on variation in cumulative % survival across families.


The utility of a wild-type strain for improving disease resistance

Variation in disease resistance is necessary for selective breeding of resistant strains. Disease outbreaks are troubling the most in production of farmed stocks, indicating that farmed stocks are in general susceptible and thus, have limited genetic and phenotypic variations for resistance. Natural populations often carry more variation than farmed stocks, including but not limited to disease resistance alleles. In our studies, we used a feral strain (S) as a source for resistance. Our mapping panels were three-way crosses with over 70% of the polymorphic markers heterozygous in the F1 parents, indicating that the feral strain introduced many polymorphisms relative to the cultured strains. Consequently, high density linkage map with great utility to this and further studies was generated, with features like marker numbers, LG number, length, and coverage comparable to recent studies focusing on other traits (Feng et al. 2018; Palaiokostas et al. 2018; Peng et al. 2016). Particularly for resistance, we reported earlier the larger phenotypic variation in family % survival after CyHV-3 challenges for F1 and BC1 families compared with the three parental strains (Tadmor-Levi et al. 2017). Similarly, in terms of genetic variation, for markers in three out of four QTLs, F1 parents had considerably more alleles than the cultured strain parents did, and particularly, alleles that increased survival were more likely to originate from the feral strain. Given the above said, there are clear advantages in utilizing a feral strain since it introduces variation, including for resistance, into the study framework.

Practically, in commercial animal stocks with advanced breeding programs, crosses with wild-type stocks for introgression of desirable traits are very cautiously and rarely considered because of the potential tradeoffs in other selected traits. However, in aquaculture, long-standing and elaborate breeding programs are not practiced for most species and stocks, so, the tradeoffs for introducing wild-type variation might not be as costly as one might apprehend, whereas the benefits are clear as described above. Furthermore, in those species that do have progressive breeding schemes and the tradeoffs might be too costly, experimental crosses introducing disease resistance from wild stocks into the susceptible cultured strains hold a great potential. Such experimental crosses, as demonstrated here, can lead to identification of beneficial alleles that can be targets for improving resistance in susceptible stocks by marker-assisted selection or genome editing and alongside, are a prime route to enhance our understanding of disease resistance mechanisms in animals.

The genetic architecture of CyHV-3 survival

In this study, SNP markers were used to identify QTLs associated with CyHV-3 survival. By comparing genotypes of dead vs. survivors and by analyzing days to mortality of individual, we identified similar survival QTLs, suggesting both measures share a similar genetic basis. The associated markers clustered into two distinct genetic regions, none of which were on the same LG as IL10a (QTL3), a candidate gene previously identified in our collaborative study (Kongchum et al. 2011) or its paralog IL10b (QTL4). Another study suggested that polymorphisms in the Cyca-DAB1-like, one of the major histocompatibility class II B genes of common carp, were associated with differences in CyHV-3 survival (Rakus et al. 2009). Notably, our list of genes in the QTL regions does not include any major histocompatibility class II B genes (Table S6), suggesting that these are independent QTLs. A quantitative mode of inheritance has been argued before for CyHV-3 resistance, based mainly on phenotypic variation analysis in a set of various strains, most with a European origin (Ødegård et al. 2010; Piačková et al. 2013). In our previous study, we reported on large variability in resistance among families of the F1 generation and demonstrated for the first time that this variability was heritable to the BC1 generation (Tadmor-Levi et al. 2017). Here, unbiased identification and validation of multiple QTLs provided actual empirical findings that CyHV-3 survival is controlled by multiple loci. Although several immune system genes were found in these QTLs, other genes might also be candidate survival genes and it is too early to suggest causative variation.

In each family, different main effects significantly contributed to survival. QTL2 had a significant effect in half of the families tested, and in each of those, the same allele was consistently more frequent in survivors. Despite the insignificant effect in half of the families, the effect of this QTL was significant on the population level, that is, progeny from all families together. Interestingly, based on the genotypes of the parents, it seems like the allele that increases survival can be found in both the feral and cultured backgrounds, although the cultured strains were generally susceptible. Unlike the common survival allele at QTL2, a few rare alleles at the other three QTLs with a likely feral origin increased survival in certain families.

Besides the main effects of QTL alleles, the results showed how dominant and epistatic interactions within and between these QTLs had further contributed to trait variation. Dominant effects are better studied when all three genotypes of a locus segregate in the population and in BC1 crosses this often is not the case. Therefore, the contribution to variation in survival of such dominant effects might have been underestimated in this study design. Nevertheless, QTL2 did show three segregating genotypes in four families, revealing that the relationship (dominance or additivity) between the same alleles can vary among families. This result indicates epistatic relationships between QTL2 and other loci in the genome, as directly measured by the QTL–QTL interaction analysis. Besides QTL2, further QTLs interacted with each other, even in families where the main effects of one or both QTLs were insignificant. One example for how this can happen was demonstrated for QTLs 1 and 3 in BC1-D1 where additive effects of the same alleles, but in opposite directions, canceled out the main effects of each QTL alone.

Considering the above said, from the detailed analysis of four QTLs in 14 families, our understanding of the genetic architecture of CyHV-3 survival goes beyond just demonstrating a complex basis to portraying some of the intricate effects found within and between QTLs. Clearly, this study also implicates that more QTLs contributing to the trait are present and can be identified by a similar approach.

Alleles affecting survival within families but not across families

The significance of having disease-resistant strains available for aquaculture cannot be overestimated. Disease resistance is commonly defined as the capability of the host to control the burden of pathogen infection (Bishop and Woolliams 2010; Doeschl-Wilson et al. 2012). However, measuring pathogen load in the body of the host is rarely used to estimate resistance and instead, it is common to measure survival from a pathogen challenge. Here, as in many other studies, we analyzed survivors vs. dead within families, but the goal of breeding is to improve % survival across families at the population level. Family selection (i.e., individuals are selected from better families) has proven to be more effective than mass selection (i.e., individuals are selected based on their own performance) in selective breeding of multigenic traits in fish (Gjedrem and Rye 2018; Sonesson et al. 2010; Stear et al. 2012). A family-based value represents more accurately the estimated breeding value of an individual than the own performance value does. This advantage, which is common knowledge for other quantitative traits, probably applies also for family-based estimates of disease resistance (i.e., family cumulative survival). Our experience confirmed the superiority of family-based selection for improving CyHV-3 resistance (Tadmor-Levi et al. 2017) and for other diseases, this advantage was proved before (Leeds et al. 2010; Stear et al. 2012). Nevertheless, we suggest that for disease resistance, assigning a family-based value to individuals might bring an additional advantage beyond its better accuracy. This additional advantage is reflected by the intuitive notion that there should be some genetic variants in a family with 90% survival that increase the survival chances of its individuals beyond the survival chances of individuals from a 10% survival family, despite the fact that both such individuals are phenotypically survivors.

Since breeding seeks to improve the overall survival in the population, we combined the within-family survival phenotype of individuals across families to test the potential of these QTLs to improve cumulative survival across families. Although allele 1 of QTL2 had a significant effect on survival of individuals, both in specific families and across all families together, higher frequency of this common survival allele did not significantly explain higher family % survival. This difference in effect between combining individual phenotypes (dead/survivor) across families and comparing mean family % survival as a function of allele frequency could be because our study design has less power to detect across family effects of mild QTLs of a multigenic trait. However, it could also be because families with higher % survival might contain QTLs other than those we tested. Should the latter option be possible, it is suggesting that at least in some cases the genes and markers identified by contrasting dead vs. survivors within families or even across families, might not be so useful to assist selection for improved cumulative survival. Instead, in cases of multigenic resistances, QTLs and markers that were indentified by contrasting families with high and low cumulative survival might be more useful in assisting selection for higher cumulative survival. We acknowledge that our analysis is limited to four QTLs when the trait is probably conditioned by more QTLs and that analyzing more families would have improved the power of our analyses. Hence, our results here are suggestive rather than conclusive, yet if this indication is true, it has implications and therefore, should be further explored.

In conclusion, this study provided empirical evidence that CyHV-3 survival is a multigenic trait and demonstrated some of the complexities in its genetic basis. Further analyzing the phenotypic and genotypic data, this study addressed also two key points that are of general interest. First, that utilizing crosses involving wild-type strains offers considerable advantages in multiple aspects and thus, should be considered in other species too. Second, the initial indication that uncovering genetic variation conditioning survival of individuals within families may not explain the variation conditioning cumulative survival across families should be further tested. Finally, this study paved the way to future studies on the genetic basis of CyHV-3 resistance, studies that will assist breeding and enhance our understanding of immunogenetics and disease resistance mechanisms in fish.

Data archiving

All mapped markers with their map position and physical scaffold position, as well as SNP genotypes from the GBS analysis on MF2 and MF3 are detailed in Table S2. Phenotypes and genotypes at all four QTLs for all fish from all families used in this study are detailed in Table S5. Raw sequence reads from the GBS analysis have been deposited to NCBI sequence read archive under BioProject accession number PRJNA494509. Supplementary information is available at Heredity’s website.