Introduction

Iberian pig is an autochtonous fatty breed characterized by a high adipogenic potential, high appetite and outstanding meat quality. These features are consequence of their genetics and the traditional extensive production system, both contributing to the deposition of subcutaneous and intramuscular fat with a high content of oleic acid and antioxidants1,2. Iberian × Duroc crossbred pigs are the main genotype employed to produce Iberian meat products in intensive production systems, although their sensorial meat quality is considered lower than that of pure Iberian pigs3,4,5. Furthermore, crossbred Iberian pigs are characterised by very heterogeneous developmental, productive and quality traits6, with great potential to identify the underlying genes. Knowledge of the molecular genetic basis of relevant traits would help in the design of marker-assisted selection programs.

Different candidate gene studies have been performed in Iberian pig populations, mostly in purebred animals or experimental crosses and with low-scale structural approaches, in order to deepen in the genetic architecture of productive traits. Relevant genes and mutations have been identified so far with this approach, mainly those detected in the IBMAP experimental populations such as LEPR7,8, ELOVL69 or ACSL410,11. Recently, Fernandez-Barroso et al.12 found significant associations of mutations in PRKAG3, CAPN1 and several other genes on meat quality traits in pure Iberian animals. Also, functional genetic studies have been performed in Iberian pigs which have provided information on functional candidate genes and pathways potentially involved in phenotypic variation5,13,14,15. On the other hand, the use of commercially available high-density SNP arrays for genome-wide association studies (GWAS) provides a systematic and powerful approach for deepening into the genetic basis of complex traits. This approach has led to the identification of many interesting genomic regions and candidate genes in different populations and breeds, including the Iberian16,17,18,19,20,21,22. Nevertheless, the applicability of commercial SNP arrays is limited because they only identify a fraction of genetic variation. This is due to the fact that they have been designed from SNPs detected in a few cosmopolitan breeds resulting in limited informativity and power, specially for the analysis of local breeds22,23. In contrast, whole-genome sequencing (WGS) can potentially detect all genetic variants, including causal ones24 and may allow the design of customized, highly-informative SNP panels, useful to uncover the genetic basis of relevant traits in local breeds. Such advances make genetic improvement achievable for relevant productive and quality traits in non-cosmopolitan pig breeds.

The objective of this work was to discover and evaluate a wide panel of DNA markers for association with traits related to muscle growth, fat deposition and composition, metabolism and meat quality in Iberian crossbred pigs. For this purpose, a medium density custom genotyping protocol was used, designed with SNPs obtained and selected from a combination of whole genome sequencing data, transcriptome sequencing data and a candidate gene approach.

Material and methods

Ethics statement

The study was performed according to the Spanish Policy for Animal Protection RD53/2013, which meets the European Union Directive 2010/63/UE about the protection of animals used in research. The experiment was specifically assessed and approved (report CEEA 2012/036) by the INIA Committee of Ethics in Animal Research, which is the named Institutional Animal Care and Use Committee (IACUC) for the INIA. The study was carried out in compliance with the ARRIVE guidelines (https://arriveguidelines.org/arrive-guidelines).

Animals and phenotypes

We used an Iberian × Duroc crossbred population. The animals were housed at a commercial farm, Ibéricos de Arauzo 2004 S.L. (Zorita de la Frontera, Salamanca, Spain). A total of 477 crossbred piglets (50% females and 50% males) born from 47 third and fourth-parity Iberian sows were involved in this study. The Iberian sows were inseminated with cooled semen from Duroc PIC boars (Genus plc, UK). The sows were individually identified with an electronic ear tag and housed in groups until day 101 of pregnancy, when they were moved to individual pens until the end of the suckling phase. Sows and offspring were fed standard diets for the Iberian breed, adjusted to fulfil individual daily requirements, based on recommendations from De Blas et al.25. The piglets were individually tagged at birth and remained with the sows until weaning. Within 2 days of birth, litters were equalled by cross-fostering and male piglets were surgically castrated. Surgical orchidectomy was performed after sedation with azaperone (1 mg/kg LW; Stresnil®, Ecuphar Veterinaria, S.L.U, Barcelona, Spain) and local anesthesia (lidocaine hydrochloride 2% solution; Lidocaina Normon®, 20 mg/mL, Laboratorios Normon, Madrid, Spain) applied with a 25G needle inserted directly into the spermatic cord through the skin. Piglets were weaned at an average age of 24 days-old, and housed in groups of 12 piglets/pen distributed by sex and body weight (BW) during the transition phase. In the growing-fattening phase (from 70 days-old until slaughter), animals were housed by sex and BW in groups of 40 pigs/pen. Females received immunocastration using VACSINCEL (Zoetis Inc., New Jersey, USA) with two vaccinations at 120 and 148 days-old, respectively. From day 240 until day 340 of age, pigs were marketed whenever they reached the minimum market weight, which for Iberian crossbred pigs is set at 115 kg of carcass weight. At day 340 of age, all remaining pigs were sent to market, independently of their weight. The phenotype evaluation followed the methods already described in a previous paper using part of the animals involved in this study6.

Evaluation of growth pattern and fatness

Different measurements were carried out at the following time points: birth, weaning (24 days-old), days 70, 110, 150, 180, 215 and 240 of averaged age and at slaughter. Body weight was determined individually at all these time points. Morphological measures (occipito-nasal length, biparietal diameter, trunk length, abdominal and thoracic perimeter and maximum thoracic diameter) were recorded at birth and at weaning for all piglets, with a measure tape. Backfat thickness and loin diameter were determined at weaning and at 110 and 215 days-old. Weight and length of carcasses and backfat thickness were recorded at the slaughterhouse.

The average daily weight gain (ADWG) was calculated during the suckling phase (1–24), the transition phase (25–70), and during the five selected time periods of the growing-fattening phase until the day 240 of age (70–110, 111–150, 151–180, 181–215 and 216–240 days of age; each period is named by its last day). The feed conversion ratio (FCR) was calculated by pen during the five selected periods of the growing-fattening phase. Average daily weight gain and FCR were also calculated for the whole periods from birth to slaughter and from weaning to slaughter, respectively.

Tissue sample collection and drip-loss analysis

Samples of blood were obtained in vivo (110 and 215 days-old) and at slaughter, and employed for DNA extraction and metabolic status evaluation. Samples of longissimus dorsi (LD) muscle, right liver lobe and subcutaneous fat were taken at slaughter and biobanked at − 20 °C until fatty acid (FA) composition was analysed. Muscle drip-loss analysis was also carried out26.

Evaluation of metabolic status

Metabolic status was assessed at the time points of 110 and 215 days-old and at slaughter. Blood samples were assayed for determination of parameters related to metabolism of glucose (glucose and fructosamine) and lipid profiles [total cholesterol, high-density lipoprotein cholesterol (HDL-c), low-density lipoprotein cholesterol (LDL-c) and triglycerides], by means of a clinical chemistry analyzer (Saturno 300 plus, Crony Instruments s.r.l., Rome, Italy).

Evaluation of fat content and FA composition of tissue samples

Liver fat (LF) and intramuscular fat (IMF) were extracted as described by Segura and Lopez-Bote27 and expressed as a percentage of dry matter. These lipids were fractionated into the two main fractions of the fat tissue: neutral lipids (NL) and polar lipids (PL)1,27. Subcutaneous fat was extracted and separated in outer and inner layers to be analysed individually. From individual FA values, proportions of saturated, monounsaturated, and polyunsaturated FA (SFA, MUFA, and PUFA), the unsaturation index (UI), and the sum of total n-3 FA (n-3) and n-6 FA (n-6) and its ratio (n-6/n-3) were calculated28. Moreover, the activity of stearoyl-CoA desaturase enzyme 1 (SCD1) was estimated using two desaturation indexes, the ratios of C18:1/C18:0 and MUFA/SFA29.

SNP discovery

Out of the crossbred Iberian pig population, 18 male pigs with divergent postnatal growth patterns were selected for whole genome sequencing. Birth weight and BW at 240d were considered in order to select animals corresponding to three different growth categories: low birth weight and low final weight (LL; 6 animals), low birth weight but normal final weight (LN; 6 animals) and normal values for both birth and final weights (NN; 6 animals) (Fig. 1). Selected animals came from different litters.

Figure 1
figure 1

Growth patterns of the 18 animals selected for WGS. Six animals are included in each growth group: low birth weight and low end weight (LL; pink); low birth weight and normal end weight (LN; purple) and normal values for both birth and end weights (NN; green). Population mean is shown in black. BW body weight.

Genomic DNA was obtained from the blood samples of all animals using the NucleoSpin® Blood Kit (Macherey-Nagel, Düren, Germany). A NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA) was used to measure the concentration and quality of the DNA. Whole-genome resequencing of the selected 18 animals was performed on an Illumina HiSeq 2000 platform by CNAG-CRG (Barcelona, Spain). About 1 μg genomic DNA was randomly fragmented. The genomic library was prepared according to the manufacturer’s protocol (Illumina, True Seq DNA preparation guide) using the TruSeq DNA sample preparation kit. The paired-end library was sequenced on an Illumina HiSeq 2000 using the v4 chemistry and 2 × 125 reads.

FastQC software was employed to perform the quality control of the raw data (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimmomatic package version 0.3830 was used for paired-end read trimming. Reads were trimmed to remove adapters, eliminate low quality or N-bases at the start or end of the read and scan the reads from the 3′-end with a 4 bases sliding window, cutting when the average Phred quality drops below 20. After trimming, reads were removed if they were shorter than 40 bp. Filtered paired reads were aligned to the porcine reference genome build 1131 by the Burrows-Wheeler Aligner (BWA version 0.7.10)32, employing the “bwa-mem” algorithm. SAMtools version 1.633 was used to filter unmapped or unproperly mapped reads, sorting, indexing and adding read groups. Picard tools version 2.18.9 (http://broadinstitute.github.io/picard/) was employed to mark duplicates. From this step on till the Variant Call Format (VCF) file, the reads were processed using the Genome Analysis Toolkit (GATK version 4.0.6.0)34,35. Base quality was recalibrated using BaseRecalibrator and variant calling was carried out for each sample using HaplotypeCaller, resulting in Genomic Variant Call Format files (GVCFs), which were combined to a VCF file using CombineGVCFs using default parameter values. Finally, variants were called using GenotypeGVCFs using default parameters. A total of 14,871,405 and 3,613,138 raw SNPs and INDELS were detected, respectively. The SNPs were quality filtered using the GATK VariantFiltration tool, excluding those with parameters “QUAL < 40 || QD < 5.0 || FS > 60.0 || MQ < 40.0 || DP < 30” and resulting in 13,324,328 filtered SNPs. VCFtools36 was employed to filter those SNPs with Minor allele frequency (MAF) lower than 0.05, leading to a final set of 11,756,179 filtered SNPs.

Ensembl variant effect predictor (VEP ver. 99)37 was used to annotate the variants from our WGS dataset with information in the VEP database (the merged cache file Sscrofa11.1 ver. 99). The default parameter of --distance 5 kb of VEP was applied to define the upstream and downstream variants. We also applied --sift b option to the prediction of the SIFT score38 of missense variants.

Selection of SNPs for genotyping

We focused on missense SNPs located on annotated genes and showing differences in allele frequencies between groups differing in growth patterns. A total of 32,688 markers were classified as missense out of the 11.7 million, according to VEP. LD-pruning was implemented with PLINK 1.939 removing a total of 20,049 SNPs with r2 > 0.6 in 1000 kb windows. Finally, a total of 8419 SNPs mapped on autosomes were used for downstream analyses. Allele frequencies within growth groups were calculated with PLINK 1.9. We kept those SNPs having an allelic frequency difference between groups higher than 0.25 in at least one comparision (LL vs LN, LL vs NN or LN vs NN). As a result, a total of 2259 SNPs remained.

Among these preselected SNPs, 1023 were technically adequate for Agriseq genotyping. These were included in the design and genotyped by Genotyping by Sequencing (GBS) in the Iberian × Duroc population, from which phenotypes were also available. The list of markers genotyped by GBS is included as Supplementary Table S1. For GBS, libraries were made using Agriseq HTS library kit, and sequenced in Ion 540 chips on an Ion GeneStudio S5 Sequencer.

Besides, we complemented this panel with 192 candidate SNPs obtained from the literature mining and from pure and Duroc-crossbred Iberian muscle RNAseq data13,14,40 (Suplementary Table S2). These SNPs were genotyped in the same Iberian × Duroc population using a customed TaqMan® OpenArray™ (OA) Genotyping platform (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Genotyping was carried out in a QuantStudio™ 12 K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, Massachusetts, USA) at the Veterinary Molecular Genetics Service (SVGM, UAB, Barcelona, Spain). DNA samples were loaded and amplified on the arrays following manufacturer instructions. Detection of allele-specific signal intensities was performed using OpenArray NT Imager, and the genotypes were called using OpenArray SNP Genotyping analysis software. In addition, SNPs images were visually inspected to detect any clustering issues.

Statistical association analyses

The total SNP dataset (1215 markers) was filtered for association analyses using PLINK 1.9 by MAF value (0.1) and excluding markers with genotyping errors, with 1005 SNPs remaining after the filtering. These analyses were carried out on 477 individuals with genotyping and phenotyping data. SNP effects were analysed with the leave-one-chromosome-out option from the software GCTA41. This option implements a mixed linear model-based association analysis using the following general model:

$${\varvec{y}} = {\varvec{a}} + {\varvec{bX}} + {\varvec{fF}} + {\varvec{g}}{\text{-}} + {\varvec{e}}$$

where y is the phenotypic value corresponding to each trait, a is the mean term, b includes the additive effects (fixed term) of the analysed SNP, X includes the genotype indicator variable that is coded as 0, 1 or 2, f includes the remaining fixed effects and covariates that were different depending on the group of traits analysed (Suplementary Table S3) and F its corresponding incidence matrix, g- is the accumulated effect of all the SNPs (captured by the Genomic Relationship Matrix, GRM) except those mapped on the same chromosome than the SNP analysed and e is the residual. The genetic variance var(g-) is estimated based in the null model y = a + g- + e and then fixed while the association between each SNP and the trait is tested. This variance is re-estimated each time that a chromosome is excluded when the GRM is calculated. False discovery rate (FDR) was controlled for using the library q-value42 in Rstudio43. SNPs with a p-value and q-value lower than 0.05 and 0.10, respectively, were considered as significantly associated with the trait tested.

Results

SNP discovery and genotyping

An average of 83 million paired-reads were generated per sample by Illumina sequencing, ranging from 67 to 100 million, with a mean coverage of 8.4 ×. After quality control, 98% of the reads were mapped, on average, to the reference genome (Sscrofa11.1), and over 14 million variants were detected. After the application of several quality filters and functional and technical selection criteria, 1023 missense polymorphisms were finally genotyped by GBS.

GBS quality metrics confirmed the reliability of the genotyping procedure. Both average sample and marker call were 93.9%, and 90.2% of the markers had a call rate above 90%. Average coverage was 183 ×, well above the Agriseq recommended coverage of 100 × for genotyping purposes. Uniformity (percentage of target bases that have at least 0.2 times the mean depth) averaged 92.9%.

Out of the 1023 SNPs included in the GBS marker panel, 120 polymorphisms were discarded due to lack of segregation (MAF < 0.10) or genotyping errors, resulting in 903 GBS markers with useful data for the association study.

In parallel, 192 markers in candidate genes were genotyped by openarray. Out of them, 60 were monomorphic and a total of 30 discarded due to low variability (0 > MAF < 0.10) or genotyping errors, thus 102 OA markers remained informative.

Joint results of GBS and OA genotyping resulted in 1005 markers being included in the association study (903 and 102 from GBS and OA, respectively). Minimum allele frequencies (MAF) distribution for the genotyped markers in the Iberian crossbred population is shown in Fig. 2, which reflects a good proportion of markers with intermediate frequencies in the GBS marker panel coming from WGS data, and a much lower informativity in the OA panel coming from candidate genes proposed by either bibliography or RNAseq data.

Figure 2
figure 2

MAF distribution for markers genotyped by Genotyping by Sequencing (obtained from WGS data) or OpenArray (obtained from bibliography and RNAseq data).

Association study

The association study was performed for the 1005 successfully genotyped SNPs showing segregation within our population and all growth, fatness, FA composition and metabolic traits available (descriptive statistics included in Supplementary Table S4). Significant association results (q < 0.10) are included in Supplementary Table S5 and a graphical representation of those results is shown in Fig. 3. Supplementary Table S6 includes association results with significant nominal p-values, for those genes showing at least one significant association (q < 0.10).

Figure 3
figure 3

Summary of significant association results (q < 0.10) obtained for the different traits, grouped by trait type/stage (x axis) and source of polymorphisms (y axis). Colored bars denote the number of traits significantly associated with each gene. ADG average daily gain, FCR feed conversion ratio, WGS whole genome sequencing.

Growth and fatness

Regarding growth and fatness traits, 72 significant associations were observed, most of them involving candidate genes genotyped in the OA platform (n = 59) for which previous biological or functional hypothesis were available.

Candidate genes with SNPs affecting weight and morphological measures at early developmental stages (birth and weaning) included ACACA, EPHX1 and FREM2 genes (Supplementary Table S5). The two SNPs analysed within ACACA gene systematically affected all the body measures recorded at birth (Table 1). At weaning, punctual associations were detected for EPHX1 and abdominal circumference, and also for FREM2 and weaning weight, which reached the significance threshold. Several nominal-level associations were observed for other related traits in both genes (Supplementary Table S6). FREM2 showed several suggestive associations with early development traits, while EPHX1 showed suggestive associations with livelong growth and development traits. From 150 days-old, the main genes affecting weight, ADG and FCR were NMNAT1, SOWAHB, EPHX1, TFRC and SCD. Out of them, the three SNPs analysed within the SCD gene showed the effects with the largest magnitude, reaching, for instance, 29 kg (3.2 SD) for slaughter weight, 10 kg for carcass weight (1.25 SD) and 0.125 kg/d (1.9 SD) for global ADG (Table 2). A total of 13 associations involved SNPs detected in WGS and selected based on their potential effect on the coded protein and their difference in allele frequency between the growth groups. Results corresponding to the SNPs coming from WGS data provide new genes related to weight and body measures, only at birth, including LRIG3, DENND1B, TMCC1 and MFRP. Among the genes associated with birth morphological traits, ACACA (Table 1) and TMCC1 were those with the largest effects.

Table 1 Association results for polymorphism located in ACACA gene (75_ACACA, position 38,825,225). Traits with p-value < 0.01 are shown and those with q < 0.10 are highlighted in bold.
Table 2 Association results for polymorphisms in SCD gene (10_SCD, position 111,461,631). Traits with p-value < 0.01 are shown and those with q < 0.10 are highlighted in bold.

Backfat was affected by LEPR, NFE2L2, FTO and LIPE candidate genes (Supplementary Table S5). LEPR showed the most evident impact on backfat, with significant or suggestive effects on the depth of the different fat layers during the growing stage and several other traits (Table 3) and with the effect of largest magnitude at slaughter (0.7 SD). In addition, a SNP located in CFAP157/ STXBP1, derived from WGS data, was associated with backfat thickness at slaughter, with an effect of relevant magnitude (0.6 SD).

Table 3 Association results for LEPRc.1987C/T polymorphism (001_LEPR, position 146,829,589). Traits with p-value < 0.01 are shown and those with q < 0.10 are highlighted in bold.

Fatty acid composition

Fatty acid composition was determined in backfat (inner and outer layers) and in Longissimus dorsi muscle and liver tissues (neutral and polar lipid fractions). A total of 100 significant associations were found for these traits (Supplementary Table S5), out of which 45 associations corresponded to backfat FA composition (13 for the inner layer and 32 for the outer layer), 36 corresponded to loin FA (25 for neutral and 11 for polar lipid fractions) and 19 corresponded to liver FA profile (12 for neutral and 7 for polar lipid fractions). As for growth and fatness traits, most associations corresponded to markers located in biological and functional candidate genes genotyped by OA (n = 93) in comparison to those derived from WGS (n = 7). The 100 significant associations were related to 42 SNPs, located in 35 genes, and showing between 1 and 18 significant effects on different traits or tissues. Results for those SNPs having at least 2 significant effects are shown in Table 4. Genes with the highest number of significant associations with FA traits were KRT10 (Tables 4 and 5), with 18 significant associations; and NLE1, with 9 significant associations (Tables 4 and 6), both genes affecting the FA profile of backfat and muscle neutral lipids, mainly regarding short chain saturated and monounsaturated FA. Variation in KCNH2 gene was significantly associated with several main PUFA measures (C18:2n-6, C18:3n-3, PUFA, n-3, n-6) only in the backfat outer layer; while other genes such as AHNAK, LIPE, EPHX1 or TNF had effects on different locations affecting similar traits (Table 4).

Table 4 Significant association results for FA composition traits, including SNPs with at least two significant effects on different traits or different tissues (q < 0.10).
Table 5 Association results for polymorphism in KRT10 gene (175_KRT10, position 21,641,370). Traits with p-value < 0.01 are shown and those with q < 0.10 are highlighted in bold.
Table 6 Association results for polymorphism in gene NLE1 (168_NLE1, position 40,037,354). Traits with p-value < 0.01 are shown and those with q < 0.10 are highlighted in bold.

Three different SNPs located in SCD gene were associated with palmitoleic acid and the desaturation ratio oleic/stearic only in the neutral lipid fraction of liver (C16:1n-9, C18:1/C18:0) (Tables 2, 4). However, significant effects at the nominal level were observed for inner backfat FA traits, which did not reach the multiple-test corrected significance threshold (Supplementary Table S6). Also, the key genes of lipogenesis, FASN and ACACA, were associated with miristic acid (C14:0) and palmitoleic acid (C16:1n-7), respectively, in both the backfat outer layer and the neutral lipids from muscle (Table 4).

Besides, some other relevant genes known to influence fatness related traits, such as IRX3, ADIPOQ, MSTN or MYOD1, had significant effects on just one trait (Supplementary Table S5).

Blood biochemical parameters

In relation to metabolic traits, only two significant associations were found for HDL concentration in plasma at 110 and 215 days-old, both related to WGS markers (Supplementary Table S5). The association found at 110 days old involves a SNP located in TBC1D16 gene, while the association found at 215 days-old involves the gene CD4.

No significant effect was observed for any of the other meat quality traits included in the study (intramuscular fat content, pH and drip loss) after the multiple test correction.

Discussion

In this study, a medium-throughput genotyping approach was performed after an initial SNP discovery step, in order to uncover DNA polymorphisms related to relevant productive traits. SNPs analysed were mostly obtained from WGS data, and complemented with a panel of biological and functional candidate genes/SNPs, obtained from literature and RNAseq data.

Genotyping results showed a good level of informativity of the markers derived from WGS, with a high proportion of markers showing intermediate frequencies (56% showed 0.3 < MAF < 0.7) and scarce proportion being discarded due to low segregation (10.6%). These results support the validity of the SNP discovery approach perfomed from WGS data. On the contrary, markers obtained from bibliography and RNAseq data were less informative (33% showed 0.3 < MAF < 0.7) and a high proportion was detected as monomorphic or scarcely informative (50%). This finding was not unexpected as WGS markers were obtained from sequencing data from animals of the same Iberian crossbred population employed for the association study, while candidate genes/SNPs were previously identified in different pig breeds and populations.

In spite of a lower number of previously known candidate SNPs, and their lower informativity, in comparison to WGS-derived markers (102 vs 903) a much higher number of significant associations were observed for the SNPs with previous evidences of association or differential expression. This is not surprising as those fulfill more functional criteria than the ones discovered in the present work from WGS data. In fact, out of 174 significant associations, 152 involved previously identified SNPs in candidate genes and only 22 involved WGS-derived markers. Besides, out of the 152 associations related to previously characterized SNPs, 62 corresponded to well-known markers with previous evidences of association, and 90 corresponded to markers discovered from RNAseq data, within genes showing differential expression between Iberian pig genotypes or with a potential regulatory role. This finding supports the usefulness of deep sequencing transcriptome data mining to uncover structural variants with interesting effects on phenotype.

Regarding growth and development traits, two main genes, ACACA and SCD, were found to affect, respectively, birth measures and livelong performance traits. These genes are involved in fat synthesis and metabolism and, indeed, significant effects were also found for both genes on FA composition. However, their potential role on growth traits is not clear so far. Acetyl-CoA carboxylase (ACACA) is a multifunctional enzyme system that catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in FA synthesis. In agreement, our results show a significant effect of this gene on palmitoleic acid abundance in backfat and muscle and suggestive effects on many main FA and FA ratios. Besides this main role, the ACACA gene influences mitochondrial membrane potential, thus it might condition cell proliferation and tissue growth44. In fact, a deficiency in ACACA activity was associated to defects in early growth and muscle development in humans45. Stearoyl-CoA desaturase (SCD) catalyzes the ∆9-cis desaturation of palmitoyl- and stearoyl-CoA, which are converted into palmitoleoyl- and oleoyl-CoA, respectively. In addition to being components of tissue lipids, MUFAs also serve as mediators of signal transduction, cellular differentiation, food intake, apoptosis, and mutagenesis, and therefore, variation in SCD function in mammals would be expected to have an effect on a variety of key physiological events, including growth46. Thus, it has been proposed that BW gain may be associated with SCD increased activity. In fact, a different polymorphism in SCD gene was associated with ADG and FCR in lean pigs47. Nevertheless, the three SNPs analysed in this study (g.2108C>T; g.2228T>C; g.2281A>G), located in the promoter region and in full linkage disequilibrium, had not been previously associated with performance traits48,49,50. This novel finding is also striking because of the size of the effect, especially on BW traits, with an estimated additive effect close to 30 kg. This finding should be cautiously interpreted, as the frequency of the homozygous genotype was low and the effect could be overestimated. Interestingly, Estany et al.48 found that C-T-A haplotype was additively associated with enhanced C181/C180 desaturation index both in muscle and subcutaneous fat, but not in liver, in a purebred Duroc line. According to our results, the main effect of SCD gene regarding FA composition is detected in liver neutral lipid fraction, with C-T-A haplotype enhancing the desaturation index as well as C16:1n-9 content. Also, a potential effect on IMF was reported for this mutation49 which has not been validated in the present work.

Other interesting genes and SNPs that have not yet been established as candidates for growth traits were also identified. Some of them were obtained from RNAseq data and were differentially expressed between Iberian and Iberian × Duroc crossbred or Duroc purebred pigs, such as FREM2, EPHX1, SOWAHB or TFRC5,13,14,15. Thus, those genes fulfill a functional criteria to be considered involved in the phenotypic differences observed between Iberian and Duroc genotypes. The FREM2 gene (FRAS1 related extracellular matrix protein 2) encodes an extracellular matrix protein with a role in morphogenetic processes51. In agreement with our findings, the FRAS/FREM complex is particularly important during early development. In humans, mutations in this gene cause the Fraser syndrome, a multisystem disorder involved in the structural adhesion of the skin ephitelium to its underlying mesenchyma52 and in pigs this gene has been proposed as candidate for developmental morphological traits such as presence of wattles53, The genes SOWAHB and EPHX1 showed somewhat similar association results, both affecting livelong growth traits and some fat deposition traits. The gene SOWAHB (sosondowah ankyrin repeat domain family member B) is part of the Ankyrin family, involved in membrane skeleton organisation, ionic transport, protein recognition as well as cell–cell adhesion regulation54. Moreover, SOWAHB is activated in response to insulin and possibly involved in insulin resistance55. Besides its effect on growth traits, significant nominal p-values were observed in the test of association with fatness traits at different ages (backfat thickness, cholesterol, Supplementary Table S6) suggesting a role in lipid metabolism. This gene showed 4 × higher expression in loin of purebred Iberian than in crossbred growing pigs14. The EPHX1 (microsomal epoxide hydrolase) gene showed significant association results for growth as well as FA composition traits, specifically for C20:1(n-9) and for C20:3(n-6) contents. The enzyme encoded by this gene is able to either detoxify or bioactivate a wide range of substrates56 including epoxides derived from endogenous polyunsaturated FA, thus mediating several biological processes such as inflammation or angiogenesis, and regulating crucial signaling pathways for cellular homeostasis, adipocyte differentiation and insulin response57,58,59. This gene was shown to be upregulated in biceps femoris, longissimus dorsi and backfat of pure Iberians in comparison to Duroc genotypes13,14,15. The upregulation of SOWAHB and EPHX1 genes in pure Iberians in different tissues may be related to their potential role in insulin response and adiposity, which is in agreement with the suggestive association of their structural variants with fatness traits and significant association with weight at growing stages, when Iberian pigs are known to develop a compensatory growth and a more intense development than Duroc genotypes5. The TFRC gene (Transferrin Receptor) encodes a cell surface receptor necessary for cellular iron uptake and has been studied as a positional candidate gene for disease susceptibility60. In addition, it acts as a lipid sensor that regulates mitochondrial function by regulating activation of the JNK pathway61. Interestingly, TFRC was upregulated in loin muscle of Duroc × Iberian crossbred pigs at birth14, but it was upregulated in biceps femoris and backfat of Iberian purebred growing pigs5,15. According to our results, the SNP in this gene affects BW and ADG. Thus, the opposite regulation observed at birth and growing stages would be related to the different paucity of muscle developmental processes observed in Iberian vs Duroc genotypes, with the latter showing higher prenatal development and birth weight and the Iberian purebreds showing increased muscle development at growing stages.

Newborn developmental traits were also affected by several SNPs detected through WGS, located in relatively unknown genes such as LRIG3, DENND1B or TMCC1. Although none of these genes has been studied for association with any productive trait, all have roles or evidences which suggest involvement in early developmental processes. LRIG3 (leucine-rich repeats and immunoglobulin-like domains 3) plays a role in embryo development, including cranio-facial morphogenesis and neural crest formation62. Also a direct relationship for LRIG3 and early growth was observed since KO mice were smaller than wild-type ones63. In agreement, our results show significant effects on birth weight and thoracic circumference and suggestive effects on all remaining traits recorded in neonates, including those related to head development. DENN Domain Containing 1B (DENND1B) gene has an important role on cytokine production and regulation of T cell receptor signaling and mutations or loss of this factor were associated with immune diseases in neonates64. Interestingly, methylation of DENND1B gene in venous umbilical cord blood at delivery has been associated to birth weight in humans65. According to our results, this gene influences birth weight, abdominal circumference and biparietal diameter and is suggestively associated to most early traits recorded at birth and at weaning. TMCC1 (Transmembrane And Coiled-Coil Domain Family 1) is an integral endoplasmic reticulum (ER) membrane protein that has been associated with central adiposity and waist circumference66 and is supposed to be involved in regulation of myogenesis67. Our results indicate effects on biparietal and thoracic diameter as well as thoracic circumference, and also at a suggestive level, on several other birth traits and later fatness traits such as backfat.

Fattening traits were mainly affected by SNPs located in known candidate genes such as LEPR, LIPE or FTO, with the clearest results detected for LEPR. Leptin receptor is known to have a determinant role on energy homeostasis and has been widely studied as functional and positional candidate gene for fat deposition traits. Iberian pigs are known to carry a fixed mutation in this gene which is considered a causal mutation contributing to their characteristic trend for adiposity and high appetite as well as their leptin resistant phenotype7,8. This mutation has been evaluated in different genetic backgrounds, mainly in experimental pig populations, but also in other breeds and commercial populations50,68,69,70. According to the present results, the effects of this SNP on fatness are further validated in our Iberian × Duroc commercial population, with significant effects on different backfat thickness measures and global ADG, and suggestive associations with BW, FA composition and cholesterol, HDL and LDL levels. As expected, the Iberian T allele increases fattening, with an additive effect close to 0.5 cm for backfat, which means an estimated difference of almost 1 cm between homozygous animals for alternative alleles (20% of the trait mean value).

Besides the known candidate genes, the NFE2L2 gene (nuclear factor erythroid 2-related factor 2) was associated with fatness. This gene codes a transcription factor whose main role involves the regulation of the expression of antioxidant proteins that protect against oxidative damage. Moreover, this transcription factor is determinant in the maintenance of homeostasis, participating in the regulation of metabolism, inflammation, autophagy, proteostasis, mitochondrial physiology, and immunity71. This gene was selected from RNAseq data for SNP discovery, despite not being differentially expressed, due to its potential role as predicted regulator for the differential expression observed between Iberian pig genotypes14. Our results show a significant effect on backfat thickness and suggestive effects on intramuscular fat content and FA composition, especially in loin, in agreement with its known role in muscle mitochondrial biogenesis and the predicted role as regulator of muscle metabolism.

The main genes affecting FA composition were KRT10 and NLE1, both affecting the proportions of SFA and MUFA in backfat and loin. KRT10 (keratin 10) was studied because it was overexpressed in muscle from pure Iberian in comparison to crossbred pigs13, although it does not have a known role in relation to fat metabolism. However, it has been recently shown that KRT10 expression is suppressed by serum lipids72 and, interestingly, different keratin genes had been associated to FA composition traits in a chromosome 12 survey in Iberian × Landrace backcrossed pigs73. In fact, the KRT10 gene maps on SSC12, where several QTL for FA composition traits have been discovered. Moreover, KRT10 was found as the most significant differentially expressed gene according to ACACA genotype regarding polymorphism ALGA0066302A, not analysed here, which in turn is associated with FA composition. On the other hand, the NLE1 gene (Notchless homolog 1) plays a role in regulating the Wnt pathway, which is known to play a central role in adipocyte differentiation and lipid metabolism74. In fact, Wnt pathway plays a dual function in adipocytes, including a well-known repressive effect on adipogenesis and the stimulation of leptin production in mature adipocytes75. In agreement with the antiadipogenic effect, a higher expression of this gene was observed in loin of crossbred Iberian pigs, which are leaner than purebreds14. Our association results do not relate this gene with adiposity traits, but clearly demonstrates its involvement in relevant traits related to FA metabolism and profile, which influence meat quality, in the two main productive tissues, fat and muscle, and especially on this last one. The allele that increases oleic acid content and monounsaturated to saturated fat proportions, with effects ranging from 0.5 to 1 SD in magnitude, also shows suggestive positive effects on BW and ADG at late growing stages, meaning it could be an interesting marker to improve meat quality without negative pleiotropic effects on other relevant traits.

Several other SNPs showed significant effects limited to a few FA. KCNH2 gene (Potassium Voltage-Gated Channel Subfamily H Member 2) was associated with the main PUFA contents in backfat, although, suggestive effects were observed on the muscle FA profile and on some growth and carcass traits. This gene was previously found overexpressed in loin and biceps of pure Iberians in comparison to crossbreds and Duroc animals5,14. Our results agree with previous findings in humans showing that mutations in KCNH2 gene are associated with alteration of insulin homeostasis and glucose and lipid metabolism76 and its methylation is related to obesity77. AHNAK gene (Neuroblast differentiation-associated protein) codes a nucleoprotein involved in adipogenesis and lipid metabolism. AHNAK KO mice have reduced fat accumulation and decreased serum triglyceride levels as well as increased expression of genes involved in lipolysis and FA oxidation78. In agreement, its expression was higher in pure Iberian animals14, which are characterised by a higher adipogenic potential than crossbreds. In addition, our results show effects of this SNP on different FA in backfat and liver and a suggestive effect on backfat thickness at weaning.

On the other hand, the genes FASN, LIPE and TNF had similar effects, mainly on C14:0 in backfat and muscle. As mentioned previously, LIPE gene (hormone sensitive lipase) was one of the genes significantly affecting backfat depth at slaughter. The LIPE allele with a positive effect on backfat also positively affected C14:0 and C16:0 in backfat and loin and had suggestive negative effects on several MUFA parameters and glucose metabolism indicators. LIPE is a key enzyme in the mobilization of FA from acylglycerols79, which has been related to different fat deposition and FA profile traits in different species. On the other hand, FASN gene (fatty acid synthase) codes a multifunctional enzyme that catalyzes the synthesis of FA and has been repeatedly associated with FA composition traits, especially affecting short chain saturated FA21,80, as in our results. This gene, together with ACACA, regulates de novo synthesis of FA. Thus, FASN, ACACA and LIPE are involved in the regulation of fat deposition by balancing lipogenesis and lipolysis, and significant association results are obtained for all three of them. On the other hand, tumor necrosis factor (TNF) is an adipokine that promotes insulin resistance and is associated with obesity-induced type 2 diabetes81. In agreement, our results indicate a significant effect on C14:0 in backfat and loin, but also many other suggestive associations are found for other saturated and monounsaturated FA, as well as carcass and meat quality traits (backfat thickness at slaughter, drip loss) and BW records along growth.

Scarce significant associations were found for metabolic traits. Only HDL concentration was significantly associated with two different SNPs, both derived from WGS data, at two different ages. The association found at 110 days old is interesting as it involves TBC1D16 gene, which has a role in membrane trafficking and molecule transport and is known to be dysregulated in obesity82. Besides the significant effect on HDL, this SNP showed suggestive effects on many different FA traits related to MUFA contents and MUFA/SFA ratios in loin and liver, which are known to be correlated with physical markers of obesity such as body mass and adiposity indexes83, in agreement with the role of this gene, although no effect was observed on fatness.

Conclusions

A combined approach of candidate SNP recovery from literature and SNP discovery from WGS and RNAseq data has been successful in validating the effects of different SNPs in candidate genes previously associated with fatness, growth and FA composition traits (such as LEPR, SCD, ACACA, FASN or LIPE). Moreover, our approach allowed for the discovery of interesting new effects of SNPs in less known genes (such as LRIG3, DENND1B, NMNAT1, SOWAHB, EPHX1) affecting BW and ADG at different ages, as well as fatness and FA composition (NFE2L2, KRT10 or NLE1). Most significant associations came from SNPs identified through structural mining of transcriptome deep-sequencing data. In spite of a relatively small number of significant associations detected for WGS-derived markers, the approach employed here is proposed as a useful genome-wide strategy for the discovery of untapped genetic basis of productive traits. Our results contribute to a better understanding of the molecular genetic basis of relevant traits in pigs, including both growth and yield-related traits as well as traits involved in sensory and technological quality of meat which are essential for Iberian pig cured products. Besides, our results provide interesting SNPs and new candidate genes for association studies in different pig breeds and populations and for future implementation of marker-assisted selection strategies. Different markers can be highlighted as most promising, such as KRT10, NLE1, TFRC, NFE2L2, SCD or LEPR, and some of them may allow the improvement of meat quality without negatively affecting growth efficiency, or viceversa.