Joint QTL mapping and gene expression analysis identify positional candidate genes influencing pork quality traits

Meat quality traits have an increasing importance in the pig industry because of their strong impact on consumer acceptance. Herewith, we have combined phenotypic and microarray expression data to map loci with potential effects on five meat quality traits recorded in the longissimus dorsi (LD) and gluteus medius (GM) muscles of 350 Duroc pigs, i.e. pH at 24 hours post-mortem (pH24), electric conductivity (CE) and muscle redness (a*), lightness (L*) and yellowness (b*). We have found significant genome-wide associations for CE of LD on SSC4 (~104 Mb), SSC5 (~15 Mb) and SSC13 (~137 Mb), while several additional regions were significantly associated with meat quality traits at the chromosome-wide level. There was a low positional concordance between the associations found for LD and GM traits, a feature that reflects the existence of differences in the genetic determinism of meat quality phenotypes in these two muscles. The performance of an eQTL search for SNPs mapping to the regions associated with meat quality traits demonstrated that the GM a* SSC3 and pH24 SSC17 QTL display positional concordance with cis-eQTL regulating the expression of several genes with a potential role on muscle metabolism.

performed an additional analysis where we have investigated the co-localization between GM QTL and expression QTL in cis (cis-eQTL).

Materials and Methods
Ethics approval. The manipulation of Duroc pigs followed Spanish national guidelines and it was approved by the Ethical Committee of Institut de Recerca i Tecnologia Agroalimentàries (IRTA).

Measurement of phenotypic and expression data.
Phenotypic records were collected in a commercial Duroc line of 350 barrows distributed in five half-sib families (Lipgen population). A detailed description of the management conditions of this commercial line has been previously reported 9 . Meat quality analyses were performed 24 h after slaughter at the IRTA-Centre of Food Technology by using 200 g samples of the LD and GM muscles. Electrical conductivity was estimated with a Pork Quality Meter (Intek GmbH) while pH 24 was measured with a pH-meter equipment with a Xerolyte electrode (Crison). Meat L*, a* and b* color parameters were determined with a Minolta Chroma-Meter CR-200 (Konica Minolta) equipment (light source C and aperture 2). Microarray expression data of GM samples from 104 Duroc pigs were obtained in a previous study (data can be found in the Gene Expression Omnibus public repository, accession number: GSE19275) based on the use of GeneChip Porcine Genomic arrays (Affymetrix, Inc., Santa Clara, CA) 10 . A detailed description of the techniques and methods used to perform the RNA purification and microarray hybridization steps can be found in Canovas et al. 10 . Briefly, GM samples from 104 pigs were grinded in liquid nitrogen and homogenized with a mechanical rotor. Total RNA was purified with an acid phenol protocol 11 and it was subsequently used as a template to synthesize double stranded cDNA with the One Cycle cDNA Synthesis Kit (Affymetrix, Inc.). cRNAs were purified with the GeneChip Sample Cleanup Module (Affymetrix, Inc.), fragmented and added to a hybridisation cocktail 10 . The GeneChip Porcine Genome Array was equilibrated to room temperature and prehybridised with 1× hybridisation buffer at 45 °C for 10 min 10 . The hybridisation cocktail was heated to 99 °C for 5 min in a heat block and cooled to 45 °C for 5 min. Subsequently, a hybridization step was carried out at 45 °C for 16 hours. GeneChips were washed and labeled with streptavidin phycoerythrin in a Fluidics Station 450 (Affymetrix, Inc) and they were scanned in an Agilent G3000 GeneArray Scanner (Agilent Technologies, Inc.). The "Affy" and "Sympleaffy" packages from the Bioconductor project 12 were employed to establish a set of quality control metrics to assess the quality of RNA samples and the efficiencies of the labelling and hybridisation steps. Data pre-processing and normalization were carried out with the BRB-ArrayTools software version 3.7.1 13 . Genes displaying more than 20% of expression values over ± 1.5 times the median expression of all arrays were retained for further analysis.
Genome-wide association analysis for meat quality and expression data. Genotyping was performed with the Porcine SNP60 BeadChip (Illumina, San Diego, CA) which contains 62,163 single nucleotide polymorphisms (SNPs). Quality genotyping analyses were carried out with the GenomeStudio software (Illumina), as previously reported 14 . We removed SNPs (a) mapping to the X chromosome, (b) with a rate of missing genotypes higher than 5%, (c) that did not conform Hardy-Weinberg expectations (threshold set at a P-value ≤ 0.001), (d) that had a minor allele frequency below 0.05, (e) that had a GenCall score < 0.15, (f) that had a call rate < 95% or (g) that could not be mapped to the pig genome (Sus scrofa 10.2 assembly). After filtering the raw data, a GWAS was carried out with 36,710 SNPs. Single-SNP association analyses were performed with the Genome-wide Efficient Mixed-Model Association (GEMMA) software 15 under an additive genetic model that included the genomic kinship matrix to account for relatedness. The statistical model assumed in this analysis was: where y ijklm is the vector of phenotypic observations i.e. pH 24 , CE, L*, a* and b* measured at the GM and LD muscles of the i th individual; μ is the population mean of each trait; batch j is a systematic effect of the j th fattening batch, with 4 categories; β is the regression coefficient on the covariate weight at slaughter (weight k ); δ is the SNP allelic effect, estimated as a regression coefficient on the corresponding g l genotype (values − 1, 0, 1) of the l th SNP; and e ijklm is the residual effect. The statistical relevance of the systematic environmental sources of variation and the covariates included in the model were previously reported by Gallardo et al. 8 and Casellas et al. 16 . Correction for multiple testing was implemented with a false discovery rate approach 17 . Microarray data were available exclusively for GM muscle samples 10 . Following the strategy employed in the Genotype-Tissue Expression (GTEx) pilot analysis 18 , we primarily searched for cis-eQTL because they are expected to have larger effects than their trans-counterparts. We used two different strategies: Analysis 1, we retrieved 12 genes localized within GM QTL regions and we looked for cis-eQTL that might regulate their expression and Analysis 2, we made a search for cis-eQTL at a whole genome scale and we analyzed if there was a positional concordance between GWAS signals and cis-eQTL identified in this way. This second strategy made possible to identify cis-eQTL that might be located in the vicinity of GWAS signals. Genes corresponding to each probe included in the GeneChip Porcine Genomic array (Affymetrix, Inc., Santa Clara, CA) were identified in the BioMart database 19 . The statistical model assumed in this analysis was: where y ijklm is the vector that defines the expression of each gene in the GM muscle of the i th individual; μ is the mean expression of each gene in the population; batch j and lab k are the systematic effects i.e. batch j of fattening (with 4 categories) and lab k (microarray data were generated in two different laboratories); δ is the SNP allelic effect estimated as a regression coefficient on the corresponding g l genotype (values − 1, 0, 1) of the l th SNP; and Scientific RepoRts | 7:39830 | DOI: 10.1038/srep39830 e ijklm is the residual effect. Correction for multiple testing was implemented with a false discovery rate approach 17 . The threshold of significance in Analysis 1 took into consideration the number of SNPs contained within 2 Mb windows around each one of the 12 genes under consideration, while in Analysis 2 such threshold was established by taking into account the 36,710 SNPs typed in the Duroc population.

Results and Discussion
The SNPs arrayed in the Porcine SNP60 BeadChip explain a limited amount of the phenotypic variance of meat quality traits. By using the GEMMA software, we have estimated the proportion of phenotypic variance explained by the 36,710 SNPs (h 2 SNP ) genotyped with the Porcine SNP60 BeadChip (Table 1). In general, estimates of h 2 SNP ranged from low to moderate and differed between muscles. Discrepancies in the genealogic heritability (h 2 ) estimates of meat quality traits recorded in different skeletal muscle samples were previously reported by Larzul et al. 20 . In this way, these authors found h 2 of 0.03 and 0.23 for L* measured in the gluteus profundus and longissimus muscles, respectively. Similarly, the h 2 values of pH 24 measured in 4 different muscles oscillated between 0.17 (longissimus) and 0.39 (biceps femoris) 20 . When Gallardo et al. 8 performed a QTL scan for meat quality traits in the Lipgen population, they also found that QTL maps differed markedly amongst traits recorded in the GM and LD muscles. As a whole, these results suggest that there are muscle-specific factors that modulate the genetic determinism of meat quality traits. Indeed, Quintanilla et al. 21 identified remarkable differences in the gene expression patterns of the LD and GM muscles, a feature that was especially prominent for genes involved in muscle tissue development, cell proliferation and migration and muscle contraction.
Several h 2 SNP values obtained by us were comparable to genealogic heritabilities estimated for porcine meat quality traits in previous studies. For instance Gjerlaug-Enger et al. 22 reported heritabilities for a* of 0.43 and 0.46 in Duroc and Landrace pigs, respectively. Similarly, Van Wijk et al. 23 and Gjerlaug-Enger et al. 22 described heritabilities of 0.11 (crossbred pigs) and from 0.12 (Landrace) to 0.27 (Duroc) for pH 24 . More unexpected were the null h 2 SNP values obtained in the current work for traits such as b* (in GM) and L* (in both muscles). We attribute these null heritabilities to our inability to detect genetic variants that may have small effects or that segregate at very low frequencies 24 .
Environmental variables may also obscure the contribution of genetic factors. Indeed, meat quality traits can be affected by poor on-farm handling, mixing of unfamiliar animals and high pig density and long travel distance during transportation 25 . Such events may increase the stress of the swine brought to the abattoir and, consequently, they may have negative consequences on meat quality 25 . At the abattoir, extended lairage time can increase the incidence of dark, firm and dry (DFD) meat, while a short lairage time has been associated with an increased proportion of pale, soft and exudative (PSE) meat 25 . Electrical stunning induces a more rapid pH fall early post mortem and an inferior water-holding capacity than CO 2 stunning, while an accelerated chilling may have negative consequences on meat tenderness and water-holding capacity 25 . In summary, all these factors, and others that are not mentioned, can have a strong impact on the post-mortem pH, electrical conductivity and color of pig meat and "dilute" the contribution of polygenes 25 .
Genome-wide and chromosome-wide associations with meat quality traits in Duroc pigs. At the genome-wide level, we found significant associations between CE of LD and three genomic regions on SSC4, SSC5 and SSC13 ( Table 2). The SSC4, 104 megabase (Mb) region, lies close to a previously reported QTL for CE identified by Cepica et al. 26 . We also found positional concordance between the SSC13 (137.0 Mb) region associated with LD CE and a semimembranosus CE QTL reported by Evans et al. 27 . At the chromosome-wide level, a coincidence was detected between a a* QTL on SSC3 (50-57 Mb, Table 3) and a QTL for the same trait reported by Li et al. 28 on SSC3 (55 Mb). Overall, our results confirm the existence of differences in the genetic determinism of meat quality traits recorded in the GM and LD muscles. The only exception was a region on SSC5 that significantly affected CE in both LD and GM muscles (Table 3). When we compared these data with the set of QTL previously reported by Gallardo et al. 8 in the same Lipgen population we found one coincidence i.e. the GWAS signal identified on SSC4 (132 Mb) for CE in LD overlapped the confidence interval of a LD CE QTL (S0097 marker, ~133 Mb) detected by these authors 8 .
In general the positional coincidence between GWAS signals detected by us and those reported in previous studies was weak, indicating that the majority of associations reported in the current work are new. For instance, when we compared our a*, b* and pH 24 data with those described in six additional GWAS studies 4,6,29-32 we only found one positional coincidence between the SSC10 (70.6 Mb) genomic region associated with LD a* in the Lipgen  4 as associated with the same trait in the semimembranosus muscle of White Duroc × Erhualian F 2 pigs.
The level of coincidence of trait-associated regions between these six GWAS for a*, b* and pH 24 traits was also quite low. Only about 20% of the regions identified as significantly associated with any of these phenotypes were shared between two studies or more, indicating that the majority of associations are population-specific. These shared regions were: (a*) SSC4 (80-85 Mb) 6 29,32 . This latter region on SSC15 (133-136 Mb) appeared to be associated with a*, b*, pH 24 , shear force and cook loss in many independent studies 29-32 but not in ours. Interestingly, this SSC15 region contains the protein kinase AMP-activated non-catalytic subunit gamma 3 (PRKAG3) gene, whose polymorphism has causal effects on muscle glycogen depletion, a parameter that can have a strong influence on meat quality traits 33 .
Besides technical and methodological reasons, a probable cause for the lack of positional concordance between GWAS studies would be genetic heterogeneity 34 . Indeed, Yang et al. 34 performed a GWAS for blood lipid traits in 2,400 Laiwu, Erhualian and Duroc × (Landrace × Yorkshire) pigs and they identified a total of 22 QTL. Notably, only six regions were identified in more than one population, and 16 were detected in a single population.
Positional concordance between cis-eQTL for genes expressed in the GM muscle and QTL for GM traits. In general, eQTL are highly enriched in variants with causal effects on phenotypic variation and they can provide valuable information about candidate genes to be further investigated. Integrative analyses of QTL and eQTL data have been performed in pigs, making possible to combine the power of recombination with    expression studies in order to identify promising candidate genes 35 . For instance, multiple associations between SNPs mapping to porcine chromosomes 4 and 6 and meat quality traits have been detected 30 . Through an eQTL approach, it was possible to identify several genes on SSC4 (ZNF704, IMPA1 and OXSR1) and SSC6 (IH1D1, SIGLEC10, TBCB, LOC100518735, KIF1B, LOC100514845) whose variation is concomitantly associated with gene expression and phenotype data 30 . Similarly, Ma et al. 36 used a genetical genomics approach to demonstrate that a splice mutation in the PHKG1 gene is the causal mutation for a glycolytic potential QTL mapping to SSC3. We have used this integrative strategy to identify potential candidate genes for meat quality traits in a dataset of 12 loci that mapped to GM QTL regions (Analysis 1). In doing so, we have detected 3 cis-eQTLs ( Table 4) that co-localize with three chromosome-wide QTLs. One of them maps to SSC3 (16.6-17.06 Mb) and displays associations with a* (Fig. 1a); while the other two are located on SSC17 (53.1-57.2; 64.5-65.3) and show significant associations with GM pH 24 (Fig. 1b and c). Interestingly, two of the three cis-regulated genes encode lysosomal enzymes, i.e. cathepsin A (CTSA) and glucuronidase β (GUSB), that might be released during the post-mortem maturation of meat 37,38 . Cathepsin A is a lysosomal serine protease that can also protect galactosidase β from intralysosomal proteolysis 38 , while glucuronidase β is mainly involved in the degradation of glycosaminoglycans 39 . Interestingly, there are evidences that galactosidase β and glucuronidase β might affect the degradation of the collagen mucopolysaccharide, thus having a potential impact on meat ultrastructural properties 40 .
In Analysis 2, we have identified three additional cis-eQTL that map near to the SSC3 QTL for a* and the SSC17 QTL for pH 24 (Table 5). The ADCY3 locus, that co-localizes with the SSC3 QTL for GM a* (Fig. 2a), encodes an adenylate cyclase catalysing the conversion of ATP into cyclic adenosine-3′ ,5′ -monophosphate (cAMP), a secondary messenger that can have broad effects on muscle metabolism 41 . Indeed, AMPc is an activator of the cAMP-dependent protein kinase, a molecule involved in the phosphorylation of enzymes that promote the conversion of glycogen into glucose 41 . Noteworthy, the amount of glycogen stored in the muscle determines the post-mortem production of lactic acid, a molecule that has strong effects on meat color. Another eQTL of interest is the one influencing the mRNA levels of the secretory leukocyte peptidase inhibitor (SLPI) gene. This cis-eQTL co-localizes with the SSC17 QTL for GM pH 24 (Fig. 2b). The SLP1 gene encodes a serine-protease that inhibits protein-degrading enzymes with strong effects on meat tenderization i.e. when the skeletal muscle is being degraded and transformed into meat, SLPI attenuates muscle proteolysis by binding to proteases and rendering them inactive 42 . Finally, the co-localization of the IGKC cis-eQTL and the SSC3 QTL for a* (Fig. 2c) does not have an obvious biological interpretation because this gene is mainly related with humoral immunity.

Conclusions
We have detected genome-wide and chromosome-wide significant QTL for meat quality traits recorded in a Duroc commercial line with a population size that was moderate but comparable to the ones used in other porcine GWAS [43][44][45] . The limited positional concordance between the set of QTL detected by us and those reported by other authors in purebred populations suggests the existence of a significant amount of genetic heterogeneity   Table 5. List of significant cis-eQTLs mapping close to QTL regions for gluteus medius meat quality traits. a*: Minolta redness, pH 24 : pH at 24 hours post-mortem, N: number of significant SNPs, SNPs: marker displaying the most significant association with the trait under study, Location (Mb): region containing SNPs significantly associated with the trait under study, P-value: nominal P-value, q-value: q-value calculated with a false discovery rate approach, B: P-value corrected for multiple testing with the Bonferroni method, δ: allelic effect and its standard error (SE), A1: minority allele, MAF: frequency of the minority allele.  for meat quality traits in porcine breeds. We have found remarkable differences between the QTL maps for the LD and GM muscles, suggesting that meat quality is determined to a great extent by genetic factors that are muscle-specific. Finally, we have observed a number of cis-eQTL that co-localize with meat quality QTL regions. Several of these cis-eQTL regulate the expression of genes which may play important roles in muscle physiology and post-mortem meat maturation. Sequencing of the regulatory regions of these loci might be useful to uncover the identity of the causal mutations explaining the existence of these QTLs.