Pork quality is important both to the meat processing industry and consumers’ purchasing attitude. Copy number variation (CNV) is a burgeoning kind of variants that may influence meat quality. In this study, a genome-wide association study (GWAS) was performed between CNVs and meat quality traits in swine. After false discovery rate (FDR) correction, a total of 8 CNVs on 6 chromosomes were identified to be significantly associated with at least one meat quality trait. All of the 8 CNVs were verified by next generation sequencing and six of them were verified by qPCR. Only the haplotype block containing CNV12 is adjacent to significant SNPs associated with meat quality, suggesting the effects of those CNVs were not likely captured by tag SNPs. The DNA dosage and EST expression of CNV12, which overlap with an obesity related gene Netrin-1 (Ntn1), were consistent with Ntn1 RNA expression, suggesting the CNV12 might be involved in the expression regulation of Ntn1 and finally influence meat quality. We concluded that CNVs may contribute to the genetic variations of meat quality beyond SNPs, and several candidate CNVs were worth further exploration.
As a global food source, pork has unquestionable significant implications on the livelihood of human beings. Pork quality is important to the meat processing industry and consumers’ purchasing attitude1, thus, it is one of the main selection objective in breeding plans for most of the pig breeding organizations and enterprises2. Meat quality traits are complex and usually influenced by multiple genes or QTLs, therefore, the genetic improvement of these traits is rather slow. However, genome-wide association studies (GWAS), linkage mapping, and selective sweep analysis have been successfully employed to investigate significant gene markers for pH, tenderness, meat color, intramuscular fat (IMF) content1,2,3,4,5,6,7,8,9.
To date, a total of 12,618 QTLs from 461 publications have been recorded in PigQTLdb (http://www.animalgenome.org/cgi-bin/QTLdb/SS/index, released at Feb 11, 2015)10. Among these QTLs, more than half (7,014) were associated with meat and carcass quality traits. However, the limited density of microsatellite (SSR) markers resulted in inaccurate QTL mapping (Some segments are larger than 20 cM11, even cover a whole chromosome12). Long-term fine mapping experiments are needed to refine their locations and investigate causative variants13. Additionally, although genome wide association studies identified significant SNPs associated with meat quality traits, these SNPs explain a small portion of the genetic variance2,4. Alternative variances, which could explain the “missing heritability” of meat quality traits, were urgently needed14.
Copy number variations (CNVs) are currently accepted as a common source of genetic variation, and the “missing heritability” could partially be explained by CNVs as reported in several human studies15,16,17,18,19. In animals, there is strong evidence of the effects CNVs may have on disease resistance and economically important traits, such as milk production13, gastrointestinal nematodes resistance20, residual feed intake21, Marek’s disease-resistance22, late-feathering23, and squamous cell carcinoma of the digit24. Min pig is well known in China for its excellent meat quality (with 5% IMF in LM at 240-day-old) and good cold resistance capacity, and Large-white pig is a normal lean meat type breed with normal meat quality. The intercross of Min × Large-white is a good model to analyze the genetics of meat quality2. The objective of this work was to perform systematic CNV association analysis with meat quality traits using the Porcine SNP60 Genotyping BeadChip (Illumina, San Diego, CA, USA), analyze the joint or independent effects of CNVs and SNPs, and provide some helpful information to identify genetic markers that may be suitable for inclusion in genetic improvement program.
Trait properties and correlations
Two IMF traits (IMF content and marbling), 6 meat color traits (color L6*, color a6*, color b6*, color L24*, color a24*, and color b24*), 2 pH traits (pH24 and pH6), moisture, and shearing force (SFN) were analyzed in this paper. The summary of means, standard deviations (SD), genetic and phenotype correlations of the traits are provided in Table 1. From the result, there was a small difference between genetic and phenotype correlations. As respect, the genetic correlations between marbling and IMF, between each trait for meat color, between 2 pH values were high. The correlations between different kinds of traits such as color with IMF, pH with IMF, were low except the moisture and IMF.
CNV segmentation and genotyping
In the 678 samples of the three-generation pigs, a total of 32,544 distinct segments were detected using the multivariate method of CNAM in SVS. After merging across samples, 48 nonredundant CNVs were left for subsequent association test (Table S1). Within these 48 segments, each sample was genotyped (i.e., called as a loss, neutral or gain event) according to a three state model with strict threshold levels of marker mean ±0.5. Since the multivariate CNAM method was developed to identify common CNVs, only those segments with frequencies above 0.4% were retained for further analysis in order to filter away false positive calls. A total of 15 CNVs ranging in size from 34,076 bp to 1.10 Mb were retained (frequency > 0.04) (Table 2). These 15 CNVs have an estimated average size and SNP count of 208.53 Kb and 3.27 Mb, respectively.
Quality assessment of CNVs by using qPCR and NGS data
We first compared all of the 15 CNVs with the results from 7 previously published reports25,26,27,28,29,30,31. Twelve out the 15 CNVs (80%) were found previously reported (Table 2 and Table S2), and the remaining 3 non-overlapping CNVs (CNV2, 6 and 13) had high frequencies (ranging from 26 to 83%) and large marker mean changes (ranging from −0.39 to −1.24, deviated from 0), suggesting they are probably real CNVs (Table 2). Two complementary methods, quantitative PCR (qPCR) and next generation sequencing (NGS), were also performed to confirm the existence of detected CNVs. In the qPCR validation, we systematically assessed the overall agreement rate of detected CNVs with qPCR results. All the primers and results of qPCR are listed in Table S3 and Table S4. Totally, the detection power for the qPCR validation is 73.3% (11/15). In the NGS validation, we chose 12 F0 pigs for CNV calling, the sequencing depth of coverage for each animal varied from 4.7× to 8.4×. All of the NGS-based CNVs which were overlapped with significantly associated CNVs are listed in Table S5. All of the CNVs were confirmed with the Read depth (RD) method ranged from 0 to 4.5332. Detailed NGS analyses will be presented in a separate manuscript.
CNV association analysis
All of the 602 F2 generation pigs were employed to test association between CNVs and rEBV data. We identified a total of 8 CNVs that were significantly associated with at least one trait using a linear regression model (Fig. 1 & S1 and Table 2). Among those 8 associated CNVs, the CNV with the highest frequency (87.46%) was found at chr1: 242,457,549–242,519,391, while the CNV with the lowest frequency (35.10%) was localized at chr18: 46,776,812–46,983,072 (Table 2 and Table S6). Among the 12 traits, only four traits (b24*, marbling, ph6 and IMF) had significantly associated CNVs. Two CNVs: CNV9 (chr10: 9,369,752–9,462,206), and CNV12 (chr12: 56,893,678–57,020,468) had p values < 0.05 after FDR correction for more than two traits. Their frequencies were 46.76% and 79.35%, respectively.
Annotation of QTLs and genes within and near CNVs
QTL locations on the updated porcine genome sequence assembly (Sscrofa10.2) were retrieved from the PigQTLdb to compare our 8 associated CNVs with previously reported QTL locations. When considered together, 3 of the 8 significant CNVs overlapped with at least one of the known QTLs for meat quality (Table S7). For CNV1 (chr1: 242,457,549–242,519,391), which was significantly associated with IMF, two QTLs for percentage type I fibers (QTL 2794)33 and diameter of type IIb muscle fibers (QTL 2795)33 were overlapped. For CNV10 (chr10: 49,173,528–49,255,139), which was significantly associated with pH6 and IMF, one QTL for Marbling (QTL 3280)34 was overlapped. For CNV12 (chr12: 56,893,678–57,020,468), which was significantly associated with pH6, IMF and Marbling, three QTLs for CIE- color a* (QTL 21403)35 and Percentage type IIb fibers (QTL 7036 and 7021)36 were overlapped. The CNV9 (chr10: 9,369,752–9,462,206) was also near the previously reported meat quality QTL regions (pH 24 hr post mortem (ham QTL 18702)9, pH 24 h post mortem (ham QTL 18681)9 and CIE-color a* (QTL 3067)37).
Based on the Sscrofa10.2 sequence assembly, pig gene annotations within the 8 CNVs and flanking regions (500 Kb in both downstream and upstream directions) were summarized in Table S8. Among the 80 genes retrieved from the 8 regions, 22 cases were uncharacterized protein coding genes, 7 cases were RNA coding genes, and 51 cases were known protein-coding genes. Since only a limited number of genes in the pig genome have been annotated, we converted the pig official gene symbols to orthologous human genes by BioMart before Gene Ontology (GO) and pathway analysis. Ten statistically significant GO terms (P < 0.05, Table S9) and none significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were identified. A total of 12 detected genes were associated with GO terms consisting of secretion by cell, G-protein signaling, coupled to cAMP nucleotide second messenger, cAMP-mediated signaling, secretion, G-protein signaling, coupled to cyclic nucleotide second messenger, cyclic-nucleotide-mediated signaling, neuromuscular junction development, positive regulation of peptide secretion, positive regulation of cell proliferation, cell death, and death.
All of the 51 protein coding genes were searched on GenBank for their other functional information. NOD1 (the nucleotide-binding oligomerization domain-containing protein 1 gene) was found overlapped with CNV15 (chr18: 46,776,812–46,983,072), and was expressed increasingly in the adipose tissue of women with gestational diabetes38. Approximately 25 Kb downstream of the same CNV, pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 8 gene (PLEKHA8) encodes the protein of the four-phosphate adaptor protein 2 (FAPP2) which could connect vesicular transport with lipid synthesis39. Netrin-1 (Ntn1) gene which covers CNV12 (chr12: 56,893,678-57,020,468), has been reported to be highly expressed in obese but not lean adipose tissue of humans and mice40.
SNP association analysis
The result of GWAS based on SNPs was shown in Table S10. A total of 24 SNPs were genome-wide significantly associated with IMF, Marbling, Moisture, SFN, Ph6, a6* and a24*. Among the 24 significant SNPs (Table 3 and Fig. S2), 18 SNPs were located on Chromosome 12, one SNP located on Chromosomes 2 and 3 respectively. Eight SNPs were significantly associated with more than one trait. There were many overlapping significant SNPs for IMF and moisture (5/5 of moisture) and for IMF and Marbling (8/10 of IMF), which was in concordance with the results of significant associated CNVs.
Relationship between associated CNVs and associated SNPs
For all of the 8 possible combinations of CNVs and traits, we found no significant SNPs directly overlapping with them. Haplotype analysis for the CNV regions which included 25 SNPs both downstream and upstream of associated CNVs were used to detected the linkage relationship between CNVs and neighboring SNPs. And the results (Figs 2 and 3) showed four cases (CNV1, CNV12, CNV13, and CNV15) were enclosed in a haplotype block with other SNPs, none cases where CNVs directly overlapped with significantly associated SNPs, and only the block containing CNV12 is adjacent to IMF and marbling significantly associated SNPs (about downstreams of 300 kb).
In conventional CNV discovery studies, researchers usually try to identify as many CNV regions as possible. But in this CNV-based GWAS, the algorithm intended to identify the common CNVs shared among samples in order to detect associations with meat quality traits. Thus, only 48 CNVs were detected and only 15 CNVs were retained after quality control. However, the validation results indicate that all the 15 CNVs may be real. Our results indicated that there is a small discrepancy (27.6%) between qPCR and SVS 8.2 CNV callings. As we know, small variations such as SNPs, small indels may influence the hybridization of the qPCR primers and finally influence the amplification efficiency. Moreover, in this study, the overlapping rate between NGS RD-based CNV and SVS 8.2 CNV callings (100%) is better than previous NGS reports, and this is probably because we use the F0 generation pigs.
Illumina Porcine SNP60 BeadChip is designed using Duroc, Landrace, Pietran, and Large White SNP information. GWAS research using this chip have been successfully carried out in intercross population such as Large white × Min, Large white × Erhualian, Iberian × Landrace2,4,41. In the results of CNV-based association, we found 8 CNVs were significantly associated with at least one meat quality trait. These results indicate that CNVs may contribute to the difference of meat quality. The results reveal alternative variations can explain the missing heritability of complex traits.
Among these 8 CNVs, one of the interesting CNVs is CNV15. When analyzing genes within and near CNVs, we found two genes, one gene (NOD1) within the CNV and one gene (PLEKHA8) near the CNV. The PLEKHA8 gene encodes the protein of the four-phosphate adaptor protein 2 (FAPP2) which could connect vesicular transport with lipid synthesis39. FAPP2 has a pleckstrin homology (PH) domain, which can bind selectively phosphatidylinositol-4-phosphate42. And as CNV15 was significantly associated with pH24, we proposed that it might have some relationship with FAPP2.
Another interesting CNV was CNV12. After retrieving the sequence of CNV12 and blast the sequence with NCBI database, we found that CNV12 was located in the predicted gene of Ntn1. As the Blast results match one 853 bp pig EST (BW980235 full-length enriched swine cDNA library, adult intestine Sus scrofa cDNA clone ITT010048D02 5′, mRNA sequence), we inferred that there may be some problems in the genome assembling. In order to explore the relationship between CNV12 and Ntn1, we first carried out qPCR to investigate the relationship between copy number and EST expression of CNV12 and then investigate the correlation between the expressions of CNV12-EST and Ntn1-RNA. The primers were shown in Table S11, and the results were shown in Table 4 and Fig. 4. The expression pattern was similar between the DNA dosage and EST expression, and similar pattern appeared in the expressions of CNV12-EST and Ntn1-RNA, suggesting CNV12 might be involved in the regulation of Ntn1 directly. Moreover, 20 individuals from high (>5.3) and low (<1.4) IMF groups were used to explore the relationship between Ntn1-RNA expression and IMF. Figure 4 also showed that the fold changes of Ntn1-RNA expression is significant different between the two groups (with means of 1.49 and 0.76, P values < 0.05). In previous reports, Ntn1 was highly expressed in obese but not lean adipose tissue of humans and mice40 which is consistent with our results, and one of Ntn1 receptors, adenosine a2b receptor (a2bR) could inhibit adipogenesis43. Thus, we inferred that increased Ntn1 may bind with a2bR, and the decreasing of a2bR may lead to less inhibition effects on adipogenesis.
In current study, we carried out the first genome wide CNV association analysis using CNVs in pig population. However, as the Multi-variate algorithm are limited to detect small, common CNVs, some CNVs beyond the detection of Multi-variate method and the rare CNVs involved in complex traits are remain largely unknown44. Further, the estimation effect of CNV contributing to complex trait by integrating both SNPs, CNVs and other genomic variants will further help comprehensively understand the mechanism underline complex quantitative trait45. In the analysis of relationship between associated CNVs and associated SNPs, we found only one block is adjacent to tagged SNPs. Our results revealed a relatively lower CNV tagging rate comparing to cattle study13, this finding may be due to the limited sample size for common CNVs detection in CNV analysis. Thus, further analyses with a large sample size are needed to explore the precise relationship between CNVs and neighboring SNPs in pig.
In summary, previous SNP-based GWAS have been successfully used to identify significant genes or loci for complex traits. In this CNV-based GWAS study, our results indicate that meat quality traits were probably influenced by CNVs, and some of the CNVs may independently play their roles in determining the traits. These results provide us with an added source of variation to explain the missing heritability of complex traits, and the joint analysis of CNVs and SNPs could be a powerful way to potentially identify the causes of complex traits. The several candidate CNVs associated with meat quality such as CNV12 in SSC12 reported in this paper are worth further exploration.
The methods were carried out in accordance with the approved guidelines.
All animals used in this study were handled and kept following the standard guidelines of experimental animals established by Ministry of Science and Technology (Beijing, China). All of the animal experiments were approved by the Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS) (Beijing, China).
Phenotypic and rEBV values
In this study, a three-generation Large White × Min resource population (678 pigs, including 602 F2 individuals) was genotyped using Illumina PorcineSNP60 arrays. Phenotypic data of twelve meat quality traits, including IMF, marbling, moisture, shearing force, pH6, color L6*, color a6*, color b6*, Ph24, color L24*, color a24*, and color b24* in the longissimus muscle (LM) were measured for all of the 602F2 individuals following the standard guidelines of the US National Pork Producers Council (NPPC). The pH values were measured by a HANNA HI8424NEW pH Meter (HANNA, Cluj Napoca, Romania) at 6 h and 24 h postmortem. Meat color L* represented lightness, color a* represented redness and color b* represented yellowness on the cut surface of the LM were evaluated at 6 h post mortem and 24 h postmortem using a CR-410 Minolta Chroma Meter (Konica Minolta, Tokyo, Japan). Marbling scores were analyzed by a standard NPPC photographic reference (1, 2, 3, 4, 5, 6, with 1 = devoid, 6 = overly abundant) to determine of LM at 24 h post-mortem and IMF content were measured using an ether extraction method (Soxtec Avanti 2055 Fat Extraction System, Foss Tecator, Denmark). After removing fixed non-genetic effects using DMU software (v6), rEBVs estimated by breeding values (EBV) plus residual which are used as the phenotypes in association testing of the 12 meat quality traits.
CNV segmentation and genotyping and PCA-corrected association testing
The CNV segmentation and genotyping, and principal component analysis (PCA) corrected association testing were performed using Golden Helix SVS 8.2 (Golden Helix, Inc., Bozeman, MT, USA). The method and parameters are all follow the description of Xu et al.13. The array we used was Illumina Porcine SNP60 Beadchip, which contains 62,163 SNP probes. The GC correlation file was GC Reference sus_sscrofa_10.2 gc_digest.dsf. And finally, significant CNVs were considered at the level of P-value < 0.05 after FDR correction.
Association studies based on SNPs
Mix Model and Regression—Genomic Control (GRAMMAR-GC) method was used for the associated analysis between SNPs and meat quality traits46,47. The genome-wide significance threshold was decided using Bonferroni correction, in which P-value (0.05) was divided by the number of effective SNPs (12,039) estimated using simple M method48.
Haplotype block analysis and relationship between significantly combinative CNVs and SNPs
Haplotype block was employed to investigate linkage disequilibrium (LD) patterns in the regions containing significant associated CNVs using Haploview (v 4.2)49,50. In previous research, Du et al. presented a linkage disequilibrium (LD) map in pig population, and found the maximum segment was 1 cM (about 1 Mb)51. And as the average distance between SNPs was about 40 K in the PorcineSNP60 chip, 25 SNPs on the both upstream and downstream were selected directions of each CNV. All of the significant combinative CNVs (P-values < 0.05 after FDR correction) and SNPs (P-values < 4.15E-6) were used to analyze the relationship between significantly combinative CNVs and SNPs following the method described by Xu et al.13.
Validation of qPCR and next generation sequencing
The qPCR amplification was performed using an ABI 7900HT instrument (Applied Biosystems, Inc., Foster City, CA) in 384-well optical PCR plates. SYBR® Green primers were designed to query CNVs using the Primer 6 software. A 15-mlsystem containing 15 ng of genomic DNA, 150 nM each for the primers, and SYBR® Select Master Mix (ABI part number 4472908) were used for the reaction. The glucagon gene (GCG)52 was used as control of single copy control. Copy number was calculated by the method of 2−ΔΔCT 53,54, where Δ CT is the differential value of target region cycle threshold (CT) of and the control region CT. And moreover, 2−ΔΔCT stand for the comparison of the Δ CT value of samples with CNV to those without CNV. The PCR cycle was: 2 min at 50 °C, 10 min at 95 °C, 40 cycles of 15 sec at 95 °C and 1 min at 60 °C. A list of these 16 pair of primer sequences (CNVs and GCG) is shown in Table S3.
Twelve pigs (8 Min pigs and 4 Large White pigs) in the F0 generation were sequenced using Illumina HiSeq 2000. Every pig was sequenced with paired-end reads (100 bp) in two 500 bp insert size genomic DNA libraries. All paired-end reads were mapped to the genome Sscrofa10.2 (ftp://ftp.ensembl.org/pub/release-77/fasta/sus_scrofa/dna/Sus_scrofa.Sscrofa10.2.dna.toplevel.fa.gz) using BWA v0.5.955 with the parameter(−t 4 −k 32 −M and −R), and the BAM data were merged and sorted using Samtools v0.1.1856. The CNVs were detected using CNVnator v0.3 software according to the previous studies with the parameter (-call 100)32. CNV calls were filtered by the criteria of P-value < 0.01and size > 1 Kb. Calls overlapping with gaps which are larger than or equal to 5 bp in the reference genome were also excluded.
Gene content and functional analysis
The pig annotated genes were downloaded from BioMart (http://www.biomart.org/). Genes overlapping with detected CNVs and near the detected CNVs (<100 Kb) were picked out for further analysis. Annotation analysis were performed with the DAVID (http://david.abcc.ncifcrf.gov/)57 for Gene Ontology (GO) terms58 and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis59 to provide insight into the functional enrichment of copy number variable genes.
How to cite this article: Wang, L. et al. Copy number variation-based genome wide association study reveals additional variants contributing to meat quality in Swine. Sci. Rep. 5, 12535; doi: 10.1038/srep12535 (2015).
This research is support by the Agricultural Science and Technology Innovation Program (ASTIP-IAS02), National Key Technology R&D Program of China (No.2011BAD28B01), National Natural Science Foundation of China (31201781), earmarked fund for Modern Agro-industry Technology Research System, National Technology Program of China (No.2011ZX08006-003) and Chinese Academy of Agricultural Sciences Foundation (No.2014ZL006, 2011cj-5, No.2012ZL069 and 2014ywf-yb-8). G.E.L. was partially supported by Agriculture and Food Research Initiative competitive grants no. 2011-67015-30183 from National Institute of Food and Agriculture (NIFA) of the US Department of Agriculture (USDA). Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture.