Article | Open | Published:

Multi-strategy genome-wide association studies identify the DCAF16-NCAPG region as a susceptibility locus for average daily gain in cattle

Scientific Reports volume 6, Article number: 38073 (2016) | Download Citation


Average daily gain (ADG) is the most economically important trait in beef cattle industry. Using genome-wide association study (GWAS) approaches, previous studies have identified several causal variants within the PLAG1, NCAPG and LCORL genes for ADG in cattle. Multi-strategy GWASs were implemented in this study to improve detection and to explore the causal genes and regions. In this study, we conducted GWASs based on the genotypes of 1,173 Simmental cattle. In the SNP-based GWAS, the most significant SNPs (rs109303784 and rs110058857, P = 1.78 × 10−7) were identified in the NCAPG intron on BTA6 and explained 4.01% of the phenotypic variance, and the independent and significant SNP (rs110406669, P = 5.18 × 10−6) explained 3.32% of the phenotypic variance. Similarly, in the haplotype-based GWAS, the most significant haplotype block, Hap-6-N1416 (P = 2.56 × 10−8), spanned 12.7 kb on BTA6 and explained 4.85% of the phenotypic variance. Also, in the gene-based GWAS, seven significant genes were obtained which included DCAF16 and NCAPG. Moreover, analysis of the transcript levels confirmed that transcripts abundance of NCAPG (P = 0.046) and DCAF16 (P = 0.046) were significantly correlated with the ADG trait. Overall, our results from the multi-strategy GWASs revealed the DCAF16-NCAPG region to be a susceptibility locus for ADG in cattle.


With the recent emergence of genome-wide association studies (GWASs)1, major advances have been made in the understanding and practice of functional gene discovery and quantitative trait locus (QTL) mapping2,3,4. Although the Single Nucleotide Polymorphism (SNP)-based GWAS has been useful for identifying causal variants5, this strategy has its limitations. This approach overlooks the interaction between SNPs within a gene, misses weak signals that aggregate within related SNP sets, and incurs a severe penalty for multiple testing6.

To increase the statistical power and limit the false discovery rate (FDR) associated with GWAS analyses, GWASs have been improved using haplotype-based7,8,9,10 and gene-based11,12,13,14 strategies to assess complex and quantitative traits in human and domestic animals. The haplotype-based GWAS has high statistical power15,16 and aims to identify causal haplotypes with specific combinations7. Haplotype-based GWASs have recently identified susceptibility haplotypes or blocks for coronary artery disease7, low-density lipoprotein cholesterol8, triglyceride levels9, and boar taint10.

Because the gene-based GWAS analysis involves all variants within a gene, it has reduced the number of required tests and is more powerful than the simple SNP-based GWAS17,18,19. Several gene-based GWAS methods have been developed, including the genetic similarity gene-based GWAS20, entropy-based joint analysis21, and extended Simes procedure association analysis17. Risk genes have been successfully identified for several human diseases, including multiple sclerosis11, hypertension13, and Alzheimer’s disease14. However, there has been little gene-based GWAS research on quantitative traits in domestic animals.

In the beef cattle industry, average daily gain (ADG) is an economically important growth trait that contributes to the production efficiency and economic benefits of graziery. Table 1 lists the ADG-associated QTL positions and candidate genes that have been reported in cattle. Notably, PLAG1 and NCAPG-LCORL, known loci that are linked to adult human height22,23,24, have been associated with growth traits and body size in cattle25,26,27,28,29. The dissection of a QTL and the fine mapping of QTNs involved in bovine stature have been reported for the PLAG1 gene28. The mechanism of the effect of PLAG1 on growth and fertility has been clearly illustrated, and PLAG1 knockout mice have highlighted the importance of PLAG1 in postnatal growth and reproduction30.

Table 1: Average daily gain (ADG)-associated quantitative trait loci (QTL) in cattle.

Using 1,173 samples genotyped by Illumina BovineHD Beadchip, multi-strategy GWASs were performed to explore candidate genes or QTL regions for the ADG trait in Simmental cattle. Transcripts abundance of candidate genes were also examined and validated to be associated with ADG trait in this study. Identification of the promising candidate genes for further studies will greatly dissect the molecular mechanisms underlying ADG trait in cattle and has the practical in breeding program for the improvement of carcass weight in breeding program.

Materials and Methods

Ethics statement

All animal procedures were conducted in strict accordance with the guidelines proposed by the Chinese Council on Animal Care, and all protocols were approved by the Science Research Department of the Institute of Animal Science, Chinese Academic of Agriculture Sciences (Beijing, China). The use of animals and private land in this study was approved by their respective owners.

Phenotype Data

The resource population consisted of 1,173 Simmental cattle that were born between 2008 and 2013 in Ulagai, Inner Mongolia. After weaning, all calves were transferred to a fattening farm in Beijing and fattened in the same pens for 8–12 months. All cattle were fed with identical feed, which consisted of silage, brewer’s grain, bean dregs, breadcrumbs, and maize. We measured each bull’s body weight at the following five time points: birth, upon entering the fattening farm, 12 months of age, 18 months of age, and before slaughter. The growth curve analyses closely followed the linear regression during the fattening period (see Supplementary Fig. S1), and the slope of the regression line therefore represented the average daily gain (ADG) during the fattening period.

Genotype Data

The genotypes of the 1,173 beef cattle were obtained by Illumina BovineHD BeadChip, which included 774,660 SNPs. Quality control procedures were carried out using PLINK 1.7 software31 to remove SNPs with a call rate less than 95%, a minor allele frequency (MAF) less than 0.05 and a significant deviation from the Hardy-Weinberg equilibrium (P < 10−5); moreover, animals with more than 10% missing genotypes were removed from the dataset. Missing alleles were imputed using Beagle 4.1 software32 to guarantee the accuracy and effectiveness of the statistics33,34.

Gene Annotation

A total of 24,596 genes were downloaded from the Ensembl Genes database (, UMD3.1), including the coding and non-coding RNA. To address the regulatory regions and linkage disequilibrium in SNPs18,35, we defined the gene boundary as ±50 kb upstream and downstream of the gene. Each gene was covered by three or more SNPs in the genotyping BeadChip, and 23,856 genes remained to be analyzed.

SNP-based GWAS

A standard MLM for GWAS was performed by extending the Henderson notation as follows:

where y represented a vector of ADG, μ represented the population mean, v represented a vector of fixed effects, βi denoted the effect of the ith SNP, u represented a vector of the polygenic effects and e represented the residual. W, X and Z represented the incidence matrices for v, βi and u. Z was the genetic additive matrix constructed by SNPs, termed as kinship. As described by Lopes36, we built kinship using 50,000 random SNPs across autosomes. In this model, we considered sex, birth year, calving season and population stratification as fixed effects.

The percent phenotypic variance that was explained by a single significant SNP was calculated as follows:

where pi and qi represented the allele frequencies for the ith SNP, βi denoted the effect of the ith SNP, and σp2 represented the phenotypic variance. The R package heritability ( was used to estimate the ADG-associated heritability and genetic variance.

Haplotype-based GWAS

Haplotype-based GWAS was performed using the method proposed by Gregersen VR et al.10. Haplotype blocks were established based on pairwise measures of the linkage disequilibrium (LD)37 and implemented using the PLINK 1.7 software with a block window that was less than 100 kb. The haplotype block estimation option was --blocks --ld–window-kb 100. After the haplotype block partitioning, haplotypes for each sample were calculated using a standard expectation-maximum (EM) algorithm, and the program was conducted using the R package haplo.stats (URL: Haplotype association analyses were implemented in the R package lme4 (URL: using the MLM equation as follows:

where y represented a vector of ADG, μ represented the population mean, v represented a vector of fixed effects, βij represented the effect of the ith haplotype in the jth block (which contained t haplotypes), u denoted polygenic effects for each individual, and e represented the residual. W, Hi and Z were the incidence matrices for v, βij, and u. A Chi-square hypothesis test with df = 1 was used to calculate the significance level of the haplotype block as follows:

where denoted the maximum effective haplotype at the jth block, and represented the variance of obtained via mixed model equations.

The percent phenotypic variance (Vpj) explained by the jth block was calculated using a two-step approach. Firstly, the effect of haplotypes at the jth block was estimated using least square (LS) method and all jth block haplotypes were clustered into two groups (G1 and G2) based on the estimated effects. Each sample was defined as 0, 1, and 2 (G1/G1, G1/G2 and G2/G2) according to the EM results. We then calculated Vpj as follows:

where β represented the regression coefficient of the phenotype on the indicator (0, 1 and 2), and p and q indicated the frequencies of G1 and G2, respectively.

Gene-based GWAS

We conducted a gene-based GWAS method using a principal component analysis (PCA) according to the method proposed by Kai Wang et al.38. First, principle components (PCs) were constructed based on an intragenic SNP indicator, and we selected the PCs based on a cumulative contributed proportion >85%. Second, the estimate breeding value (EBV) was calculated based on genomic best linear prediction (GBLUP) with fixed effects (sex, birth year, calving season and population stratification) and random effects (polygenic effects). Third, the effectiveness of each PC and statistical hypothesis test was calculated. The general linear model was:

where bi represented the regression coefficient of the phenotype on the PC, X represented the vector of the PC, and e represented the residual. The following Chi-square hypothesis testing (df = 1) formula was used:

For each gene, we selected the minimum P-value for the PCs when the PC number exceeded two. The significant threshold was set based on the permutation testing to overcome false positive discovery39. Thus, 1,000 permutation cycles were performed (23,856,000 multiple tests), and the 240,000th highest value represented the cut-off point for the 1% level of significance.

Gene expression level

To validate whether the explored gene resulting from the three GWAS methods was associated with ADG trait, transcript abundance in longissimus dorsi muscle tissue was measured. We selected 28 steers randomly in 2014. Longissimus dorsi muscle samples were collected from steers at slaughter and stored in liquid nitrogen. Total RNA was isolated using the TRIzol Reagent total RNA extraction kit (Invitrogen, Carlsbad, CA, USA) and precipitated with ethanol.

Primers were designed using the Primer 5 software and were approximately 200 bp in length (Supplementary Table S1). Real-time PCR was performed to examine the expression level of selected genes using the SYBR® Fast qPCR Mix (Takara Bio, Otsu, Japan) with the Applied Biosystems® 7500 Real-Time PCR Systems (Applied Biosystems, Foster City, CA, USA). Expression values were normalized to GAPDH as the internal control. The mean fold change in expression of the target genes was calculated using the 2−ΔΔCt method.

Correlation analyses were conducted using R version 3.2.2 (, 18/3/2016). Correlations were derived for all candidate genes expression and phenotypic data with 28 random steers from the same year. General linear model (GLM) was used and the fixed effects included calving season and population stratification effects.

Results and Discussion

Phenotype description and genetic parameters

The phenotypic distribution followed a Gaussian distribution with a mean of 0.98 kg/day, a maximum of 1.87 kg/day, a minimum of 0.54 kg/day, and a standard deviation (SD) of 0.16 kg/day. The heritability (h2) of the average daily gain (ADG) was 0.48, with an additive genetic variance (Va) equal to 0.012.

Following the quality control and imputation, 1,141 samples with 669,742 SNPs remained. Cleaned SNPs were uniformly distributed over the whole bovine genome with a mean inter-marker space of 4.52 kb.

SNP-based GWAS results

In this study, we used three strategies to perform a genome-wide association study (GWAS) for the ADG trait in beef cattle (Fig. 1). In the SNP-based association, we identified 40 distinct SNPs (Supplementary Table S2) that exceeded the suggested significance thresholds (P < 10−6), 38 of which were located within BTA6 (Fig. 1a). Here, we identified the most significant SNPs, rs109303784 and rs110058857, on BTA6 with identical P-values of 1.78 × 10−7. The distance between the two significant SNPs was 680 bp, which were in complete linkage disequilibrium (r2 = 1) and explained 4.01% of the phenotypic variance. Rs109303784 and rs110058857 were both located upstream of NCAPG and downstream of the DCAF16 gene according to the Ensembl genome database ( Figure 2 showed the regional −Log10 (P-value) of the significant SNPs that surround the DCAF16-NCAPG locus on BTA6. We also calculated the LD levels, with the two peak SNPs denoted by different colors. Notably, we found that rs110406669 (P = 5.18 × 10−6) had a low LD with the two peak SNPs and independently explained 3.32% of the phenotypic variance. Moreover, two other prominent SNPs, rs109028700 (BTA5:43111315) and rs137683327 (BTA5:84944556), were located on BTA5 and explained 2.59% and 2.87% of the phenotypic variance, respectively.

Figure 1: Results of the multi-strategy GWAS for average daily gain.
Figure 1

(a) Manhattan plots for the SNP-based GWAS. (b) Manhattan plots for the haplotype-based GWAS. (c) Log10 (P-value) values of 23,856 genes in the gene-based GWAS.

Figure 2: Regional −Log10 (P-value) plot of the SNP-based and haplotype-based association around the DCAF16-NCAPG locus on BTA6: 38.6–39.0(Mb).
Figure 2

The yellow bar represents the block position. The purple triangle represents the two most significant SNPs (rs109303784 and rs110058857, r2 = 1). SNPs were colored based on their LDs with two most significant SNPs as follows: red SNPs with LDs at r2 > 0.9, pink SNPs with LDs at r2 > 0.7, orange SNPs with LDs at r2 > 0.5, yellow SNPs with LDs at r2 > 0.3 and grey SNPs with LDs at r2 < 0.3. The size of the plots indicates the significance level of SNPs in the SNP-based GWAS. The positions of all RefSeq genes were downloaded from the ENSEMBL database.

Haplotype-based GWAS results

A total of 93,732 blocks were identified, and these blocks comprised 615,355 SNPs. The maximum length was 99.9 Kb, and the minimum length was 0.4 Kb. Fourteen significant haplotype blocks (shown in Table 2) were obtained at the suggested threshold (P < 10−5) across 5 chromosomes (BTA3, BTA6, BTA7, BTA12, and BTA19). Similar to the SNP-based GWAS, 7 associated haplotype blocks that surrounded rs109303784 and rs110058857 were found on BTA6 (Fig. 2). The most significant block, Hap-6-1416 (P = 2.56 × 10−8), spanned 22.8 Kb and was located in upstream of NCAPG at a distance of 12.7 Kb with rs109303784. The Hap-6-N1416 block explained 4.85% of the phenotypic variance and had 7 distinct haplotypes (GTGGATA, GTGAATA, GTAAATA, ACAGGCG, ACAAGCG, ACAAATA and ATAAATA, referred to as Haplo1, Haplo2, Haplo3, Haplo4, Haplo5, Haplo6 and Haplo7) with frequencies of 0.13%, 2.67%, 4.94%, 19.36%, 5.74%, 1.34% and 65.82%, respectively. The average effect was 0.24 kg/day, with the minimum in Haplo3 of 0.08 kg/day and the maximum in Haplo5 of 0.45 kg/day.

Table 2: Significant haplotypes from the haplotype-based GWAS.

In contrast to the SNP-based GWAS results, no prominent block was found on BTA5, but 5 blocks were identified on BTA3, 7, 12 and 19. However, no gene regions or coding domains coincided with these blocks. Notably, Hap-3-N3218 (P = 1.7 × 10−7) on BTA3 contained 3 extragenic SNPs (rs109934393, rs43349539 and rs43348574) that explained 6.22% of phenotypic variance. These results indicated that unknown functional regions or regulatory elements may exist around this identified block.

Gene-based GWAS results

A total of 24,616 genes were annotated in ENSEMBLE database. For the gene-based association, 23,856 genes with an average 34.7 SNPs per gene were analyzed. And other 760 genes were excluded, since they included less than three SNPs or not were located in autosomes (sex chromosome or mitochondria DNA). The 1,000 permutation-cycle results suggested a set P-value of 10−3 with a FDR < 1%. Seven genes were identified for ADG in this study (Table 3). Specifically, DCAF16 and NCAPG were implicated by the SNP- and Haplotype-based association results. We also found two small nucleolar RNAs, SNORD50 and SNORD87, with identical functions in the modification process of other small nuclear RNAs (snRNAs). Additionally, two uncharacterized proteins—ENSBTAG00000038625 and ENSBTAG00000024272—were obtained. These results indicated that the gene-based method can identify functional genes or loci which are previously unverified and provide a possible structural basis for further gene functional validation studies.

Table 3: Seven significant ADG-associated genes based on the gene-based GWAS.

DCAF16-NCAPG locus associated with ADG

Taken together, 163 significant SNPs were identified by three GWAS strategies (The SNPs in the gene-based set were SNPs within significant genes). Venn diagram summarizing the three strategies results was shown in Fig. 3. Here, the SNP- and haplotype-based GWAS approaches returned a distinct set of 8 and 44 prominent SNPs, respectively. Five genes—PTPRR, LMNTD1, FAM114A2, C8A and STARD13—were proximal to these 52 significant SNPs, suggesting associations for some of these genes with the ADG trait. We focused on the intersection of candidate SNPs identified by the three GWASs methods with the highest ADG trait-associated accuracy, which included 28 significant SNPs located at 38.6–39.0 Mb on BTA6. Figure 2 showed a schematic diagram of the region, which contains four annotated genes—FAM184B, DCAF16, NCAPG, and LCORL—from the Ensembl genome database.

Figure 3: Venn diagram summarizing the association analyses results of the three strategies.
Figure 3

The number represents the interaction and the remaining significant SNPs identified in three GWAS methods.

DCAF16, which was near to the peak SNPs for SNP-based GWAS approach, was the most significant gene (P = 6.45 × 10−5) for gene-based GWAS analysis. Similarly, the most significant block, Hap-6-N1416 (P = 2.56 × 10−8), was also located downstream of DCAF16 (physical distance = 19,663 bp) according to the Ensembl database. DCAF16 may function as a substrate receptor for the CUL4-DDB1 E3 ubiquitin-protein ligase complex, which is involved in two pathways that promote protein modifications and ubiquitination. NCAPG, which was also identified by three GWAS methods simultaneously, encodes a subunit of the condensin complex, which is responsible for the condensation and stabilization of chromosomes during mitosis and meiosis. The associated pathways involved the cell cycle, mitosis and the mitotic prometaphase. Numerous studies25,26,40,41,42,43,44,45,46,47,48,49,50,51,52 have confirmed that NCAPG has strong effects on the body sizes and growth traits of human and domestic animals. According to the association analyses from Lindholm-Perry’s results40, 47 SNPs within or near the gene boundaries of the three candidate genes (NCAPG, LCORL and LAP3) were genotyped. Figure 4 showed a comparison of these association study results with our SNP-based GWAS results. In contrast to our results, the most significant SNPs were located in the LCORL gene. However, most of the significant SNPs from these two analyses were located around the BTA6: 38.78 (Mb) region near the downstream region of DCAF16, suggesting that this region might be a more effective QTL for ADG trait in cattle.

Figure 4: Regional plot of our GWAS results versus association analysis results by Lindholm-Perry40
Figure 4

. The black circles represent the −Log10 (P-value) of our SNP-based GWAS, and the red squares represent the −Log10 (P-value) of the previous association analysis. The purple triangle represents SNP c.1326 T > G, which is the Ile442 to Met442 amino acid change, in exon 9 of NCAPG.

Additionally, a missense mutation (c.1326 T > G, indicated in Fig. 4 by a purple triangle) was identified in exon 9 of NCAPG by several association26,45 and linkage analyses25,41. The resulting amino acid change of Ile442 to Met442 in the encoded protein has been shown to be a candidate causative variation of the growth trait in cattle. Significant selection regions that affect the statures of European and African cattle cohorts were identified in NCAPG by multiple signal selection analyses49. GWAS analyses in horses42,43,46,47 and cattle48,51 indicated that the NCAPG-LCORL locus or closed regions were significantly associated with body size and growth traits.

Based on our results and previous reports44,52, we tested DCAF16, NCAPG, and LCORL expression in muscle tissues. Longissimus dorsi muscle samples from 28 steers with ADG phenotypes were collected. General linear model (GLM) results showed DCAF16 and NCAPG expression is significantly associated with ADG trait (Table 4) and correlations between ADG and genes expression were presented in (Supplementary Fig. S3). No significant difference was detected for the LCORL gene. Our results were concordant with the results presented by Perry et al.44 that abundance of NCAPG was associated with ADG in the muscle tissue muscle from cows.

Table 4: Target gene expression in muscle tissue and estimated effects for ADG.

In the NCBI database, the NCAPG gene has one reference transcript (Genebank accession number: NM_001102376) and two predicted transcripts (XM_005207785 and XM_015471561), which were derived by a computational analysis using transcriptome data from 11 Hereford cattle. The differences between the three transcripts occur in exon 1 (Supplementary Fig. S4). Three transcripts primers were designed using the Primer 5 software (Supplementary Table S3). We demonstrated the existence of three transcripts in Simmental cattle using reverse transcription polymerase chain reaction (RT-PCR) (Supplementary Fig. S2), and the PCR production sequences were consistent with those reported in the NCBI database. To address the significant association between each transcripts abundance and the ADG trait, we also tested the expression levels of three transcripts. GLM results showed XM_005207785 (P = 0.050) expression was significantly associated with ADG, while no significant correlation were found in NM_001102376 (P = 0.597) and XM_015471561 (P = 0.074) transcripts (Table 4).

Overall, DCAF16 and NCAPG have been simultaneously explored by the three GWAS methods, and statistical analysis have proven that DCAF16 and one of NCAPG transcripts (XM_005207785) abundance were associated with ADG trait, indicating that the DCAF16-NCAPG region is a susceptibility locus for the ADG trait in cattle.

Furthermore, we noticed that the independent and significant SNP (rs110406669) from the SNP-based GWAS was located 5′ upstream with a distance of 30,695 bp to XM_005207785. Two peak SNPs were located in intron 1 of XM_005207785 and upstream with a distance of 6,970/7,650 bp to DCAF16. We then searched the transcription factor-binding (TF) site around candidate regions using the Tfsitescan software on the MIRAGE WWW server ( The regions, which contained ±5 Kb flanking sequences of the obtained significant SNPs (rs109303784, rs110058857 and rs110406669), were analyzed, and Table 5 showed the Tfsitescan results. The distances between the two most significant TF sites identified here—Nmp4-COL1A1-sit and AT2-VIRE—and the significant SNPs were 130 bp and 178 bp, respectively. It has been shown that Nmp4-COL1A1-sit influences cell structure and function during extracellular matrix remodeling in osteoblasts53. The protein product of AT2-VIRE, the AT2 receptor, is widely and abundantly expressed in fetal tissues and plays a pivotal role in cell differentiation and growth31. Moreover, similar TF site sequences were found upstream of the NCAPG gene in various species (Supplementary Table S4). Taken together, we proposed that Nmp4-COL1A1-sit, AT2-VIRE or other TF sites are probably involved in the regulation of DCAF16 or NCAPG transcript expression in association with the ADG trait.

Table 5: List of transcription factor-binding (TF) sites around the NCAPG-LCORL locus.


In this study, we performed multi-strategy GWASs to investigate average daily gain (ADG) in the Simmental beef cattle. Forty significant SNPs in the SNP-based GWAS, 14 significant haplotype blocks in the haplotype-based GWAS, and 7 prominent genes in the gene-based GWAS were identified. Two genes, DCAF16 and NCAPG, were demonstrated to be associated with ADG by all three GWAS methods. Most importantly, the significant SNPs within the NCAPG-DCAF16 region were strongly associated with the ADG trait, with phenotypic variance of approximately 4%, suggesting the existence of causal variants in this region. Moreover, we have also shown that DCAF16 and NCAPG expression were significantly associated with ADG. Our findings provide insights into the understanding of the genetic mechanisms underlying ADG trait in cattle, and these results inform future NGS-GWAS analyses of causal variants for the ADG trait. Moreover, multi-strategy GWASs represents a powerful approach to the search and analysis of susceptibility loci-related traits.

Additional Information

How to cite this article: Zhang, W. et al. Multi-strategy genome-wide association studies identify the DCAF16-NCAPG region as a susceptibility locus for average daily gain in cattle. Sci. Rep. 6, 38073; doi: 10.1038/srep38073 (2016).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    et al. Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005).

  2. 2.

    et al. Genome-Wide Association Study Identifies Nox3 as a Critical Gene for Susceptibility to Noise-Induced Hearing Loss. Plos Genet 11, doi: 10.1371/journal.pgen.1005094 (2015).

  3. 3.

    et al. Genome-Wide Association Studies in Dogs and Humans Identify ADAMTS20 as a Risk Variant for Cleft Lip and Palate. Plos Genet 11, doi: 10.1371/journal.pgen.1005059 (2015).

  4. 4.

    et al. A genome wide association study for backfat thickness in Italian Large White pigs highlights new regions affecting fat deposition including neuronal genes. Bmc Genomics 13, doi: 10.1186/1471-2164-13-583 (2012).

  5. 5.

    , , & Five years of GWAS discovery. Am J Hum Genet 90, 7–24 (2012).

  6. 6.

    et al. A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics 29, 1526–1533 (2013).

  7. 7.

    et al. Genome-wide haplotype association study identifies the SLC22A3-LPAL2-LPA gene cluster as a risk locus for coronary artery disease. Nat Genet 41, 283–285 (2009).

  8. 8.

    , & Genome-wide haplotypic testing in a Finnish cohort identifies a novel association with low-density lipoprotein cholesterol. Eur J of Hum Genet 23, 672–677 (2015).

  9. 9.

    et al. Genome-Wide Association Studies Using Haplotypes and Individual SNPs in Simmental Cattle. Plos One 9, doi: 10.1371/journal.pone.0109330 (2014).

  10. 10.

    et al. Genome-wide association scan and phased haplotype construction for quantitative trait loci affecting boar taint in three pig breeds. Bmc Genomics 13, doi: 10.1186/1471-2164-13-22 (2012).

  11. 11.

    , , & Susceptibility Genes for Multiple Sclerosis Identified in a Gene-Based Genome-Wide Association Study. J Clin Neurol 11, 311–318 (2015).

  12. 12.

    et al. Voxelwise gene-wide association study (vGeneWAS): Multivariate gene-based association testing in 731 elderly subjects. Neuroimage 56, 1875–1891 (2011).

  13. 13.

    et al. Identification of IGF1, SLC4A4, WWOX, and SFMBT1 as Hypertension Susceptibility Genes in Han Chinese with a Genome-Wide Gene-Based Association Study. Plos One 7, doi: 10.1371/journal.pone.0032907 (2012).

  14. 14.

    et al. Genome-wide and gene-based association implicates FRMD6 in alzheimer disease. Hum Mutat 33, 521–529 (2012).

  15. 15.

    The role of haplotypes in candidate gene studies. Genet Epidemiol 27, 321–333 (2004).

  16. 16.

    Evaluating associations of haplotypes with traits. Genet Epidemiol 27, 348–364 (2004).

  17. 17.

    , , & GATES: A Rapid and Powerful Gene-Based Association Test Using Extended Simes Procedure. Am J Hum Genet 88, 283–293 (2011).

  18. 18.

    et al. A Versatile Gene-Based Test for Genome-wide Association Studies. Am J Hum Genet 87, 139–145 (2010).

  19. 19.

    , & Gene-based Genomewide Association Analysis: A Comparison Study. Curr Genomics 14, 250–255 (2013).

  20. 20.

    et al. A new gene-based association test for genome-wide association studies. BMC Proc 3 Suppl 7, S130, doi: 10.1186/1753-6561-3-S7-S130 (2009).

  21. 21.

    & Entropy-based joint analysis for two-stage genome-wide association studies. J Hum Genet 52, 747–56 (2007).

  22. 22.

    et al. Association, effects and validation of polymorphisms within the NCAPG-LCORL locus located on BTA6 with feed intake, gain, meat and carcass traits in beef cattle. Bmc Genet 12, doi: 10.1186/1471-2156-12-103 (2011).

  23. 23.

    et al. Identification and fine mapping of quantitative trait loci for growth traits on bovine chromosomes 2, 6, 14, 19, 21, and 23 within one commercial line of Bos taurusi. J Anim Sci 82, 3405–3414 (2004).

  24. 24.

    et al. Primary genome scan to identify putative quantitative trait loci for feedlot growth rate, feed intake, and feed efficiency of beef cattle. Journal of Animal Science 85, 3170–3181 (2007).

  25. 25.

    et al. Polymorphisms and haplotypes in the bovine neuropeptide Y, growth hormone receptor, ghrelin, insulin-like growth factor 2, and uncoupling proteins 2 and 3 genes and their associations with measures of growth, performance, feed efficiency, and carcass merit in beef cattle. Journal of Animal Science 86, 1–16 (2008).

  26. 26.

    et al. Genetic association between GHSR1a 5′UTR-microsatellite and nt-7(C > A) loci and growth and carcass traits in Japanese Black cattle. Animal Science Journal 82, 396–405 (2011).

  27. 27.

    et al. Genome-wide association analysis for feed efficiency in Angus cattle. Anim Genet 43, 367–374 (2012).

  28. 28.

    et al. The identification of common haplotypes on bovine chromosome 5 within commercial lines of Bos taurus and their associations with growth traits. Journal of Animal Science 80, 1187–1194 (2002).

  29. 29.

    Bayesian genome wide association analyses of growth and yearling ultrasound measures of carcass traits in Brangus heifers (vol 90, pg 3398, 2012). Journal of Animal Science 91, 1522–1522 (2013).

  30. 30.

    et al. Genome-wide association analyses for growth and feed efficiency traits in beef cattle. Journal of Animal Science 91, 3612–3633 (2013).

  31. 31.

    et al. Bivariate Genome-Wide Association Analysis of the Growth and Intake Components of Feed Efficiency. Plos One 8, doi: 10.1371/journal.pone.0078530.t001 (2013).

  32. 32.

    et al. A critical functional missense mutation (H173R) in the bovine PROP1 gene significantly affects growth traits in cattle. Gene 531, 398–402 (2013).

  33. 33.

    et al. Haplotypes in the promoter region of the CIDEC gene associated with growth traits in Nanyang cattle. Scientific Reports 5, doi: 10.1038/srep12075 (2015).

  34. 34.

    et al. Identification of 15 loci influencing height in a Korean population. J Hum Genet 55, 27–31 (2010).

  35. 35.

    et al. Many sequence variants affecting diversity of adult human height. Nature Genetics 40, 609–615 (2008).

  36. 36.

    et al. Genome-wide association analysis identifies 20 loci that influence adult height. Nature Genetics 40, 575–583 (2008).

  37. 37.

    et al. Dissection of Genetic Factors Modulating Fetal Growth in Cattle Indicates a Substantial Role of the Non-SMC Condensin I Complex, Subunit G (NCAPG) Gene. Genetics 183, 951–964 (2009).

  38. 38.

    et al. The SNP c.1326T > G in the non-SMC condensin I complex, subunit G (NCAPG) gene encoding a p.Ile442Met variant is associated with an increase in body frame size at puberty in cattle. Animal Genetics 42, 650–655 (2011).

  39. 39.

    et al. Genome-Wide Association Study Identifies Two Major Loci Affecting Calving Ease and Growth-Related Traits in Cattle. Genetics 187, 289–297 (2011).

  40. 40.

    et al. Variants modulating the expression of a chromosome domain encompassing PLAG1 influence bovine stature. Nature Genetics 43, 405–413 (2011).

  41. 41.

    et al. A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nature Genetics 41, 527–534 (2009).

  42. 42.

    , , , & Emerging role of PLAG1 as a regulator of growth and reproduction. Journal of Endocrinology 228, R45–R56 (2016).

  43. 43.

    et al. Stimulation of different subtypes of angiotensin II receptors, AT1 and AT2 receptors, regulates STAT activation by negative crosstalk. Circ Res 84, 876–82 (1999).

  44. 44.

    et al. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–71 (2013).

  45. 45.

    et al. Genotype and SNP calling from next-generation sequencing data. Nature reviews Genetics 12, 443–451 (2011).

  46. 46.

    et al. Haplotype phasing: existing methods and new developments. Nature reviews Genetics 12, 703–714 (2011).

  47. 47.

    et al. Classic Selective Sweeps Revealed by Massive Sequencing in Cattle. Plos Genet 10, doi: 10.1371/journal.pgen.1004148 (2014).

  48. 48.

    et al. Improved estimation of inbreeding and kinship in pigs using optimized SNP panels. Bmc Genet 14, doi: 10.1186/1471-2156-14-92 (2013).

  49. 49.

    , , & Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–5 (2005).

  50. 50.

    & A principal components regression approach to multilocus genetic association studies. Genetic Epidemiology 32, 108–118 (2008).

  51. 51.

    & Empirical threshold values for quantitative trait mapping. Genetics 138, 963–71 (1994).

  52. 52.

    et al. Cross-breed comparisons identified a critical 591-kb region for bovine carcass weight QTL (CW-2) on chromosome 6 and the Ile-442-Met substitution in NCAPG as a positional candidate. BMC Genet 10, doi: 10.1186/1471-2156-10-43 (2009).

  53. 53.

    et al. Four loci explain 83% of size variation in the horse. PLoS One 7, e39929 (2012).

  54. 54.

    et al. A genome-wide association study reveals loci influencing height and other conformation traits in horses. PLoS One 7, e37282 (2012).

  55. 55.

    et al. Adipose and muscle tissue gene expression of two genes (NCAPG and LCORL) located in a chromosomal region associated with cattle feed intake and gain. PLoS One 8, e80882 (2013).

  56. 56.

    et al. Comparison of the effects explained by variations in the bovine PLAG1 and NCAPG genes on daily body weight gain, linear skeletal measurements and carcass traits in Japanese Black steers from a progeny testing program. Anim Sci J 84, 529–34 (2013).

  57. 57.

    , , & A genome-wide association study indicates LCORL/NCAPG as a candidate locus for withers height in German Warmblood horses. Anim Genet 44, 467–71 (2013).

  58. 58.

    et al. Genomic analysis establishes correlation between growth and laryngeal neuropathy in Thoroughbreds. BMC Genomics 15, doi: 10.1186/1471-2164-15-259 (2014).

  59. 59.

    , , & Large-effect pleiotropic or closely linked QTL segregate within and across ten US cattle breeds. BMC Genomics 15, doi: 10.1186/1471-2164-15-442 (2014).

  60. 60.

    , , & Composite Selection Signals for Complex Traits Exemplified Through Bovine Stature Using Multibreed Cohorts of European and African Bos taurus. G3 (Bethesda) 5, 1391–401 (2015).

  61. 61.

    et al. Systems biology analysis merging phenotype, metabolomic and genomic data identifies Non-SMC Condensin I Complex, Subunit G (NCAPG) and cellular maintenance processes as major contributors to genetic variability in bovine feed efficiency. PLoS One 10, e0124574 (2015).

  62. 62.

    , , & Loci associated with adult stature also affect calf birth survival in cattle. BMC Genet 16, 47 (2015).

  63. 63.

    , , , & NCAPG is differentially expressed during longissimus muscle development and is associated with growth traits in Chinese Qinchuan beef cattle. Genet Mol Biol 38, 450–6 (2015).

  64. 64.

    et al. Cloning and functional analysis of a family of nuclear matrix transcription factors (NP/NMP4) that regulate type I collagen expression in osteoblasts. J Bone Miner Res 16, 10–23 (2001).

Download references


This work was funded in part by National Natural Science Foundation of China (31402039), Beijing Natural Science Foundation (6154032), Chinese Academy of Agricultural Sciences Foundation (2014ywf-yb-4), National Natural Science Foundation of China (31372294), National Natural Science Foundation of China (31472079), Cattle Breeding Innovative Research Team (cxgc-ias-03), National Beef Cattle Industrial Technology System (CARS-38), and Project of College Innovation Improvement under Beijing Municipality (PXM2016_014207_000012). The authors would like to thank the staff at the cattle experimental unit in Beijing and Ulagai for caring of animals and collection biological samples.

Author information


  1. Cattle Genetics and Breeding Group, Institute of Animal Science (IAS), Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, China

    • Wengang Zhang
    • , Junya Li
    • , Lupei Zhang
    • , Lingyang Xu
    • , Xue Gao
    • , Bo Zhu
    • , Huijiang Gao
    •  & Yan Chen
  2. Animal Science and Technology College, Beijing University of Agriculture (BUA), Beijing 102206, China

    • Yong Guo
    •  & Hemin Ni


  1. Search for Wengang Zhang in:

  2. Search for Junya Li in:

  3. Search for Yong Guo in:

  4. Search for Lupei Zhang in:

  5. Search for Lingyang Xu in:

  6. Search for Xue Gao in:

  7. Search for Bo Zhu in:

  8. Search for Huijiang Gao in:

  9. Search for Hemin Ni in:

  10. Search for Yan Chen in:


J.Y.L. and Y.C. conceived and designed the experiments. W.G.Z., H.M.N., and Y.G. performed the experiments and wrote the manuscript. L.P.Z. and X.G. performed GWAS analysis. L.Y.X., H.J.G., and B.Z. collected phenotype records and fixed effects data. All the authors have read and approved the final manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Hemin Ni or Yan Chen.

Supplementary information

About this article

Publication history






By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.