Phenotypic records have been routinely collected in dairy cattle to facilitate selective breeding for more than one hundred years. The phenotype of a bull can be highly accurately calculated from thousands of phenotypic records of his daughters and other relatives1. A comprehensive spectrum of phenotypes has been recorded in dairy cattle, including production, reproduction, health, and body type traits2. GWAS on these traits simultaneously in the same population can provide a better understanding of the effects of underlying QTLs. Because of the intensive use of artificial insemination and strong selection in dairy bulls, there are a much smaller number of males than females in the cattle population3, and chromosome segments can be quickly traced back to an ancestral bull. The high relatedness in the cattle population can facilitate accurate imputation4, especially with the availability of many important ancestor bulls sequenced by the 1000 Bull Genomes project5,6,7,8. These unique features of the cattle population make a large-scale GWAS with imputed sequence variants possible and valuable.

Fine-mapping of complex traits to single-variant resolution has started in human studies, e.g., ref. 9,10. Because of the high levels of linkage disequilibrium (LD) in the livestock population11, fine-mapping of GWAS signals is still difficult in cattle. Additionally, existing fine-mapping methods are not easily applicable to large-scale cattle GWAS and fine-mapping studies. Some methods, e.g., CAVIARBF12 and PAINTOR13, generally use a logistic model with a binary response and categorical functional annotations as covariates. Such a logistic model is then incorporated into a model search scheme that often limits the maximum number of causal variants (e.g., 3) and is computationally impractical for a locus containing thousands of sequence variants. When multiple functional data sets are to be tested, model-searching needs to be conducted separately for each set of functional annotation data, further increasing the computational burden. In cattle, Bayes and BayesRC methods have been applied to incorporate sequence data into genomic selection models, but the large amount of computation from MCMC prohibits their direct application to large-scale fine-mapping studies14,15. Although GCTA-COJO is capable of fast conditional analysis for fine-mapping in cattle16, the use of summary statistics and LD data from a reference population can be suboptimal when direct genotype and phenotype data are available. To address these problems, we develop a fast Bayesian Fine-MAPping method (BFMAP) that can efficiently integrate functional annotations with fine-mapping. Specifically, BFMAP can re-use initial model search results for various functional annotations and can be employed for both fine-mapping and functional enrichment analyses. More importantly, the functional enrichment estimated from BFMAP is, by definition, the enrichment of causal effects, in contrast to the enrichment of heritability by the well-known stratified LD score regression17.

In our study, the large number of bulls with highly reliable phenotype and imputed sequence variants can facilitate powerful GWAS and fine-mapping of major GWAS signals. Although the high LD in the cattle genome makes fine-mapping and functional enrichment studies difficult, the large sample size and improved methods can help identify candidate genes of complex traits as well as biologically informative enrichment of candidate variants in functional annotation data. Specifically, we seek to use BFMAP to identify and incorporate functional annotation into the fine-mapping of 35 production, reproduction, and conformation traits in dairy cattle. The fine-mapped genes and variants can provide candidates readily testable in functional studies. The functional data enriched with variants associated with complex dairy traits will be useful for future cattle GWAS and genomic prediction studies. Additionally, the initial model search results can be reused for estimating enrichment of causal effects of dairy traits for additional functional annotations that are being generated by the FAANG and related projects in cattle18.


Data description

We imputed over 3 million selected sequence variants to 27,214 Holstein bulls after quality control edits, using the 1000 Bull Genomes data as reference. These bulls were selected to have highly reliable breeding values (predicted transmitting abilities; PTA) for 35 production, reproduction, and body conformation traits, with an average reliability of 0.71 across traits (Table 1). The number of bulls available for individual traits ranged from 11,713 to 27,161, with >20,000 animals having data for 32 traits (Table 1). The 27,214 bulls had over 31.6 million daughters with records for milk production, and the counts were lower for other traits. This large, high-quality bull data set enables our following GWAS and fine-mapping studies with great power and precision.

Table 1 Number of Holstein bulls, mean and standard deviation (SD) of PTAs, and reliabilities for 35 dairy traits

Single-trait GWAS

We used a mixed-model approach implemented in the software MMAP19 that can incorporate reliability variation across individual bulls. The mixed-model used in our GWAS was robust against population structure and familial relatedness. As shown in Supplemental Data 1, 27 of the 35 traits had a genomic control factor between 0.95 and 1.05.

Using a genome-wide significance level of P < 5E−8, we found many clear association signals for the 35 dairy traits (Supplementary Fig. 1). In total, there were 286 unique QTL regions associated with the 35 traits, and the number of associations for individual traits ranged from <3 for leg and foot traits to 23 for protein percentage (Supplemental Data 1 and 2). As compared to the Cattle QTLdb release 3520, we found that 123 associations (43%) had been previously reported while 163 associations (57%) were newly discovered in this study. We identified 15 new association signals (out of 68) even for the five production traits that had been extensively studied previously, and 92 new associations (out of 125) for type traits that drew less attention in previous studies (Fig. 1 and Supplemental Data 2). While a proportion of these newly discovered QTLs were identified to be associated with new traits, these results demonstrated the superior power of our GWAS in dairy cattle.

Fig. 1
figure 1

Number of association signals newly discovered in our single-trait GWAS versus previously reported. There are in total 30 traits listed. Three leg traits were excluded since we found no associations passing genome-wide significance. DFB and final score were not listed because there were no matched traits in the Cattle QTLdb release 35

Multi-trait association analysis

Consistent with trait definition, hierarchical clustering of the 35 traits based on the absolute correlation coefficients identified three trait clusters: production, reproduction, and body type (Fig. 2). Interestingly, rump angle, teat length, and dairy form were clustered into reproduction traits, although they are type traits by definition, indicating a close genetic correlation between these three traits and cattle reproduction. Even after removing the potential distortion from net merit, rump angle, and teat length were still clustered in the reproduction group while dairy form was clustered in production traits (Supplementary Fig. 2).

Fig. 2
figure 2

Hierarchical clustering of 35 traits in Holstein cattle. a Cluster dendrogram. b PCA clusters

From multi-trait association analyses of the three trait clusters, we identified 33, 21, and 39 associations for production, reproduction, and type traits using P < 5E−8, respectively (Fig. 3 and Supplemental Data 3). While multi-trait analysis is generally more powerful than single-trait GWAS for pleiotropic QTLs21,22, we found fewer associations from the multi-trait analyses than in single-trait results (76 vs 286 unique QTLs). This is likely due to the proportion of QTLs with pleiotropic effects on related traits is less than expected, and/or the limited benefit of including additional traits in cattle studies where individual traits are already highly accurate (Table 1). Although the majority of the multi-trait associations were already identified from single-trait GWAS, we found ten associations that were missed by single-trait analyses (Supplemental Data 4). Interestingly, we noticed that the top variant in multi-trait analysis could be >1 Mb away from the top variants in single-trait GWAS (Supplementary Fig. 3), so the multi-trait results were combined with single-trait analyses to refine candidate QTL regions for fine-mapping.

Fig. 3
figure 3

Manhattan plots for multi-trait association analyses. a Production traits. b Reproduction traits, excluding four calving traits (calving ease and stillbirth traits). c Type traits. d All 29 dairy traits, excluding DFB, net merit, and four calving traits


To facilitate fast fine-mapping analyses, we developed a fast Bayesian Fine-MAPping method (BFMAP) that calculates a posterior probability of causality (PPC) for variants in candidate regions. We picked QTL regions for fine-mapping from both single- and multi-trait GWAS results. Initially, we fine-mapped 434 association signals for 282 QTLs using a significance threshold of 5E−7 (Supplemental Data 5). The observed number of fine-mapped signals in a QTL is approximately exponentially distributed, consistent with our expectation of more causal mutations with a lower probability in a QTL region (Fig. 4). After further quality control edits, we finally fine-mapped 308 association signals for 32 traits (Supplemental Data 6). Specifically, there were more than 20 independent association signals identified on chromosomes 5, 6, 14, 18, and 29, while very few were identified on chromosomes 12, 22, and 27.

Fig. 4
figure 4

Distribution of number of fine-mapped signals in a candidate locus for a trait. Signals were filtered by a significance threshold of 5E−7

We investigated the impacts of incorporation of SnpEff-inferred effect impact (commonly used functional annotation) on fine-mapping performance. First, incorporating variant impacts resulted in a substantial change of PPC for variants in the 308 fine-mapped association signals. Variants with moderate impact had a considerable increase in PPC when functional information was included in the calculation, while modifier variants generally had a decreased PPC (Fig. 5a). Second, fine-mapping by incorporating variant impacts generated significantly smaller 95% credible variant sets than that using an equal prior for all variants (P = 0.01, Wilcoxon signed-rank test; Fig. 5b). These two features make the incorporation of functional annotation favored in our fine-mapping analyses.

Fig. 5
figure 5

Effect of incorporation of SnpEff-inferred impact on fine-mapping performance. a PPC with incorporation of SnpEff impacts versus PPC with an equal prior for each variant. b Size of 95% credible variant set generally decreased after incorporation of SnpEff-inferred impact

Enrichment analysis

To verify the quality of our fine-mapped variants and characterize their distribution on the cattle genome, we investigated the enrichment of fine-mapped variants with different functional annotation data available to cattle, including location in protein-coding gene, effect predicted by SnpEff23, and evolutionary constrain predicted by GERP24. Our enrichment analysis estimated the probability of a causal variant being in a functional category and the probability of a non-causal variant being in the category. The ratio of the two probabilities was used to measure the enrichment of causal variants for this functional category25, with a value larger than one indicating higher enrichment than the genome background. This enrichment analysis has also been implemented in BFMAP.

We first categorized variants into five groups based on their locations regarding protein-coding genes, i.e., CDS, 5′ UTR+2 kb upstream, intron, 3′ UTR+2 kb downstream, and other (intergenic or non-protein-coding genic regions). Despite the strong LD levels in the cattle genome26, we observed distinctive enrichment patterns across these five categories (Fig. 6a). Using bootstrapping, we calculated 95% confidence intervals for the enrichment levels, showing significant enrichment of fine-mapped variants in CDS (4.52×) and 5′ UTR (2.39×), but not in intron (0.93×) or 3′ UTR (0.77×). We also analyzed a group of non-protein-coding genes but found significant depletion with \(\widehat {E_C}\) = 3.2E−04 (Supplemental Data 7), suggesting a lacking of functional impacts in these genes on dairy cattle traits.

Fig. 6
figure 6

Enrichment of fine-mapped variants across various functional annotations. a Locations of variants regarding protein-coding genes. b SnpEff predicted impact. c GERP-constrained elements. d SnpEff impact and GERP-constrained elements

We further investigated the enrichment of fine-mapped variants regarding their genomic locations and protein-coding effects (High, Moderate, Low or Modifier) predicted by SnpEff23. When modeling these four categories, we found severe depletion of variants with high impact (\(\widehat {E_C}\) = 2.51E−05; Supplemental Data 8). This is strikingly different from a previous study on human complex traits and diseases that reported an enrichment of >100 for this category25. As shown in Fig. 5b, we observed a significant enrichment in moderate-impact variants (\(\widehat {E_C}\) = 8.7; P = 0.01). Low-impact variants also showed an enrichment (2.0×), though it was not statistically significant (Fig. 6b). As expected, a minor depletion was seen in modifier variants (0.87×).

We also used constrained elements on the cattle genome to categorize variants into two groups (inside of or outside of constrained elements), as highly conserved DNA sequences may imply functional importance. As shown in Fig. 6c and Supplemental Data 9, fine-mapped variants were significantly enriched in constrained elements (3.72×; P = 0.02). When further categorizing variants into six groups based on both constrained elements and variant impacts (Moderate, Low or Modifier), we found the highest enrichment in moderate-impact variants inside constrained elements (25.56×; P = 0.005). For the other categories, we observed no enrichment of fine-mapped variants (Fig. 6d and Supplemental Data 10).

When comparing different trait groups, we observed little difference in the pattern of enrichment regarding SnpEff-inferred effect impact (Fig. 7 and Supplemental Data 11). Moderate-impact variants had a clearly higher enrichment of being causal for production traits than for reproduction and type traits. We further used permutation to generate the null distribution of EC(Production)/EC(Reproduction+Type) and showed that the difference was statistically significant (P = 0.01; Supplementary Fig. 4A). However, the enrichment for low-impact variants was similar between the three trait groups (Supplementary Fig. 4B).

Fig. 7
figure 7

Enrichment estimates for SnpEff predicted impact by three groups of traits

Candidate genes

Based on the PPCs of variants after incorporation of SnpEff impact, we calculated PPC for each gene in each independent association signal. In total, there were 564 gene-trait association pairs with PPC >0.01 (Supplemental Data 12). Most of the genes had either a large (>0.95) or small PPC (<0.05) (Supplementary Fig. 5). We further obtained a short list of the most promising candidates by applying conservative criteria: PPC >0.9 if a gene is associated with only one trait and PPC >0.5 for all traits if a gene affects multiple traits.

This short list had 69 unique genes including both previously reported genes and newly discovered ones for cattle traits (Table 2). For example, ABCG2 and DGAT1 are known to affect milk production in dairy cattle27,28. The ARRDC3 gene has been associated with body confirmation traits and calving traits in beef and dairy cattle21,29,30. Our fine-mapping study also revealed novel gene/association combinations for dairy traits. A previous study reported that the ABCC9 gene was associated with fat yield, protein yield, and calving to first service interval in Holstein cattle31. In our study, we found a pleiotropic effect of this gene on body type traits (fore udder attachment and udder depth), milk production (milk and protein yields), and daughter pregnancy rate, with a PPC of almost 1 for all the associated traits. In addition, we found that there were no common variants among the credible variant sets for these traits (Table 2), suggesting that ABCC9 might have different causal mutations for the associated traits. TMTC2 has been associated with teat length30, and our fine-mapping showed that it had an effect on six type traits (including teat length, fore udder attachment, front teat placement, rear teat placement, rear udder height, and final score), with PPC being ≥0.95 for all those traits. Abo-Ismail et al. reported CCND2 was associated with stature30. Our fine-mapping results determined its association with four type traits (PPC > 0.95 for body depth, rump width, and stature). It is worth noting that our fine-mapping study not only discovered association of a gene with a trait, but also provided the posterior probability of being causal for a gene.

Table 2 Candidate genes with high posterior probability of causality

Candidate variants

Because our stringent quality control filtering during and after imputation removed many variants (~20%, mostly intergenic with some genic), fine-mapping of the QTL regions to single-variant resolution could not always be achieved. Nevertheless, we obtained 95% credible variant set for each independent signal and merged them into one table. This resulted in a total of 1582 unique variants (Supplemental Data 13). We generated a short list of those variants with a moderate impact on protein coding and PPC >0.2 (Table 3). Among the list, some variants have been previously reported, e.g., Chr6:38027010 in ABCG227 and Chr26:21144708 in SCD32. We also found other promising candidate variants, e.g., Chr7:93244933 in ARRDC3 with an average PPC of 0.608 on 9 traits, Chr8:83581466 in PTH1 with an average PPC of 0.68 on two type traits (body depth and strength), Chr1:69673871 in KALRN with an average PPC of 0.46 on two reproduction traits (cow conception rate and daughter pregnancy rate), Chr17:70276788 in CHEK2 with an average PPC of 0.39 on two calving traits (sire calving ease and daughter calving ease).

Table 3 Missense variants with largest posterior probabilities of causality (>0.2)


In this study, we performed GWAS for 35 production, reproduction, and type traits in dairy cattle with a uniquely large data set, and then fine-mapped the GWAS signals to single-gene resolution. With the fast computing method that we developed (BFMAP), we attempted to find causal effects in hundreds of loci each of which contained thousands of variants. We also investigated the functional enrichment patterns of several functional annotation data available in the cattle genome, and incorporated useful functional information into the final fine-mapping. In sum, we provided not only a credible candidate gene list for follow-up functional validation, but also a unique resource that can be easily employed by future functional enrichment studies.

In the single-trait GWAS, we found many association signals that have not been discovered (Fig. 1), clearly demonstrating the benefits of using large dairy cattle data for GWAS of complex traits. Reliabilities of deregressed PTAs were modeled for most of the traits. For the traits with small variation of reliability, we observed similar results for the models with and without reliability; e.g., QTLs found when not modeling reliability were largely the same as those by incorporating reliability for fat percentage and daughter pregnancy rate (Supplementary Fig. 6). Interestingly, we observed some deflations in the GWAS of production traits, which could be due to the large QTL effects on these traits including the DGAT1 gene. Minor inflations were observed in GWAS for calving traits (i.e., calving ease and stillbirth) and final score (Supplementary Fig. 1). Although there were many sporadic variants passing the threshold of genome-wide significance (P < 5E−8), we could still locate a few credible GWAS peaks where there were a cluster of significant variants.

Initially, our fine-mapping discovered as many as 19 signals in a candidate region for a trait, as we applied a variant inclusion threshold accounting for only the effective number of independent variants (meff) at the locus-by-trait level. We also noticed that there were more locus-by-trait association pairs with multiple signals than with only one signal. By examining those with multiple signals, we found the models often contained a strong signal and several much weaker ones. Those weak signals might result from imperfect model fitting of the lead variants in other signals, instead of being true positives. Nevertheless, filtering out these weak signals with genome-wide significance levels did little harm to the discovery of strong ones.

The enrichment results for SnpEff-inferred variant impact in our study were very different from those reported in human studies25. The differences among the four categories in the human study are more distinctive than ours. This is consistent with our anticipation that high LD in cattle genome makes such enrichment difficult to detect. In addition, high-impact variants generally have a lower frequency than other variants and are thus harder to impute in cattle where the number of reference sequences is small and the original genotype data are of moderate density. Nevertheless, we found a considerable enrichment of candidate causal effects in moderate-impact variants. Incorporation of this enrichment into fine-mapping facilitated the discovery of more candidate causal variants (Fig. 5). The discovery of biologically meaningful enrichment patterns will be valuable for the development of new methods to incorporate functional information into fine-mapping and genomic prediction.

Different functional annotations are often related, so we analyzed the enrichment of each functional annotation separately. Although single-annotation analysis does not resolve confounding of multiple annotations, the enrichment estimates can still provide informative priors for fine-mapping. We analyzed various functional annotations by single-annotation enrichment analysis and determined the ones that provide highly differential priors. LDSC may be able to dissect heritability enrichment between multiple functional annotations, and BFMAP can incorporate these outputs into fine-mapping simultaneously.

It is widely acknowledged that population structure and relatedness need to be properly accounted for in GWAS via a linear mixed model33 or a linear model with principal components extracted from genomic relationship matrix34. Similarly, we need to account for population structure and relatedness in fine-mapping analyses as proposed in our BFMAP. However, existing fine-mapping approaches have not fully addressed this issue. For instance, BIMBAM models only intercept and SNPs of interest, and thus only works for independent samples35. BayesFM can include principal components as covariates, but it does not have a random component to fully account for relatedness36. piMASS applies Bayesian variable selection regression to modeling genome-wide variants to control for population structure and relatedness, but it uses a Markov chain Monte Carlo (MCMC) algorithm that is computationally impractical for large studies37. CAVIARBF, PAINTOR, and FINEMAP38, use summary test statistics and are approximately equivalent to BIMBAM. In theory, these methods work for independent samples by using summary statistics from linear model analyses and genotype correlations between variants. Further studies are warrant to investigate how these summary-statistics methods perform for structured or related samples when using summary statistics from linear mixed models.

Using BFMAP, we pinpointed some promising candidate genes for economically important traits in dairy cattle. It is promising to validate those genes with high posterior probability of causality (Table 2) in future functional studies. In addition, with our new method of functional enrichment analysis in BFMAP, our fine-mapping result of hundreds of QTLs (Supplemental Data 6) can be readily used to estimate enrichments of causal effects for additional functional annotation data. Thus, we provided an easy-to-use enrichment analysis resource to test the functional annotations that are being generated by the on-going FAANG and related projects for cattle18.


Genotype and phenotype data

Genotype data have been described in more details previously5. Here we provide a brief summary. SNP and insertion-deletion (InDel) calls (sequence variants) from Run 5 of the 1000 Bull Genomes Project6 were released in July 2015. After stringent quality control edits and removal of intergenic and intronic SNPs, 3,148,506 sequence variants were retained for 444 Holstein bulls. The sequence variants and high-density SNP genotypes of 312K markers for 26,949 progeny-tested Holstein bulls (and 21 Holstein cows) were combined by imputation using the FindHap software (version 3)39. Finally, we had genotypes of 3,148,506 sequence variants for 27,214 Holstein bulls (179 bulls had both sequence and high-density genotypes) and 21 cows. Imputation quality from FindHap was assessed with 404 of the sequenced animals as the reference population and 40 randomly selected animals for validation. The sequence genotypes of the validation animals were reduced to high-density SNP genotypes and then imputed back to sequence variants. The average imputation accuracy was 96.7% for the 3,148,506 variants5. After excluding high-density SNPs, we found an average accuracy of 96.4% for the newly imputed sequence variants. Chromosome-specific imputation accuracy was >95% for all autosomes except Chromosome 12.

All of the 27,214 Holstein bulls used in this study had highly reliable (average reliability >71% across traits) PTAs for 35 production, reproduction, and type traits (Table 1). Transmitting ability is basically the additive genetic values of cattle. Reliability quantifies the amount of information available in a PTA and measures its accuracy40. Deregressed PTAs were used as phenotype in all our analyses, which excludes parent information and reduces the dependence among animals41. Because each of the bulls had many phenotyped daughters, their PTAs were generally of high reliability, even for low-heritability traits (Table 1). The trait definitions are shown in Table 1 and Supplementary Note 1. We categorized the 35 traits into three groups, i.e., production, reproduction, and body type, based on a clustering analysis.

Single-trait GWAS

The software MMAP19 was used for all single-trait GWAS analyses ( Basically, MMAP efficiently implements a mixed-model approach for association tests that is similar to GEMMA42 but different from EMMAX;43 that is, variance component is estimated uniquely for each marker. We used the following model

$${\boldsymbol{y}} = {\boldsymbol{\mu }} + {\boldsymbol{X}}b + {\boldsymbol{g}} + {\boldsymbol{e}}\,{\mathrm{with}}\,{\boldsymbol{g}}\sim N\left( {0,\sigma _g^2{\mathbf{G}}} \right){\mathrm{and}}\,{\boldsymbol{e}}\sim N\left( {0,\sigma _e^2{\mathbf{R}}} \right),$$

where y is deregressed PTAs, μ is global mean, X is genotype of a candidate variant (coded as 0, 1 or 2) and b is its effect, g is a polygenic effect accounting for population structure, and e is residual. The genomic relationship matrix (G)44 was built using ~312K high-density SNP markers (filtered by MAF >1%). R is a diagonal matrix (Rii = 1/r2−1), which is used to model differential reliability among animals.

We disregarded variants on the X chromosome. We also filtered out variants with an MAF of <1% or failing Hardy–Weinberg Equilibrium test (p < 1E−6). After QC, there were ~2.7 million variants to be tested for association. We used a genome-wide significance level of P < 5E−8. QTLs were located by finding GWAS peaks where there were a cluster of significant variants. We used a custom Perl script to find all GWAS peaks and further examined each of the peaks based on the Manhattan plots to filter out suspicious ones (i.e., sporadic significant variants). Subsequently, we determined a total of 286 QTLs (Supplemental Data 2) that were further analyzed in fine-mapping studies.

To find which ones are novel among the 286 QTLs, we compared our result with Cattle QTLdb (release 35 published on April 29, 2018) that contains 113,256 QTLs/associations from 848 publications20. To ensure correct physical positions of QTLs on UMD 3.1, we first extracted the rs identifiers (rs#) of flanking SNPs for each term from the Cattle QTLdb data, and then used the identifiers to find flanking SNPs’ positions on UMD 3.1 in the Ensembl genome variation database. These SNP positions were used as QTL positions. This procedure can rule out QTL terms whose physical positions are inaccurately converted from genetic maps. The Cattle QTLdb release 35 covers 599 different traits, in which we extracted those with the (almost) same definition as the 35 traits in our study (Supplemental Data 14). For each of the QTLs that we detected, we determined that it is either previously reported if it is within ± 500 kb of any QTL/association for the same trait(s) in the Cattle QTLdb or is newly discovered otherwise (Supplemental Data 2).

Multi-trait association analysis

Following a previous study21, our multi-trait association tests were based on a chi-square statistic with multiple degrees of freedom. For each variant, the chi-square statistic for the multi-trait association test was calculated by:

$${\mathrm{Multi - trait}}\,\chi _{df = n}^2 = {\mathbf{t}}_i^\prime {\mathbf{V}}^{ - 1}{\mathbf{t}}_i,$$

where ti is a n × 1 vector of the signed t-values of variant i for n traits, and V is an n × n correlation matrix for the n traits which is calculated using signed t-values of genome-wide variants. In our analysis, the signed t-values were obtained from single-trait GWAS for 2,619,418 variants passing QC, and the correlations between traits were calculated using all the variants. To test the robustness of the estimated correlation using all sequence variants45, we also computed the correlation matrix using two variant subsets obtained by selecting every 10th and every 100th variant. The three variant sets produced similar correlation estimations (Supplementary Fig. 7).

We performed hierarchical clustering based on the absolute correlation coefficients, and then did multi-trait association analysis for each of the three resulting clusters of traits as shown in Fig. 2. Specifically, we excluded net merit and days to first breeding (DFB) in production and reproduction clusters, respectively, because these traits are linear combinations of other traits and the number of bulls for DFB was much smaller compared to other traits. We also excluded the four calving traits to avoid sporadic significant variants. Additionally, all the traits except for the six traits aforementioned were analyzed as a whole in a separate multi-trait association test.

Bayesian Fine-MAPping (BFMAP)

We developed the following Bayesian model for fine-mapping:

$$\begin{array}{*{20}{l}} {{\boldsymbol{y}} = {\boldsymbol{X}}{\boldsymbol{b}} + {\boldsymbol{Z}}{\boldsymbol{a}} + {\boldsymbol{g}} + {\boldsymbol{e}}} \hfill \\ {{\boldsymbol{b}}\sim N\left( {0,\varphi \sigma _e^2{\boldsymbol{I}}} \right)} \hfill \\ {{\boldsymbol{a}}\sim N\left( {0,\gamma \sigma _e^2{\boldsymbol{I}}} \right)} \hfill \\ {{\boldsymbol{g}}\sim N\left( {0,\eta \sigma _e^2{\boldsymbol{I}}} \right)} \hfill \\ {{\boldsymbol{e}}\sim N\left( {0,\sigma _e^2{\boldsymbol{R}}} \right)} \hfill \\ {P\left( {\sigma _e^2} \right) \propto 1/\sigma _e^2} \hfill \end{array}$$

where y is a phenotype vector of size n for a complex trait, b is a vector of covariate (other than genomic variants) effects and X is corresponding design matrix, a is a vector of variant effects and Z is corresponding genotype coding matrix (e.g., genotype coding for additive, dominance, or imprinting effects46), g is a vector of polygenic effect for controlling population structure, G is a corresponding variance structure matrix (e.g., genomic relationship matrix), and e is the residual with variance structure R for modeling reliability or accuracy of phenotypic records as in model (1). The common variance component (\(\sigma _e^2\)) is given by a non-informative Jeffrey’s prior. Other variance parameters (φ,γ,and η) are treated as known. Generally, we can set φ to a large value (e.g., 1E8) to make a act like fixed effects. A genomic variant is usually considered to have a small but noticeable effect, so we can set γ at 0.01 or 0.0447,48. When Za only accounts for a tiny proportion of phenotypic variance (this is true when modeling variants from a small genomic region), we can set η based on the heritability (h2) by η = h2/(1−h2). In practice, we can instead use heritability estimate (\(\widehat {h^2}\)) in the null model without variants to determine η.

We can easily compute P(D|M) (data D, and model M regarding variant inclusion) by integrating out \(\sigma _e^2\) based on model (2). To allow easy calculation, we use a linear transformation to the model (Supplemental Note 2). We can further obtain the null distribution of Bayes factors (H0: a = 0) in model (2) by an extension of the results by Zhou and Guan48 (Supplemental Note 3). Based on the null distribution, scaled Bayes factor48 and corresponding p-value can be computed for our model.

We seek to identify independent association signals within a QTL region and to assign a posterior probability of causality (PPC) to each variant with fine-mapping. Following the method by Huang et al10., our fine-mapping approach includes three steps: forward selection49 to add independent signals in the model, repositioning signals, and generating credible variant set for each signal. Although our approach uses the same framework as Huang et al.10, there are a few notable differences (Supplemental Data 15). While they only provided some R scripts for disease data, we provide a fast, general-purpose software tool for fine-mapping analysis of complex traits.

We set φ = γ = 1E8 in model (2) for fine-mapping, which enables easy calculation of p-value for a newly added variant conditional on variants already in model (Supplemental Note 4). We use a Bonferroni-corrected threshold49 as stopping criterion in forward selection; that is, forward selection stops when (2logBF+1) < 2logmeff, where meff is the effective number of independent variants calculated using the method by Li and Ji50. Suppose that we select p independent signals in forward selection and determine a set of lead variants (Sl) for the p signals after repositioning. Then, for signal i with lead variant (li), we have a variant set (Si) containing variants that have substantial LD with li but weak LD with lead variants in other signals Sl /{ li }. Accordingly, we can compute PPC of variant j (vij) in Si conditioning on Sl /{ li }:

$$\begin{array}{l}P\left( {M_i = v_{ij}|y,X,Z,S_l\backslash \left\{ {l_i} \right\}} \right){\mathrm{ = }}\\ \frac{{P\left( {y\left| {X,Z,M_i = v_{ij},S_l\backslash \left\{ {l_i} \right\}} \right.} \right)P\left( {M_i = v_{ij}} \right)}}{{\mathop {\sum }\nolimits_j P\left( {y\left| {X,Z,M_i = v_{ij},S_l\backslash \left\{ {l_i} \right\}} \right.} \right)P\left( {M_i = v_{ij}} \right)}}\end{array}$$

Where Mi = vij denotes that the causal variant in signal i is variant j in Si (i.e. vij). Weöó can easily get a credible variant set passing a given confidence level (e.g., 95%) for a signal, by sorting variants in a descending order of PPC and including them in the set from top to bottom. We can also calculate PPC of a gene by summing up PPCs of all variants within the gene.

In the study by Huang, et al.10, an equal prior for each variant was used; that is, P(Mi=vij)=1 vijSi. Here, we propose a method to apply differential prior probabilities by integrating functional annotation, following a previous study on adjusting significance threshold based on functional annotation in GWAS25. With our fine-mapping procedure, it is usually safe to assume one and only one causal variant in each independent signal. For a functional annotation with several categories, we denote the probability of a causal variant being in category C as pC and the probability of a non-causal variant being of category C as qC. We can accordingly obtain:

$$P\left( {M_i = v_{ij}} \right) = P\left( {c_{ij}\left| {M_i = v_{ij}} \right.} \right)\mathop {\prod}\nolimits_{j\prime \ne j} {P\left( {c_{ij\prime }\left| {M_i \ne v_{ij\prime }} \right.} \right)} = p_{cij}\mathop {\prod}\nolimits_{j\prime \ne j} {q_{cij}}$$

where cij denotes the category of variant j in Si (i.e. vij).

We estimate qC with the genome-wide frequencies of the categories25. To estimate pC, we can use all available independent signals (Mi):

$$\begin{array}{l}L\left( {\left\{ {p_C} \right\}|y,Z} \right) \propto \mathop {\prod }\limits_i P\left( {M_i,y,Z|\left\{ {p_C,q_C} \right\}} \right)\\ \propto \mathop {\prod }\limits_i \mathop {\prod }\limits_j P\left( {y|X,Z,M_i = v_{ij}} \right)P\left( {M_i = v_{ij}|\left\{ {p_C,q_C} \right\}} \right)\end{array}$$

When the signals identified in fine-mapping are independent of each other, we can get:

$$P\left( {y|X,Z,M_i = v_{ij}} \right) \approx P\left( {y|X,Z,M_i = v_{ij},S_l\backslash \left\{ {l_i} \right\}} \right)$$

Taking Eqs. 4 and 6 into Eq. 5, we obtain a likelihood function regarding {pC} and then get their maximum likelihood estimates (MLEs), \(\left\{ {\hat p_C} \right\}\). By taking the estimates of {pC, qC} and Eq. 4 to Eq. 3, we get updated PPCs with incorporation of function annotation, which is actually an empirical Bayes approach.

When setting an equal prior for each variant, we find:

$$P\left( {M_i = v_{ij}|y,X,Z,S_l\backslash \left\{ {l_i} \right\}} \right) \propto P\left( {y|X,Z,M_i = v_{ij},S_l\backslash \left\{ {l_i} \right\}} \right)$$

Thus, to estimate {pC} by Eq. 5, we can use PPCs from the computation assuming an equal prior for each variant. Accordingly, incorporation of functional annotation includes three steps: computing PPCs given an equal prior for each variant, estimating {qC} with the genome-wide frequencies of the categories and estimating {pC} with these PPCs, and updating PPCs with {pC, qC}. These features make our approach easier to use compared with PAINTOR13 and CAVIARBF12.

Fine-mapping of dairy cattle traits

Genomic regions for find-mapping were determined by lead variants in single-trait and multi-trait GWAS results. We first determined a minimal region that covered each lead variants (either in single- or multi-trait QTLs), and then extended it 1 Mb upstream and downstream, resulting in a ≥ 2 Mb candidate region for fine-mapping. The 1-Mb extension allowed the region to cover most variants that have an LD r2 of >0.3 with the lead variants26.

We obtained a total of 125 loci from single- and multi-trait GWAS results (Supplemental Data 16). Three loci without enough high-density SNPs were removed to ensure imputation quality, thus leaving 122 loci for fine-mapping. A total of 57 loci were associated with more than one trait. Fine-mapping was performed for individual traits, and these 122 loci represented 282 locus-by-trait pairs for 32 traits (three leg traits were excluded for lack of significance). When fine-mapping identified multiple signals in a candidate locus, we kept the strongest one and filtered the rest. The effective number of independent tests was 54,403 for the 282 locus-by-trait pairs (Supplemental Data 17). Considering that our effective number estimates were already conservative51, we used 5E−7 (<0.05/54,403) as the significance threshold. Subsequently, we found 434 association signals (Supplemental Data 5).

We found that the locus-by-trait association pairs with more than three signals identified were mostly from still birth and final score (Supplemental Data 5). We also noticed slight inflation of GWAS results of these two traits (Supplementary Fig. 1). Therefore, we removed the 16 QTLs with >3 fine-mapped signals from all following analyses. We further removed 15 signals whose variant set had ≤ 10 variants of distinct genotypes, as a small cluster of highly linked variants could indicate inaccurate imputation. Additionally, if there were multiple QTL on a chromosome for a trait, all lead variants in these loci were modeled jointly in fine-mapping. Accordingly, 13 association signals whose lead variant had a p-value > 5E−7 were removed. After all these edits, we determined a total of 308 association signals (Supplemental Data 6).

Besides assuming an equal prior for each variant, we further applied differential prior probabilities based on SnpEff-inferred impacts23. Since using Eq. 5 requires independent association signals, we removed all the association signals for protein, cow conception rate, rear teat placement, udder depth and strength, because they have high correlation (r2 > 0.5) with other traits. We also removed another six association signals, since these signals have a substantial LD with another signal (measured by LD r2 between lead variants >0.25). These edits reduced the number of association signals from 308 to 249. We estimated {pC, qC} for variant impact categories based on the 249 association signals, and updated PPCs for all 308 signals by integrating the estimated functional enrichment. Effect impact-incorporated PPCs were used for determining candidate variants or genes. When computing PPC of a gene, all variants within its 2-kb upstream and downstream ranges were included.

Enrichment analysis in BFMAP

Our enrichment analysis was based on our 249 fine-mapped association signals to estimate pC (the probability of a causal variant being in category C) and qC (the probability of a non-causal variant being in category C). The two probabilities can be estimated using the models described in BFMAP. The enrichment for category C is defined as EC = pC/qC25, for which a value larger than one indicates that candidate causal variants are more enriched in category C than across the whole genome. Functional annotations investigated included locations of variants regarding protein-coding genes, effect impact inferred by SnpEff23, and constrained elements predicted by GERP24. Confidence intervals of the enrichment estimates were derived by percentile bootstrap as in ref. 25. The association signals were resampled 1,000 times to calculate the confidence intervals. We removed very small categories (like HIGH in SnpEff-inferred effect impacts) in bootstrapping to avoid non-convergence of the maximum likelihood estimation.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.