Sex differences have been documented in many phenotypes and diseases [1]. Across quantitative traits, men and women typically have overlapping distributions with different means, examples of these traits include height and body mass index (BMI) [2]. Previous studies have demonstrated that some of this difference is due to sex-specific genetic factors [1, 3]. Genome-wide Association Studies (GWAS) are increasingly used to identify variants that contribute to sex differences, and recently, gene-by-sex interactions have been identified across many phenotypes, including in anthropometric traits [4], irritable bowel syndrome [5], and glioma [6].

Examination of blood and urine laboratory biomarker levels reveal sex differences [7]; however, it is unknown to what extent these sex differences are related to underlying differences in the genetic architecture versus environmental differences. Heritability, or the fraction of phenotypic variability explained by genetic variance, was initially estimated from family studies; but now, with the increasing availability of genome-wide data, common genetic variants (such as single nucleotide polymorphisms (SNP)) are used for this estimation [8]. Methods for estimating SNP-based heritability include LD-score regression, restricted maximum likelihood estimation, Haseman–Elston regression, and the moment-matching approach [9,10,11,12]. These methods are applied to a sample of unrelated individuals in order to quantify the proportion of phenotypic variance explained by all genetic variants in the GWAS.

At a trait level, we can use the sex-specific heritabilities and the between-sex genetic correlation to examine what fraction of the genetics of that trait is shared. The UK Biobank is a prospective population-based study of 500,000 individuals that includes both genetic and phenotypic data, allowing for rich SNP-based estimation of heritability [13]. While most traits do not show sex effects on heritability [14], previous studies have documented these differences in a subset of traits, including fat distribution and other anthropometric traits [4, 15]. However, there has yet to be an analysis of these sex differences across biomarkers.

Here we present an approach for estimating the extent to which genetic effects are correlated between sexes and identifying the proportion of relevant variants that have shared effects versus effects that are specific to each sex. We apply this approach to blood and urine biomarker data from the UK Biobank to examine sex differences in genetic effects, and find differences primarily in the genetic determinants of testosterone level. Furthermore, we use these identified sex differences to provide hypotheses about biological mechanisms including (1) examination of protein-altering variants and tissues where these genes are selectively expressed, (2) causal inference using Mendelian randomization (MR) to assess relationships between testosterone and other traits, and (3) improved genetic risk prediction models for testosterone.

Materials and methods

Genotype data

We used genotype data from the UK Biobank dataset release version 2 and the hg19 human genome reference for all analyses in the study [16]. To minimize the variability due to population structure in our dataset, we restricted our analyses to unrelated White British individuals (as indicated by self-reported ethnicity, UKBB field ID 21000) without missing data (see Supplementary methods for details). Variant annotations, filtering, and LD pruning were performed as previously described [17, 18]. We additionally filtered for variants with Hardy–Weinberg equilibrium <10−7 and <1% missingness, and used plink --xchr-model 2.

Anthropometric traits

To demonstrate the utility of our method, we applied SEMM to previously examined anthropometric traits [4, 15] (field IDs in Table S1; Supplementary methods).

Selection and processing of biomarker traits

We focused on 33 of 38 biomarkers previously described and covariate-adjusted [7] (field IDs in Table S2; Supplementary methods).

Menopause phenotype definition

We used a stringent definition for dividing individuals into pre- vs. postmenopause to separate into clear categories and avoid including peri-menopause (see Supplementary methods for the definition and relevant field IDs).

Summary statistic generation

Genome-wide association summary statistics were generated separately for males and females using PLINK v2.00aLM (2 April 2019). Age, genotyping array used, and the first four PCs were included as covariates. Variants with missing standard errors or standard errors >0.2 in either sex were also removed.

Sex Effects Mixture Models (SEMM)

We used a two-component mixture model consisting of a point mass centered at zero and a multivariate normal distribution to estimate the variance–covariance matrix, from which we estimate the genetic correlation and heritability. We extended this model to a four-component model with male- and female-specific components in order to identify variants with sex-specific genetic effects (see Supplementary methods).

Sex-specific multivariate polygenic prediction

To construct sex-stratified polygenic risk score (PRS) models using multivariate penalized regression, we created a random split dataset of White British individuals in UK Biobank into 70% training, 10% validation, and 20% test sets. We used covariate adjustment testosterone residual values as described previously [7]; however, we did so in both sex-separated and combined cohorts (see Table S14; Supplementary methods).

Mendelian randomization (MR)

We used MR-Base to test for evidence for causal associations between testosterone and ten outcomes of interest using the sets of female- and male-specific testosterone variants [19]. Variants were pruned for LD with clumping and the analysis was performed with MR Egger, inverse variance weighted (IVW), and IVW with fixed effects. For each of the outcomes, we used summary statistics from both a UK Biobank and non-UK Biobank source and sex-divided outcomes where available. The traits include: waist circumference (WC) [20], hip circumference (HC) [20], height [21, 22], BMI [23], age at menarche [24], age at menopause [25], prostate cancer [26], heart disease [27], type 2 diabetes (T2D) [28], and stroke [29] (Table S12). A Bonferroni correction was used to account for multiple tests (p value < 0.05/168 = 2.98 × 10−4).


Sex-Effect Mixture Models

We built a two-component Bayesian SEMM for estimating the contributions of sex to genetic variance using GWAS summary statistics (Fig. 1). The model contains a null component, for variants that do not contribute to the trait, and a non-null component, for variants that represent the genetic contribution to that trait. Variants driving male and female traits in the non-null component are modeled as two-dimensional vectors drawn from a multivariate normal distribution with a variance–covariance matrix that can be used to estimate the genetic correlation between sexes. To assess whether our approach obtains reliable estimates in real data, we applied SEMM to traits from Rawlik et al. [15] and obtained overlapping genetic correlations and similar but not identical heritability estimates (Table S1 and Figs. S1, S2).

Fig. 1: Schematic overview of Sex-Effect Mixture Model.
figure 1

We prepared a dataset of 33 serum and urine biomarkers from 337199 individuals in UK Biobank. We calculated GWAS summary statistics from males (light purple) and females (orange) separately, so that for every trait, we had an effect estimate (and) and standard error for each variant in each sex. We use a two-component Bayesian Sex-Effect Mixture Model (SEMM), with no-effect and nonzero effect components, to estimate SNP-based heritability and the genetic correlation between males and females for each biomarker. A four-component extension of the SEMM contains two additional components for separate male and female effects. This model allows us to distinguish between four cases: genetic variants that have no effect (illustrated as M0), genetic variants that have a stronger association with the trait in females or males (M1, orange, or M2, light purple), and genetic variants that have similar effects in females and males (M3, gray).

We extended our two-component SEMM to a four-component model to identify genetic variants with different effects in males and females (Fig. 1). To do so, we add two components for detecting genetic variants that have stronger effects in one sex. Similar to the two-component model, the four-component SEMM also contains a no-effect and shared-effect component. Through fitting this model, we are able to separate genetic variants that have “shared” effects, where the variant or set of variants have the same effect in males and females, and “sex-specific” effects, where the variants have different effects in males than in females (e.g., this variant is associated with higher lab values in females but not in males).

To demonstrate the efficacy of the four-component SEMM, we applied the model to four traits, waist–hip ratio, arm-fat ratio, leg-fat ratio, and trunk-fat ratio with previously identified sex-specific genetic effects. We identified 367, 560, 832, and 1158 genetic variants that had significantly stronger associations in females in waist–hip ratio, arm-fat ratio, leg-fat ratio, and trunk-fat ratio, respectively. In males, only 12 variants were found in arm-fat ratio (estimated false discovery rate 4.9–6.8% across all traits, see Table S3a–c and Fig. S3). Included in the female-specific waist–hip ratio variants were genetic variants proximal to four of six previously reported genes (COBLL1/GRB14, VEGFA, PPARG, and HSD17B4). Fat ratio variants were proximal to one of the male and 48 of the female-specific genes previously identified, indicating that we capture known sex-specific signal (see Table S3d for the overlap and Table S4 for the full lists). In addition, we validate these sex-specific variants by showing they have similar effect sizes in a held-out cohort (see Supplementary methods, Fig. S6 and Table S11).

Sex-differential heritability

We applied the two-component SEMM to 33 UK Biobank biomarkers [7] in order to estimate the sex-specific heritability and genetic correlation for each trait (see Table S2 for a full list of these traits). While a large fraction of biomarkers had overlapping heritability estimates, we found sex differences in the heritability of 17 of 33 biomarkers, including testosterone, IGF-1, non-albumin protein, SHBG, total protein (higher in males), apoplipoprotein B, C-reactive protein, cholesterol, creatinine, cystatin C, eGFR, gamma glutamyltransferase, HDL-C, LDL-C, potassium in urine, sodium in urine, and urate (higher in females, Fig. 2a). Of these, cholesterol, creatinine, and sodium in urine, LDL, testosterone, and urate showed >1.3-fold differences. For the majority of traits, the between-sex genetic correlations were close to 1.0, indicating shared additive genetic effects between males and females (Fig. 2b). By contrast, for testosterone, we estimated a genetic correlation of only 0.120 (2.5–97.5 percentile interval: 0.0805–0.163), indicating largely nonoverlapping genetic effects between males and females (see Table S5; these estimates are consistent across priors: Fig. S4a and Table S6a, b).

Fig. 2: Heritability and genetic correlations of biomarkers between females and males and related to menopausal status.
figure 2

a SNP-based heritability estimates of 33 biomarkers for females (orange) and males (light purple). b Correlation of genetic effects between males and females for 33 biomarkers. c The genetic correlation within women (pre- vs. postmenopausal, in pink) was higher or equal to than either that between both postmenopausal women and men (green) and premenopausal women and men (tan). Error bars in all three plots indicate the 2.5–97.5 percentile interval from STAN sampling.

The heritability of a particular trait can vary across the lifetime, as genetics may explain more or less of the variation in that particular trait. Previous studies have found that pre- and postmenopausal women have different heritability for BMI, waist and hip measures, and lipid biomarkers [30]. To examine this across biomarkers in the UK Biobank population, we applied our two-component SEMM to summary statistics for pre- and postmenopausal women. We found that genetic correlations between pre- and postmenopausal women close to 1.0, and all traits had higher or equivalent within-sex (between pre- and postmenopausal women) than between-sex (between either group and men) genetic correlations (Fig. 2c and Table S7).

Identification of genetic variants with sex-specific effects

We applied our four-component SEMM to all 33 biomarkers to identify genetic variants with sex-specific and shared effects. In total, our analysis found 26,561 variants with effects on the traits of interest (Table S8). As expected, the majority (25,950) of these variants showed shared effects between sexes, and most traits had few or no sex-specific variants. We identified 148 and 463 genetic variants with sex-specific effects in females and males, the bulk of them associated with testosterone (80.4% and 96.1%, respectively; see Figs. 3a, S5 and Tables S9, S10). Of the testosterone variants, 54 male-specific, one female-specific, and one shared variant are located on the X chromosome, indicating enrichment of X chromosomal variants in male testosterone genetics, consistent with previous reports [31]. In addition, using tissue-specific enrichment analysis [32], we find enrichment of liver genes in the genes proximal to male- but not female-specific variants (p = 6.21 × 10−7 and p > 0.1, respectively; Fig. S5, Supplementary methods).

Fig. 3: Identification of genetic variants with sex-specific effects on testosterone levels.
figure 3

a Estimated effect sizes for genetic variants with nonzero effects on testosterone are shown (x-axis, estimated effect size in females; y-axis, estimated effect size in males). Light purple dots correspond to variants that belong to the “male-specific” effect component; orange corresponds to the “female-specific” effect component; and gray dots correspond to genetic variants that belong to the “shared” effect component of SEMM. Polygenic risk score (PRS) predictions for testosterone in male (b) and female (c) individuals for male-specific (light purple), female-specific (orange), and combined (gray) PRS models. Plots show predicted stratified risk bins for testosterone levels (x-axis) versus mean covariate-adjusted testosterone for those individuals (y-axis). These values were calculated on a held-out test set of unrelated White British individuals. The error bars represent standard errors.

Previous testosterone genetics studies have focused on males. In our study, in addition to identifying male-specific variants in known testosterone-related genes (AR, JMJ1DC, and FAM9B), we also identify multiple female-specific variants with strong positive or negative effects on testosterone. In females, these include missense variants in STAG3 (rs149048452, β = −0.33 and p = 8.97 × 10−9), a meiosis cohesion complex protein containing variants associated with premature ovarian failure [33], and POR (rs17853284, β = −0.23 and p = 8.98 × 10−15), a cytochrome p450 oxidoreductase where deficiencies associated with amenorrhea, disordered steroidogenesis, and congenital adrenal hyperplasia [34] (Table S10a). Many female-specific missense variants are located in genes associated with steroid hormone production (LIPE, POR, UGT2B7) or gamete formation (STAG3, MCM9, TSBP1, ZAN); although ZAN and TSBP1 encode the sperm zonadhesin protein and testis-expressed protein 1, respectively. To our knowledge, these associations are novel and may help with understanding testosterone genetics in women.

Mendelian Randomization of sex-specific genetic effects

After identifying genetic variants with sex-specific effects on testosterone levels, we used Mendelian Randomization (MR) to examine whether these biomarkers are causally related to disease outcomes or other commonly measured traits. The intuition is that if a genetic variant is associated with differing levels of a biomarker, this provides a natural experiment, and we can examine whether the predicted variance in that biomarker based on the genetic variant is associated with the outcome variance, which indicates a causal effect. Recently, MR studies have found evidence for causal links between testosterone and cardiovascular disease [35] but not cognition [36] or BMI [37]. We aggregated a total of ten outcomes (Table S12), including anthropometric traits (height, BMI, WC, and HC), disease outcomes (heart disease, stroke, and type 2 diabetes), and sex-specific traits (ages at menarche and menopause, prostate cancer), and used the IVW method [38] to assess the causal effects of the sex-specific variants identified in our analysis (Fig. 4 and Table S13). We found that testosterone levels showed evidence of a causal association with BMI and WC using female-specific variants as instruments and HC using male-specific variants, with estimated effects consistent with higher testosterone increasing BMI, WC, and HC (p = 1.3 × 10−12, 1.1 × 10−4, 2.6 × 10−5; β = 0.081, 0.04, 0.036; SE = 0.011, 0.01, 0.0086 respectively, Fig. S8). A previous MR study [35] examined testosterone for causal effects on HC, WC, and BMI, but did not find evidence of an association; however, it is possible we are able to find these associations because we used sex-specific genetic instruments. Both female and male variants showed evidence of a causal association with height (p = 6.1 × 10−6 and 9.8 × 10−9), with higher testosterone associated with decreased height (β = −0.093 and −0.11, SE = 0.021, 0.020). This is in contrast to evidence of a positive relationship between height and testosterone levels at a population level and in a previous MR study [35, 39]. For all of these associations, we observe similar effects in the UK Biobank and GIANT datasets, with MR Egger and IVW (Table S13).

Fig. 4: Results of Mendelian randomization tests with sex-specific testosterone variants as instruments.
figure 4

This plot shows the Mendelian randomization results for all traits, with each trait shown separately. Effect sizes (betas) are estimated using either the female or male-specific variants as instrumental variables for testosterone exposure. Ninety-five percent confidence intervals are shown for each estimate. Points are colored by the sex of the outcome population (light purple for males, orange for females, and gray for combined), with size indicating the −log10 p value, and shape showing the source of the GWAS statistics for the trait (triangles for UKBB = UK Biobank, circles for all others). Trait/exposure pairs that are significant after multiple hypothesis correction are indicated with an asterisk. The list of non-UK Biobank consortia traits is included in Supplementary methods and Table S10.

Male-specific testosterone levels show evidence of an association with type 2 diabetes (T2D) (p = 3.1 × 10−5); higher testosterone is related to T2D risk reduction (β = −0.54, SE = 0.13) using data from the combined DIAGRAM and MetaboChip study [28]. This association was found using the IVW method; MR Egger estimates indicate that the relationship is in the reverse direction and is not significant (β = 1.1, SE = 0.61, p = 0.10). Several longitudinal studies have shown that low levels of testosterone predict the later development of T2D or metabolic syndrome [40].

Sex-specific multivariate polygenic risk prediction

Motivated by the sex differences in testosterone genetics, we tested whether sex-specific PRS would have better predict testosterone levels than a sex-combined model. We applied batch screening iterative lasso [41] to train multivariate penalized regression models for males and females. While the two sex-specific and combined models are consistent on a held-out test set ( = 0.59 and 0.60, p < 2.2 × 10−16; Fig. S9), the sex-specific models have improved performance in the sexes they were trained on over the combined model, and low performance in the opposite sex (R2 = 0.31 vs. 0.21 vs. 0.020 and 0.18 vs. 0.13 vs. 0.023 for male and female vs. combined vs. opposite sex). Overall, these results highlight the benefits of sex-specific polygenic prediction for testosterone (Fig. 3b, c and S10).


We sought to examine how genetics relates to sex differences in biomarker levels. To answer this question, we studied the genetics of 33 biomarkers in UK Biobank males and females using SEMM, a two- and four-component Bayesian Mixture Model. SEMM has the benefit of both estimating the underlying genetic architecture and identifying genetic variants with shared and sex-specific effects. For the majority of the traits we analyzed, we do not see strong sex differences in genetic effects, which is expected and previously documented in the literature [14]. Namely, the genetics of these traits are shared (as indicated by genetic correlations close to one and similar heritabilities), and the traits have few or no variants with sex-specific effects.

By contrast, we found little overlap between males and females in the genetics of testosterone levels. In addition to finding significant sex differences in genetic architecture, we also identified over five hundred genetic variants with male- or female-specific effects. Because of the male-female differences in testosterone genetics, we examined the subset of these sex-specific variants that are protein-altering and the tissue-specific expression patterns of proximal genes. The protein-altering variants associated with female-specific effects testosterone include variants in genes associated with steroid hormone production and gamete production. Tissue-specific enrichment analysis reveals that the genes proximal to these sex-specific variants are enriched in liver in males but not females. It is hypothesized that the relationship between testosterone and liver disease may have different etiology in men and women [42]. In addition, we built sex-specific polygenic risk models, which showed improved predictive performance over a sex-combined model.

We used MR to assess whether testosterone may be causally implicated in a broad range of diseases and phenotype measurements, and found associations with BMI, WC, HC, height, and T2D. The relationships with BMI, WC, and HC are novel MR associations, and it is possible that we are able to identify them because we are using a novel set of genetic instruments and most previous MR studies in testosterone excluded women. Our analysis shows a decreasing effect of testosterone on height, which is surprising, as previous studies indicated that higher testosterone is associated with higher stature [35]. However, testosterone is sometimes used as a therapy for tall males with delayed puberty, and results in accelerated initial growth but overall stunting of stature [43]. Further work is required to understand this association. Finally, the potential causal relationship with T2D supports the hypothesis that testosterone treatment for reducing diabetes risk in men may be a worthwhile approach, and matches the MR recent findings of Ruth et al. [44], that higher testosterone levels reduce T2D risk in men and increase risk in women. While we did not use sex-separated T2D data in our MR analysis, the associations between T2D and testosterone levels were only found with male-specific testosterone variants as instruments, which are consistent with this effect.

SEMM is made publicly available as an R package. In addition, the inference results are available for visualization as a web application in the Global Biobank Engine [45], and all the sex-specific genetic variants we identified are included in the Supplementary Tables (Tables S4 and S10). We focused on identifying sex-specific effects in biomarkers; however, there are many other traits and diseases that also show phenotypic sex differences and may have underlying sex-specific genetic effects, so future work may involve application of our model to those traits. While we applied this method to examine sex differences in genetic effects, SEMM could also be used to for any other type of binary gene-covariate analysis.

Testosterone is frequently thought of as a male sex hormone because of its higher levels in men and its involvement in the development of the male reproductive tract and secondary sex characteristics. However, females also produce testosterone, albeit at lower levels, and elevated testosterone is associated with polycystic ovarian syndrome and metabolic disorders [46, 47]. Previous work examining the genetics of testosterone in females did not find associations [48] and previous MR studies have been limited by the lack of known testosterone variants in women [35]. Our analysis expands on and addresses this by using a larger population, carefully adjusted biomarkers, and our SEMM method to identify variants. Our results demonstrate that the genetics of testosterone levels is complex and highly polygenic in both males and females. Further, our work highlights the importance of also examining female variability in testosterone levels, and of considering sex as a variable. It is important to note that our analysis was limited to White British individuals; and as such may not generalize to other populations. In addition, our analysis binarized sex and did not consider gender, so our results do not consider gender-related effects or include intersex, transgender, and non-binary individuals [49]. Future analyses should include more diverse population cohorts, an expanded sex-gender spectrum, rare genetic effects, and additional reproductive health outcomes; we anticipate that this will lead to better understanding of the translational impact of these findings.