Introduction

The most common three types of skin cancers are named for their presumed cell of origin: the keratinocyte derived cancers (KC) basal cell carcinoma (BCC) and squamous cell carcinoma (SCC), and the melanocyte-derived melanoma. Exposure to ultraviolet radiation (UVR) is a key risk factor for skin cancer, and cutaneous melanocytes produce and distribute protective pigment-containing melanosomes to surrounding skin keratinocyte cells to protect against UV damage [For review see ref. 1]. In some Western countries such as Australia and the United States, cutaneous melanoma incidence is higher in men compared to women2,3,4,5. While this may in part derive from differences in sun exposure behaviours and reduced healthcare engagement amongst men, these are insufficient to fully explain the difference2. Differences between men and women in immune responses and hormone levels may also influence the risk of melanoma2 and KCs6.

Male-pattern baldness (MPB), also known as androgenetic alopecia, is characterised by progressive hair follicle miniaturisation, leading to hair loss in men. This condition is often linked with changes in dihydrotestosterone levels, with several studies showing that patients with MPB had higher endogenous testosterone levels than controls7,8,9. Recent genetic studies10,11,12 of endogenous testosterone levels revealed genes that are associated with both testosterone levels and MPB. MPB has been associated with a greatly increased risk of scalp melanoma (hazard ratio [HR]  =  7.2, 95% confidence interval [CI]: 1.3–39.4) and SCC (HR = 7.1, 95% CI: 3.8–13.1)13. This is of interest as the numbers of head and neck melanomas have increased by over 50% between 1994 and 2015, primarily in males of European descent in the United States and Canada14. Melanomas of the head and neck, and particularly the scalp, are associated with higher mortality than other sites15,16. Across all studied populations, men experience higher rates of head and neck melanoma5. Cutaneous melanoma proliferation may be influenced by androgens either directly or through immune system suppression2,17,18, providing a potential link between MPB and cutaneous melanoma risk. A recent observational study reported a positive association between free testosterone and melanoma risk19, but whether this relationship is potentially mediated through MPB remains unknown. For KCs, a previous Canadian study revealed that the incidence of KC is higher in men20. Another retrospective cohort study from Australia showed potential sex differences in the incidence of KC in different body sites, where men are more likely to develop SCC in the scalp region21. To the best of our knowledge, there have been no large-scale studies to date evaluating the link between testosterone and KCs to dissect sex differences in KC incidences in these populations.

The increased incidence of skin cancers amongst men with MPB may: a) reflect higher chronic UV damage to the exposed scalp; b) result from a direct causal role for male-specific factors, including testosterone (consistent with the proposed androgen basis of melanoma hypothesis13); or c) be driven by other factors related to androgenic regulation of immune response. Leveraging the availability of large-scale genetic data Mendelian randomization (MR)22, a powerful genetic-based instrumental variable technique relying on the use of strong genetic proxies, can be used to detect and dissect causal pathways between MPB and skin cancers. As the allocation of genetic variants (namely single nucleotide polymorphisms, SNPs) is randomized at meiosis, the measured causal effect is less likely to be biassed by reverse causality and confounding factors23. When used in combination with multivariable techniques24, we can extend MR frameworks to disentangle the complex (and potentially mediated) relationship between endogenous testosterone, MPB and the risk of skin cancers.

In this study, we examined the link between endogenous testosterone levels, MPB, and skin cancers through a series of univariable and multivariable MR (MVMR) analyses. Using skin cancer outcome data from the UK Biobank (UKB)25, the QSkin Sun and Health Study (QSkin)26, and a recent large-scale cutaneous melanoma meta-analysis27, we comprehensively assessed the relationship between genetic instruments for MPB, sex hormones, and skin cancer susceptibility, followed by an exploratory analysis of whether these associations differed by anatomical location (body site) of the primary cancer site.

Results

Methodology overview

In brief, we adopted a two-sample MR framework22 to investigate the relationship between testosterone, MPB and risk of KCs (all KC, SCC only and BCC only) and melanoma. The skin cancer genome-wide association studies (GWASs) were derived from a recent melanoma GWAS meta-analysis27 (participating studies in Supplementary Table 1) and the Australian QSkin cohort26 and UKB25 for KCs. GWAS analyses of risk factors (sex hormones) were conducted in subsets of UKB independent of those used in the melanoma GWAS meta-analysis to increase the robustness of MR findings. We first applied standard univariable MR techniques to examine the direct association between hormonal risk factors, MPB and skin cancers, followed by a multivariable approach to fit all risk factors simultaneously whilst including proxy traits captured by heterogeneous genetic outliers (see Fig. 1 for the complete study schematic). We finally evaluated site-specific cancers in a subset of cases from UKB and Melanoma Institute Australia (MIA) cohorts to assess whether these MR associations differed by primary tumour anatomic site (i.e. head and neck) (distribution of cases by anatomic site provided in Supplementary Tables 24; classifications in Supplementary Fig. 1), as reported in previous observational studies21.

Fig. 1: Schematic diagram outlining the overall study approach of modelling genetic outliers via MVMR.
figure 1

Each panel (a), (b), (c) and (d) is listed in chronological order of the analysis procedure. a Schematic MR diagram. b MR scatter plot. c Selection of candidate traits for inclusion into MVMR via PheWAS findings. d Modelling the candidate traits into the MVMR analysis to obtain the marginal effect of MPB on skin cancer risk, by conditioning on endogenous testosterone levels and other candidate traits.

Assessment of statistical power for Mendelian randomization

We identified a total of 444 SNP instruments for male-pattern baldness explaining 13.95% of the phenotypic variance. Similarly, 118, 83 and 204 SNPs were used as genetic instruments for endogenous totalT, (estimated) freeT and SHBG in the univariable MR analyses, with these SNPs cumulatively explaining, 8.13, 4.94 and 14.5% of the phenotypic variances on these traits respectively. The comparison of our testosterone SNP associations (estimated in the non-overlapping samples) with those obtained in the testosterone GWAS reported in Ruth et al.11 is shown in Supplementary Figs. 24, revealing highly concordant genetic effect sizes. Even when we restrict the analysis to SNPs with an association of p < 1×10−5 with the exposure of interest within our independent UKB subset (halved the original discovery GWAS sample size), the cumulative variances explained by SNPs were largely unaffected (Table 1). With the relatively high proportion of variance tagged by our SNPs, our MR study has very good power (at least 90%) to detect associations at OR > 1.2 for a one SD change in the aforementioned risk factors (Supplementary Fig. 5). The conditional F-statistics (to quantify instrument strength of our combined instrument) in the MVMR setting for each tested combination of models is shown in Supplementary Table 5.

Table 1 Assessment of instrument strength on major exposures of interest for the two-sample MR analyses

Primary MR analyses evaluating the MPB association with risk of skin cancers

For melanoma, none of the risk factors were associated with the risk of melanoma, with OR point estimates close to one (estimated IVW OR between 0.96 to 0.99) (Table 2). For KCs, the estimated OR per SD increase in MPB score was 1.17 (95% confidence interval [CI] 1.08−1.27) for risk of all KC, 1.15 (95% CI 1.05−1.26) for BCC and 1.31 (95% CI 1.17−1.46) for SCC in the UKB. Estimates from QSkin yielded slightly smaller effect sizes, albeit with largely overlapping confidence intervals (Table 2). The fixed-effect meta-analysed MR estimates for MPB on KC outcomes across both cohorts were: KC 1.15 (95% CI 1.06−1.23), BCC 1.15 (95% CI 1.06−1.25), and SCC 1.28 (95% CI 1.15−1.43). We detected no strong association between endogenous sex hormone levels (totalT, freeT, SHBG) and any KC outcomes (Table 2). Results derived using alternative MR models are shown in Supplementary Tables 69 for each exposure, respectively. Estimates derived using only SNP instruments that were more robustly associated (p < 1×10−5) with the exposure of interest measured in the independent UKB subset were not meaningfully different (Table 2; Scatter plot in Fig. 2).

Table 2 Validation of MR association between risk factors and skin cancer risk using SNP instruments derived from independent non-overlapping subset of UKB
Fig. 2: Comparison of MR effect sizes for MPB on skin cancer risk.
figure 2

beta_IVW refers to the (fixed effect) inverse variance weighted MR effect estimate; p= corresponding two-sided z-test P-value of the beta_IVW estimate; MPB Male pattern baldness. Each panel illustrates the MR scatter plot for the association between genetically predicted MPB score and the log(OR) on various types of skin cancer. Each point represents a single MPB SNP instrument, with the corresponding horizontal and vertical error bars reflecting its standard error on the genetic association with MPB and skin cancers, respectively. Points in light blue are MPB variants that showed stronger evidence of association in the UKB subset independent of those used to derive the SNP-skin cancer association. The annotated SNPs on each panel refers to the SNPs identified as outliers via the MR-PRESSO outlier test. Apart from melanoma (right-bottom panel; which shows predominantly null findings), the IVW effect estimates derived using MPB variants after excluding the genetic outliers show strong attenuation of effect sizes towards the null. Source data are provided as a Source Data file.

MVMR analyses combining MPB and testosterone levels revealed no evidence that the marginal association between MPB and KCs is influenced by endogenous testosterone levels; marginal OR on KC: 1.16 (95% CI 1.06−1.28); BCC: 1.16 (95% CI 1.04−1.29); SCC: 1.30 (95% CI 1.13−1.48) per SD increase in MPB score (Supplementary Table 10).

Sensitivity analyses to model the potential pleiotropic role of pigmentation-related SNPs

It is possible that some MPB SNPs are pleiotropically associated with skin cancer risk factors. To address this, we first adopted a non-parametric approach to identify potential pleiotropic variants captured by the MR-PRESSO outlier test28. We detected four potential SNP-outliers in the association between MPB and KC phenotypes (rs2669871 [near KRT75], rs3847069 [near CUX1], rs1805007 [non-synonymous functional SNP in MC1R], rs12203592 [functional SNP in an enhancer for IRF429]); outliers were also confirmed via manual inspection of MR scatter plots (Fig. 3) and funnel plots (Supplementary Figs. 69).

Fig. 3: Comparison of MR-derived association between MPB and various univariable and MVMR models.
figure 3

UniMR − MPB SNPs with p < 1e − 5 only: Univariable MR model for MPB on skin cancer risks using on MPB SNPs with association (z-score) two-tailed P value < 1e-5 on MPB in the independent UK Biobank subset. UniMR − excl. outliers: Univariable MR model for MPB on skin cancer risks using all MPB SNPs excluding pleiotropic SNPs detected by MR-PRESSO. MVMR Model 1: MVMR model incorporating MPB, freeT and totalT. MVMR model 2: MVMR model incorporating MPB, totalT, skin colour and hair colour. The error bars reflect the 95% confidence intervals around each OR estimate. For all MVMR models, the reported OR estimates are the marginal OR estimates for skin cancer per 1 SD increase in MPB upon conditioning on the genetic effect sizes from other traits included in the model. Note that for MVMR model 2, including freeT into the model resulted in the weakening of the combined instrument for MVMR (conditional F-statistics<10 for some traits), which might result in weak instrument bias and hence were not included in the main analysis. However, these findings can be accessed in Supplementary Tables. Source data is provided as a Source Data file.

PheWAS revealed strong associations between these SNPs and pigmentation traits, including skin colour and ease of skin tanning (Supplementary Table 11; Supplementary Figs. 1013). Univariable MR analyses support a positive relationship between lighter skin or hair colour and MPB (beta for lighter skin colour on MPB Score=0.28 [95% CI 0.14–0.42] per SD unit increase in MPB; beta for lighter hair colour on MPB = 0.13 [95% CI 0.08–0.18], Supplementary Table 12), suggesting a potential common biological pathway. In our MVMR model incorporating pigmentation variables (hair and skin colour), the marginal association between MPB and KCs (including BCC and SCC separately) showed signs of attenuation towards the null (e.g. OR on KC 1.05 [95% CI 0.98–1.12] adjusted for pigmentation compared with original MVMR OR 1.16 [1.06–1.28]; see Fig. 3). These results suggest that the association between genetic variants associated with MPB and skin cancer body-wide might be driven by a pleiotropic effect on pigmentation. MVMR results for all tested combinations of the five traits (each model satisfying the minimal conditional-F > 10 requirement30) are shown in Supplementary Tables 1315, each revealing similar marginal effect sizes from MPB.

In our second approach, we removed the four pleiotropic SNPs detected via MR-PRESSO from our instrument set and repeated our analyses. Both the revised findings from univariable, and MVMR models, revealed no clear evidence for an overall genetic association between MPB and skin cancers, apart from the association with SCC (MPB-SCC univariate MR estimate: OR 1.17 [95% CI 1.07–1.28], p = 8.06×10−4; MVMR estimate: OR 1.15 [95% CI 1.02–1.28], p = 0.02), see Figs. 2 and 3. We observed a large reduction in the univariable MR model’s Cochran’s Q statistics following exclusion of the four pleiotropic SNPs, indicating that the genetic heterogeneity among the SNP effect sizes on KC was much smaller in the revised model (see Supplementary Tables 16 and 17). The revised MR estimates from alternative MR models had largely overlapped 95% C.I. with estimates from the [IVW MR model excluding the SNP-outliers detected by MR-PRESSO] (Supplementary Table 18), indicating minimal evidence of horizontal pleiotropy biases on our findings.

Stratified MR analysis between MPB and skin cancers by body sites

Melanoma

For cutaneous melanoma, we obtained primary body-site-specific cancer data from the UKB and the Melanoma Institute Australia to investigate evidence for a primary site-specific association between MPB and melanoma. A higher genetic predisposition towards MPB was associated with increased risk of head and neck melanoma (meta-analysed OR 1.31 [95% CI 1.07–1.61] in the original model; OR 1.23 [95% CI 1.00–1.50] in the outlier-robust model). MPB was not associated with cutaneous melanoma at other body sites (Table 3). Similar positive findings were obtained when we evaluate melanoma at the scalp region specifically: estimated OR 1.66 [95% C.I. 0.99–2.80], but with signs of attenuation towards the null in the outlier-robust model estimate (OR 1.33 [0.79 – 2.25]) (Supplementary Table 19). Further splitting the association analysis by evaluating scalp melanoma and head and neck melanoma excluding the scalp region separately yielded very similar findings–indicating that the overall MR association between MPB and head and neck melanoma is primarily driven by melanoma on the scalp region (Supplementary Table 19). We also found no evidence that the MPB-melanoma association differed by Breslow thickness with the 95% confidence intervals on the OR estimates for both thick (OR 1.12 [0.89–1.39]) and thin (OR 1.04 [0.82–1.33]) melanoma largely overlapping (see Supplementary Table 20 for the ORs derived from alternative MR models).

Table 3 Comparison of estimated MR body-site-specific association between MPB and skin cancer in the UK and Australian population

Keratinocyte cancers

Combining data for site-specific BCC and SCC separately from the UKB and the QSkin cohort, we found limited evidence for an association between MPB and head/neck BCCs (OR 1.08 [0.99–1.17]) or SCCs (OR 0.92 [0.80–1.05]) using pleiotropy-robust MR models. The estimates for all other body sites were largely consistent with a null effect. MR estimates derived using alternative MR models show consistent effect sizes with the IVW estimate, albeit with much lower precision (Supplementary Table 19).

Discussion

Melanomas and skin cancers occur with unequal distributions across the body surface, and depending on the country, at higher rates amongst men. While this distribution may be explained by patterns of sun exposure, it has also been postulated that differences in hair covering5,31, and possibly other hormonal factors5, may also contribute. We took advantage of several large genetic cohorts that collected site-specific melanoma and skin cancer data, and first showed that testosterone does not play a role on skin cancer susceptibility, a finding supported by our site-specific MR analyses indicating potential increased risk of scalp melanoma among individuals with a high genetic risk for balding. We presented evidence for an association between MPB and risk of SCC and finally revealed insights on mediation mechanisms on the link between MPB and these skin cancers through MVMR.

In Australia, the 2015 age-standardised rates of melanoma per 100,000 were higher for men (63.1) than for women (42.0)32. Whilst this trend is also observed in the United States, New Zealand and Canada, the differences were lower or negligible in other countries (e.g. United Kingdom, Denmark and Sweden), and varied by primary body site5. Arguments for the involvement of sex hormones in melanoma have primarily stemmed from the observed difference in disease prevalence between men and women, prompting an investigation into potential sex-specific mechanisms that stimulate the proliferation of melanoma cells. Animal studies and in-vivo experiments have suggested that products of testosterone (i.e. progesterone) inhibit melanoma cell growth in a dose-dependent fashion, but there are very few large-scale epidemiological studies exploring this. Of note, our genetic-based approach failed to replicate the association between free testosterone and melanoma susceptibility reported in a recent cross-sectional study19 using data from the UK Biobank. Here we found no evidence for a causal role between endogenous sex hormone levels (SHBG, freeT, totalT) and skin cancers. Interestingly, our multivariable MR model which enables the estimation of marginal effect sizes for these risk factors on cancer outcomes (upon conditioning on the other competing trait) yielded similar conclusions of a null association between testosterone and skin cancers.

Among the three skin cancers evaluated, SCC showed the strongest association with MPB, followed by BCC. However, we found that the positive association between MPB and risk of KCs was almost completely driven by a functional SNP in the IRF4 locus (rs12203592). This variant was also previously known to be associated with pigmentation and immune response29,33,34,35,36, thus may influence risk of skin cancers through pathways other than its impact on MPB. Disentangling the observed association between MPB with skin cancers requires careful consideration of both potential causal and pleiotropic mechanisms in play. While the practice of excluding heterogeneous variants and SNP outliers to mitigate potential horizontal pleiotropy bias in the outcomes is a valid approach, we adopted an alternative strategy. Specifically, we chose to model these pleiotropic associations with other traits using MVMR37. This decision was motivated by our aim to gain a more comprehensive understanding of potential mediators and genetic confounders that may underlie the relationship between MPB and the development of skin cancers.

In our investigation, we demonstrate how the pleiotropic variants, such as the detected SNP-outlier rs12203592 located in an enhancer for the IRF4 gene may help explain the observed relationship between MPB and skin cancers (See Fig. 2). For instance, the role that IRF4 plays in transforming pigmented terminal hair into unpigmented vellus hair is widely established29,34, suggesting a potential causal role for pigmentation on MPB. rs12203592 has been consistently identified by prior GWAS of nevus count, hair colour, development of freckles and skin pigmentation, all of which are established risk factors for skin cancers10,29,38,39. Due to the strong magnitude of association between variants in IRF4 and both hair loss and pigmentation, it is difficult to determine the type of genetic pleiotropy (vertical or horizontal) exerted by this variant on MPB and skin cancers (i.e. whether the association between IRF4 on skin cancers were through change in baldness). When pigmentation variables were included in our MVMR model, the association between MPB and KC weakened; though this was also observed when we excluded IRF4 from the revised univariable MR models. Hence, the association between MPB and overall KC is likely capturing an indirect influence of IRF4 on both pigmentation-related variables (e.g. skin colour, nevus counts) and risk of balding (with the mode of pleiotropy cannot be reliably determined), through potential pleiotropic effects on autoimmune functions33 which was not characterised in the present analysis.

As our MR instruments for MPB explain large proportions of phenotypic variances, we had reasonable power to revisit previous observational findings on primary site-specific skin cancer associations with MPB, or at the very least, exclude very large OR effect sizes. Our site-specific MR analysis revealed two interesting key findings. Firstly, the estimated effect size on melanoma located in the head and neck region (based on IVW model excluding SNP outliers) was larger than those of other body sites for both the MIA and the UKB cohorts (Table 3), supporting the role of balding in exposing the scalp area of the skin to UVR radiation. Taken together with our MR findings on testosterone, this demonstrates that testosterone had no direct role in skin cancer formation (i.e. its only role is to remove the natural protection from UV - hair). We observed the same trend for head and neck BCC (albeit with lower precision), where the magnitude of association was slightly higher for the QSkin cohort compared to the UK Biobank study (Table 3), making it unclear whether these MPB-BCC associations varied between populations in low-UVR (UK) and high-UVR (Australia) geographical regions. For melanoma, we found a strong association between MPB and melanoma at the head and neck region; combined with MPB being a trait that is specific towards men, might help explain the higher prevalence of head and neck melanoma in men as previously reported. However, the estimated effect size between balding and melanoma/KC at the head and neck region (predominantly scalp melanoma) in our datasets (e.g. equivalently head and neck melanoma OR 1.72 [1.14–2.59] per 2 SD increase in MPB score) appeared to be much more modest than those previously reported by Li and colleagues (scalp melanoma HR 7.15 [1.29–39.42] for highest balding category vs. none)13, though our estimates had much greater precision. Another potential explanation may be that observational associations between balding and skin cancers may be amplified by ascertainment and detection bias, where tumours at the scalp or forehead region are more apparent among patients with MPB.

This study has several notable strengths as compared to previous studies of similar nature. Firstly, the proportion of phenotypic variance explained by our genetic instruments were high (r2 estimates 0.04–0.15; Table 1), attributable to the largely heritable nature of both MPB and sex hormone phenotypes. Combined with the very large sample sizes for melanoma and KCs (approximately 2–4 times larger than those previously reported in any MR findings40,41,42), our statistical power to detect even subtle effect sizes (e.g. OR < 1.3 per SD change in exposure trait) were greatly improved. The polygenic basis of both exposures also enables better assessment of horizontal pleiotropy through various alternative MR techniques, ensuring that our main findings were not severely biased by violation of MR assumptions. In addition, the MVMR enables triangulation of a mediation effect for the association between testosterone or other putative biological pathways in our sensitivity analyses and skin cancers. Finally, we used MR to interrogate whether the association between MPB and skin cancers differed by body site, to compare with previous observational findings.

There are also some limitations to our study. The grading of MPB in the UKB was acquired through participant’s self-report; hence we cannot exclude the possibility of self-reporting biases arising from negative social stigma43. In practice, the misclassification of MPB scores is unlikely to be directly associated with skin cancer outcomes, as that would generally just reduce the statistical power for (genetic) instrument identification. Hence, the technical difference arising from alternative classifications of MPB is not a potential concern in our MR study. The derived MR finding on melanoma using summary statistics from the melanoma GWAS meta-analysis consisted of participants from the UKB (n = 7782 men), making up ~25% of the total number of cases included in the meta-analysis. However, given that the proportion of overlapping samples is less than 30%, any generated bias on the MR estimate is likely limited44. Moreover, we repeated our analyses using only subsets of the UKB participants not involved in the skin cancer GWASs and showed that our findings remain broadly unchanged. Our MR findings assume a linear relationship between balding and cancers, and hence might under-estimate the true effect size if the dose-response relationship violates the linearity assumption45. Whilst our analysis on the MIA cohort revealed no evidence that the MPB-melanoma association differed by Breslow thickness, we were unable to repeat this analysis in the UK Biobank as information on tumour thicknesses have not been recorded. Our subgroup analyses revealed evidence that this association is driven by melanoma in the scalp. Whilst this supports our inference on increased UV exposure, we did not have the necessary information/sample size to replicate this stratified MR analysis in the UK Biobank and QSkin cohort. Finally, further external validations will also be required to assess whether these genetic instruments can be applied to probe the balding-skin cancer relationship in other non-European ancestries.

Our approach of efficiently incorporating outlier information as candidate traits in our MVMR model is inspired by the MR-TReasure Your eXceptions (MR-TRYX) framework37. The key difference here, however, is that we restricted our candidate traits to those involved in pigmentation whilst the MR-TRYX model exhaustively examines all possible risk factors through a generic PheWAS platform. There may be more optimal candidates, such as immunological factors (e.g. eosinophil count, see Supplementary Table 11), which we did not consider in our approach. In practice, we are largely limited by the number of traits we can simultaneously model in an MVMR framework due to difficulty satisfying the conditional F-stat requirement as we increase the number of traits in the model, distribution of trait-specific conditional F-stat in all our tested MVMR models shown in Supplementary Table 5. Finally, our sample sizes in the body site-stratified analyses were limited and were only drawn from regions with two extreme ends of very low (UK) and very high ambient UV radiation (Australia). It remains unclear on whether our findings can be generalised onto other populations. Hence, replicating our site-specific findings on other populations with moderate UV radiation will help ensure our findings are generalisable.

In conclusion, genetic evidence in this study provides minimal support to the androgen-driven hypothesis linking sex hormones to the development of melanoma and keratinocyte cancers. Pigmentation-related factors very likely mediate the genetic relationship between balding and KCs at all body sites, evident through MVMR findings. Finally, we observed a modest body-site specific association between MPB and both the risk of melanoma and KCs involving the head and neck region greater than other body sites, suggesting that balding might increase susceptibility for melanoma around the head and neck region through reduced hair covering, a potential explanation for sex-differences in head and neck melanoma risk between men and women.

Methods

Ethical approval and patient consent

The UK biobank study has been formally approved by the UK Biobank Ethics Advisory Committee. The Qskin and AGDS study (used as controls for the MIA case-control GWAS) has been formally approved by the QIMR Berghofer Human Research Ethics Committee. The MIA study was approved by the Sydney Local Health District Ethics Review Committee at the Royal Prince Alfred Hospital in Sydney, Australia, respectively. This research project is approved by the QIMR Berghofer Human Research Ethics committee. The complete list of HREC involved in the individual studies contributing to the melanoma GWAS meta-analysis can be found in Landi et al.27 (see Supplementary Notes 1). Written informed consent was obtained from all participants from all studies. The authors confirm that patient data has been obtained according to the terms and conditions of the databases where the data was sourced. More specifically, we have obtained specific permission from the MIA investigators to utilise anonymised genetic and patient data from the MIA study for this project.

Construction of genetic instruments from the UKB

UKB is a large population-based cohort consisting of predominantly middle-aged (at the time of recruitment) white British participants recruited in the United Kingdom. Participants were genotyped (genetic QC and details available elsewhere25) and had extensive phenotypic information collected ranging from self-reported diet and lifestyle behaviour to measurements of disease biomarkers, mental health, medical history and cancer diagnosis. GWAS findings for both MPB and endogenous testosterone levels derived from UKB have been previously reported10,11,12.

MPB

Data for MPB (UKB-field ID 2395) were available for 227,354 male participants. In the UKB, MPB was self-reported and defined using a 4-point scale of increasing hair loss (see Supplementary Notes 2). We rank-transformed the ordinal MPB scales into standardised Z-scores. Analyses were restricted to only male participants of white British ancestry inferred through ancestral principal component (PC) clusters. We performed two GWASs on the rank-transformed MPB score using: i) all 227,354 available participants for instrument discovery; and ii) the subset of only 90,577 participants not overlapping the UKB KC GWASs.

Sex hormones

Endogenous serum total testosterone level (totalT) and sex-hormone-binding globulin (SHBG) were measured as part of the 2019 UKB biochemistry data release. Although publicly available GWAS summary statistics for total and free testosterone for UKB participants are available11, we adopted a similar phenotype definition to those from Ruth et al.11 for the derivation of testosterone levels (including bioavailable/free-testosterone [freeT]). The derivation for serum freeT using serum totalT, albumin and SHBG in UKB is described in Supplementary Notes 3. Each GWAS on sex hormones was performed in individuals of white British ancestry using BOLT-LMM v2.3, a linear mixed model GWAS framework that accounts for population structure and cryptic relatedness among samples46. We fitted recruitment age, genotype array and the first 20 ancestral PCs as standard covariates for all analyses.

Re-estimation of SNP effect sizes free from sample overlap

To obtain unbiased MR estimates that minimize sample overlap in the 2-sample setting44, we repeated GWAS analyses for MPB and sex hormones using only a subset of the UKB participants independent of those used for the KC GWASs (see below, also Supplementary Notes 4).

Sex-stratified GWAS summary statistics for melanoma

All cutaneous melanoma

Summary statistics from a recent fixed-effects GWAS meta-analysis of clinically confirmed cutaneous melanoma in men only were included27. All samples were clinically confirmed cutaneous melanoma. While the majority were invasive melanoma, specific histological subtype data was not available for all sets, and the GWAS meta-analysis will include a small subset of cases with in situ melanomas. Full details describing the participating studies,of analyses, collection of informed consent and ethical approval have been previously reported27. Briefly, the following standard GWAS cleaning procedures were performed: exclusion of samples (i) with >3% missingness (ii) with abnormal genotype heterozygosity (iii) was a European ancestry outlier or (iv) related to another sample at identity-by-descent (IBD) PI_HAT > 0.15. SNPs were also filtered where we removed genotypes with a minor allele frequency (MAF) < 0.01 or those with Hardy-Weinberg equilibrium (HWE) P < 5 × 10−4 in controls or 5 × 10−10 in cases prior to imputation; our post-imputation analyses were restricted to SNPs with a MAF > 0.005 and imputation quality score > 0.3. A fixed-effects inverse-variance weighted meta-analysis of melanoma risk GWAS was then performed (each GWAS was modelled via logistic regression, including PCs or equivalent control for residual population stratification). The final male-only melanoma risk GWAS included 12,232 cases and 20,566 controls. The distribution of cases across each study is shown in Supplementary Table 1.

Site-specific primary cutaneous melanoma

For the site-specific analyses, we obtained site-specific histopathologically confirmed primary cutaneous melanoma diagnoses for men from the UKB and the MIA. In UKB, we derived site-specific primary melanoma diagnoses from ICD-10 codes C43.0-9 (UKB field-ID: 41270). For MIA, melanoma cases with primary site-specific diagnoses and GWAS data were drawn from two complementary data repositories, the MIA Biospecimen Bank (protocol HREC/10/RPAH/530) and Melanoma Research Database (protocol HREC/11/RPAH/444)). Among the 2,236 men diagnosed with melanoma, more than 99.5% (2227/2236) of the cases were of (or had shown progression into) invasive melanoma. Site-specific primary melanoma cases were then matched against 3,176 melanoma-free controls from the Australian Genetics of Depression Study (AGDS)47 for the site-specific melanoma GWAS analysis. Details of the definition of anatomical primary site (body sites) categories, data consent and site-specific GWAS analysis in both the UKB and MIA study are available in Supplementary Notes 1, with case distribution across body sites reported in Supplementary Tables 2 and 3. To further investigate whether the association differed by Breslow thickness, we further performed a stratified MR analysis contrasting thick and thin melanoma (see Supplementary Notes 1 under study description on the MIA cohort for the adopted definitions for thick and thin melanoma).

Sex-stratified GWAS summary statistics for keratinocyte cancers

KCs in UK Biobank

We performed sex-stratified GWAS analyses for KC using participants from UKB. Using the case definitions as per previous work48, we identified 3,483 invasive SCC and 10,718 invasive BCC cases confirmed through ICD-10 and histology records (UKB Field-ID: 40006, 40013) among UKB male participants ascertained through linkage of participant health records to national cancer registries. Men with no prior history of KC or any other cancer diagnosis (n = 96,620) were used as controls. Our analyses were restricted to individuals of white British ancestry identified through self-report and clustering approaches on ancestral principal components49. We performed the GWAS analyses for BCC and SCC using SAIGE50, a recently developed software that implements linear mixed models for binary traits/outcomes, accounting for case-control imbalance and cryptic relatedness. We also performed a GWAS of a combined KC phenotype (cases=13,463; controls=96,620) to leverage the shared genetic architecture between SCC and BCC for improved power48.

KCs in QSkin

In total, 4,049 men with post-quality-control genetic data were clinically diagnosed with KC, with 1,064 and 502 cases identified to have invasive BCC and invasive SCC based on pathological records, respectively. Healthy controls were selected from participants screened/self-reported to have no history of KC or actinic keratoses at the time of recruitment26. The cleaning and quality control of the genetic data for the QSkin cohort have been previously described27. We performed a sex-stratified GWAS for KC, BCC and SCC, including only male participants of white European ancestry using SAIGE50, adjusting for recruitment age and the first ten ancestral principal components.

Site-specific KCs

Additional body-site-specific GWASs for KCs in men were conducted in the UKB and QSkin cohort. Similarly, these analyses were restricted to those of white European ancestry. For UKB, body-site-specific diagnoses of BCCs and SCCs were obtained through ICD-10 codes C44.0-7. For the QSkin cohort, site data of the KCs were manually recovered through histopathological records. To enable comparison between the two cohorts, we collapsed the site-specific cancers into 4 broad categories (similar to those used for the melanoma analysis): head and neck, upper limb, lower limb and trunk region. Details of the defined categories and the number of cases and controls included for each body site are also presented in Supplementary Notes 5 and Supplementary Tables 24.

Univariable Mendelian randomisation analysis

Prior to the MR analysis, we harmonised the exposures [totalT, freeT, SHBG, MPB] and outcomes (skin cancers) datasets to align alleles and discard palindromic SNPs with non-strand-inferrable allele frequencies (MAF > 0.3). Multi-allelic variants and variants with inappropriate standard errors (i.e. >1 decimal place smaller than those approximated via sample size and minor allele frequency51, for outcome datasets) were also excluded.

MPB SNP instruments were clumped to remove variants in linkage-disequilibrium (LD) (window = 10 megabase [Mb], max LD r2 = 0.001 in PLINK v1.96b52) to ensure strict independence. To control for bias from potentially weak instruments, we additionally curated instrument sets for MPB, totalT, freeT and SHBG to consist only of SNPs that were robustly associated with the corresponding traits at p < 1×10−5 in the independent UKB subset (i.e. no sample overlap with UKB KC GWAS). Statistical power for MR was assessed based on the proportion of variance explained by these instruments (Supplementary Notes 6). The inverse variance-weighted (IVW) estimator was then used to derive the log(odds ratio) (log[OR]) estimates for skin cancers for a standard deviation (SD) increase in the exposure (e.g. genetically predicted endogenous testosterone levels). For SCC, BCC and KC outcomes (and the corresponding site-stratified analyses), we combined the MR log(OR) estimate derived from QSkin and UKB through a fixed-effect inverse variance-weighted model, using the rmeta package (v3.0) in the statistical software R.

Alternative MR techniques (MR-PRESSO, MR weighted median, MR weighted mode, MR-Egger, MR-Robust) that relax assumptions regarding horizontal pleiotropy and invalid instruments were also applied to ensure the robustness of our MR inferences28,53,54,55. The specific strength and limitations for each of these methods are reported in the Supplementary Notes 7. All primary MR analyses were performed using the MendelianRandomization R package and TwoSampleMR R package v0.4.23 curated by the MR-Base platform56,57. MR scatter plots were used to inspect statistical outliers, alongside the outlier-test implemented directly via MR-PRESSO28 (see below).

Modelling the influence of androgenic and other pleiotropic pathways

We explored the possibility that some MPB-associated SNPs might exert pleiotropic effects that modify the risk of developing skin cancers. However, excluding variants with potentially large effect sizes on the outcome based on (horizontal) pleiotropy is less efficient if information linking the outlier and other pleiotropic pathways can be incorporated via MR. This allows us to understand potential mediators and/or genetic confounders of the exposure-outcome relationship in MR37. Here we outline two complementary approaches; full details of these models are in Supplementary Notes 5. We first applied the MR-PRESSO28 outlier test on MPB vs. skin cancers to identify potential outliers that can bias the MR-IVW results. We then performed phenome-wide association studies (PheWAS) to evaluate the association between the set of potentially pleiotropic SNPs and a series of pigmentation/skin-related phenotypes likely to confound any associations between MPB and skin cancer (see Supplementary Notes 8). This was done by querying the candidate SNPs against two GWAS databases, MRC-IEU OpenGWAS and the OpenTarget platform56,58,59. Based on the potential candidate SNP-trait association, we then assessed putative mediation mechanisms driving the association between MPB and skin cancer via the candidate trait. We first attempted to validate the relationship in a univariable MR framework, followed by a multivariable MR (MVMR) analysis incorporating the candidate trait alongside testosterone and MPB on skin cancer (See Fig. 1).

In our second approach, we manually excluded the set of pleiotropic variants identified in the MR-PRESSO outlier test altogether and repeated the univariable and MVMR analyses. Curation of the genetic instruments for the MVMR analysis and the selection criteria for traits to achieve strong instrument strength for the analysis are described in Supplementary Notes 9. The MVMR association analyses were performed using the mv_multiple() function from the TwoSampleMR R package24,56.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.