A multi-phenotype analysis reveals 19 susceptibility loci for basal cell carcinoma and 15 for squamous cell carcinoma

Seviiri, Mathias; Law, Matthew H.; Ong, Jue-Sheng; Gharahkhani, Puya; Fontanillas, Pierre; Olsen, Catherine M.; Whiteman, David C.; MacGregor, Stuart

doi:10.1038/s41467-022-35345-8

Download PDF

Article
Open access
Published: 10 December 2022

A multi-phenotype analysis reveals 19 susceptibility loci for basal cell carcinoma and 15 for squamous cell carcinoma

Nature Communications volume 13, Article number: 7650 (2022) Cite this article

2938 Accesses
10 Citations
11 Altmetric
Metrics details

Subjects

Abstract

Basal cell carcinoma and squamous cell carcinoma are the most common skin cancers, and have genetic overlap with melanoma, pigmentation traits, autoimmune diseases, and blood biochemistry biomarkers. In this multi-trait genetic analysis of over 300,000 participants from Europe, Australia and the United States, we reveal 78 risk loci for basal cell carcinoma (19 previously unknown and replicated) and 69 for squamous cell carcinoma (15 previously unknown and replicated). The previously unknown risk loci are implicated in cancer development and progression (e.g. CDKL1), pigmentation (e.g. TPCN2), cardiometabolic (e.g. FADS2), and immune-regulatory pathways for innate immunity (e.g. IFIH1), and HIV-1 viral load modulation (e.g. CCR5). We also report an optimised polygenic risk score for effective risk stratification for keratinocyte cancer in the Canadian Longitudinal Study of Aging (794 cases and 18139 controls), which could facilitate skin cancer surveillance e.g. in high risk subpopulations such as transplantees.

Multi-ancestry genome-wide meta-analysis identifies novel basal cell carcinoma loci and shared genetic effects with squamous cell carcinoma

Article Open access 05 January 2024

Genome-wide association meta-analyses combining multiple risk phenotypes provide insights into the genetic architecture of cutaneous melanoma susceptibility

Article 27 April 2020

Genome-wide association study of actinic keratosis identifies new susceptibility loci implicated in pigmentation and immune regulation pathways

Article Open access 21 April 2022

Introduction

Keratinocyte cancers (KC), including basal cell carcinoma (BCC) and squamous cell carcinoma (SCC), are the most commonly diagnosed cancers globally. KC resulted in over 5.4 million diagnoses and $8 billion dollars in expenditure in the US in 2011 alone¹, while in Australia, they account for >24% of all-cancer diagnoses², and impose a huge economic burden on the health sector costing over AUD $700 million for treatment annually³. KC is responsible for up to 8700 deaths a year in the United States⁴. The relative rates and morbidity, from KC, is even higher in Australia⁵. BCC and SCC share many common risk factors, including sun exposure, skin and hair pigmentation and immunosuppression.

Skin cancers, and pigmentation traits and autoimmune diseases have several susceptibility genes overlapping^6,7,8,9. For example, several variants in pigmentation genes ASIP/RALY, IRF4, MC1R, OCA2, SLC45A2 and TYR, are associated with BCC, SCC and melanoma^8,10. Shared immune-regulatory genes in the HLA and LPP regions have been found to influence susceptibility to BCC, SCC, melanoma and autoimmune diseases such as rheumatoid arthritis, vitiligo, type 1 diabetes and psoriasis^6,7,8,9. There are also some tumour-genesis-related genes, which are expressed in both KC and other non-skin cancers. For example, oncogene TNS3, which is overregulated in BCC, is also associated with breast, lung and prostate cancers^8,11,12. Furthermore, HAL at 12q23.1 has been found to be associated with KC risk¹³ as well as vitamin D levels¹⁴. However, standard single GWAS meta-analysis approaches are unable to utilise this multi-trait genetic overlap to further explore the genetic risk for BCC, and SCC.

Multivariate GWAS approaches, such as multi-trait analysis of GWAS (MTAG)¹⁵, can draw on this overlapping genetics to identify new risk regions (here for BCC or SCC). MTAG is a generalisation of inverse-variance-weighted meta-analysis that importantly accounts for incomplete genetic correlation, and sample overlap, between GWAS. A key property of MTAG is that it outputs estimates of trait-specific effect sizes and p-values for each of the input traits—in this case BCC or SCC. We have previously used MTAG to identify loci for KC based on the genetic correlation between BCC and SCC only¹³. BCC and SCC are different in terms of polygenicity and aetiology and therefore, we sought to identify susceptibility genetic loci for BCC and SCC by exploring their genetic overlap with melanoma, pigmentation traits, autoimmune diseases, and blood biochemistry biomarkers in a multi-phenotype analysis of GWAS.

In this work, we show that BCC and SCC have a high genetic correlation with melanoma, pigmentation traits, autoimmune diseases, and blood biochemistry biomarkers. We use MTAG to leverage this genetic overlap and identify 78 and 69 independent genome-wide significant susceptibility loci for BCC and SCC, respectively; 19 BCC and 15 SCC loci are both previously unknown and replicated in a large independent cohort. The previously unknown risk loci are implicated in BCC/SCC development and progression, pigmentation, cardiometabolic pathways, and immune-regulatory pathways, including; innate immunity, HIV-1 viral load modulation and disease progression. We also report a optimised BCC polygenic risk score (PRS) that enables effective risk stratification for KC.

Results

Genetic correlation

Using linkage disequilibrium score (LDSC) regression¹⁶, 20 phenotypes were significantly genetically correlated (P < 0.05, rg > 10%) with either BCC or SCC (Fig. 1 and Supplementary Data 1). In the first instance, 35 phenotypes that we considered as possibly correlated with skin cancer (including body mass index) were excluded for not meeting the aforementioned criteria above (Supplementary Data 2). Using the same selection criteria, no additional new phenotypes were included following analysis using collated GWAS summary statistics (over 700 phenotypes) in the LD hub database¹⁷. In total, subsequent analyses included 22 genetically correlated traits; cancers; BCC and SCC GWAS from the UK Biobank (UKB)^18,19, a cutaneous melanoma GWAS meta-analysis²⁰, KC from the QSkin Sun and Health Study (QSkin)²¹, KC from the Electronic Medical Records and Genomics Network (eMERGE) cohort^22,23 and all-cancer from the Resource for Genetic Epidemiology Research on Aging (GERA) cohort;²⁴ skin and hair pigmentation related traits; skin burn type (QSkin), red hair (QSkin), hair colour excluding red hair (UKB), skin colour (UKB), and mole count excluding melanoma cases (QSkin), autoimmune conditions; type 1 diabetes and hypothyroidism²⁵, and vitiligo²⁶, lifestyle-related traits; educational attainment in years spent in school²⁷ and smoking (cigarettes per day)²⁸, and biochemistry blood biomarkers from the UKB; aspartate aminotransferase, C-reactive protein, albumin, and gamma-glutamyl transferase, glucose and vitamin D (adjusted for monthly variation). The sample sizes and phenotype measurements for all the included and excluded traits are presented in Supplementary Data 3 and 2, respectively.

Discovery of genome-wide significant susceptibility loci for BCC and SCC

Adding 20 traits genetically correlated with either BCC or SCC (r_g > 0.1, P < 0.05) (from UKB) increased the effective sample sizes for BCC and SCC by 2.6 and 8.3 times, respectively. Using the MTAG approach, we identified 78 and 69 independent genome-wide significant (P < 5 × 10⁻⁸) susceptibility loci for BCC (Fig. 2 and Supplementary Data 4) and SCC (Fig. 3 and Supplementary Data 5), respectively. Although the results for the peak single nucleotide polymorphisms (SNPs) were more significant following the MTAG analysis due to the greater statistical power, the log (odds ratio) effect sizes for the MTAG output and the respective UKB BCC or SCC GWAS inputs were highly concordant. For BCC the Pearson’s correlation of effect sizes was 0.93 (95% confidence interval [CI] = 0.89–0.96, P < 2.20 × 10⁻¹⁶; Fig. 4a). Similarly, concordance was high for SCC loci (Pearson’s correlation = 0.71, 95% CI = 0.57–0.81, P = 7.34 × 10⁻¹²; Fig. 4c).

**Fig. 2: Manhattan plot for basal cell carcinoma susceptibility.**

**Fig. 3: Manhattan plot for squamous cell carcinoma susceptibility.**

**Fig. 4: Concordance of the log (OR) effect estimates for MTAG versus UK single-trait GWAS and 23andMe replication.**

In the 23andMe, Inc replication sample (252,931 cases and 2,281,246 controls), 71 of the 78 susceptibility loci for BCC replicated at the genome-wide level (P < 5 × 10⁻⁸), 74 replicated after Bonferroni correction (P = 6.49 × 10⁻⁴), and 77 loci replicated at a nominal P = 0.05 (Supplementary Data 4). There was high concordance with the BCC effect estimates between the MTAG and the replication set with Pearson’s correlation = 0.97 (95% CI = 0.95–0.98, P = 2.20 × 10⁻¹⁶; Fig. 4b). Of the 69 susceptibility loci for SCC, 25 replicated at the genome-wide level (P = 5 × 10⁻⁸), 31 replicated after Bonferroni correction (P = 7.24 × 10⁻⁴) and 38 loci replicated at a nominal P = 0.05 in the 23andMe cohort (135,214 cases and 2,404,735 controls) (Supplementary Data 5). For SCC, there was also high concordance with the effect estimates between the MTAG and the replication set with Pearson’s correlation = 0.69 (95% CI = 0.55–0.80, P = 3.48 × 10⁻¹¹; Fig. 4d).

Description of the previously unknown loci for BCC and SCC

A locus was considered previously unknown for BCC or SCC if it had not been significantly associated with either BCC, SCC or KC at the genome-wide level (P < 5 × 10⁻⁸) before, and if it replicated at minimum P < 0.05) in the 23andMe replication cohort. By this criterion, we identified 19 and 15 previously unknown loci for BCC (Table 1) and SCC (Table 2), respectively. The previously unknown loci were annotated to the pigmentation, cardiometabolic, cancer development/progression and immune-regulatory pathways (Figs. 5, 6), whilst others are known loci for cutaneous melanoma susceptibility (ATM, and SOX6 for BCC, and GPR98, and DSTYK for both BCC and SCC). More details on these loci and the broader biological groups have been discussed in the Supplementary Information (Supplementary Note 1). For loci that are unique to BCC or SCC, or overlap between BCC and SCC, refer to Tables 1, 2.

Table 1 BCC susceptibility novel loci that replicated at P < 0.05 in 23andMe cohort

Full size table

Table 2 SCC susceptibility novel loci that replicated at P < 0.05 in 23andMe cohort

Full size table

**Fig. 5: Basal cell carcinoma loci and biological pathways.**

**Fig. 6: Squamous cell carcinoma loci and biological pathways.**

Gene-set pathways

After multiple correction testing (P = 0.05/18,188 genes; 2.75 × 10⁻⁶), gene-set analysis revealed curated and gene ontology (GO) pathways that are important in the development of keratinocyte cancer (Supplementary Table 1). A number of pathways are involved in melanogenesis (e.g. melanin biosynthesis, melanin biosynthetic process and melanosome membrane); a process which influences the nature of pigmentation traits and response to UV exposure. Genes in the “response to trabectedin” pathway are likely to play an important role in DNA damage response. Trabectedin is an alkylating agent used to treat certain cancers resulting in DNA damage. Other pathways are important in the downregulation of the immune response (e.g. GO negative regulation of regulatory T cell differentiation), and enhancement of the immune response (IL2-PI3K pathway, MHC class II receptor activity, and nuclear factor of activated T cells (NFAT) pathway for development and function of regulatory T cells).

BCC MTAG-derived polygenic risk score for KC prediction in the Canadian longitudinal study on aging (CLSA)

During the validation of the PRSs, S5 (i.e. P < 10⁻⁴ with 273 SNPs for the MTAG_PRS and 462 SNPs for the UKB_PRS) was the optimal PRS models for both MTAG_PRS and UKB_PRS with Nagelkerke R² of 10.65 and 9.55% respectively (Fig. 7a). The total number of SNPs in both PRS was different because the MTAG results have more power than the single BCC analysis and therefore it has more SNPs reaching significance. However, based on 'the nearest gene' analysis, 154 SNPs (Supplementary Data 8) overlapped between the MTAG_PRS and UKB_PRS. The correlation of the effect size for the PRS SNPs across the two sets was consistent or high (e.g. for the overlapping 154 SNPs, Pearson’s correlation = 0.94, 95% CI = 0.92–0.96, P < 2.2 × 10⁻¹⁶), meaning the extra MTAG SNPs are consistent but just better powered.

**Fig. 7: Validation and application of the basal cell carcinoma MTAG_PRS and UKB_PRS in participants in the Canadian Longitudinal Study on Aging (CLSA).**

The SNPs for the optimal models are presented in Supplementary Data 6 and Supplementary Data 7 for the UKB_PRS and MTAG_PRS, respectively. When we tested the performance for both the UKB_PRS and MTAG_PRS in the CLSA (N = 18,933), the MTAG_PRS outperformed the UKB_PRS in terms of association with KC risk, KC risk prediction, and stratification. For example, after adjusting for age at recruitment, sex and the first ten PCs, the MTAG_PRS outperformed the UKB_PRS for association with KC risk i.e. MTAG_PRS OR = 1.66, 95% CI = 1.55–1.79, P = 1.95 × 10⁻⁴¹ versus UKB_PRS OR = 1.56, 95% CI = 1.45–1.67, P = 3.38 × 10⁻³³ (Fig. 7b). In addition, the net reclassification index for KC risk was greater for MTAG_PRS than the UKB_PRS (Fig. 7c), when added to the base model containing age, sex and ten PCs. Consequently, the MTAG_PRS compared to the UKB_PRS reclassified more participants for KC risk to the appropriate risk group (low risk, moderate risk and high risk) (i.e. percentage of people reclassified; MTAG_PRS = 36.57%, 95% CI = 35.89–37.26% versus UKB_PRS = 33.23%, 95% CI = 32.56–33.91%) (Fig. 7d).

Discussion

In this large multi-trait GWAS analysis, we show that cutaneous melanoma, 'any-cancer', pigmentation traits, autoimmune diseases and other serum metabolic biomarkers are genetically correlated with BCC and SCC. We have leveraged this genetic correlation using the MTAG approach to identify 78 and 69 independent genome-wide significant loci for BCC and SCC risk, respectively, the most common skin cancers among fair-skinned people. Nineteen BCC and 15 SCC loci were previously unknown for any KC and replicated in the 23andMe cohort, indicating our study uncovers important findings relevant to KC biology.

First, we identify previously unknown loci in the pigmentation pathways for both BCC and SCC susceptibility. Due to the importance of sun exposure in keratinocyte cancer biology²⁹, several new loci for BCC and SCC were linked to pigmentation traits, including skin colour, red hair, skin tanning response and sunburns. The gene-set analysis results also confirmed we identified biological pathways involved in melanin biosynthesis and DNA damage response.

Second, our study affirms the role of immune-regulatory processes and pathways in BCC and SCC susceptibility. We show that the previously unknown loci for BCC and SCC are implicated in immune-regulatory processes (Supplementary Information), including; HIV viral load modulation^30,31, innate immune response (through IFIH1)^32,33,34 and autoimmunity. These cellular immune responses are important in cancer initiation and progression³⁵. We also highlight a previously known locus (CTLA4) which is an immunotherapy target (anti-CTLA4 medication) in melanoma treatment³⁶. Therefore, our identified loci implicated in immune response may be potential targets to improve immunotherapy for skin cancer. However, further functional genomic studies will be needed to establish their potential role in skin cancer prevention and treatment.

Third, immunosuppressive medication, including azathioprine and cyclosporin A have been implicated in BCC and SCC risk^37,38. While we uncovered KC loci linked to immune-related medication use, including; anti-asthmatic inhalants and thyroid preparations³⁹, it is likely that medication-related loci underpinned here are just a proxy indicator for the autoimmune disease. Thus, these medications are unlikely to cause BCC or SCC. In addition, even if these diseases were all treated with drugs that greatly increased the risk of KC, they are (a) too rare to lead to a cryptic genetic correlation as large as what we see here e.g. for hypothyroidism (rg = −0.19, P = 1.05 × 10⁻⁴) (Supplementary Data 1) and (b) the genetic correlation e.g. for hypothyroidism was negative with BCC where a drug-induced cryptic overlap would give a positive genetic correlation.

Fourth, our study also highlights the potential role of cardiometabolic biomarkers in BCC/SCC risk. Besides the PUFA levels, whose causal association link with the BCC risk has been established through a Mendelian randomisation study⁴⁰, our results highlight a potential causal relationship between cardiometabolic biomarkers, including; diastolic and systolic blood pressure, lipids, serum glucose, cholesterol and adiposity, and the risk of BCC and SCC. As is the case for PUFA, downstream metabolism of these cardiometabolic biomarkers, such as lipids and cholesterol, results in oncogenic inflammatory biomarkers (e.g. prostaglandins E, thromboxane A2 and leukotriene B). However, some risk genetic variants or loci for the cardiometabolic pathway could be influencing BCC and SCC risk through already known pigmentation and immune-regulatory biological pathways e.g. rs1136165 in CKB and rs10774625 in ATXN2^41,42,43,44.

Fifth, we also unveil important genes with a potential role in BCC and SCC initiation and progression e.g. FAP, CDKL1, MARK3, RAB11FIP2, GAB2, SUOX and SOX6. Although some genetic variants within these genes have pleiotropic effects with pigmentation traits, the aforementioned genes have established roles in cancer cell proliferation, migration and invasion, and downregulation of apoptosis in melanoma, colorectal cancer and breast cancer^{45,46,47,48,49,50,51}. Some of these loci are potential drug targets. For example, a previous study identified a potential drug 'PCC0208017' as an inhibitor of MARK3, suppressing glioma progression both in vitro and in vivo⁵². Fostamatinib, a drug used for treatment of chronic immune thrombocytopenia⁵³, is an inhibitor of MARK3⁵⁴. Further studies are warranted to test these drugs for any anti-tumour activity in KC.

Our results further emphasise the shared biology between cutaneous melanoma and KC. In total, four previously unknown loci for BCC and SCC at ATM, DSTYK, GPR98 and SOX6 are known for CM^20,55,56. Our MTAG results have also highlighted shared biology between BCC and SCC whereby almost half (7) of the previously unknown loci are shared between BCC and SCC. However, our work also highlights loci distinct to either BCC (12) and SCC (8), indicating unique biological pathways (see results) for each cancer.

We also note the difference in the replication success between BCC and SCC. Given the relatively high genetic correlation between the two traits, similar replication results are expected. However, at a subset of loci, the input data may suggest that a particular SNP is only strongly associated with say, BCC but no SCC. Given we have substantially more input data on BCC than SCC, power may also play a part in the strength of the results, and replication success. We have previously shown that BCC is twice as heritable as SCC (SNP-heritability estimates for BCC = 13.1%, 95% CI = 9.7–16.5% versus 6.8%, 95% CI = 0.9–12.7% for SCC)¹³, and it is more polygenic^8,57. We believe the reasons contributed to the differences in replication success.

One strength of the MTAG method is the increase in statistical power to identify several loci that a standard single-trait GWAS would not have done. For example, using MTAG, we increased our sample size by 2.6 times and 8.3 times for BCC and SCC, respectively. Owing to the great improvement in statistical power, our MTAG-derived BCC PRS outperformed (for KC risk stratification) the one derived from a single-trait BCC GWAS. We and others have previously shown that the KC PRS generated from the general population effectively stratify them for KC risk and multiplicity^58,59,60. The optimised MTAG-derived PRS is likely to improve KC risk stratification in high-risk subpopulations, as previously shown in solid organ transplant recipients.

One caveat with the MTAG approach is that it assumes that the genetic variants have a homogeneous effect across all the included traits so that the results are not driven by a certain trait to result in false positives¹⁵. Firstly, when we compared the genetic correlation (Supplementary Fig. 1), and the MTAG results (Supplementary Fig. 2) before and after excluding genomic regions (HLA, ASIP, IRF4, MC1R, SLC45A2 and CDKN2A) with very large effect sizes for skin cancers and pigmentation traits, and there was a high concordance (Supplementary Fig. 2). Secondly, there was good replication of our results in an independent cohort, which counters concerns of false positives. In addition, in order to minimise biases arising from using several cohorts which might have phenotypes with different measures¹⁵, we selected only traits where the magnitude of the genetic correlation was larger than 0.1 (or less than −0.1 for negatively correlated traits); we also required the correlation to at least reach nominal significance (P < 0.05), as a priori. Also, studies with small sample size were not considered, as including such traits would only negligibly increase our effective sample size.

In conclusion, leveraging the genetic correlation between skin cancers, autoimmune diseases, pigmentation traits and serum biochemistry biomarkers revealed previously unknown susceptibility loci for SCC and BCC, implicated in KC development and progression, pigmentation, cardiometabolic and immune-regulatory pathways. We also report an optimised PRS for effective risk stratification for KC, which could facilitate skin cancer surveillance in high-risk subpopulations such as transplantees.

Methods

Cohorts

Discovery cohorts

Participants that contributed to the phenotype-specific genome-wide association studies were of homogenous European ancestry drawn from different cohorts from Australia, Europe and America. While there was sample overlap across the included GWAS, MTAG adjusts and corrects for biases due to sample overlap¹⁵. The major cohorts used included; the UK Biobank (UKB)^18,19, QSkin Sun and Health Study (QSkin) (Olsen et al. 2012), eMERGE (dbGaP, study accession: phs000360.v3.p1) and GERA (dbGaP, study accession: phs000674.v3.p3), a melanoma meta-analysis consortium (Supplementary Information; Supplementary Table 2)²⁰ (dbGaP accession study code: phs001868.v1.p1), as well as publicly available GWAS summary statistics from international cohorts and consortium. Details for each cohort, including ethics oversight, are described in the Supplementary Information.

Replication cohort: 23andMe Research Cohort

23andMe, Inc. is a direct-to-consumer genetic company that collected both self-reported phenotypes and genetic data from participants who provided informed consent and participated in the research online, under a protocol approved by the external Association for the Accreditation of Human Research Protection Programme (AAHRPP)- accredited Institutional Review Board (IRB), Ethical & Independent Review Services (E&I Review). The BCC cohort included 2,523,630 participants of European ancestry; 251,963 BCC cases and 2,271,667 controls, and 44.65% males. The SCC dataset included 2,529,399 participants of European ancestry; 134,700 SCC cases and 2,394,699 controls, and 44.65% males. Further details on data collection, validation, genotyping, imputation and quality control have been published before^8,57.

BCC PRS application cohort: the Canadian Longitudinal Study on Aging (CLSA)

The Canadian Longitudinal Study on Aging (CLSA) is a prospective large population-based cohort in Canada comprising about 50,000 participants (45–85 years) randomly recruited between 2010 and 2015 from ten provinces^61,62. More information about the cohort has been published elsewhere^61,62 and summarised here. It consists of two cohorts; the 'Tracking cohort' of ~20,000 participants recruited through a telephone questionnaire in ten provinces, and the “Comprehensive cohort” with ~30,000 individuals who provided data through an in-person questionnaire, clinical/physical tests and biological samples (e.g. for genetic data) in seven provinces.

In general, at baseline, information on relevant variants, including age and sex, were recorded, and participants were also asked whether they had been diagnosed with any cancer, including KC (yes/no), by a health professional. Between 2015 and 2018, the first follow-up assessment was conducted and participants were asked again if they had been diagnosed with cancer, and KC during the follow-up period. Thus, the CLSA dataset we used included the 'Baseline Comprehensive Dataset version 4.0' and 'Follow-up 1 Comprehensive Dataset version 1.0'. At the time of analysis, ~30,000 individuals had genetic data available, genotyped using 820 K UK Biobank Axiom Array (Affymetrix)⁶¹, and imputed using the TopMed imputation server⁶³. The CLSA is overseen by the Canadian Institutes of Health Research (CIHR) and its protocol has been reviewed and approved by 13 research ethics boards in Canada. All participants provided written informed consent.

Firstly, for purposes of validation and selection of the optimal PRS models (as described below in Stage 6 analysis) we randomly selected 1523 cancer-free controls and 388 prevalent KC cases at the baseline. Thus, our validation sample included 1911 participants with a mean age of 65.81 years (sd = 10.25) and 52.75% males.

Secondly, we tested the BCC PRSs in a second sample (unrelated to the validation dataset) of 18,933 participants of European ancestry, with a mean age of 61.80 years (sd = 9.84), followed up for a mean duration of 2.9 years (sd = 0.3) and 49.63% males. Only participants with complete data on age, sex, cancer status and KC diagnosis were included. Thus, 18,139 controls with no history of any cancer (at follow up 1) and 794 participants who developed KC during follow-up.

Statistical analysis

Stage 1: GWAS for BCC, SCC and related traits

We conducted two case-control GWAS using UKB data for BCC, N = 307,684 (20,791 cases and 286,893 controls) and SCC, N = 294,294 (7402 SCC cases and 286,892 controls) of European ancestry. We adjusted for age and sex as well as the first ten ancestral principal components (PCs) in order to control for biases from population stratification. We used Scalable and Accurate Implementation of GEneralised mixed model (SAIGE) software for the analysis since it controls for sample relatedness and case-control imbalance²⁵. Analysis was restricted to single nucleotide polymorphism (SNPs) with minor allele frequency (MAF) >1% and an imputation quality score of 0.3. BCC/SCC cases were drawn from UK cancer registries. Further details on case ascertainment and definition are described in Supplementary Information.

In addition, we conducted GWAS for pigmentation traits (e.g. skin colour, hair colour, tanning response, skin burn, sunburn, etc.), all-cancer, autoimmune conditions, and blood biochemistry biomarkers (e.g. C-reactive protein, vitamin D, glucose, albumin, aspartate aminotransferase, gamma-glutamyl transferase, etc) using data from international cohorts including; UKB, QSkin, and GERA as described in Supplementary Information, Supplementary Data 2, 3. We also conducted GWAS on KC and all-cancer after accessing data from eMERGE (dbGaP, study accession: phs000360.v3.p1) and GERA (dbGaP, study accession: phs000674.v3.p3) cohorts respectively (Supplementary Information). We also accessed publicly available GWAS summary statistics e.g. for cutaneous melanoma²⁰, smoking²⁸, education attainment²⁷, body mass index⁶⁴, hypothyroidism, type 1 diabetes, rheumatoid arthritis²⁵ and vitiligo²⁶ (Supplementary Information, Supplementary Data 2, 3).

Stage 2: Genetic correlation between BCC, SCC and related traits

We used LDSC version 1.0.1⁶⁵, to compute the genetic correlation (r_g)¹⁶ between BCC and a range of other traits, including; other skin cancer types, pigmentation traits, autoimmune traits and biochemistry biomarkers (recently released in the UKB). We then repeated this process for SCC instead of BCC. We used data from publicly available GWAS, as well as GWAS data from international cohorts of participants of European ancestry (conducted in stage 1 above). Traits with a statistically significant (P < 0.05) r_g greater than 10% with either BCC or SCC were selected and included in the MTAG model (Fig. 1 and Supplementary Table 1). We further sought additional traits that were genetically correlated with BCC or SCC using data from the LD hub catalogue¹⁷. Out of about 700 phenotypes, no additional phenotypes were selected to be included in the final MTAG model.

In total, 22 traits, including the initial input BCC and SCC GWAS from different cohorts of European ancestry, met the inclusion criteria. The 22 genetically correlated traits included; BCC, SCC, skin colour, hair colour excluding red hair, hypothyroidism, type 1 diabetes, gamma-glutamyl transferase, aspartate aminotransferase, serum vitamin D levels, albumin, C-reactive protein and glucose in the UK Biobank¹⁹, KC, red hair and mole count in the QSkin²¹, KC in eMERGE (dbGaP, study accession: phs000360.v3.p1), all-cancer in GERA cohort (dbGaP, study accession: phs000674.v3.p3), melanoma risk as measured by the latest and largest melanoma risk gwas meta-analysis²⁰, vitiligo²⁶, education attainment²⁷ and smoking²⁸. All the above studies excluded 23andMe, to enable us to utilise the 23andMe data as a replication set. Details on the phenotypic measurements and definitions are described in Supplementary Information and Supplementary Data 2, 3.

Stage 3: Multi-trait analysis of GWAS summary statistics

Next, using a total of 22 genetically correlated traits, we conducted a multi-phenotype analysis of GWAS summary statistics (generated at stage 1 analysis and selected in stage 2) using MTAG software version 1.0.8¹⁵. MTAG default settings were used. MTAG combines GWAS summary statistics for genetically correlated traits into a meta-analysis while accounting for genetic correlation, sample overlap, maximising power to identify loci associated with the trait(s) of interest (here BCC and SCC)¹⁵. MTAG generates trait-specific results for each phenotype included in the model. BCC and SCC GWAS summary data from UKB from stage 1 were included as trait 1 and 2, respectively in the model below;

$${{{\mathrm{MTAG}}}}\,{{{\mathrm{model}}}}\!\!: \, {{{\mathrm{BCC}}}}+{{{\mathrm{SCC}}}}+{{{\mathrm{melanoma}}}}+{{{\mathrm{pigmentation}}}}\,{{{\mathrm{traits}}}} \\ +{{{\mathrm{autoimmune}}}}\,{{{\mathrm{traits}}}}+\ldots \ldots .+{{{\mathrm{trait}}}}\,n.$$

After the quality control measures, the analysis was restricted to 5,301,239 SNPs common in all the 22 GWAS with a minor allele frequency of >1%, and no ambiguous alleles. MTAG boosts the statistical power of the single-trait GWAS¹⁵. We assessed the increase in the statistical power/effective sample size or the GWAS-equivalent sample size when MTAG was applied to the single-trait GWAS, by comparing the average chi-squared before and after MTAG for BCC and for SCC using the following formula recommended by the MTAG authors:¹⁵

$$({1}-{{{\mathrm{average}}}}\,{\chi }^{2}\,{{{\mathrm{MTAG}}}}\,output)/({1}-{{{\mathrm{average}}}}\,{\chi }^{2}\,{{{\mathrm{MTAG}}}}\_input)$$

Where MTAG input corresponds to the input for either BCC or SCC GWAS in the UKB dataset, and χ2 is chi-squared.

We took forward the a) BCC and b) SCC MTAG output summary statistic results for further post-GWAS analysis in stage 4 and replication in stage 5. BCC and SCC Manhattan plots are presented in Figs. 2 and 3, respectively.

Stage 3.1: Sensitivity analyses

MTAG assumes a homogeneous effect across all the included traits¹⁵. However, due to their strong association with some input traits, the following genomic regions were removed; CDKN2A, SLC45A2, IRF4 and HLA for autoimmune, and ASIP and MC1R for pigmentation or CM, violate this assumption. We conducted sensitivity analyses excluding these regions before implementing our MTAG model. Using the stage 1 BCC GWAS summary statistics, we removed extended regions for ASIP on chromosome 20 (30–36 megabases (mb)), MHC regions on chromosome 6 (25−36 mb), and MC1R on chromosome 16 (87–90.3 mb). We also removed 2 mb around the most significant SNP in the following regions; rs12203592 (6:396321) in the IRF4 region on chromosome 6, rs3731239 (9:21974218) in the CDKN2A region on chromosome 9, and rs16891982 (5:33951693) in the SLC45A2 region on chromosome 5. We compared the genetic correlation between BCC/SCC before and after removing the genomic regions with known strong associations and high LD (Supplementary Fig. 2), before running the full MTAG model of 22 traits described above. The MTAG results with and without the above genomic regions were also compared (Supplementary Fig. 1).

Stage 4: Post-GWAS analysis

We used FUMA v.1.3.6⁶⁶, to identify independent, genome-wide significant SNPs and the genomic risk loci, and performed annotation of candidate SNPs in the genomic loci and functional gene mapping. We also conducted gene-based and pathway analyses using MAGMA v.1.7, as implemented in FUMA v.1.3.6⁶⁷. For the gene pathway analysis, gene ontology (GO) and curated gene sets from MSigDB (v5.2)⁶⁸ were used and corrected for multiple testing. GWAS catalogue⁶⁹ and Open Targets platform⁷⁰ were used to annotate the loci and their relationship with other traits.

Stage 5: Replication of the BCC and SCC MTAG results

Next, we sought to replicate the BCC and SCC susceptibility loci in a large independent cohort using data from the 23andMe research cohort. For BCC, the replication cohort included 251,963 self-reported cases and 2,271,667 controls, while the SCC replication comprised 134,700 cases and self-reported cases and 2,394,699 controls of European ancestry filtered to remove close relatives.

Previous studies have shown high accuracy of 23andMe BCC/SCC self-reported cases⁸ and high genetic correlation (r_g > 0.9) between the histologically confirmed UKB BCC/SCC data and 23andMe data¹³. Age, sex, and population stratification using five PCs were adjusted for in both analyses in a logistic regression i.e.

$$ {{{\mathrm{BCC}}}}\,{{{\mathrm{or}}}}\,{{{\mathrm{SCC}}}} \sim {{{\mathrm{genotype}}}}+{{{\mathrm{age}}}}+{{{\mathrm{sex}}}}+pc.{0}+pc.{1}+pc.{2}+pc.{3}+pc.{4} \\ +v{2}\_{{{\mathrm{platform}}}}+v{3}\_{0}\_{{{\mathrm{platform}}}}+v{3}\_{1}\_{{{\mathrm{platform}}}}+v{4}\_{{{\mathrm{platform}}}}.$$

The V2 genotyping platform was a variant of the Illumina HumanHap550 + BeadChip with ~560,000 SNPs, including about 25,000 custom SNPs selected by 23andMe. The V3 platform included Illumina OmniExpress + BeadChip with ~950,000 SNPs and custom content SNPs. The V4 is the current and fully custom array of ~950,000 SNPs and includes a lower redundancy subset of V2 and V3 SNPs⁷¹.

The BCC results were adjusted for a genomic control inflation factor λ = 1.286. The equivalent inflation factor for 1000 cases and 1000 controls λ1000 = 1.001, and for 10000, λ10000 = 1.006. In a similar way, the SCC results were adjusted for a genomic control inflation factor λ = 1.172. The equivalent inflation factor for 1000 cases and 1000 controls λ1000 = 1.001, and for 10000, λ10000 = 1.007. Thus, this inflation factor was not concerning as it is proportional to the large sample size⁷². We also explored any evidence of inflation in the discovery GWAS by assessing the LDSC intercept⁷³, which showed no inflation (not substantially above 1) for both BCC (LDSC intercept = 0.96, 95%CI = 0.94−0.99) and SCC (LDSC intercept = 0.77, 95%CI = 0.75−0.79).

We also compared the concordance of the effect sizes (log OR) for the MTAG results versus the replication results (Fig. 4b, d). We further analysed the number of loci that replicated at a genome-wide significant level (P = 5.0 × 10⁻⁸), after multiple testing correction (i.e. Bonferroni correction P = 6.49 × 10⁻⁴ for BCC; correcting for 77 loci, and P = 7.24 × 10⁻⁴ for SCC; correcting for 69 loci) and at a nominal P = 0.05.

Stage 6: Development and validation of the BCC Polygenic Risk Score in a selected sample of participants in CLSA

To construct two comparable polygenic risk scores (PRSs) for BCC, we separately used the BCC MTAG output (generated in stage 3) and the UKB BCC single-trait GWAS (generated in stage 1) summary statistics as the discovery data sets. MTAG¹⁵ drops SNPs with extremely significant associations with any input trait, which resulted in a number of previously reported pigmentation-associated SNPs being dropped from the model. Hence in both the MTAG and UKB discovery GWAS summary statistics, we also included two functional SNPs (rs1805007 for MC1R, and rs12203592 for IRF4) that would otherwise have been dropped in the PRS using the weights from a previously published BCC PRS⁷⁴. They are removed during the MTAG analysis as it filters out SNPs strongly (P < 10 × 2.22⁻³⁰⁸) associated with input traits, but this same strong association confirms they are important for a PRS for BCC. A sensitivity analysis results excluding these SNPs, and still, the MTAG BCC PRS reclassified skin cancer cases to a higher risk group (41.27%) better than the single BCC PRS (37.95%).

Next, using autosomal, non-ambiguous, and bi-allelic SNPs overlapping in the CLSA cohort (MTAG discovery = 5,300,872 SNPs and UKB discovery = 5,300,868 SNPs), we performed LD clumping based on (r² = 0.005 and LD window = 5000 kb, P = 1) to yield 62,494 and 62,884 independent SNPs for MTAG_PRS and UKB_PRS models respectively. PLINK 1.90b6.8⁷⁵ for clumping. Using the clumped independent SNPs above, we generated PRS models at varying p value thresholds i.e. S1 (P < 5 × 10⁻⁸), S2 (P < 10⁻⁷), S3 (P < 10⁻⁶), S4 (P < 10⁻⁵), S5 (P < 10⁻⁴), S6 (P < 10⁻³), S7 (P < 10⁻²) and S8 (P < 10⁻¹) in validation sample of 1911 participants split from the CLSA cohort using log odds ratio (from the respective discovery GWAS; MTAG or UKB) as weights. PLINK2 (v2.00a3LM 5 May 2021 release)⁷⁵ was used for generating the PRS scores.

For both MTAG and UKB PRS models, we used Nagelkerke’s R²⁷⁶^,, a metric for model fitness used for selecting the optimal model. We computed the R² by comparing the model fitness between models with PRSs (BCC~MTAG_PRS or UKB_PRS + age + sex + 10 Pcs) and a null model using predictABEL package⁷⁷ in R software version 4.0.2⁷⁸.

Stage 7: Applying BCC polygenic risk score and keratinocyte cancer risk prediction in the Canadian longitudinal study of aging

To determine the ability of our MTAG GWAS data to predict skin cancer, we used 18,933 participants of European ancestry with data on KC risk in the Canadian Longitudinal Study of Aging (CLSA). We included 18,139 controls with no history of any cancer (both at baseline and follow-up) and 794 cases who developed KC during the 2.9 years (on average) follow-up following baseline recruitment. Separate BCC and SCC data were unavailable in this cohort, and as ~80% of KC cases are BCC cases⁷⁹, we tested the performance MTAG_PRS vs UKB_PRS derived for BCC to predict the risk of KC.

Using PLINK2 (v2.00a3LM 5 May 2021 release)⁷⁵, we generated individual scores for CLSA participants for both the BCC MTAG_PRS and UKB_PRS weighted by their respective effect sizes (log odds ratios). The genetic scores were standardised to a variance of 1 in order to interpret the associations as odds ratio per standard deviation increase in the PRS. We compared the performance of the two BCC PRSs (MTAG_PRS vs UKB_PRS) based on the magnitude of the association (odds ratios) and the net reclassification improvement for KC risk using R version 4.0.2⁷⁸. For net reclassification improvement, we compared the net reclassification index and the percentage of the participants who got reclassified to an appropriate risk group/tertile i.e. the low risk (bottom tertile), moderate risk (middle tertile), and high risk (top tertile) after adding the MTAG_PRS vs UKB_PRS to the base model containing age, sex and the ten PCs.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The full GWAS summary statistics generated in this study have been deposited in the NHGRI-EBI GWAS Catalogue under accession code GCST90137411 for BCC and GCST90137412 for SCC. The PRS generated in this paper are provided with this paper in Supplementary Data 6, 7. The BCC and SCC independent genome-wide significant SNPs (both for the discovery of MTAG results and replication from 23andMe) generated in this study are provided with this paper in Supplementary Data 4, 5 files, respectively. Source data for the figures are provided with this paper in the Source Data File. Genotype and phenotype data for the UK Biobank are available through application via https://www.ukbiobank.ac.uk/, for the Canadian Longitudinal Study on Aging (CLSA) at www.clsa-elcv.ca, and for QSkin through application to Prof. David Whiteman (David.Whiteman@qimrberghofer.edu.au), the principal investigator. Data from the Resource for Genetic Epidemiology Research on Aging (GERA) Cohort (dbGap accession phs000674.v3.p3), and the Electronic Medical Records and Genomics Network (eMERGE) (dbGaP, study accession: phs000360.v3.p1) can be accessed from dpGAP. The cutaneous melanoma GWAS summary statistics used in this paper were from ref. 20, are publicly available from dbGap (accession study code: phs001868.v1.p1). The following trait-specific GWAS summary statistics were also used in this paper and are publicly available through the following consortia or resources; educational attainment by ref. 27 (downloadable from the SSGAC website http://ssgac.org/Data.php), smoking (cigarettes per day) by ref. 28 (downloadable from the GSCAN Consortium website https://genome.psych.umn.edu/index.php/GSCAN), and autoimmune traits from by ref. 25 (available at ftp://share.sph.umich.edu/UKBB_SAIGE_HRC/). Source data are provided with this paper.

Code availability

The custom code used to generate the key results in this study can be freely accessed at https://github.com/mathiasS-hub/KC_MTAG_NatComm_Code.

References

Guy, G. P. Jr, Machlin, S. R., Ekwueme, D. U. & Yabroff, K. R. Prevalence and costs of skin cancer treatment in the U.S., 2002-2006 and 2007-2011. Am. J. Prev. Med. 48, 183–187 (2015).
Article Google Scholar
Australian Institute of Health and Welfare. Cancer in Australia: actual incidence data from 1982 to 2013 and mortality data from 1982 to 2014 with projections to 2017. Asia Pac. J. Clin. Oncol. 14, 5–15 (2018).
Article Google Scholar
Fransen, M. et al. Non-melanoma skin cancer in Australia. Med. J. Aust. 197, 565–568 (2012).
Article Google Scholar
Nagarajan, P. et al. Keratinocyte carcinomas: current concepts and future research priorities. Clin. Cancer Res. 25, 2379–2391 (2019).
Article Google Scholar
Pandeya, N., Olsen, C. M. & Whiteman, D. C. The incidence and multiplicity rates of keratinocyte cancers in Australia. Med. J. Aust. 207, 339–343 (2017).
Article Google Scholar
Zhu, K.-J. et al. Psoriasis regression analysis of MHC loci identifies shared genetic variants with vitiligo. PLoS ONE 6, e23089 (2011).
Article ADS CAS Google Scholar
Coenen, M. J. H. et al. Common and different genetic background for rheumatoid arthritis and coeliac disease. Hum. Mol. Genet. 18, 4195–4203 (2009).
Article CAS Google Scholar
Chahal, H. S. et al. Genome-wide association study identifies 14 novel risk alleles associated with basal cell carcinoma. Nat. Commun. 7, 12510 (2016).
Article ADS CAS Google Scholar
Hinks, A. et al. Investigation of type 1 diabetes and coeliac disease susceptibility loci for association with juvenile idiopathic arthritis. Ann. Rheum. Dis. 69, 2169–2172 (2010).
Article Google Scholar
Roberts, M. R., Asgari, M. M. & Toland, A. E. Genome-wide association studies and polygenic risk scores for skin cancer: clinically useful yet? Br. J. Dermatol. 181, 1146–1155 (2019).
Article CAS Google Scholar
Al Olama, A. A. et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat. Genet. 46, 1103–1109 (2014).
Article CAS Google Scholar
Qian, X. et al. The Tensin-3 protein, including its SH2 domain, is phosphorylated by Src and contributes to tumorigenesis and metastasis. Cancer Cell 16, 246–258 (2009).
Article CAS Google Scholar
Liyanage, U. E. et al. Combined analysis of keratinocyte cancers identifies novel genome-wide loci. Hum. Mol. Genet. 28, 3148–3160 (2019).
Article CAS Google Scholar
Manousaki, D. et al. Genome-wide association study for vitamin D levels reveals 69 independent loci. Am. J. Hum. Genet. 106, 327–337 (2020).
Article CAS Google Scholar
Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).
Article CAS Google Scholar
Ni, G., Moser, G., Schizophrenia Working Group of the Psychiatric Genomics Consortium, Wray, N. R. & Lee, S. H. Estimation of genetic correlation via linkage disequilibrium score regression and genomic restricted maximum likelihood. Am. J. Hum. Genet. 102, 1185–1194 (2018).
Article CAS Google Scholar
Zheng, J. et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics 33, 272–279 (2017).
Article CAS Google Scholar
Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Article Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Article ADS CAS Google Scholar
Landi, M. T. et al. Genome-wide association meta-analyses combining multiple risk phenotypes provide insights into the genetic architecture of cutaneous melanoma susceptibility. Nat. Genet. 52, 494–504 (2020).
Article CAS Google Scholar
Olsen, C. M. et al. Cohort profile: the QSkin sun and health study. Int. J. Epidemiol. 41, 929–929i (2012).
Article Google Scholar
McCarty, C. A. et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med. Genomics 4, 13 (2011).
Article Google Scholar
Kho, A. N. et al. Electronic medical records for genetic research: results of the eMERGE consortium. Sci. Transl. Med. 3, 79re1 (2011).
Article Google Scholar
Banda, Y. et al. Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort. Genetics 200, 1285–1295 (2015).
Article Google Scholar
Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
Article CAS Google Scholar
Jin, Y. et al. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants. Nat. Genet. 48, 1418–1424 (2016).
Article CAS Google Scholar
Okbay, A. et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539–542 (2016).
Article ADS CAS Google Scholar
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
Article CAS Google Scholar
D’Orazio, J., Jarrett, S., Amaro-Ortiz, A. & Scott, T. UV radiation and the skin. Int. J. Mol. Sci. 14, 12222–12248 (2013).
Article Google Scholar
Alvarez, V., López-Larrea, C. & Coto, E. Mutational analysis of the CCR5 and CXCR4 genes (HIV-1 co-receptors) in resistance to HIV-1 infection and AIDS development among intravenous drug users. Hum. Genet. 102, 483–486 (1998).
Article CAS Google Scholar
Carrington, M., Dean, M., Martin, M. P. & O’Brien, S. J. Genetics of HIV-1 infection: chemokine receptor CCR5 polymorphism and its consequences. Hum. Mol. Genet. 8, 1939–1945 (1999).
Article CAS Google Scholar
Zheng, Y. et al. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) membrane (M) protein inhibits type I and III interferon production by targeting RIG-I/MDA-5 signaling. Signal Transduct. Target Ther. 5, 299 (2020).
Article CAS Google Scholar
Yin, X. et al. MDA5 governs the innate immune response to SARS-CoV-2 in lung epithelial cells. Cell Rep. 34, 108628 (2021).
Article CAS Google Scholar
Maiti, A. K. The African-American population with a low allele frequency of SNP rs1990760 (T allele) in IFIH1 predicts less IFN-beta expression and potential vulnerability to COVID-19 infection. Immunogenetics 72, 387–391 (2020).
Article CAS Google Scholar
Gonzalez, H., Hagerling, C. & Werb, Z. Roles of the immune system in cancer: from tumor initiation to metastatic progression. Genes Dev. 32, 1267–1284 (2018).
Article CAS Google Scholar
Rotte, A. Combination of CTLA-4 and PD-1 blockers for treatment of cancer. J. Exp. Clin. Cancer Res. 38, 255 (2019).
Article Google Scholar
Jiyad, Z., Olsen, C. M., Burke, M. T., Isbel, N. M. & Green, A. C. Azathioprine and risk of skin cancer in organ transplant recipients: systematic review and meta-analysis. Am. J. Transplant 16, 3490–3503 (2016).
Kuschal, C. et al. Skin cancer in organ transplant recipients: effects of immunosuppressive medications on DNA repair. Exp. Dermatol. 21, 2–6 (2012).
Article CAS Google Scholar
Wu, Y. et al. Genome-wide association study of medication-use and associated disease in the UK Biobank. Nat. Commun. 10, 1891 (2019).
Article ADS Google Scholar
Seviiri, M. et al. Polyunsaturated fatty acid levels and the risk of keratinocyte cancer: a Mendelian randomisation analysis. Cancer Epidemiol. Biomarkers Prev. 30, 1591–1598 (2021).
Morgan, M. D. et al. Genome-wide study of hair colour in UK Biobank explains most of the SNP heritability. Nat. Commun. 9, 5271 (2018).
Article ADS Google Scholar
Kichaev, G. et al. Leveraging polygenic functional enrichment to improve GWAS power. Am. J. Hum. Genet. 104, 65–75 (2019).
Article CAS Google Scholar
Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).
Article CAS Google Scholar
Plagnol, V. et al. Genome-wide association analysis of autoantibody positivity in type 1 diabetes cases. PLoS Genet. 7, e1002216 (2011).
Article CAS Google Scholar
Song, Z., Lin, J., Sun, Z., Ni, J. & Sha, Y. RNAi-mediated downregulation of CDKL1 inhibits growth and colony-formation ability, promotes apoptosis of human melanoma cells. J. Dermatol. Sci. 79, 57–63 (2015).
Qin, C. et al. CDKL1 promotes tumor proliferation and invasion in colorectal cancer. Onco. Targets Ther. 10, 1613–1624 (2017).
Article CAS Google Scholar
Tang, L., Gao, Y., Yan, F. & Tang, J. Evaluation of cyclin-dependent kinase-like 1 expression in breast cancer tissues and its regulation in cancer cell growth. Cancer Biother. Radiopharm. 27, 392–398 (2012).
CAS Google Scholar
Chen, L., Qiu, X., Wang, X. & He, J. FAP positive fibroblasts induce immune checkpoint blockade resistance in colorectal cancer via promoting immunosuppression. Biochem. Biophys. Res. Commun. 487, 8–14 (2017).
Article CAS Google Scholar
Wen, X. et al. Fibroblast activation protein-α-positive fibroblasts promote gastric cancer progression and resistance to immune checkpoint blockade. Oncol. Res. 25 629–640 (2017).
Dong, W., Li, H. & Wu, X. Rab11-FIP2 suppressed tumor growth via regulation of PGK1 ubiquitination in non-small cell lung cancer. Biochem. Biophys. Res. Commun. 508, 60–65 (2019).
Article CAS Google Scholar
Kondreddy, V., Magisetty, J., Keshava, S., Vijaya Mohan Rao, L. & Pendurthi, U. R. Gab2 (Grb2-Associated Binder2) plays a crucial role in inflammatory signaling and endothelial dysfunction. Arterioscler. Thromb. Vasc. Biol. 41, 1987 (2021).
Article CAS Google Scholar
Li, F. et al. PCC0208017, a novel small-molecule inhibitor of MARK3/MARK4, suppresses glioma progression in vitro and in vivo. Acta Pharm. Sin. B 10, 289–300 https://doi.org/10.1016/j.apsb.2019.09.004 (2020).
Connell, N. T. & Berliner, N. Fostamatinib for the treatment of chronic immune thrombocytopenia. Blood 133, 2027–2030 (2019).
Rolf, M. G. et al. In vitro pharmacological profiling of R406 identifies molecular targets underlying the clinical effects of fostamatinib. Pharm. Res. Perspect. 3, e00175 (2015).
Article Google Scholar
Barrett, J. H. et al. Genome-wide association study identifies three new melanoma susceptibility loci. Nat. Genet. 43, 1108–1113 (2011).
Article CAS Google Scholar
Ransohoff, K. J. et al. Two-stage genome-wide association study identifies a novel susceptibility locus associated with melanoma. Oncotarget 8, 17586–17592 (2017).
Article Google Scholar
Chahal, H. S. et al. Genome-wide association study identifies novel susceptibility loci for cutaneous squamous cell carcinoma. Nat. Commun. 7, 12048 (2016).
Article ADS CAS Google Scholar
Stapleton, C. P. et al. Polygenic risk score as a determinant of risk of non-melanoma skin cancer in a European-descent renal transplant cohort. Am. J. Transpl. 19, 801–810 (2019).
Article Google Scholar
Seviiri, M. et al. Polygenic risk scores stratify keratinocyte cancer risk among solid organ transplant recipients with chronic immunosuppression in a high ultraviolet radiation environment. J. Invest. Dermatol. 141, 2866–2875.e2 (2021).
Article CAS Google Scholar
Seviiri, M. et al. Higher polygenic risk for melanoma is associated with improved survival in a high ultraviolet radiation setting. J. Transl. Med. 20, 403 (2022).
Article Google Scholar
Raina, P. et al. Cohort profile: the Canadian longitudinal study on aging (CLSA). Int. J. Epidemiol. 48, 1752–1753j (2019).
Article Google Scholar
Raina, P. S. et al. The Canadian longitudinal study on aging (CLSA). Can. J. Aging 28, 221–229 (2009).
Article Google Scholar
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed program. Nature 590, 290–299 (2021).
Article ADS CAS Google Scholar
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).
Article CAS Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Article CAS Google Scholar
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Article ADS Google Scholar
de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
Article Google Scholar
Liberzon, A. et al. The molecular signatures database hallmark gene set collection. Cell Syst. 1, 417–425 https://doi.org/10.1016/j.cels.2015.12.004 (2015).
MacArthur, J. et al. The new NHGRI-EBI catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Article CAS Google Scholar
Ghoussaini, M. et al. Open targets genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2021).
Article CAS Google Scholar
Tian, C. et al. Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections. Nat. Commun. 8, 599 (2017).
Article ADS Google Scholar
Yang, J. et al. Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet. 19, 807–812 (2011).
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS Google Scholar
Seviiri, M. et al. Polygenic risk scores allow risk stratification for keratinocyte cancer in organ-transplant recipients. J. Invest. Dermatol. 141, 325–333.e6 (2021).
Article CAS Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Article Google Scholar
Nagelkerke, N. J. D. A note on a general definition of the coefficient of determination. Biometrika 78, 691–692 (1991).
Kundu, S., Aulchenko, Y. S., van Duijn, C. M. & Janssens, A. C. J. W. PredictABEL: an R package for the assessment of risk prediction models. Eur. J. Epidemiol. 26, 261–264 (2011).
Article Google Scholar
R Core Team. R: a language and environment for statistical computing. (R Foundation for Statistical Computing, 2020).
Ciążyńska, M. et al. The incidence and clinical analysis of non-melanoma skin cancer. Sci. Rep. 11, 4337 (2021).
Article ADS Google Scholar

Download references

Acknowledgements

This study was supported by a programme grant (APP1073898) and a project grant (APP1063061) from the Australian National Health and Medical Research Council (NHMRC). S.M. and D.C.W. are supported by Research Fellowships from the NHMRC. M.S. was supported by the Australian Government Research Training Programme (RTP) and the Faculty of Health Scholarship at Queensland University of Technology, Australia. This study was conducted using data from UK Biobank (application number 25331), QSkin Sun and Health Study (Australia), Canadian Longitudinal Study on Aging (CLSA), 23andMe (USA), the Electronic Medical Records and Genomics Network (eMERGE) (USA), the Resource for Genetic Epidemiology Research in Adult Health and Aging (GERA) (USA) and other publicly available data from other consortia. We thank the research participants and the employees of 23andMe for partly making this work possible. The eMERGE data used in this study included samples from Vanderbilt University, the University of Washington, Marshfield Clinic, Mayo Clinic, and Northwestern University. We acknowledge each contributing cohort separately in the Supplementary Information (Supplementary Note on Acknowledgments). The GERA data were accessed from dbGaP (dbGaP accession phs000674.v3.p3). We acknowledge the submitter(s) of this study. Data came from a grant, the Resource for Genetic Epidemiology Research in Adult Health and Aging (RC2 AG033067; Schaefer and Risch, PIs), awarded to the Kaiser Permanente Research Programme on Genes, Environment, and Health (RPGEH) and the UCSF Institute for Human Genetics. The RPGEH was supported by grants from the Robert Wood Johnson Foundation, the Wayne and Gladys Valley Foundation, the Ellison Medical Foundation, Kaiser Permanente Northern California and the Kaiser Permanente National and Northern California Community Benefit Programmes. The opinions expressed in this manuscript are the author’s own and do not reflect the views of the Canadian Longitudinal Study on Aging or any affiliated institution. This research was made possible using the data/bio-specimens collected by the Canadian Longitudinal Study on Aging (CLSA). Funding for the Canadian Longitudinal Study on Aging (CLSA) is provided by the Government of Canada through the Canadian Institutes of Health Research (CIHR) under grant reference: LSA 94473 and the Canada Foundation for Innovation. This research has been conducted using the CLSA dataset [Baseline Comprehensive Dataset version 4.0, Follow-up 1 Comprehensive Dataset version 1.0], under Application Number 190225. The CLSA is led by Drs. Parminder Raina, Christina Wolfson and Susan Kirkland.

Author information

Authors and Affiliations

Statistical Genetics Lab, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
Mathias Seviiri, Matthew H. Law, Jue-Sheng Ong, Puya Gharahkhani & Stuart MacGregor
School of Biomedical Sciences, Faculty of Health, Queensland University of Technology, Brisbane, QLD, Australia
Mathias Seviiri, Matthew H. Law & Stuart MacGregor
Center for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD, Australia
Mathias Seviiri
23andMe, Inc, Sunnyvale, CA, USA
Pierre Fontanillas, Stella Aslibekyan, Adam Auton, Elizabeth Babalola, Robert K. Bell, Jessica Bielenberg, Katarzyna Bryc, Emily Bullis, Daniella Coker, Gabriel Cuellar Partida, Devika Dhamija, Sayantan Das, Sarah L. Elson, Teresa Filshtein, Kipper Fletez-Brant, Will Freyman, Pooja M. Gandhi, Karl Heilbron, Barry Hicks, David A. Hinds, Ethan M. Jewett, Yunxuan Jiang, Katelyn Kukar, Keng-Han Lin, Maya Lowe, Jey McCreight, Matthew H. McIntyre, Steven J. Micheletti, Meghan E. Moreno, Joanna L. Mountain, Priyanka Nandakumar, Elizabeth S. Noblin, Jared O’Connell, Aaron A. Petrakovitz, G. David Poznik, Morgan Schumacher, Anjali J. Shastri, Janie F. Shelton, Jingchunzi Shi, Suyash Shringarpure, Vinh Tran, Joyce Y. Tung, Xin Wang, Wei Wang, Catherine H. Weldon, Peter Wilton, Alejandro Hernandez, Corinna Wong & Christophe Toukam Tchakouté
Cancer Control Group, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
Catherine M. Olsen
Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
Catherine M. Olsen & David C. Whiteman

Authors

Mathias Seviiri
View author publications
You can also search for this author in PubMed Google Scholar
Matthew H. Law
View author publications
You can also search for this author in PubMed Google Scholar
Jue-Sheng Ong
View author publications
You can also search for this author in PubMed Google Scholar
Puya Gharahkhani
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Fontanillas
View author publications
You can also search for this author in PubMed Google Scholar
Catherine M. Olsen
View author publications
You can also search for this author in PubMed Google Scholar
David C. Whiteman
View author publications
You can also search for this author in PubMed Google Scholar
Stuart MacGregor
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

The 23andMe Research Team

Stella Aslibekyan
, Adam Auton
, Elizabeth Babalola
, Robert K. Bell
, Jessica Bielenberg
, Katarzyna Bryc
, Emily Bullis
, Daniella Coker
, Gabriel Cuellar Partida
, Devika Dhamija
, Sayantan Das
, Sarah L. Elson
, Teresa Filshtein
, Kipper Fletez-Brant
, Pierre Fontanillas
, Will Freyman
, Pooja M. Gandhi
, Karl Heilbron
, Barry Hicks
, David A. Hinds
, Ethan M. Jewett
, Yunxuan Jiang
, Katelyn Kukar
, Keng-Han Lin
, Maya Lowe
, Jey McCreight
, Matthew H. McIntyre
, Steven J. Micheletti
, Meghan E. Moreno
, Joanna L. Mountain
, Priyanka Nandakumar
, Elizabeth S. Noblin
, Jared O’Connell
, Aaron A. Petrakovitz
, G. David Poznik
, Morgan Schumacher
, Anjali J. Shastri
, Janie F. Shelton
, Jingchunzi Shi
, Suyash Shringarpure
, Vinh Tran
, Joyce Y. Tung
, Xin Wang
, Wei Wang
, Catherine H. Weldon
, Peter Wilton
, Alejandro Hernandez
, Corinna Wong
& Christophe Toukam Tchakouté

Contributions

Conceptualisation: M.S., M.H.L. and S.M.; Data curation: M.S., M.H.L., D.C.W., C.M.O. and S.M.; Formal analysis: M.S.; Funding acquisition: S.M., M.H.L. and D.C.W.; Investigation: M.S., M.H.L., D.C.W., C.M.O. and S.M.; Methodology: M.S., M.H.L., and S.M.; Project administration: M.S., M.H.L. and S.M.; Resources: M.H.L., D.C.W. and S.M.; Software: M.S.; Supervision: S.M. and M.H.L.; Visualisation: M.S.; Writing—original draft preparation: M.S.; Writing—revision of subsequent drafts: M.S., S.M., M.H.L., J.-S.O., P.G., D.C.W., C.M.O. and P.F. Replication of the results: P.F., 23andme Research Team. All authors contributed to the final version of the manuscript.

Corresponding author

Correspondence to Mathias Seviiri.

Ethics declarations

Competing interests

Pierre Fontanillas, Chao Tian and the 23andMe Research Team are employed by and hold stock or stock options in 23andMe, Inc. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Rafael de Cid, Maria Concetta Fargnoli and the other anonymous reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer review file

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 3

Supplementary Data 2

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Seviiri, M., Law, M.H., Ong, JS. et al. A multi-phenotype analysis reveals 19 susceptibility loci for basal cell carcinoma and 15 for squamous cell carcinoma. Nat Commun 13, 7650 (2022). https://doi.org/10.1038/s41467-022-35345-8

Download citation

Received: 21 October 2021
Accepted: 29 November 2022
Published: 10 December 2022
DOI: https://doi.org/10.1038/s41467-022-35345-8

This article is cited by

CDKL1 potentiates the antitumor efficacy of radioimmunotherapy by binding to transcription factor YBX1 and blocking PD-L1 expression in lung cancer
- Zixuan Li
- Huichan Xue
- Shuangbing Xu
Journal of Experimental & Clinical Cancer Research (2024)
Association between inflammatory bowel disease and cancer risk: evidence triangulation from genetic correlation, Mendelian randomization, and colocalization analyses across East Asian and European populations
- Di Liu
- Meiling Cao
- Youxin Wang
BMC Medicine (2024)
Multi-ancestry genome-wide meta-analysis identifies novel basal cell carcinoma loci and shared genetic effects with squamous cell carcinoma
- Hélène Choquet
- Chen Jiang
- Maryam M. Asgari
Communications Biology (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.