Sex-specific genome-wide association study in glioma identifies new risk locus at 3p21.31 in females, and finds sex-differences in risk at 8q24.21

Incidence of glioma is approximately 50% higher in males. Previous analyses have examined exposures related to sex hormones in women as potential protective factors for these tumors, with inconsistent results. Previous glioma genome-wide association studies (GWAS) have not stratified by sex. Potential sex-specific genetic effects were assessed in autosomal SNPs and sex chromosome variants for all glioma, GBM and non-GBM patients using data from four previous glioma GWAS. Datasets were analyzed using sex-stratified logistic regression models and combined using meta-analysis. There were 4,831 male cases, 5,216 male controls, 3,206 female cases and 5,470 female controls. A significant association was detected at rs11979158 (7p11.2) in males only. Association at rs55705857 (8q24.21) was stronger in females than in males. A large region on 3p21.31 was identified with significant association in females only. The identified differences in effect of risk variants do not fully explain the observed incidence difference in glioma by sex.


Introduction
Glioma is the most common type of primary malignant brain tumor in the United States (US), with an average annual age-adjusted incidence rate of 6.0/100,000 [1]. Glioma can be broadly classified into glioblastoma (GBM, 61.9% of gliomas in adults 18+ in the US) and lower-grade glioma (non-GBM glioma, 24.2% of adult gliomas) with tumors such as ependymoma (6.3%), unclassified malignant gliomas (5.1%), and pilocytic astrocytoma (1.9%) making up the majority of other cases [1]. Many environmental exposures have been investigated as sources of glioma risk, but the only validated risk factors for these tumors are ionizing radiation (which increases risk), and history of allergies or other atopic disease (which decreases risk) [2]. These tumors are significantly more common in people of European ancestry, in males and in older adults [1]. The contribution of common low-penetrance SNPs to the heritability of sporadic glioma in persons with no documented family history is estimated to be ~25% [3]. A recent glioma genome-wide association study (GWAS) meta-analysis validated 12 previously reported risk loci [4], and identified 13 new risk loci. These 25 loci in total are estimated to account for ~30% of heritable glioma risk. This suggests that there are both undiscovered environmental risk (which accounts for ~75% of incidence variance) and genetic risk factors (accounting for ~70% of heritable risk) [3,4].
Population-based studies consistently demonstrate that incidence of gliomas varies significantly by sex.
Most glioma histologies occur with a 30-50% higher incidence in males, and this male preponderance of glial tumors increases with age in adult glioma (Figure 1) [1]. Several studies have attempted to estimate the influence of lifetime estrogen and progestogen exposure on glioma risk in women [5,6]. Results of these analyses have been mixed, and it is not possible to conclusively determine the impact of hormone exposure on glioma risk. Male predominance in incidence occurs broadly across multiple cancer types and is also evident in cancers that occur in pre-pubertal children and in post-menopausal adults [7,8].
Together these observations suggest that other mechanisms in addition to acute sex hormone actions must be identified to account for the magnitude of sex difference in glioma incidence.
Though sex differences exist in glioma incidence, sex differences have not been interrogated in previous glioma GWAS. Sex-specific analyses have the potential to reveal genetic sources of sexual dimorphism in risk, as well as to increase power for detection of loci where effect size or direction may vary by sex [9,10]. The aim of this analysis is to investigate potential sex-specific sources of genetic risk for glioma that may contribute to observed sex-specific incidence differences.

Study population
There were 4,831 male cases, 5,216 male controls, 3,206 female cases, and 5,470 female controls ( Table   1). A slightly larger proportion of male cases were GBM (58.7% of male cases vs 52.5% of female cases).
Controls were slightly older than cases. GBM cases had a higher mean age than non-GBM cases, which was consistent with known incidence patterns of these tumors. Male and female cases within histology groups had similar age at diagnosis. The proportion of non-GBM cases varied by study due to differing recruitment patterns and study objectives (see original publications for details of recruitment patterns and inclusion criteria [4,[11][12][13][14]).  [14]); e. Data from CGEMS prostate study (Yeager et al. [15] (Figure 3, Table 2). ( Figure 3, Table 2). This association was further explored in a case-only analysis, where there was a significant difference between males and females overall (p=0.0012), and in non-GBM (p=0.0084) ( Table 3). with p D =6.60x10 -5 . Oligoastrocytic tumors were not included in sub-analyses due to recent research that suggests that these tumors are not an entity that is molecularly distinct from oligodendrogliomas or astrocytomas [17]. Table 4 Sex-specific odds ratios (OR), 95% confidence intervals (95% CI), and p values from metaanalysis for rs11979158, rs55705857 and rs9841110 by specific non-GBM histologies.

Genome-wide scan of nominally significant regions
In a previous eight study meta-analysis, ~12,000 SNPs (INFO>0.7, MAF>0.01) were identified as having a nominally significant (p<5x10 -4 ) association with all glioma, GBM, or non-GBM [4]. A sex-stratified genome-wide scan was conducted within this set of SNPs and results were considered significant at p D <1.4x10 -6 (adjusted for 12,000 tests in each of three histologies [36,000 tests], see Figure 2a for schematic of study design). Similar genome-wide peaks were observed between males and females (Figures 4-6). One large region within 3p21.31 (49400kb-49600kb, ~200kb) was identified as being significantly associated with glioma and GBM in females only (Supplemental Figure 1). There were 243 SNPs with nominally significant associations within this region in the previous eight-study meta-analysis (p<5x10 -4 ), and 32 of these had nominally significant sex associations (p F <5x10 -6 or p M <5x10 -6 ) in all glioma or GBM. The strongest association in females within this region was at rs9841110, in both all  (Figure 3). No SNPs in this region were significantly associated with non-GBM. In a case-only analysis a marginally significant difference was detected between males and females overall (p=0.0520) and in GBM (p=0.0428) (Supplemental Table 1).

Agnostic scan of sex chromosome loci
SNPs on the sex chromosomes were analyzed in GICC only. There were 245,746 SNPs with INFO>0.7 and MAF>0.01 on the X chromosome after quality control and imputation, and results were considered significant at p<2x10 -7 (corrected for 250,000 tests, see Figure 2b for a schematic of study design). No SNPs met this significance threshold. After quality control procedures were complete, there were 300 SNPs remaining on the Y chromosome. No significant signals were detected on the Y chromosome.

Combined analysis of germline variants and somatic characterization
Due to the lack of molecular classification data included in the GICC, MDA-GWAS, SFAGS-GWAS< and GliomaScan datasets, glioma data obtained from TCGA datasets (GBM and LGG) were used to explore the potential confounding due to molecular subtype variation with histologies. There were 758 individuals from the TCGA dataset available for analysis with available germline genotyping, molecular characterization, sex and age data (Supplemental Table 2). Overall, slightly more females (53.2%) as compared to males (47.2%) had IDH1/2 mutant glioma, but this difference was not statistically significant (p=0.1104) (Figure 7). When tumors were stratified by histological type, approximately equal proportions of males and females had IDH1/2 mutations present in their tumors (GBM: 6.0% in males, and 5.2% in females; LGG: 17.9% in males, and 17.7% in females). There were also no significant differences by sex in IDH/TERT/1p19q subtype (Supplemental Figure 2, overall p=0.2859), or panglioma methylation subgroup (Supplemental Figure 3, overall p=0.4153).
SNPs found to be nominally significant (p<5x10 -4 ) in a previous 8 study meta-analysis, with imputation quality (r 2 ) ≥0.7 were identified within the TCGA germline genotype data and D' and r 2 values in CEU were used to select proxy SNPs (Supplemental Table 3) [18].
A case-only analysis was conducted using sex as a binary phenotype for proxy SNPs in the TCGA dataset. In the overall meta-analysis, there was a nominally significant signal in the case-only metaanalysis for the proxy SNP in 3p21.31 in glioblastoma ( Table 5). There was no significant association in the TCGA set, but RAF was elevated in females as compared to males in the GBM set, as well as in all IDH1/2 wild type gliomas ( Table 5). MAF in LGG and IDH1/2 mutant glioma was similar among males and females. There was a nominally significant signal in the case-only meta-analysis for the proxy SNP at 7p11.2, but no significant association in the TCGA, but RAF was elevated in males as compared to females in the GBM set, as well as in all IDH1/2 wild type gliomas ( Table 5). There was no significant signal detected in the overall case-only meta-analysis for the proxy SNP at 8q24.21, or within the TCGA set. Among both LGG and IDH1/2 mutant, RAF was elevated in females as opposed to males.
Median URS, URS-GBM, and URS-NGBM were significantly different (p<0.0001) between cases and controls in both males and females in all histology groups (Supplemental Figure 4). There was no significant difference in median risk scores between male and female cases for any histology group.
Glioma risk increased with increasing number of alleles in both males and females for the 10 SNPs included in the overall URS, as well as the 6 SNPs in the URS-GBM and 6 SNPs in URS-NGBM ( Figure   9, Supplemental

Discussion
This is the first analysis of inherited risk variants in sporadic glioma focused specifically on sex differences, and the first agnostic unbiased scan for glioma risk variants on the X and Y sex chromosomes. Like many other types of chronic disease, there is a male preponderance of glioma. This incidence difference is not currently explained by known environmental or genetic risk factors.
One SNP at the 7p11.2 locus (rs11979158) showed significant association in males only, in both all glioma and GBM ( Table 2). Effects were similar in all studies included in the analysis (Supplemental Table 5, Supplemental Figure 6). This variant is within one of two previously identified independent glioma risk loci located near epidermal growth factor receptor (EGFR) and is most strongly associated with risk for GBM. [4,19] Though EGFR is implicated in many cancer types and is a target for many anticancer therapies, this risk locus has not been previously associated with any other cancer type. Estrogen has been demonstrated to interact with EGFR as well as other growth factors [20]. Previous studies have not been definitive about the role of endogenous estrogen exposure in glioma risk, so it was not possible to determine the biological plausibility of this association [20]. Alternatively, cell intrinsic, hormone independent sex differences in EGF effects have been observed in a murine model of gliomagenesis in which EGF treatment was transforming for male but not female astrocytes that had been rendered null for neurofibromin and p53 function [21]. While this specific SNP was not genotyped on the germline genotyping array used for TCGA, a SNP in strong LD with rs11979158 (rs7785013, D'=1, r 2 =1 in CEU [18]) was evaluated. The association in the case-only analysis in TCGA was not statistically significant in any histology group, but a similar trend to that observed in the overall meta-analysis in sex-specific RAF was observed in both the overall GBM group, as well as in the IDH1/2 wild type group.
The association at 8q24.21 (rs55705857) is the strongest that has been identified by glioma GWAS to date, [4] with an odds ratio of 1.99 (95% CI=1.85-2.13, p=9.53x10 -79 ) in glioma overall, and an odds ratio of 3.39 (95% CI=3.09-3.71, p=7.28x10 -149 ) in non-GBM. Effects were similar in all studies included in the analysis (Supplemental Table 5, Supplemental Figure 7). The identified SNP, rs55705857, is located in an intergenic region near coiled-coil domain containing 26 (CCDC26, a long non-coding RNA). This analysis found a stronger association in females than males in all glioma and non-GBM, where female odds ratio estimates are ~2x those of males ( Table 2). ORs were higher in women than men in all studies included in the analysis, but the magnitude of the ORs varied between studies (Supplemental Table 5). Furthermore, the MAF for rs55705857 in the SFAGS-GWAS differed from the other three studies (See Supplemental Table 6 for MAF by study). Consequently, a sensitivity analysis was conducted to assess the effect of study heterogeneity on this estimate in non-GBM using only the GICC, MDA-GWAS, and GliomaScan datasets. The results of this analysis did not substantially change from (Main analysis p D =1.20x10 -6 and sensitivity p D =1.49x10 -5 ).
A histology-specific analysis found a similar sex differences in ORs for rs55705957 for both non-GBM astrocytoma, and oligodendroglioma ( Table 4, see Supplemental Table 7 for study-specific estimates).
Previous analyses have shown that this variant is strongly associated with IDH1/2 mutant tumors, particularly those that have 1p/19q deletions [22,23]. Data on IDH1/2 mutation and 1p/19q codeletion were not available for the combined four GWAS datasets used here. Hence, to assess potential differences in frequency of IDH1/2 mutation, the frequency of these mutations by sex was assessed within the combined TCGA GBM and LGG datasets [24][25][26]. Approximately the same proportion of males as females with histologically confirmed GBM had IDH1/2 mutations (5.2% vs 6.0%, respectively), so females may not be more likely than males to present with IDH1/2 mutant GBM (Figure 7). While this specific SNP was not genotyped on the germline genotyping array used for TCGA, a SNP in weak LD with rs55705857 (rs4636162, D'=1; r 2 =0.104, in CEU [18]) was able to be evaluated. There was no significant association in the overall case-only meta-analysis for this SNP, and the association in the caseonly analysis in TCGA was not statistically significant in any histology group. Sex-specific RAF for this SNP was slightly higher in females as compared to males in the overall LGG group as well as the IDH1/2 mutant group.
A large region in 3p21.31 was identified that was associated with all glioma and GBM in females only (Table 2). Effects were similar in all studies included in the analysis (Supplemental Table 5, Supplemental Figure 7). The strongest association in this region was rs9841110, an intronic variant located upstream of dystroglycan 1 (DAG1) within an enhancer region. While this specific SNP was not genotyped on the germline genotyping array used for TCGA, a SNP in strong LD with rs9841110 (rs9814873, D'=1, r 2 =1 in CEU [18]) was able to be evaluated. The association in the case-only analysis in TCGA was not statistically significant in any histology group, but a similar trend to that observed in the overall meta-analysis in sex-specific RAF was observed in both the overall GBM group, as well as in the IDH1/2 wild type group. The identified risk allele at rs9841110 (C) is associated with significantly as compared to normal tissue [25,28], and increased expression of GPX1 and MSTIR1 have been associated with poor prognosis in multiple cancer types [29,30].
Though this region has not previously been associated with glioma, previous GWAS have detected associations at 3p21.31 for a large variety of traits, including several autoimmune diseases as well as increased age at menarche [31][32][33][34]. Three variants previously associated with increased age at menarche  [18,32]. If lifetime estrogen exposure modifies glioma risk, it is reasonable that variants which increase age at menarche, which may potentially decrease total lifetime estrogen exposure, may also be related to glioma risk in females. Due to the complexity of measuring lifetime estrogen exposure (which is affected by age at menarche, age at menopause, parity, breast feeding patterns, and estrogen replacement therapy post-menopause) it is difficult to determine the 'true' effect that this exposure might have on glioma risk.
As compared to a model containing age at diagnosis and sex alone, the three SNPs (rs55705857, rs9841110 and rs11979158) identified as having sex-specific effects explain an additional 1.4% of trait variance within the GICC set. The variance explained by these SNPs varies by histology (0.6% in GBM, and 3.3% in Non-GBM). The variance explained by the addition of these three SNPs was higher in females for all glioma (1.3% in males and 2.2% in females), and non-GBM glioma (2.3% in males and 5.3% in females), and slightly higher in males for GBM (0.9% in males and 0.7% in females).
In order to compare the cumulative effects of glioma risk variants by sex, unweighted risk scores (URS) were generated by summing all risk alleles using the 10 SNPs found to be significantly associated with glioma in this analysis. GBM (URS-GBM) and non-GBM (URS-NGBM) specific URS were calculated using sets of 6 SNPs in this set that were associated with significantly associated with these histologies.
Individuals with lower numbers of risk alleles had significantly lower risk of glioma, and those with higher numbers of alleles had increased risk for glioma, with statistically significant trends in each histology group). Males and females with low risk scores had similar odds of glioma, while females had increased odds in the upper strata of scores as compared to males. Development of risk scores that weight alleles by effect size, and use sex-specific estimates for variants for which effect size varies by sex (such as 7p11.2 and 8q24.21), may lead to better predictive values for risk scores.
This is the first sex-specific analysis of germline risk variants for glioma, and identifies three loci with sex-specific effects, and leverages multiple existing glioma GWAS datasets. While often not included in GWAS, sex-stratified analyses can reveal genetic sources of sexual dimorphism in risk, [9,10]. Sex variation in genetic susceptibility to disease is likely not due to sex differences in DNA sequence, but is likely to be related to sex-specific regulatory functions [35][36][37]. These analyses may not only contribute to understanding of sources of sex difference in incidence, but may also suggest mechanisms and pathways that vary by sex in contributions to gliomagenesis.
In addition to genetic sources of difference, there are likely several additional factors acting in combination which contribute to sex differences in glioma incidence. Sex differences in disease can also be linked to in-utero development, during which time gene expression and risk phenotypes are patterned through the action of X alleles that escape inactivation and genes on the non-pseudo-autosomal component of the Y chromosome, as well as the epigenetic effects of in utero testosterone. [38]. A previous analysis estimating heritability of brain and CNS tumors by sex using twins attempted to estimate sex-specific relative risks, but these analyses were limited by a small sample size [39]. Further investigation of the inheritance patterns of familial glioma by sex may also provide additional information about sex differences in this disease.
There are several limitations to this analysis. Individuals included in these datasets were recruited during different time periods from numerous institutions, with no central review of pathology. Molecular tumor markers were unavailable for all datasets, and as a result classifications are based on the treating pathologist using the prevailing histologic criteria at time of diagnosis. The variant at 8q24.21 has been shown to have significant association with particular molecular subtypes, and without molecular data it was not possible to determine whether the observed result is an artifact of varying molecular features by sex. Oligodendroglioma as a histology is highly enriched for IDH1/2 and 1p/19q co-deleted tumors (117/174, or ~67% within the TCGA glioma dataset [24] and it is therefore likely that the analysis using only tumors classified as oligodendroglioma captured most of this molecular subtype. Males and females within histology groups have different frequencies of IDH1/2 mutation [24], which may have confounded the estimates for 8q24.21. The TCGA dataset was used to explore sex differences in allele frequency within molecular groups, but none of the identified SNPs were able to be directly validated within this set; however SNPs in strong LD were evaluated except for in 8q24.21. The 8q24.21 region is not well characterized on the array used for the TCGA genotyping, and as a result this region imputed poorly. No proxy SNP in strong LD with rs55705857 was able to be identified. Similar trends in RAF to those observed in the overall meta-analysis were seen in the TCGA set, though these differences were not statistically significant. Further interrogation in datasets with molecular classification where direct genotyping of these regions is warranted in order to confirm the sex-specific associations observed in this analysis.

Conclusions
Sex and other demographic differences in cancer susceptibility can provide important clues to etiology, and these differences can be leveraged for discovery in genetic association studies. This analysis identified potential sex-specific effects in 2 previous identified glioma risk loci (7p11.2, and 8q24.21), and 1 newly identified autosomal locus (3p21.31). Odds ratios for the highest strata of an unweighted risk score calculated by summing total risk alleles was higher in females as compared to males in all three histology groups. These significant differences in effect size may be a result of differing biological function of these variants by sex due to biological sex differences, or interaction between these variants and unidentified risk factors that vary in prevalence or effect by sex.

Study cohorts.
This study was approved locally by the institutional review board (IRB) at University Hospitals Cleveland Medical Center and by each participating study site's IRB. Written informed consent was obtained from all participants. In this study, data was combined from four prior glioma GWAS: Glioma

Genotyping and imputation of GWAS datasets.
GICC cases and controls were genotyped on the Illumina Oncoarray [41]. The array included 37,000 beadchips customized to include previously-identified glioma-specific candidate single nucleotide polymorphisms (SNPs). SFAGS-GWAS cases and some controls were genotyped on Illumina's HumanCNV370-Duo BeadChip, and the remaining controls were genotyped on the Illumina HumanHap300 and HumanHap550. MDA-GWAS cases were genotyped on the Illumina HumanHap610 and controls using the Illumina HumanHap550 (CGEMS breast [16,40]) or HumanHap300 (CGEMS prostate [15]). GliomaScan cases were genotyped on the Illumina 660W, while controls were selected from cohort studies and were genotyped on Illumina 370D, 550K, 610Q, or 660W (See Rajaraman et al. for specific details of genotyping) [14]. Details of DNA collection and processing are available in previous publications [4,[12][13][14]. Individuals with a call rate (CR) <99% were excluded, as well as all individuals who were of non-European ancestry (<80% estimated European ancestry using the FastPop [42] procedure developed by the GAMEON consortium)). For all apparent first-degree relative pairs were removed (identified using estimated identity by descent [IBD]≥.5), for example, the control was removed from a case-control pair; otherwise, the individual with the lower call rate was excluded. SNPs with a call rate <95% were excluded as were those with a minor allele frequency (MAF)<0.01, or displaying significant deviation from Hardy-Weinberg equilibrium (HWE) (p<1x10 -5 ). Additional details of quality control procedures have been previously described in Melin et al [4]. All datasets were imputed separately using SHAPEIT and IMPUTE using a merged reference panel consisting of data from the 1,000 genomes project and the UK10K [43][44][45][46][47].
TCGA cases were genotyped on the Affymetrix Genomewide 6.0 array using DNA extracted from whole blood (see previous manuscript for details of DNA processing [25,26]), and underwent standard GWAS QC, and duplicate and related individuals within datasets have been excluded [4]. Ancestry outliers were identified in TCGA using principal components analysis in plink 1.9 [48]. Resulting files were imputed using Eagle 2 and Minimac3 as implemented on the Michigan imputation server (https://imputationserver.sph.umich.edu) using the Haplotype Reference Consortium Version r1.1 2016 as a reference panel [49][50][51]. Somatic characterization of TCGA cases was obtained from the final dataset used for the TCGA pan-glioma analysis [24], and classification schemes were adopted from Eckel-Passow, et al. [52] and Ceccarelli, et al. [24].

Sex-stratified scan of the autosomal chromosomes
The data were analyzed using sex-stratified logistic regression models in SNPTEST for all SNPs on autosomal chromosomes within 500kb of previously identified risk loci, and/or those found to be nominally significant (p<5x10 -4 ) in a previous meta-analysis (Figure 2A) [4,53]. Sex-specific betas (β M and β F ), standard errors (SE M and SE F ), and p-values (p M and p F ) were generated using sex-stratified logistic regression models that were adjusted for number of principal components that significantly differed between cases and controls within each study.

Estimation of sex difference and test of statistical significance
β D and SE D were estimated using the sex-specific betas and standard errors separately for each dataset, as follows: The difference between the groups was then tested using a z test. [54,55] Sex-stratified results and differences estimates from the four studies were separately combined via inverse-variance weighted fixed effects meta-analysis in META [56]. See Figure 2A for schematic of autosomal analysis methods. Case only-analyses were performed for SNPs found to be significant in agnostic analyses using sex as outcome for all glioma, GBM, and non-GBM by study and betas and standard errors were combined via inversevariance weighted fixed effects meta-analysis in META [56] .

Sex chromosome analysis
X and Y chromosome data were available from GICC set only. Males and females were imputed separately for the X chromosome using the previously described merged reference panel. X chromosomes were analyzed using logistic regression model in SNPTEST module 'newml' assuming complete inactivation of one allele in females, and males are treated as homozygous females ( Figure 2B). For prioritized SNPs in the combined model, sex-specific effect estimates were generated using stratified logistic regression models. Y chromosome data were analyzed using logistic regression in SNPTEST ( Figure 2B) [57]. Figures were generated using R 3.3.2, GenABEL, qqman, and ggplot. [58][59][60][61]

Analysis of TCGA germline and somatic data
Only newly diagnosed cases from TCGA GBM and LGG with no neo-adjuvant treatment or prior cancer were used. Demographic characteristics, molecular classification and somatic alterations data was obtained from Ceccarelli, et al [24]. Chi-square tests were used to compare the frequency of somatic alterations between age groups. SNPs found to be nominally significant (p<5x10 -4 ) in a previous 8 study meta-analysis [4], with imputation quality >= 0.7 were identified within the TCGA genotype data and D' and r 2 values in CEU were used to select proxy SNPs [18]. Using these SNPs, a case-only analysis using sex as a binary phenotype was conducted using logistic regression in SNPTEST assuming an additive model to estimate beta, standard error, and p values [53]. Results were considered significant at p<0.003 (Bonferroni correction for 15 tests, for the three assessed loci in each of five histology groups).

Calculation of unweighted genetic risk scores
In order to estimate the cumulative effects of significant variants by sex, histology-specific unweighted risk scores were calculated using the SNPs found to be significantly associated with each outcome. Data from all four studies was merged, and any imputed genotypes with genotype probability > 0.8 were converted to hard calls. An overall unweighted risk score (URS) was generated using the sum of risk alleles at rs12752552, rs9841110, rs10069690, rs11979158, rs55705857, rs634537, rs12803321, rs3751667, rs78378222, and rs2297440. As risk alleles are known to have histology specific associations, [4] histologic specific scores were generated for GBM and non-GBM using only the SNPs found to have a significant association with each histology. GBM-specific URS (URS-G) was calculated by summing the number of risk alleles at rs9841110, rs10069690, rs11979158, rs634537, rs78378222, and rs2297440. Non-GBM-specific (URS-N) specific URS was calculated by summing the number of risk alleles at rs10069690, rs55705857, rs634537, rs12803321, rs78378222, and rs2297440. Unweighted risk scores (URS) were calculated by summing all risk alleles for each individual. Differences in median scores between groups using were tested using Wilcoxon rank sum tests. Scores were compared against the median score for each set (URS: 10, URS-GBM: 6 alleles, URS-NGBM: 4 alleles). Odds ratios and 95% confidence intervals for each level of the score using sex-stratified logistic regression adjusted for age at diagnosis (for controls where only an age range was available, the mean value of the range was used), where each score was compared to the median score within the entire population as described in Shete et al. [13] .

Calculation of trait variance explained by SNPs with sex-specific effects
In order to determine whether the identified SNPs with sex-specific effects more accurate estimate odds of glioma than sex alone, logistic regression models were used to estimate odds of all glioma, GBM, and non-GBM glioma based on sex using the GICC data only. Proportion of variance in odds of glioma explained by sex-specific SNPs was calculated using R 2 estimated using the log likelihood of the null model (sex, age at diagnosis, and the first two principal components only) and the full model (including identified SNPs, rs9841110, rs11979158, rs55705857) [62], calculated as follows: Proportion of variance explained was also calculated separately by sex for each histology (null model adjusted for age at diagnosis, and the first two principal components only).

Supporting information captions
Supplemental Table 1. Case-only odds ratios (OR), 95% confidence intervals (95% CI), and p values from meta-analysis and individual studies for rs11979158, rs55705857 and rs9841110 overall and by histology groupings.
Supplemental Table 2. Characteristics of individuals in The Cancer Genome Atlas, by study and sex.
Supplemental Table 3. Linkage disequilibrium measures, sex-stratified odds ratios, and 95% confidence intervals (95% CI), and p values from meta-analysis for marker SNPs selected within the Cancer Genome Atlas genotyping data .
Supplemental Table 4. Odds ratios and 95% confidence intervals for unweighted scores in all glioma, GBM, and non-GBM overall and by sex.
Supplemental Table 5. Info score, sex-specific odds ratios (OR), 95% confidence intervals (95% CI), and p values from meta-analysis and individual studies for rs11979158, rs55705857 and rs9841110 overall and by histology groupings.
Supplemental Table 6. Risk allele frequencies (RAF), for meta-analysis and individual studies for rs11979158, rs55705857 and rs9841110 overall and by histology groupings.
Supplemental Figure 1. P values of SNPs between 48.8mb and 50mb on chromosome 3 in males for A) all glioma, B) GBM, and C) non-GBM, and in females for D) all glioma, E) GBM, and F) non-GBM Supplemental Figure 2. Proportion of samples by glioma subtype (based on IDH1/2 mutation, 1p19q, and TERT mutation) in the TCGA GBM and LGG datasets by sex, overall and stratified by study Supplemental Figure 3. Proportion of samples by pan-glioma methylation subgroups [23] in the TCGA GBM and LGG datasets by sex, overall and stratified by study Supplemental Figure 4. Density of histology-specific unweighted risk score by sex and case/control status for A) URS in all glioma, B) URS in GBM, C) URS in non-GBM, D) URS-GBM in GBM, only and E) URS-NGBM in non-GBM only Supplemental Figure 5. Sex-specific odds ratios and 95% CI from meta-analysis and by study for rs11979158 (7p11.2) for all glioma, GBM, and non-GBM Supplemental Figure 6. Sex-specific odds ratios and 95% CI from meta-analysis and by study for rs55705857 (8q24.21) for all glioma, GBM, and non-GBM Supplemental Figure 7. Sex-specific odds ratios and 95% CI from meta-analysis and by study for rs9841110 (3p21.31) for all glioma, GBM, and non-GBM