Novel pleiotropic risk loci for melanoma and nevus density implicate multiple biological pathways

The total number of acquired melanocytic nevi on the skin is strongly correlated with melanoma risk. Here we report a meta-analysis of 11 nevus GWAS from Australia, Netherlands, UK, and USA comprising 52,506 individuals. We confirm known loci including MTAP, PLA2G6, and IRF4, and detect novel SNPs in KITLG and a region of 9q32. In a bivariate analysis combining the nevus results with a recent melanoma GWAS meta-analysis (12,874 cases, 23,203 controls), SNPs near GPRC5A, CYP1B1, PPARGC1B, HDAC4, FAM208B, DOCK8, and SYNE2 reached global significance, and other loci, including MIR146A and OBFC1, reached a suggestive level. Overall, we conclude that most nevus genes affect melanoma risk (KITLG an exception), while many melanoma risk loci do not alter nevus count. For example, variants in TERC and OBFC1 affect both traits, but other telomere length maintenance genes seem to affect melanoma risk only. Our findings implicate multiple pathways in nevogenesis.


Supplementary
. Meta-analysis heterogeneity and meta-regression results for nevus association using the R metafor package. The metaregression included mean age in the study, mean absolute latitude and nevus measurement method as moderators. In the meta-regression, I 2 is the estimated percentage of sampling variance due to heterogeneity between studies, R 2 the percentage explained by the moderator variables, and H 2 the percentage unexplained residual heterogeneity. Q M P is the P-value from the test for the contribution of moderators, and Q E P, the P-value for the test for residual heterogeneity.
Supplementary Table 6. Genome-wide complex trait analysis (GCTA) estimates of the variance explained by all autosomal SNPs for nevus count for TwinUK, QIMR and a combined sample (using all family members or only unrelateds (1/family)).    While nevus counts or density assessments are available for melanoma cases from a number of studies, in the meta-analysis of nevus count we included only samples of healthy individuals without melanoma, all of European ancestry.

ALSPAC:
The Avon Longitudinal Study of Parents and Children is a population-based birth cohort study consisting of over 13,000 women and their children recruited from the county of Avon, UK, in the early 1990s 14,15 . Both mothers and children have been extensively followed from the eighth gestational week onward with a combination of self-reported questionnaires, medical records, and physical examinations. Biological samples including DNA were collected for 10,121 of the children from this cohort, and genome-wide SNP typing has been performed. The number of large and small nevi on the arms and legs was self-counted by 3,257 participants at age 15, and validated by observer count of nevi >5 mm diameter on one arm in 325 individuals 16 . The total number for whom both nevus count and genome-wide association studies (GWAS) were available was 3,309.
Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees and written and informed consent was provided by the parents. The study website contains details of all the data that are available through a fully searchable data dictionary (http://www.bris.ac.uk/alspac/researchers/data-access/datadictionary/). Because of a quite skewed distribution of counts, the inverse Gaussian rankit transformation was used for the counts, and regression adjusted for sex (all participants are the same age). Coding of phenotype and statistical analysis: Both the NHS and HPFS prospectively collected self-reported information on the number of melanocytic nevi on the arms (larger than 3 mm in diameter) using similar wording. Participants were asked to choose from the following categories: 1 = none, 2 = 1-2, 3 = 3-5, 4 = 6-9, 5 = 10-14, 6 = 15-20, and 7 = 21+. This ordinal variable is a highly significant predictor of melanoma risk in these cohorts 17 . SNP genotype data are available for 32,975 participants who took part in seven case-control studies nested within the NHS and HPFS.

Leeds:
The Leeds case-control study 3 covers a geographically defined area of Yorkshire and the northern region of the United Kingdom. Controls were identified by the cases' family doctors as not having cancer and were randomly invited from individuals with the same sex and within the same 5-year age group as a case. Participants were examined by specifically trained research nurses, who counted nevi on exposed skin (excluding the genitalia and breasts). Total body nevus count was defined as the sum of all nevi >2 mm. We include phenotype and genotype data from 397 controls (mean age 57 years, range 21-80). Studies were approved by the UK Multi-Centre Research Ethics Committee (MREC), and the Patient Information Advisory Group (PIAG) and informed consent was obtained from all subjects.

QIMR Brisbane Twin Nevus Study (BTNS):
As described in detail elsewhere4,18, adolescent twins, their siblings and parents have been recruited since 1992 into an ongoing study of genetic and environmental factors contributing to the development of pigmented nevi and other risk factors for skin cancer. The proband twins are recruited via public appeal and through schools around Brisbane. The total body nevus count (excluding the genital area and breasts) was defined as the sum of all nevi >0 mm in diameter. Total body nevus counts (excluding the genital area and breasts) summed across three size classes, <2 mm, 2-5 mm, >5 mm, were obtained from the adolescent twins on two occasions by a trained nurse (at ages 12 and 14), and in near-age singleton siblings on one occasion (the first visit of the twins). For simplicity, we use only the first measurement from the twins, and 3,261 individuals from 1,309 families were included. The mean age of both cohorts at the time of counting was 12.6 (range 10-23).

Parents of BTNS Twins:
Mole counts in the parents of the twins were assessed by a 4-point questionnaire item with levels: none, a few, moderate, many; the item was accompanied by a cartoon showing 0, 6, 14 and 28 spots visible on a silhouette of a human in the anatomical position. Of the parents of the twins, 2,248 from 1,299 families had self-reported nevus score and genotypes available. While the parents in themselves may be viewed as a sample of unrelated individuals, their relatedness to the twins/sibs in Cohorts 1 and 2 above must be taken into account in genetic analysis (see below).

QIMR Adult Twins:
In the context of an online survey whose primary focus was childhood trauma of adult twins who had been already genome-wide genotyped 6  Harvard: There were 36 GWAS datasets from the NHS, NHS2, and HPFS with cleaned genotype data available (Tables S1.2-1-S1.2-3). These studies were genotyped using different arrays that fall into five categories: AffyMetrix, Illumina HumanHap series, Illumina OmniExpress, OncoArray, and HumanCoreExome. Standard quality control filters for call rate, Hardy-Weinberg equilibrium, and other measures were applied to the genotyped SNPs and/or samples. We combined these datasets into five compiled datasets based on their genotype platforms using the same methods described by Lindström and coworkers 19 . Briefly, datasets were imputed separately, with the 1000 Genomes Project Phase 3 Integrated Release Version 5 as the reference panel. SNP genotypes were imputed in two steps: genotypes on each chromosome were phased using ShapeIT (v2.r837), and these phased data submitted to the Michigan Imputation Server, which used Minimac3 to impute the phased genotypes to approximately 47 million markers in the 1000 Genomes Project mixed population. We regressed nevus counts (0 = none, 1.5 = 1-2, 4 = 3-5, 7.5 = 6-9, 10 = 10+) on the dosage of each SNP, adjusted for age, sex, and the top three principal components (PCs). All the analyses were first conducted within each of the platform-specific datasets, and then combined by inverse-variance-weighted meta-analysis if results were not significantly different. ProbABEL package and R-3.0.2 were used to perform these tests. Ethical approval was obtained from appropriate IRBs and informed consent was obtained from all subjects. Analysis of the nevus counts in the adolescent BTNS and TEST studies was carried out following cube-root transformation of the counts, and including age, age 2 , year studied, body surface area, estimated cumulative UV exposure, ancestry (first 5 PCs as well as reported grandparental ancestry, coded as northern vs. southern European), hair and skin colour as covariates. For these counts from adolescents, we found that cube-root transformation was significantly better than log transformation based on the Box-Cox analysis, and it closely matched the results of a negative binomial generalized linear mixed model analysis of the BTNS data, which also improved significantly in terms of model-fit diagnostics Association of the GWAS SNP genotypes to quantitative phenotypes was carried out using a conventional linear mixed model as implemented in GEMMA 21 which takes account of relatedness. For some analyses fine-scale follow-up and multivariate mixed model association analysis was carried out using MENDEL 14.0 22 , and WOMBAT 23 (see Table  S3.2-1). Descriptive statistics and plots were produced using the R statistical programming environment. 24 TwinsUK: The TwinsUK sample was genotyped using a combination of Illumina arrays (HumanHap3001'2, HumanHap610Q, 1M-Duo and 1.2MDuo 1M) as described previously12. Stringent QC measures were implemented, including minimum genotyping success rate (>95%), Hardy-Weinberg equilibrium (P > 10 -6 ), minimum MAF (>1%) and imputation quality score (>0.7). Subjects of non-Caucasian ancestry were excluded from the analysis. Whole genome imputation of the genotypes was performed using the Phase I, Integrated Variant Set from the 1000 Genomes Project (March 2012). Pre-phasing was done using ShapeIt and the imputed genotypic probabilities were calculated using IMPUTE2.

Nevus GWAS meta-analysis
The assessment of nevus counts varies considerably between the 11 studies in three respects: (a) nevus counts vs. density ratings; (b) whole body vs. only certain body parts; (c) all moles (>0 mm diameter) or only moles greater than 2 mm, or 3 mm, or 5 mm. These differences could contribute considerable heterogeneity of phenotypes to our analyses, so we have done considerable preliminary work to convince ourselves that all assessments are measuring the same biological dimension of "moliness". Results of these analyses are found in Fig. S3.3-1.
Given this, we combined results from each study as regression coefficients and associated standard errors in standard fixed and random effects meta-analyses using the METAL program 26  To declare a locus as genome-wide significant we have chosen the fixed-effects meta-analysis association P-value for SNPs when the P value for study homogeneity test P Q > 0.05, and selected the conventional random effects mean effect P value when P Q ≤ 0.05. It is well known that the conventional random effects test is less powerful in the presence of betweenstudy heterogeneity, 93-96 so for fine mapping in regional association plots etc, we present the fixed-effects P values, even when heterogeneity is present.
We perfomed a REML-based meta-regression for the 21 most significant nevus-associated regions from Table 2 (main text) using the R metafor package 88 . The included covariates were mean age of participants for that study, mean absolute latitude for the location of the study, and study nevus measurement method (categorised as "partial body count", "total body count", "ordinal score").
In the case of extremely significant between-study heterogeneity, most notably rs12203592 in IRF4, we would strongly argue against intepreting an across-studies mean effect of this SNP of zero as being evidence that this locus is not truly associated to nevus count and melanoma risk. In a population genetic context, if there is statistically significant evidence of differences between populations in strength of association, then we have detected a real biological phenomenon, providing we can exclude methodological artefact as the cause. In this case the appropriate null hypothesis is no association within any subgroup. [93][94][95][96] To this way of thinking, the detection of marked heterogeneity for IRF4 gives strong support for association of rs12203592 genotype with nevus count in some but not all populations.
Given that many of the nevus-associated loci are involved in pigmentation, it is not surprising these exhibit between-population differences in association -both allele frequency and linkage disequilibrium pattern differences can lead to this, quite aside from gene by environment interaction.
Han and Eskin 93 suggested an alternative random effects test, as implemented in the METASOFT package (http://genetics.cs.ucla.edu/meta/), which tests the composite hypothesis that the mean effect and heterogeneity are zero. The Han and Eskin P-values are compared to those from other approaches in Table S3.2.2 below. In most cases, the Han and Eskin test agrees with the conventional fixed effects analysis P, except for rs12203592, where it is much more significant.

Melanoma GWAS meta-analysis
We combined the results from the nevus meta-analysis above with results from Stage 1 of a recently published meta-analysis of cutaneous malignant melanoma (CMM) 13 . Stage 1 of the CMM study consisted of 11 GWAS data sets totalling 12,874 cases and 23,203 controls from Europe, Australia and the United States; this stage included all six published CMM GWAS and five unpublished ones. We do not utilize the results of stage 2 of that study, where a further 3,116 CMM cases and 3,206 controls from three additional data sets were genotyped for the most significantly associated SNP from each region, reaching P < 10 -6 in stage 1. As a result, certain melanoma association peaks are not genome-wide significant in their own right in the present bivariate analyses. Further details of these studies can be found in the Supplementary Note to Law et al. 13

Meta-analysis of nevus GWAS meta-analysis and melanoma GWAS meta-analysis
Combination of the nevus and melanoma P-values was performed using the Stouffer method.
We weighted the P-values using the square root of the sample sizes for nevi and melanoma. Generally, the fixed effects P-value was used unless the Cochran Q heterogeneity P-value was below 0.05, uncorrected for multiple testing. We also estimated a q value 27 for the combined association P values using the R qvalue package (https://bioconductor.org/packages/release/bioc/html/qvalue.html ). A Manhattan plot for the combined nevus GWASMA plus melanoma GWASMA is shown in Fig. S3.3-2.

S2.2 Gene-based tests using VEGAS and PASCAL
We applied the VEGAS program 28 to our combined nevus-melanoma meta-analytic P values to test for gene association to our traits. We used 1,000,000 iterations to generate empirical P values for each gene, and regarded the usual Bonferroni-corrected critical threshold of 3x10 -6 as genome-wide significant. We also estimated q values and local FDRs (lfdr) for the gene based P values29-32 using the R qvalue package (https://bioconductor.org/packages/release/bioc/html/qvalue.html). The local FDR is an estimate of the probability one will falsely declare a locus significant if this level P value is used as the critical threshold in a test of statistical significance. It is estimated by fitting a 2distribution mixture to the P values, most of which are taken as being samples from the unassociated gene set.
The PASCAL program 33 has very similar functionality to VEGAS, but calculates gene-based P values using an analytic approach. It also extends this to pathway-based analysis.

S2.3 Estimating SNP heritability (h 2 s) and r g using bivariate REML, GCTA analyses and LD score regression
We have performed multiple supplementary univariate and multivariate REML analyses of those studies where there are nevi counts carried out by trained observers (Australian BTNS and TwinsUK). For one sub-analysis, we have also analyzed melanoma and self-rated nevus count in Australian case families, combining these with control pedigrees from the BTNS to contrast the estimates of the genetic correlation between these traits as estimated by the LD score regression approach.

S2.3.1 GCTA and LDAK heritability analysis of BTNS and TwinsUK (classical twin analysis, partitioning by chromosome)
We used GCTA 34 to estimate the SNP heritability (h 2 s ) in the combined BTNS plus TwinsUK sample, given that nevi were counted using the same standardised protocol. This was done in two ways: first, including all subjects regardless of relatedness (n = 5,608); and second, and more conventionally, using only one subject per family (n = 2,863) to preclude confounding of results by close relatives. Contributions by chromosome were also assessed.
For some analyses (see S ection S 3.4.2 ), we also used the LDAK 5.0 program. 35 This allowed us to calculate LD-adjusted weights for the empirical kinship matrices. We also estimated contributions of sets of our peak associated regions as an additional region kinship matrix within the same mixed model framework.

S2.3.2 Bivariate heritability analysis of melanoma and nevus count
We also performed analyses examining the overall architecture of the relationship between nevus count and melanoma risk. First, a bivariate REML analysis using the GCTA package of a melanoma Queensland case-control family-based sample36 that overlaps one study used by Law and coworkers; there were 5,210 individuals from 3,520 melanoma case and control families, including 1,137 melanoma cases, and nevus assessment was a 4-point self-rating questionnaire item. We compared these results to those of a bivariate LD score regression analysis of summary results from the entire nevus and melanoma meta-analyses (https://github.com/bulik/ldsc).

S2.4 GCTA conditional and joint analysis
The GCTA package provides an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of GWAS and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. We have performed two analyses using the QIMR adolescent twin sample as our reference population. The first is a stepwise model selection procedure selecting independently associated SNPs "-cojo-slct --cojo-p 5e-8 --maf 0.01 --cojo-actual-geno"), obtaining 8 SNPs. The second was a joint analysis of these top SNPs estimating the genetic variance explained.

S2.5 Bivariate analysis using GWAS-PW
Pickrell et al. 37 recently described an interesting extension of the approach of Giambartolomei et al. 38 to the combination of summary results from individual trait GWAS in a Bayesian bivariate analysis that can examine the genetic relationship between the traits. We used their program, GWAS-PW, on the meta-analysis results from nevus count (the present study), melanoma 13 , telomere length 39 , and a GWAS of pigmentation as measured by hair darkness using the BTNS twins (hair color as self-reported on a 5-point scale). The genome was segmented into approximately independent LD blocks (European 1000 Genomes populations) using the file provided by Pickrell et al. 37 . The program tests the hypothesis that each region contains either: (a) no associated trait loci; (b) one measured locus associated with the first trait; (c) one measured locus associated with the second trait; (d) one locus associated with both traits; (e) two loci, each associated with one trait. The analysis provides posterior probabilities for each of these hypotheses for each region. It also provides similar statistics for each SNP within the region. The latter results were sometimes surprising for our data, in that the SNP with the smallest observed individual P values often had very low posterior probabilities supporting the hypothesis that it was associated. As a result, we have relied in almost all cases on the binned analyses.
To summarize the results for this analysis, we have plotted the posterior probabilities for the four hypotheses (b)-(e) ( Fig. S3.6-1).
In the case of the telomere length GWAS results, we have imputed the association Z values up to the same SNP density as the nevus and melanoma GWAS using the DISTmix pack-age40, and interpolating for the necessary variances ( Fig. S3.6-2a-c).

S2.6 Pathway analysis
We assessed contributions of entire pathways by estimating a genomic relationship matrix (GRM) for all SNPs within the genes making up the pathway as indicated either by gene product function or GWAS result. For example, the telomere length maintenance pathway GRM included SNPS from ACYP2, ASCC2, CSNK2A2, CTC1, DKK2, NAF1, OBFC1, RTEL1, SYT16, TERC, TERT, TMPRSS7,TRDMT1, ZNF208. Mixed model analysis was undertaken using the entire Australian twin samples including both pedigree-based and genomic GRMs and a shared family environment correlation matrix, along with the candidate pathway GRM, using LDAK 5.0 (as we encountered a few numerical problems using the GCTA package).

S2.7 Bioinformatic methods
Multiple methods were used to annotate trait associated SNPs. HaploReg Version 4.1 41 was used to obtain summary functional data for individual SNPs. The Roadmap Epigenomics dataset were mined for locus annotations for our SNPs and genes for skin related cell types: melanocytes, skin keratinocytes and fibroblasts. We also assessed all candidate SNPs for evidence that they were co-localized and affected the function of regulatory elements, or were known eQTLs. Specifically, we tested co-localization of candidate SNPs with melanocyte H3K27Ac sites (http://www.roadmapepigenomics.org/), MITF and SOX10 binding sites, Hi-C, FANTOM5, PRESTIGE42 (Table S2.7-1).

S3.2 Nevus GWAS meta-analysis
As described earlier, we combined results from each study as regression coefficients and associated standard errors in standard fixed and random effects meta-analyses using the METAL program 26 1-11). Meta-regressions including age, latituted and nevus counting method as moderators suggest this is largely due to interactions with age and method in the case of the most extreme SNP, rs12203592 in IRF4 (Table S3.2-1). Different nevus subtypes predominate at different ages, with the dermoscopic globular type most common before age 20 43 . The QIMR and Leeds studies are the only ones that have adolescents assessed, so enabling detection of this interaction 5 . The other epidemiological driver of heterogeneity is intensity of sun exposure, which is obviously mainly high in the Australian studies and lower in the European ones 5 . The contribution of family environment in the twin analyses was twice as high in the low UV environment of Britain -we would speculate that exposures are uniformly high in the Queensland environment. A methodological cause of heterogeneity that should affect all loci equally would be differences in nevus assessment methods, since attenuation leads to downward biasing of effect estimates; however, the method could covary with other confounders producing differential errors between loci. The results in Table S3.3.2 demonstrate the shortcomings of the conventional random effects test of overall association in the presence of heterogeneity.

S.3.3.1 Plots for combined meta-analysis of the combined nevus GWASMA plus melanoma GWASMA
The Manhattan and Quantile-Quantile plot for the combined nevus GWASMA plus melanoma GWASMA is shown in Supplementary Figs   Colours been laid down sequentially, so the colour of lesser SNPs from a "more significant" chromosome are overwritten. The genomic inflation factor λ=1.36, standardized genomic inflation factor λ 1000 =1.004

S3.3.2 Gene-based tests using VEGAS and PASCAL
The VEGAS and PASCAL programs 28,44 were applied to our combined nevus-melanoma meta-analytic P values to test for gene association to our traits. We used 1,000,000 iterations to generate empirical P values for each gene, and regarded the usual Bonferroni-corrected critical threshold of 3x10 -6 as genome-wide significant. We also estimated q values and local FDRs (lfdr) for the gene based P-values 29-32 using the R qvalue package (https://bioconductor.org/packages/release/bioc/html/qvalue.html). The local FDR is an estimate of the probability one will falsely declare a locus significant if this level P value is used as the critical threshold in a test of statistical significance. It is estimated by fitting a 2distribution mixture to the P values; most of which are taken as being samples from the unassociated gene set.

S3.4.1 GCTA heritability analysis of BTNS and TwinsUK (classical twin analysis, partitioning by chromosome)
We used GCTA 41 to estimate the SNP heritability (h 2 s ) in the combined BTNS plus TwinsUK sample, given that nevi were counted using the same standardized protocol. We did this in two ways, first including all subjects regardless of relatedness (n = 5,608), and second, and more conventionally, using only one subject per family (n = 2,863) to preclude confounding of results by close relatives. The first analysis in many senses just recapitulates the standard variance components analysis of twin data with its high statistical power arising from the large magnitude of the empirical kinship coefficients but with the added advantage that it partitions total twin heritability by chromosome. Since it is essentially the classical twin method, we may also include a variance component for shared family environment (C). It should be noted that this analysis will, with equal power, detect effects of both common and unmeasured rare variants. In contrast, the one per family analysis is only powered to detect common variants tagged by the genotyping array. We can, in principle, therefore compare the individual chromosome h 2 s estimates under the two methods to infer the relative genomic distributions of common and rare variants. In practice, standard errors may be too large to make this a useful comparison with current power constraints (sample size, chip design). Table S3.4-1 shows, for the combined BTNS plus TwinsUK sample, the "all subjects" genetic component G, which is equivalent h 2 twin = 58%, while C = 34.2% and E = 7.8% (unique environment, which includes measurement error). (The smaller estimate for G, and correspondingly larger estimate for C in the UK sample may be due to a much more dispersed age structure of the UK sample  compared to the Oz sample (9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23).) The "one per family" GCTA analysis of the combines sample estimates h 2 s = 29.4%, which is about half our h 2 twin = 58%. This ratio (from which the "missing heritability" is often invoked) is an almost ubiquitous finding from current GWAS results for many complex traits, and is almost certainly due to imperfect tagging of variants with the chips employed.
The contributions by chromosome to trait genetic variance were also assessed. Every autosome contributed some variance, but the largest single contribution was from one-sixth of chromosome 9 (Table S3.4-2). We can, in principle, compare the individual chromosome h 2 s estimates from pedigree-based and SNP regressions to infer the relative genomic distributions of common and rare variants. In practice, standard errors may be too large to make this a useful comparison with current power constraints (sample size, chip design). Given that the entire chromosomal contribution of chr 9 in the twin analysis is close to that of the individual top SNPs at MTAP and 9q32.1 suggests that the responsible variants are well captured by the chip. A more complex case is for chr6 where we see sizable contributions for Australia but negligible ones for the UK. We have previously shown the effects of the IRF4 locus on 6pter are large in the young BTNS sample but absent in the older TwinsUK sample, and this is reflected in the Forest plot [ Supplementary Fig. 5.1-4]. Another interesting pattern (albeit not significant) is on chr12 where we see h 2 c (using all family members) = 3.3% versus 0.0% for 1 per family. Were these differences significant we might infer that most variation on this chromosome is rare rather than common and tagged by SNPs. The fact that no differences are strikingly great or significant is supporting evidence that polygenic effects are small and distributed across the genome in proportion to chromosome length.

S3.4.2 Variance components based estimates of contributions of genes and gene sets to variation in total nevus count.
Using the same BTNS adolescent and Twins UK twins, we estimated the proportional contribution of the most strongly associated loci, as well as total contribution from loci in pathways implicated by the SNP association analysis. We carried this out using the LDAK 5.0 program with the complete twin families, adjusting for age, body surface area. The contribution of the top associated SNPs was represented by the empirical kinship matrix based on 1000 SNPs covering the regions of the loci in the bivariate melanoma and nevus count analysis (all SNPs in our associated regions with an combined association P value less than 10 -3 ; a list of these SNPs is in the attached spreadsheet). Genomic effects are represented by the (overall) pedigree-based kinships. In this analysis, again a significant effect of family environment (Table S3.4-3) was detectable in the older UK sample. The contributions of our chosen genomic regions to the total genetic variance were 12% in the UK twin families, and 25% in the Australian twin families.

S3.4.3 Bivariate heritability analysis of melanoma and nevus count
We have also performed analyses examining the overall architecture of the relationship between nevus count and melanoma risk in the QMEGA family study of melanoma 36 . First we performed a bivariate REML analysis in the GCTA package; there were 5,210 individuals from 3,520 case and control families in the analysis, including 1,137 melanoma cases; nevus assessment was a 4-point self-rating.
We also carried out bivariate LD score regression analysis of the nevus and melanoma metaanalyses (https://github.com/bulik/ldsc). For melanoma, the family-based GCTA and total dataset LD score analyses gave somewhat different point heritability estimates of 0.59 (0.22) and 0.16 (0.03) respectively. The latter may be closer to the SNP heritability h 2 s and is consistent with the h 2 s estimate of 19.2% due to significant SNPs in the Law et al metaanalysis 13 , while the former estimate is consistent with the value for melanoma age of onset of 0.45 (0.05) obtained using a more elaborate method on an overlapping data-set 25 .
Both analyses badly underestimated the h 2 of nevus count at 0.07 (0.04) and 0.03 (0.01) respectively. This is probably because of the coarse nevus measure in the family study and the genetic architecture for the LD score analysis (the original authors note that the method does not seem to "work" for some phenotypes, including perhaps melanoma, as above). In both analyses, however, the r g is high at 0.68 (0.40) and 0.66 (0.14), albeit with large SE for the pedigree estimate. We interpret this as reflecting the strong genetic causation of nevus number along with a direct phenotypic causal pathway to melanoma, that is, the genes we have found for nevus number act via different pathways but will all affect melanoma risk. In all examples, the nevus number increasing alleles we have found also proportionately increase risk of melanoma. On the other hand, the converse is not always true -there are some melanoma loci that do not increase nevus count. This is in keeping with the notion that nevus number is the final common pathway to melanoma risk for all these heterogeneous nevus risk pathways.

S3.5 Results of GCTA conditional and joint analysis
The GCTA package provides an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of GWAS and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. We have performed two analyses using the QIMR adolescent twin sample as our reference population. The first is a stepwise model selection procedure selecting independently associated SNPs "-cojo-slct --cojo-p 5e-8 --maf 0.01 --cojo-actual-geno"), obtaining 8 SNPs (Table S3.5-1). The second was a joint analysis of the top SNPs from the bivariate melanoma and nevus analysis estimating the multivariate mutually adjusted contributions from each SNP (see Table  S3.5-2). Comparison of the univariate and multivariate adjusted betas (6th and 11th columns of Table S3.5-2) reveals negligible changes in magnitude.

S3.6 Results of bivariate analyses using GWAS-PW for nevi and melanoma
We have considered bins where the posterior probability (PPA) of an associated locus in that interval exceeded 50% as potentially interesting, and have tabulated (Table S3.6-1) and plotted (Fig. S 3.6-1 ) the PPA from the GWAS-PW bivariate segmental analysis. An attractive feature of the Bayesian approach is that most bins are assigned a low PPA, so that candidate regions stand out clearly. In some cases, multiple adjacent regions are given high PPAs due to linkage disequilibrium, as the LD blocks are only a partial approach to dealing with LD. On chromosome 16, for example, LD with MC1R haplotypes extends considerable distances.
The most striking finding for nevus count and melanoma is that no "pure" nevus loci were detected (Fig. S 3.6-1 and Table S3.6-1). That is, all the significant nevus count loci also acted to increase melanoma risk. By contrast (see below , Fig ure S 3.7-2 ), only TERC appeared as a significant pleiotropic telomere length and nevus count locus, despite the fact that multiple telomere maintenance genes harbour melanoma risk polymorphisms.
The results for the IRF4 region are not as impressive as one would expect, because they ignore the marked heterogeneity between studies for association with both melanoma and nevus count.
We have also plotted the specific effects of risk alleles for the two traits (Fig S3.6-2 and Fig  S3.6-3).  Figure S3.6-1. Results of GWAS-PW analyses which assigns posterior probabilities (PPA) to each of ~1700 genomic regions that it is (a) a pure melanoma locus, (b) a pure nevus locus, (c) a pleiotropic nevus and melanoma locus, and (d) that the locus contains co-located but distinct variants for nevi and melanoma.

S3.7.1 Bivariate GWAS-PW Bayesian GWAS analyses
As well as using more traditional approaches to assess contribution of loci in particular biological pathways, we have also performed bivariate GWAS-PW analyses using traits that act as "read-outs" for those pathways. In the case of overlap with loci for telomere length, the strength of genetic correlation seems to be stronger with melanoma than with nevus count using this approach (though see section S3.7.2 below). Further to the appearance that there are co-located telomere length and nevus count loci in the region of PLA2G6 on chromosome 22 in Supplementary Fig. 3.7-2, we plotted the contributing results for this region (Supplementary Fig. 3.7-3). The nevus and TL peaks on Supplementary Fig. 3.7-3 are quite distant from one another, and the best telomere length SNP lies in KCNJ4, not a candidate, though DMC1 nearby might be. It is plausible to regard this as a false positive.
In a similar fashion, evidence for overlap with pigmentation pathway loci was also greater for melanoma risk (Supplementary Fig. 3.7-4). Given the known role of pigmentation loci in melanomagenesis, the overlaps in that analysis (Supplementary Fig. 3.7-5) are completely expected, and offer a further positive control for the GWAS-PW method. There was no significant evidence that pigmentation and telomere length loci overlap (Supplementary Fig.  3.7-6). Figure S3.7-1. Combination of melanoma and telomere length GWAS. Results of GWAS-PW analyses which assigns posterior probabilities (PPA) to each of ~1700 genomic regions that it is (a) a pure melanoma locus, (b) a pure telomere length (TL) maintenance locus, (c) a pleiotropic TL and melanoma locus, and (d) that the locus contains co-located but distinct variants for TL and melanoma. Figure S3.7-2. Combination of nevus count and telomere length GWAS. Results of GWAS-PW analyses which assigns posterior probabilities (PPA) to each of ~1700 genomic regions that it is (a) a pure nevus locus, (b) a pure telomere length (TL) maintenance locus, (c) a pleiotropic TL and nevus locus, and (d) that the locus contains co-located but distinct variants for TL and nevi.

Figure S3.7-3.
Miami plot of association z scores for telomere length and nevus count near PLA2G6. Since this was the only strong example of putative co-located loci, we plotted the evidence for association for this region.

Figure S3.7-4. Combination of melanoma and pigmentation GWAS. Results of GWAS-PW
analyses which assigns posterior probabilities (PPA) to each of ~1700 genomic regions that it is (a) a pure melanoma locus, (b) a pure hair color locus, (c) a pleiotropic locus, and (d) that the locus contains co-located but distinct variants for melanoma and hair color. Figure S3.7-5. Combination of nevus and pigmentation GWAS. Results of GWAS-PW analyses which assigns posterior probabilities (PPA) to each of ~1700 genomic regions that it is (a) a pure nevus locus, (b) a pure hair color locus, (c) a pleiotropic locus, and (d) that the locus contains co-located but distinct variants for nevi and hair color. Figure S3.7-6. Combination of telomere length (TL) and pigmentation GWAS. Results of GWAS-PW analyses which assigns posterior probabilities (PPA) to each of ~1700 genomic regions that it is (a) a pure TL locus, (b) a pure hair color locus, (c) a pleiotropic locus, and (d) that the locus contains co-located but distinct variants for telomere length and hair color.

S4.1 Correlation between different measures of nevus count
The included studies have used different methods to measure nevus count which might raise questions about the propriety of combining them in a meta-analysis. We and others have previously shown that directly counted nevus numbers on one body site -for example, armare well correlated with total body nevus number, and that genetic analyses using either measure can detect the same loci. Questionnaire-based self-assessment on an ordinal scale is understandably less well correlated with total body nevus count and attenuates the strength of association. As a result, we have calculated disattenuated regression coefficients for those studies using coarsened measures of total nevus count, utilizing the estimated correlation between ordinal categorical measures and total nevus count of ~0.5. This value is derived from a number of different sources, both internal to the present study, and published by others, as shown in Table S4.1-1.

S4.2 Bivariate analyses using different nevus count measures
In the BTNS, while adolescent twins have their nevi counted by a nurse, parental nevus counts are measured by self-report on a single questionnaire item as "none", "a few", "moderate", "many" (see S1.1). Though any one individual is only tested by one method, we can estimate the genetic correlation between these two types of measure in a family-based analysis, and combine evidence of association to a given SNP to the different measures. We performed a bivariate maximum analysis of these BTNS data using the MENDEL package 48,49 with the results shown in Table S4.2-1. For most of the SNPs examined, the SNP effect is proportional and in the same direction, with the exception of IRF4 rs12203592, where, as we have previously shown, there is strong age by genotype interaction, with a reversal of sign in the different age groups (here parents and their children).

S4.3 Bivariate mixed model association analysis for melanoma and nevus count in Australian data
Bivariate mixed model association analysis for melanoma and nevus count in Australian data (Table S4.3-1). The P value is for the bivariate model, and so does not preclude the association to one of the traits being not significantly different from zero. This should be borne in mind when interpreting the sign changes (effect allele) between the two traits for several SNPs such as rs1805007 in MC1R, where the main analysis suggests no significant effect on nevus count. A negative effect sign indicates that an allele that increases mole count also increases melanoma risk and the consistency of negative signs across all SNPs indicates that there is always a positive genetic correlation between moliness and melanoma risk.

S4.4 Use of overlapping controls for melanoma and nevus counts
A standard assumption of meta-analysis is that the samples combined should be independent. In our case, a subset (n = 1,320) of the controls for the Australian melanoma cases are the parents of twins counted in the Brisbane Twin Nevus Study. In addition, the Leeds nevus sample are controls from their melanoma case-control study, which also contributes to the melanoma meta-analysis. An additional complication is that the parents of twins also provide self-ratings of whole-body nevus density on a 4-point scale. This has been dealt with by performing analysis of the nurse-counted total nevus count in the children and mole score in the parent as a bivariate mixed model analysis, which correctly takes account of both relatedness and the difference in the metrics.
It is known that performing a secondary analysis of a correlated trait solely with controls from a case-control study is unbiased, unless the primary trait (here melanoma) is common 51 . We carried out simulations to confirm that this was the case in our special situation where we combine evidence of association of a locus to both the primary trait and the correlated secondary trait by meta-analysis. This showed that unless one also incorporates data for the secondary trait in cases (which we have not), then no special ascertainment correction is required. See Supplementary Figs. 4.4-1-2. Figure S4.4-1. Meta-analytic association statistics for simulated case-control data confirming that results reusing control data for the secondary trait is equivalent to that from two independent samples of the same size. This supports our use of parents of BTNS twins both as subjects in the mole meta-analysis and as controls for the Australian melanoma cases in the melanoma meta-analysis. Supplementary Fig. 4.4-1.

S5.3 Data for peak PPARGC1B SNP rs251464 in meta-analysis
Within the Australian sample, we had observed suggestive levels of evidence for association of SNPs in the PPARGC1B gene with nevus count. Since it has been previously reported 52,53 that rs32579 was associated with tanning ability, we carried out joint and haplotypic analyses to determine whether rs251464 or rs32579 was more likely to be the causative SNP for nevus count (Table S5.3-1). This was more supportive of rs251464. In the meta-analysis (Table  S5.3-2), this nevus association was only strongly supported by the Australian sample, but the heterogeneity test was only marginally significant (P = 0.05).

KITLG
Although we obtain genome-wide significant association with a SNP in KITLG and naevus count (best SNP rs7313352), there is no equivalent effect on melanoma risk. KITLG has well characterised effects on melanocyte growth. The "Icelandic" fair hair SNP rs12821256 alters the binding site for the lymphoid enhancer-binding factor (LEF) transcription factor, so reducing LEF responsiveness and enhancer activity by 20% in keratinocytes 54 . Mice carrying ancestral or derived variants of this human KITLG enhancer exhibit significant differences in hair pigmentation. This SNP was associated with fair hair in Icelandic and Dutch populations, and is 350 kbp from KITLG.
By contrast, the functional SNP from Zeron-Medina and coworkers 33 is rs4590952, which lies within a p53-binding site in intron 1 of KITLG, and is in strong LD with known testicular cancer SNPs rs995030 and rs1508595 55,56 . It is a nevus gene, which is not the case for rs12821256 (see Supplementary Tables 5.4-1-3). Invoking a role for a variant in a p53 element in nevogenesis is very plausible, as p53 is important in the acute sunburn response, especially within keratinocytes. Keratinocytes then stimulate melanocytes by release of KITLG, FGF2, EDNs and alpha-MSH. Also, p53 transcriptionally upregulates the proopiomelanocortin (POMC) gene after UV exposure, the precursor to alpha-MSH 57 .
Somatic mutations of KIT are present in melanoma 58 , but are not common (2%) 59 , occurring on sun-exposed sites.

CYP1B1
Common coding variants in CYP1B1 have been implicated in a wide variety of cancers, notably breast, ovarian, prostate and lung. The gene product, dioxin inducible oxidoreductase, is a member of the cytochrome P450 superfamily and is involved in the metabolism of a wide range of hormones and environmental toxins. Several of our associated SNPs are also likely cis-acting eQTLs, notably rs162328, rs2855658 (with P levels 2-6x10 -6 in the GEUVADIS dataset 60 . There is one SNP, rs162558, that sits in the MITF binding site up from CYP1B1 (-1548 bp), and has been tested for association (with colorectal cancer). However, the nevus association P value is 0.03. We have recently reported this gene to be associated to skin colour in Europeans. 91

TMEM38B
TMEM38B encodes an intracellular monovalent cation channel that functions in maintenance of intracellular calcium release. A deletion mutation in TMEM38B is associated with autosomal recessive osteogenesis imperfecta. The peak SNP (rs10739221, EUR MAF 0.25) has also been associated with age at menarche 52 and lies intergenic, or within a putative 303kbp ncRNA BC039487 over 1Mbp distant from TMEM38B.

SYNE2
SYNE2 (spectrin repeat containing nuclear envelope protein 2) on 14q23.2 is a large gene that encodes a nuclear outer membrane scaffolding protein Nesprin-2 that binds cytoplasmic Factin. This binding (within the LINC -linker of nucleoskeleton and cytoskeleton -complex) tethers the nucleus to the cytoskeleton and aids in the maintenance of the structural integrity of the nucleus. As such it is essential to cell motility and mechanotransduction. Gene mutations are a cause of Emery Dreifuss muscular dystrophy. Multiple alternate transcripts exist (with protein products sizes varying between 80 and 800 kD), and some are implicated in the suppression of nuclear ERK1/2 signaling events, "fine-tun[ing] ERK-initiated proliferation in multiple cell types". 61 Intact LINC function is required for efficient DNA double strand break repair 90 , as well as transcriptional regulation via mechanotransduction. 92 The Bayesian association analysis does not give much support to any one SNP from this region.

FMN1
In the genomic region centered on FMN1 (formin 1), the best SNP is rs117648907 (MAF 0.02), which lies in intron 7 (20 exon version) of FMN1, but also in mRNA LF205828 (a polycomb-associated non-coding RNA). One should note this SNP is only present on a subset of arrays, and is poorly imputed (in QIMR imputation pipeline R 2 =0.25), so the association is supported by four (larger) studies in the meta-analysis. The formins act to regulate the cell cytoskeleton via control of actin and microtubule dynamics. Unlike some other members of the family, FMN1 tends to be localized to the nucleus, and may contribute to the regulation p53 transcription, along with FBXO3. 89 Other formins, notably FHOD1, interact directly with giant forms of Nesprin-2 (SYNE2, see above).

GPRC5A
GPRC5A (or RAI3) is an orphan G-protein-coupled receptor, originally identified as a retinoic acid-induced protein. It is a tumor suppressor known to be important in lung 62 , and breast cancer 63,64 , and modulates intracellular cAMP, Gsalpha gene expression, proliferation and apoptosis. Mutation of p53 causes derepression of GPRC5A, and proliferation, but paradoxically GPRC5A knockout mice experience high rates of adenoma and adenocarcinoma, as well as lung inflammation 65 . Expression of the nearby GPRC5D gene is reportedly localized to hard keratin tissues 66 , but more recently, expression in multiple myeloma has been suggested to be a highly specific tumor marker 67,68 . The SNP rs7308443 gave the best signal close to GPRC5D (nevus meta-analysis P = 0.009, melanoma metaanalysis P = 7.8e-5). Returning to GPRC5A, several SNPs are in near perfect linkage disequilibrium with the peak associated SNP, including four with putative effects on enhancer or transcription factor binding sites: rs2111398, rs4763899, rs1642199, rs1642203. The SNP rs1056927 is a known eQTL for HEBP1 (heme binding protein 1, P = 2x10 -11 ) in blood69, but this gene has not been implicated in either melanocyte function or cancer risk.

PPARGC1B
PPARGC1B is the gene encoding PPAR-γ coactivator 1-β (PGC-1β, and also known as estrogen receptor-related receptor ligand 1). The PGC-1 (PPARG coactivator) family of proteins control production and metabolism of mitochondria. A and B are usually expressed in similar tissues. Luo et al. 70 have recently shown that levels of PGC-1-alpha regulate melanoma invasiveness. The PPARGC1B gene has been previously implicated in risk of type 2 diabetes, obesity (the coding SNP rs7732671), bone resorption, breast cancer, and skeletal muscle metabolism. More relevantly, Nan et al. 52 noted suggestive levels of association of rs32579 in PPARGC1B to skin tanning (P = 3.6 x 10 -6 ) in a two-stage GWAS (stage 1 N = 2,287, P = 1.4x10 -6 ; stage 2 N = 870, P = 0.06). We recently replicated this association. 91 Using both mouse and human data, Shoag et al. 53 have shown that both PGC-1β and its orthologue PGC-1α are involved in melanogenesis. They found that α-MSH induces PGC-1α expression and stabilizes both PGC-1α and PGC-1β proteins via activation of the melanocortin-1 receptor and that inhibition of PGC-1α and PGC-1β blocked the α-MSHmediated induction of MITF and melanogenic genes. In a sample of 4,341 individuals, they obtained further evidence of an association with tanning ability with rs32579 genotype (P = 0.01), which when pooled with data from Nan et al. 52 gave a P = 9.9x10 -7 (pooled N = 5,140). This tanning allele was also found to be weakly protective in an analysis comparing 2,380 melanoma cases versus 6,319 controls (trend test P = 0.15). Finally, they identified eQTLs in PPARGC1B 71 have also reported that BRAF*V600E melanomas exhibit reduced expression of both MITF and PPARGC1A, a feature reversible by BRAF treatment.
In our Australian data, we do not see evidence of association between PPARGC1B rs32579 and skin color (N = 3,467, P = 0.91), or with tanning ability, as measured by the difference in skin reflectance (at 550 nm) between outer arm and inner arm skin (N = 1969, P = 0.20). The melanoma meta-analysis found only weak evidence for association of this SNP: P = 0.0017; not quite as impressive as that obtained for our peak nevus SNP, P = 0.00046. For nevus count these were: rs251464, P = 2.3 x 10 -7 ; rs32579, P = 5.5 x 10 -5 . We found only slightly stronger support for association of our peak nevus SNP rs251464 with skin color and with freckling (P = 0.01), but not with tanning per se. Specifically, the rs251464*G allele is more common in darker skinned individuals in our sample, and correspondingly weakly negatively correlated with Northern European ancestry (P = 0.01), and more strongly with membership of the small proportion of our sample with non-European ancestry. The Nan et al. 52 SNP rs32579 is in strong LD with rs251464 in the Australian adolescent twin sample (r 2 = 0.80), but is less strongly correlated with nevus count (Table S5.3-2) -haplotypic analysis suggests that any effects are derived from rs251464 (or a trait allele in stronger LD). The effect of the each additional rs251464*G allele is to decrease total nevus count by 1 (approximately 1%). Importantly, this effect was also detectable using a quantitative trait TDT (P = 0.0014). This implies that the association is not due to ethnic stratification of the sample. 9q31.1-2 In the 9q32.1-2 region, several SNPs have been associated to other traits in GWAS. For example, rs10739221 appears in Perry et al.72 as a locus associated with age at menarche (P = 4e-41), with a stronger signal for rs10453225, which is only very weakly associated with nevi and melanoma (P ~ 1e-3). However, rs10453225 does lie within the proximal (9q31.1) melanoma peak, while rs10739221 is in the distal 9q31.2 peak.

NFIC
There is a very narrow peak of enhanced chromatin access in melanocytes over rs34466956 on 19p13.3 5kbp upstream from NFIC (nuclear factor IC). NFIC is a CCAAT-binding transcription factor. It acts to regulate initiation of transcription and also mediates DNA replication. It controls development in multiple tissues (eg teeth, fibroblasts, liver, prostate), and is expressed in melanocytes. A NFIC-BRAF fusion has been reported in 1 of a series of 46 mucosal melanoma cases. Another member of the NFI family, NFIB coregulates epithelial-melanocyte stem cell behavior in hair follicles 73 . In small cell lung cancer, NFIB overexpression is common in metastatic poorly differentiated tumours, acting via upregulation of EZH274. BRN2 is another transcription factor increased in metastatic melanomas. Overexpressing BRN2 in melanoma cell lines causes NFIB to increase, but significantly decreases NFIC levels (as well as NFIA and NFIX) 75. The ChEA Transcription Factors database records that EZH2 interacts with NFIA, NFIC and NFIX, but not NFIB. It has been noted that EZH2 and HDAC4 levels are reciprocally increased and decreased across various tumour types.

HDAC4
The peak SNP (rs3791540) on 2q37.3 for nevus count is intronic to HDAC4 (histone deacetylase 4). HDAC4 is a member of the 2a class of histone deacetylases 76 , and acts as part of a multiprotein complex with RbAp48 and HDAC3 to repress transcription. It interacts with transcription factors MEF2C and MEF2D, and function varies by phosphorylation status. It is widely expressed, including in the skin, but most studies have concentrated on its roles in coordinating brain and bone development. Germline deletions or mutations in humans lead to brachydactyly mental retardation.
Stark and Hayward 77 have reported somatic deletions of HDAC4 is deleted in 3 of 76 melanoma cell lines. Generally, cancer types with high EZH2 activation consistently also have low HDAC4 activation 78 and vice versa: melanoma is characterized by high EZH2 activation and low HDAC4 activation.
One of the most associated SNPs in the present analysis, rs55875066, lies in an MITF binding region of intron.

DOCK8
Human DOCK8 mutation or deletion leads to a rare combined immunodeficiency and autosomal recessive hyper-IgE syndrome that includes a high risk of epithelial (predominantly squamous cell) and hematological cancers 79 . The gene is quite extensive -the association peak covering ~70 kbp covers just intron 1, but the maximally associated SNPs lie across exon 1 and the first 10 kbp of intron 1, coinciding with an open chromatin region containing numerous regulatory features. The peak SNP rs600951 sits in binding sites for BRG1 and HNF4A in intron 1, and in the GTEx database, is an eQTL for DOCK8 (in pancreas), as well as CBWD1, the neighbouring gene, in multiple tissues. Theoretically, the gene C9orf66 is read off the other strand of DOCK8's exon 1. DOCK8 regulates Cdc42 activation in many immune effector cells -variants in CDC42 have been previously associated with melanoma tumour thickness 86 , though our best association P-value in the region of that latter gene is 3x10 -4 . DOCK8 is expressed in melanocytes. One might propose an immune cell based mechanism for an association with nevi, or possibly a role in melanocyte migration (given Cdc42 involvement in melanoma cell invadosomes 87 ).

LMX1B
SNPs in this gene in earlier rounds of analysis gave quite suggestive results, but with the addition of further samples, the best nevus P value (rs7854658) is now only 3 x 10 -6 . The region around the LIM homeobox transcription factor 1B gene on chromosome 9q34.1 is 40-50 Mbp distal to the 9q region weakly implicated with nevus count 4 and melanoma5 3 in linkage studies. Mutations in LMX1B are the cause of nail-patella syndrome (hereditary osteo-onychodysplasia)51 as well as focal segmental glomerulosclerosis in the absence of extrarenal pathology 80 . The only pigmentary feature associated with nail-patella syndrome is Lester sign, a zone of darkened colour around the central iris, but this may reflect iris thickness rather than changes in iris melanocytes. Although expressed in many cell types and tissues, especially during development, LMX1B transcripts levels are low in melanocytes 81 , and only slightly higher in keratinocytes. The neighbouring genes include ZBTB43, MVB12B, and NRON. ZBTB43 is a widely expressed transcription factor, including in melanocytes and keratinocytes, though not overexpressed in melanomas.

FTO
One possibility for a mechanism linking nevus count and FTO genotype is that this simply represents an increase in body surface area offering more sites for nevi to develop. Body surface area was usually included as a covariate in our nevus count association analyses to allow for this possibility. The BMI-associated SNPs in FTO are thought to act via regulatory effects on IRX3 and IRX5. We can show that rs1421085, which doubles IRX3 and IRX5 expression and increases BMI82, is not associated with either nevus count or melanoma (nevus meta-analysis P = 0.99, melanoma P = 0.22). This was also the case for the other IRX5 eQTLs from the MuTHER Skin study. The four peak nevus/CMM-associated SNPs (rs12596638, rs12599672, rs62034121, rs62034139) do all lie within a 2 kbp long enhancer element, characterized by open chromatin (H3K27ac signal in melanocytes) and putative MITF ESR1, STAT1, FOXA1, CEBPA, TFAP2C and TFAP2A binding sites (see Supplementary Fig. 5.2-19). None of these SNPs have been reported to be associated with BMI.
FTO is moderately expressed in melanocytes, and is a nucleic acid demethylase. In one mouse model experiment, FTO overexpression increased PPARGC1A expression in myocytes, and silencing it the reverse effect. Mutant forms of the FTO protein and the use of rhein, an FTO inhibitor, also diminished PGC-1α (PPARGC1A) levels 83 . We reviewed melanoma and nevus associations with PPARGC1B above.
Below (Table S5. 4-4), we tabulate transcript abundances for our candidate genes. Tyrosinase and SLC45A2 head this list. Several regulatory factors or receptors essential to melanocyte function are present at small absolute levels, for example, MC1R. Melanoma samples: Full acknowledgments for samples used in the melanoma meta-analysis are given in Law et al (2015). 13 Telomere length GWAS results: Veryan Codd kindly provided the association results from the telomere length GWAS of Codd and coworkers 2013. 39