Introduction

Melanoma etiology is complex and involves environmental, phenotypic, and genetic factors. Approximately 10% of melanoma cases occur in a familial context. To date, CDKN2A (NG_007485.1, NM_000077.4 (p16INK4A) and NM_058195.3 (p14ARF), LRG_11) is the main high-risk susceptibility gene and germline pathogenic variants are detected in around 20% of melanoma-prone families worldwide [1]. The prevalence of CDKN2A pathogenic variants varies across populations (5–72%) [2]. In the Mediterranean population, due to the low incidence of the disease, melanoma-prone families are considered as those with at least two melanoma patients in first- or second-degree relatives [3, 4]. Overall, 14% of Spanish melanoma-prone families carry CDKN2A pathogenic variants, with prevalence increasing with the number of cases in the family: 11% in families with 2 cases, 23% in families with 3 cases, and 36–43% in families with at least 4 melanoma cases [5, 6].

Beyond CDKN2A, other high-risk melanoma genes have been identified, but they account for <3% of the families studied worldwide [1]. Thus, the genetic factors underlying melanoma susceptibility remain unknown in a substantial number of high-risk melanoma families [1].

Previous genome-wide linkage analyses, either using microsatellite marker sets or high-density single-nucleotide polymorphism (SNP) arrays, have been conducted in CDKN2A wild-type melanoma-prone families, mostly from pedigrees of Northern European ancestry [7,8,9]. Altogether, these studies suggest 1p22, 9q21, and 17p12-p11 as melanoma susceptibility loci. Notably, the regions detected in these studies were restricted to each geographic population without overlap between studies, and results typically achieved suggestive evidence for linkage. To date, only one study has been conducted in Mediterranean melanoma pedigrees from Italy [10], which failed to detect results with suggestive or significant linkage evidence.

With the goal of identifying new familial melanoma susceptibility loci, we report the first genome-wide linkage analyses conducted in Spanish melanoma-prone families. This is the first study carried out in Mediterranean melanoma pedigrees that has been able to detect genomic regions reaching significant genome-wide linkage evidence.

Subjects and methods

Samples and pedigrees

The study included 29 melanoma cases and 39 non-affected individuals belonging to 11 Spanish melanoma-prone families (10 CDKN2A-negative families and one family with CDKN2A-positive and two CDKN2A-negative cases), with genome-wide genotyping data available from at least two melanoma cases (Figure S1). The family set was enriched with families with a high number of cases for our geographic location: six families with ≥4 melanoma cases, three families with 3 melanoma cases, and two families with 2 melanoma cases. All patients belonged to melanoma-prone families under dermatological follow-up at the Melanoma Unit of Hospital Clinic of Barcelona. For family and individual de-identification, the families included in the study were numbered consecutively from 1 to 11 and sex has been hidden on purpose.

The study was approved by the ethical committee of Hospital Clinic of Barcelona. All patients provided written, informed consent.

Linkage analysis

Subjects were genotyped on either the HumanOmni2.5 (Illumina) array versions v1.0 (81% of subjects) or v1.1 (19% of subjects). The GEO accession number for the genotyping data reported in this paper is GSE109208. Only SNPs common to both versions were included in the study (2,426,511 SNPs). We also excluded SNPs with missing genotypes in >95% of samples (2,332,767 SNPs remaining). Since linkage disequilibrium between markers can artificially inflate evidence for linkage [8], we reduced the set of markers to a non-linkage disequilibrium set by iteratively removing markers with heterozygosity <0.3, r2 > 0.16 with a previously selected marker and a minimum distance of 0.1 cm between markers, which resulted in 24,225 SNPs for analysis [8].

Mcsim software was used to perform parametric linkage analysis. Mcsim uses Monte Carlo Markov Chain techniques to provide haplotype reconstructions to extract inheritance information in pedigrees [11]. In addition to standard multipoint logarithm of the odds (LOD) scores, the program calculates robust multipoint LOD scores (referred to as TLODs). TLOD score is preferable to standard multipoint because it incorporates the recombination frequency (theta) in the statistic’s parameterization, preserving the robustness of the two-point LOD statistic to model misspecification while taking advantage of multipoint information [12]. The TLOD statistic follows the same theoretical distribution as other LOD score statistics (e.g., two-point, multipoint, and heterogeneity-LOD (het-LOD) scores) and can be interpreted with the same conventions. Lander and Kruglyak proposed using LOD >0.588 for nominal evidence, >1.9 for suggestive evidence, and >3.3 for significant evidence [13]. Evidence from multiple pedigrees was assessed with the heterogeneity-TLOD statistic (het-TLOD) [14]. Allele frequencies were estimated internally and general dominant and recessive models were used. We analyzed all pedigrees using an affected-only model that assumed a disease gene frequency of 0.005 for a dominant model and 0.05 for a recessive model. The penetrance estimates for carriers and non-carriers were 0.5 and 0.0005, respectively. The genome version GRCh37/hg19 was used to establish genomic positions. Reference sequence (RefSeq) database at NCBI and GeneCards Human Gene Database (http://www.genecards.org/) were used to obtain information on the genomic features in the regions of interest [15, 16].

Haplotype phasing

Regions of interest linked to multiple families were assessed for the presence of common haplotypes shared between linked families. The software SHAPEIT2 was used for haplotype phasing [17]. All genotyped SNPs in regions of interest were used for phasing.

Results

Genome-wide linkage analysis

The het-TLOD genome-wide analysis, using evidence summed across the pedigrees, revealed a region with significant linkage (het-TLOD >3.3) on chromosome 11 (Table 1 and Fig. 1). This region had a maximum het-TLOD of 3.449 (rs12285365:A>G) and spanned the 11q14.1-q14.3 locus (when using one het-TLOD score support interval). The region contains 52 genomic features of which 38 are protein-coding genes. The strongest linkage evidence at this locus (all markers with het-TLOD >3.3) was detected in two regions: between rs1940085:G>A and rs7108021:T>G (chr11: 84.3–84.6 Mb) and between rs12285365:A>G and rs607530:T>C (chr11: 86.6–87.6 Mb). These regions contain four protein-coding genes: DLG2 (NG_021375.1, NM_001142699.1), PRSS23 (NM_007173.5), FZD4 (NG_011752.1, NM_012193.3), and TMEM135 (NM_022918.3). We phased haplotypes in the linked pedigrees using all available markers to determine whether the same haplotypes appeared in multiple linked families, but failed to identify any such haplotype. Other regions showed suggestive linkage evidence (TLOD >1.9) at chromosome 1q, 6p, 7p, and 11q under a dominant model, and at chromosome 3q, 12p, and 13q under a recessive model (Table 1). SNPs description based on a genomic reference sequence is shown in Table S1.

Table 1 Genome-wide suggestive het-TLODs (>1.9) and significant het-TLODs (>3.3)
Fig. 1
figure 1

Genome-wide het-TLOD scores. Genome-wide het-TLOD scores in dominant (continuous line) and recessive (dashed line) models are plotted. Significant linkage evidence threshold (het-TLOD >3.3) is denoted by the horizontal dashed line

Family-specific genome-wide linkage analysis

Data from previous studies suggest that certain high-risk melanoma factors may be restricted to a limited number of pedigrees such as germinal variants in TERT (NG_009265.1, NM_198253.2, LRG_343) [18, 19]. Thus, we conducted a separate genome-wide analysis for each family. We detected three regions with suggestive evidence for linkage (TLOD >1.9) in family #1 under a dominant model (Fig. 2). This is a family with six CDKN2A-positive melanoma cases, two CDKN2A-negative melanoma cases, and other cancers in blood relatives (liver, lung, cervix, endometrial, and breast cancer cases). The CDKN2A-negative cases developed melanoma at a young age (32, 33 years old (y.o)) similar to the CDKN2A-positive cases (27, 34, 37, 37 y.o). Since, CDKN2A-negative cases did not carry medium melanoma risk variants such as MC1R (NG_012026.1, NM_002386.3) red-hair color variants or MITF (NG_011631.1, NM_000248.3, LRG_776) variant c.952G>A (p.(Glu318Lys)), we hypothesized that the melanoma risk observed in CDKN2A-negative cases may result from other melanoma susceptibility variants. We genotyped the CDKN2A-negative melanoma cases along with two CDKN2A-positive melanoma cases. The analyses identified three regions that segregate with all melanoma cases independently of CDKN2A status. The first region was in 1q31.1-q32.1 (chr1: 187.5–205.3 Mb) with a maximum TLOD of 2.447 at markers rs2246083:G>A and rs11590469:C>T (Figure S2). This region spans 17.8 Mb and contains 133 genomic features, of which 103 are protein-coding genes. The second region was in 6p24.3-p22.3 (chr6: 8.2–19.5 Mb) with a maximum TLOD of 2.409 at marker rs4712415:T>C (Figure S3). The region spans 11.3 Mb and contains 63 genomic features, of which 44 are coding protein genes. The third region was in 11q13.3-q21 (chr11: 68.7–95.5 Mb) with a maximum TLOD of 2.654 spanning >100 markers (Figure S4). This region spans 26.8 Mb and contains 239 genetic features, of which 171 are protein-coding genes.

Fig. 2
figure 2

Genome-wide TLOD scores for family #1. Genome-wide TLOD scores for dominant (continuous plot line) and recessive (dashed plot line) models are plotted. The suggestive linkage evidence threshold (TLOD >1.9) is denoted by the horizontal dashed line

Discussion

In Spain, the genetic background in melanoma-prone families remains unknown in >80% of families [5, 6]. Linkage analysis is likely to detect regions containing high-risk variants or genetic features segregating with the disease. Here, we report the results of a genome-wide linkage screen performed on 11 melanoma-prone families in which we detected significant linkage to the 11q14.1-q14.3 locus for melanoma susceptibility. Although the number of families included in the study is lower than previous studies, the subset of families was enriched by inclusion of highly informative families since 54.4% families had ≥4 melanoma cases.

A previous genome wide association study (GWAS) study performed in melanoma patients reported a melanoma locus at the 11q14.3 region. The study detected the strongest evidence of association near rs1393350:G>A encompassing TYR (NG_008748.1, NM_000372.4) gene, which plays a key role in human pigmentation and is a low-risk melanoma gene [20]. In the present study, the two subregions with strongest linkage evidence within 11q14.1-q14.3 do not include the TYR gene, suggesting that this genomic region is associated with melanoma susceptibility due to genetic factors other than pigment related alleles in the TYR gene. The DLG2, PRSS23, FZD4, and TMEM135 genes are located in the regions with the strongest linkage evidence. The biological information about this set of genes is limited, but they are all plausible candidates for cancer susceptibility [21, 22]. However, further sequencing data and molecular studies are necessary to elucidate the possible role of these genes in melanoma susceptibility.

Moreover, we have detected seven additional loci (1q31.1-q32.1, 3q29, 6p24.3-p23, 7q21.11-q21.2, 11q22.1, 12p13.1, 13q12.3-q14.11) with suggestive linkage evidence in the studied families. Notably, the 3q28-q29 locus has been previously detected with suggestive evidence of melanoma linkage in CDKN2A wild-type Swedish families [9]. A subsequent analysis of those families reported a narrower region spanning 3.5 Mb (chr3: 192.1–195.6 Mb) [23], overlapping the linked region detected in Spanish families. The rest of the suggestive regions detected have not been previously reported. The finding of a common region in Spanish and Swedish melanoma-prone families, strongly suggests that this region may contain genetic factors associated with melanoma susceptibility. The overlap region from both populations contains 20 genetic features, of which 10 are protein-coding genes (Table S2) including plausible candidates involved in proliferation and apoptosis, lipid transport, serin/threonin phosphatase PP1 inhibition, or Notch activation [24,25,26,27].

Melanoma is one of the tumors with highest heritability [28]. In families with melanoma aggregation, melanoma susceptibility follows an autosomal dominant inheritance pattern with incomplete penetrance. Multiple genes can play a role in melanoma susceptibility in a family, by combination of high-risk gene/s and presence of medium-/low-risk variants modulating expressivity of the high-risk gene/s. High-risk variants or genetic features segregate with the disease in most affected cases in the family and may be detected by linkage analysis studies. We expect to identify one or very few high-risk variants in an individual. However, in >70% of families worldwide, these have still not been identified. Thus, studies such as the present one are needed to provide clues to new genomic regions to focus on in order to identify new high-risk variants that may explain part of the missing heritability of melanoma susceptibility. The combination of medium-/low-risk variants modulates the penetrance and expressivity of high-risk genes, but may vary within the family and may be inherited from different ancestors. Multiple medium-/low-risk variants have been described to date [1]. However, their specific role in the modulation of the expressivity of pathogenic variants in high-risk genes has only been well established for the highly polymorphic pigmentation gene MC1R. CDKN2A variant carriers with melanoma-associated variants in MC1R have an increased risk of developing melanoma than CDKN2A variant carriers with wild-type MC1R [29]. Although co-existence of a CDKN2A pathogenic variant with the rare MITF c.952G>A (p.(Glu318Lys)) variant has also been described [30], the implication of MITF in the modulation of melanoma penetrance in CDKN2A variant carriers is still unknown.

In our study, we included a CDKN2A-positive family in which two melanoma cases did not carry known high-risk nor medium-risk melanoma susceptibility variants. We detected three loci with suggestive linkage evidence indicating that, in addition to the CDKN2A pathogenic variant, other genetic factors underlie the increased melanoma risk observed in the members of this family. Knowing the gene, or combination of genes, involved in melanoma susceptibility is crucial for identification and better management of at-risk individuals. Furthermore, it allows the refinement of genetic counseling in melanoma, as specific measures can be included when genetic testing detects germline variants in known susceptibility genes [5, 30,31,32].

In conclusion, using linkage evidence from multiple pedigrees, we have identified a familial melanoma susceptibility locus at 11q14.1-q14.3, in Spanish melanoma-prone families. We have also detected suggestive evidence of linkage at 3q29, previously described in Swedish families. Future next-generation sequencing studies or candidate gene targeted sequencing from these regions may allow the identification of new genetic factors implicated in melanoma susceptibility.