Introduction

Renal cancer incidence is increasing1 and Central/Eastern Europe has the highest rates worldwide.2 Most (∼90%) adult renal cancers are renal cell carcinomas (RCC), most of which are clear cell RCC (ccRCC).3 ccRCC can be further classified by genetic features, most notably by the presence/absence of Von Hippel–Lindau (VHL) gene inactivation, which occurs through three mechanisms: sequence alteration affecting the transcript or protein, transcriptional silencing because of promoter CpG hypermethylation and loss of heterozygosity/copy number of the 3p25-26 locus.4

The VHL protein is involved in tissue-specific responses to oxygen concentration delivery.5, 6 VHL alteration through sequence changes and/or promoter hypermethylation, leading to transcriptional silencing, prevents formation of a protein complex required for hypoxia-inducible factor (HIF) degradation, resulting in excess HIF protein and a gene expression pattern facilitating adaption to oxygen deprivation, delivery and angiogenesis. VHL inactivation subtypes have been associated with improved prognosis and a favorable treatment response in some but not all studies.7, 8, 9, 10, 11 In contrast, VHL wild-type cases may occur through mechanisms independent of VHL inactivation. Generally, VHL inactivation is considered an early event in ccRCC but other unidentified alterations may accompany VHL alteration at diagnosis.

ccRCC subgroups have demonstrated different frequencies and patterns of VHL inactivation associated with known risk factors, which include obesity, hypertension, smoking, certain occupational exposures and a positive family history of kidney cancer.12, 13 High-intake of fruits and vegetables may reduce risk.1

In this study we used array comparative genomic hybridization to evaluate chromosomal alterations among incident ccRCC cases enrolled in a large case–control study of renal cancer. Risk factors, tumor histopathological features and VHL gene inactivation were evaluated for association with the frequency and type of copy number alterations observed.14, 15, 16

Results

Cases with tumor biopsies (N=412) were more likely to be residents from the Czech Republic, have a family cancer history and have a higher education level compared with those without biopsies (N=348) (Supplementary Table 1). Other factors did not differ.

Fraction of genome altered (FGA)

Males had a higher FGA compared with females (P=0.002) (Figure 1). Stage and grade at diagnosis were associated with increasing fraction of the genome lost (FGL), fraction of genome gained (FGG) and total FGA (P<0.0001). Ever smokers had elevated FGG (P=0.05) and FGA (P=0.05) compared with never smokers, but trends with current smoking status were not observed (P=0.62). Individuals without a first degree relative with cancer had a higher FGA compared with those who did (27.9% vs 22.8%; P=0.02). A lower FGA was observed in cases with VHL promoter hypermethylation (20.8%, P=0.03) and sequence alteration (25.9%, P=0.11), compared with wild-type cases (30.5%-referent). FGA was not associated with other factors.

Figure 1
figure 1

Fraction (%) of the FGL, FGG and FGA among ccRCC case subgroups. P-values between subgroups were as follows: males vs females (P=0.002), age (<50, vs ⩾50 years, P=0.06), any family history of cancer (P=0.02), stage (P<0.00001), grade (P<0.0001), ever vs never smoking (P=0.05), VHL wild-type cases vs those with VHL promoter hypermethylation (P=0.03), VHL wild-type cases vs those with VHL sequence alterations (P=0.17). REF, referent group.

Chromosomal alterations

The exact number and percent of all chromosome p and q arm gains and losses in all ccRCC cases in this study are shown in Supplementary Table 2. Chromosome arms altered in greater than 25% of cases, and those of previous interest were included in multivariate models (Table 1). The strongest association was observed between 3p loss and VHL gene inactivation (P<0.0001). 3p loss was less frequent among VHL wild-type compared with inactivated cases (70.2% vs 95.3%, respectively). Therefore, 29.8% of VHL wild-type and 4.7% of inactivated cases had 3p present. No other factors with the exception of VHL inactivation were associated with 3p loss. Stage was strongly associated with 9p, 9q and 13q (P<0.0001), and to a lesser extent 1p, 10q and 14q losses. Stage was associated with 12p and 20q gains (P=0.001). Grade was strongly associated with 8p and 14q (P=<0.0001) and also 1p, 4q, 6q, 9p, 9q and 13q losses. Gains associated with grade were observed at 5p, 7p, 7q and 12q.

Table 1 Regression analysis of selected chromosomal arm alterations patient/clinical factors

After adjustment, male cases had more losses of 3q, 9p, 9q, 13q, 14q and gains of 5p, 7q, 12p, 12q, 20q. Most of these alterations were associated with stage and/or grade. Patients 65 years or older had more 4q, 9p and 9q losses compared with younger patients diagnosed before age 50. Individuals that reported a family cancer history had fewer 7q gains and 8p losses, explaining the lower FGA observed. In contrast, there were no chromosome arm alterations associated with smoking status. Interestingly, 5q gain was frequently observed although it was not associated with any of the patient/clinical characteristics or risk factors evaluated. Closer examination revealed that gains occurred distally, specifically at two regions (that is, 5q34–35 (55–58%), 5q31–32 (52–54%) and 5q21–23 (34–49%) (Supplementary Figure 1)).

Loci

Differences in copy number alteration prevalence at specific loci and risk factors were compared using a moderated t-statistic or F statistic with pooled variance, controlling the false discovery rate (FDR). High frequency clonal alterations were sorted initially by FDR-adjusted P-values. Subsequently, significant loci were re-analyzed in adjusted models (Table 2). Grade was associated with loss of large regions on 4q, 9q, 13q, 14q, 18q and gains on 1q, 7p. Stage was associated with large losses on 9p, 9q, 13q, 14q, 17q, 18q and gains on 3q and 7q. Both stage and grade were associated with loss of 1p36; 4q11-12,26; 9p21-22; 9q21-21.3; 14q13-14, 21, 24, 31-32; 18p11.2-12, 18q11-12, 21.3-23, tel; and gains on 3q22-23; 17q24-25; and 20q13. Both stage and grade were associated with losses on 14q23.2, a region harboring the HIF1A gene, and 14q13.2 harboring the PHD1–EGLN3 gene. We examined HIF1A protein expression in tumor tissue microarrays using immunohistochemistry, in relation to 14q and regional (clonal) loss across the HIF1A region (14q23). Although univariate analyses suggested a decrease in expression with 14q and 14q23 loss, neither factor was associated with HIF1A expression after adjustment for stage and grade. Stage and particularly grade remained strongly associated with HIF1A expression, independent of 14q loss (data not shown).

Table 2 Regional clone and copy number alterations and ccRCC tumor stage and grade

Male cases exhibited losses on 6p22, 6p24–25, large regions of 9q and gains at 1q43-44, 12p12.3, 12q12.4, 24-tel, compared with females (Table 3). These alterations were also associated with stage and grade. Among cases reporting a positive family cancer history, fewer cases demonstrated losses at 4q21-23 (11% vs 24%, P=0.002) and gains at 20q13.2 (13% vs 26%, P=0.002). Former and current smokers exhibited more loss at 3q12 (24% vs 40%, P=0.04) and gain at 16p11.2-13 (22% vs 11%, P=0.003). Interestingly, a significant positive trend was observed with current smoking status and gain at 16p11.2-13.2 (P-trend=0.003). Lastly, copy number alteration of specific clones and VHL inactivation status were evaluated. The lowest FDR-adjusted P-values that remained significant in multivariate analyses were losses observed at clones RP11-245E5 (3p24; P<0.0001) and RP11-180G14 (3p14.3; P<0.0001). RP11-180G14 (3p14.3) is located just distal to CTD-2175D15 (3p14.2), a clone harboring the FHIT locus (Figure 2).

Table 3 Regional clone and copy number alterations and patient/tumor characteristics
Figure 2
figure 2

Frequency of copy number loss of 3p clones among ccRCC cases grouped by absence (wild-type) or presence of VHL gene inactivation (through sequence alteration or promoter hypermethylation). Region A (telomere to RP11-180G14): the frequency of copy number loss of individual 3p clones differed significantly between VHL wild-type and inactivated cases that occurred through both sequence alteration (P<0.00001) and promoter hypermethylation (P=0.03–0.001). Region B (from RP11-180G14to RP11-154H23): the frequency of clonal copy number loss differed significantly between wild-type cases and those with sequence alterations (P=0.003–<0.00001), but was no longer observed with hypermethylated cases (P=0.08–0.59).

To follow-up upon these observations, the prevalence of all 3p losses was compared between groups stratified by VHL inactivation characteristics. Clonal alteration prevalence among wild-type cases were compared separately to cases demonstrating inactivated through VHL promoter hypermethylation or sequence alteration. Loss of these 3p clones was significantly greater among cases with VHL sequence alterations compared with wild-type cases. Losses were significantly more prevalent at the 3p telomere (93% vs 59%, P<0.00001), and clone CTB110-J24 (3p25-26;VHL) (95% vs 62%, P<0.00001) (Region A-Figure 2), and at 3p14.2 (FHIT) (92% vs 75%, P=0.001) but not at regions proximal to the centromere. Loss of 3p loci among VHL hypermethylated cases was similar in prevalence to cases with sequence alterations but significantly higher compared with wild-type cases at the 3p telomere (83%, P=0.03) and the VHL locus (91%, P=0.0006) (Region A-Figure 2), but not significantly different to wild-type cases at loci centromeric to clone RP11-180G14 (3p14.3; Region B-Figure 2). Loss among hypermethylated cases was significantly less prevalent than among cases with sequence alterations at 3p14.2 (84%, P=0.04) and regions proximal to the centromere (50%, P=0.0002). These findings demonstrate differences in 3p loci loss between VHL inactivation subgroups, in particular clones surrounding the fragile FHIT locus. Lastly, VHL wild-type cases were distinguished by having more gains at clone LLNL-255K9 (20q11-12) compared with inactivated cases (33% vs 20%, P=0.03) (Table 3), a clone that was also more prevalent in cases with higher stage.

Amplifications

High-level amplifications were observed at 154 clones among 24 cases (1–28 amplifications/case). The most frequent amplifications were located at clones RP11-28l11 (11p15.2-15.4; N=6), RP11-10M23, RP11-91H14, RP11-121G20 (6p21.1-21.2; N=5) and CTB-136O14 (12q14; N=3). Two cases (one VHL wild-type with 3p present; one VHL inactivated with 3p loss) demonstrated 2q/6p co-amplifications. The 2q overlapping region (∼179.4–199.8 Mb) covers ∼100 genes; the 6p overlapping region (∼39.1–45.7 Mb) included the VEGF (vascular endothelial growth factor) gene.

Discussion

Previous studies have utilized copy number profiling to classify histologically categorized RCC cases into molecularly defined subtypes (Table 4).17, 18, 19, 20, 21, 22 Consistent with previous studies, the most common alterations were loss of 3p, 14q and gain of 5q.17, 18, 19, 20, 21, 22 Two previous comparative genomic hybridization studies examined associations between RCC copy number variation with stage and grade.18, 20 One study (51 cases) reported two clusters defined solely by the alteration subtypes observed.18 In both clusters, 3p loss and 5q gain were frequently observed, however, one cluster, defined by losses on 1p, 4, 9, 13q and 14q, had a higher stage and grade, increased vascular involvement and lower overall survival that were independent of stage and grade.18 Another study (51 cases) reported a difference in genomic copy number and expression profiles between cases with/without biallelic VHL loss.18 A study of 80 cases observed significant associations with stage and loss of 14q and 18p.20 In the current study, both stage and grade were associated with losses on 14q harboring HIF1A (14q23.2) and PHD1-EGLN genes (14q13.2). Recently, genetic and functional studies reported that HIF1A is a target of 14q loss and that HIF1A activity is diminished in 14q deleted RCC cases.21 Loss of these loci being associated with 14q loss was not corroborated in the current study, perhaps because of differences in type of study design and laboratory methodologies. In the current study, loss of specific loci spanning 14q23 was associated with stage and grade, which is consistent with the poor prognosis reported in 14q deleted cases.18

Table 4 List of published comparative genomic hybridization (CGH) studies of kidney and renal cell carcinoma (RCC)

This is the first study to evaluate 3p loci loss among large numbers of cases that were also defined by VHL status. In multivariate analyses, only VHL gene alteration was associated with 3p loss. Cases with VHL promoter hypermethylation were similar to those with sequence alterations at loci distal to the FHIT locus, and to wild-type cases at loci centromeric to FHIT. This observation suggests that in hypermethylated cases, the FHIT breakpoint could be critical in terms of somatic copy number loss of the chromosome segment distal to 3p14.2. In contrast, the FHIT fragile site was not a significant breakpoint in VHL wild-type or cases with sequence alterations. The FHIT region spans the most common fragile site of the human genome (the FRA3B fragile site) encompassing the previously observed familial kidney cancer breakpoint (t(3;8) p14.2;q24) and a viral integration site.23, 24, 25 Hemizygous, interstitial or terminal 3p deletions involving the FHIT gene and reduced protein expression have been observed in over 90% of ccRCC cases, similar to that observed in the current study (89%).

In addition to heterogeneity across 3p by VHL inactivation subgroups, we found that wild-type tumors had more alterations compared with those with VHL inactivation through promoter hypermethylation; corroborating a previous report that wild-type cases may be less genetically stable than hypermethylated cases, and possibly those with sequence alterations.20 The higher genomic instability among wild-type cases suggests a greater potential for progression, as copy number alterations have been associated with tumor stage, grade and worse prognosis.

Analyses of patient characteristics revealed that male cases had more alterations compared with females. The most prevalent copy number alterations in males occurred independently of stage and grade. This could indicate that incident male ccRCC tumors could be more aggressive compared with female cases, regardless of stage or grade at diagnosis. Male patients might also benefit from more aggressive treatment, follow-up and their genetic alterations could provide clues for targeted therapies. Several target genes on chromosome 9 include CDKN2A and CDKN2B.19, 20 Other risk factors examined did not explain the sex differences observed. Additional characteristics associated with tumor heterogeneity included having a first degree relative with cancer in which fewer copy number alterations were observed compared with those without cancer history. Smoking was associated with higher FGA. Ever smoking was associated with copy number loss of 3q12-14, gain of 16p11.2-13.2 for which a significant, positive trend with current smoking status was found.

As previously reported, 5q gain was highly prevalent and was not associated with of stage, grade, VHL inactivation suggesting that these loci may provide clues to the location of additional genes that are also altered in the early stages of ccRCC development. Two recent studies that employed SNP arrays19, 20 also observed gains at the distal end of 5q. Expression analysis revealed overexpression of 12 of 22 5q genes in tumor compared with normal cortex, including: GNB2L1, MGAT1, RUFY1, RNF130, MAPK9, CANX, CNOT6, SQSTM1, LTC4S, TBC1D9B, HNRPH1, and FLT4.19 A second study reported focal amplification of 5q35.3 containing the SQSTM1 gene that is overexpressed in other cancers.20 The weak correlation between 5q gain and 3p loss across studies suggests that new targets, perhaps functioning through VHL-HIF independent mechanisms, could be identified on 5q.

This study included a large series of histologically-confirmed ccRCC cases with detailed risk factor and patient/clinical information. Our two-step analytical approach, entailing FDR adjustment and subsequent multivariate analyses of significant loci, reduced possible false-positive reporting. Weaknesses included a low sample size for interaction analyses and potential selection bias from exclusion of almost 50% of cases. We also used a relatively low resolution array compared with some studies. However, the current study utilized available high-quality information on each patient and RCC risk factors to an extent not previously evaluated in earlier studies. Results from reports that employed higher resolution techniques will complement our study findings and enable identification of smaller target regions/genes for follow-up.

In summary, we identified novel associations between patient/clinical characteristics and copy number alterations in a well-defined ccRCC case-series. Considerable heterogeneity spanning the FHIT gene locus suggests that this fragile site could be critical in terms of chromosomal loss of regions distal to 3p14.2 in cases with VHL promoter hypermethylation. Our findings also suggest that male tumors may be less genetically stable compared with those in female cases, and may benefit from more aggressive follow-up. Lastly, this study provides evidence that loci on 5q may provide clues to RCC carcinogenesis that could be distinct to the VHL-HIF pathway.

Materials and methods

Study population

Incident cases (ICD-02 code C.64) were participants in a case–control study of RCC conducted in seven centers in four Central and Eastern European countries (Moscow, Russia; Bucharest, Romania; Lodz, Poland; and Prague, Olomouc, Ceske-Budejovice, and Brno, Czech Republic).14, 15 Diagnostic slides were re-reviewed by an expert renal cancer pathologist for standardized classification (MM). Of 1097 cases, 763 were ccRCC, and frozen biopsies were available from 412 (54%) cases. Protocols were approved by ethics committees and institutional review boards of participating centers, the International Agency for Research on Cancer and the US National Cancer Institute. All patients and physicians provided written informed consent. Patients were asked about lifestyle habits, height/weight one year before diagnosis, personal/family medical history and diet.13, 14, 15, 16

Array comparative genomic hybridization

Areas of frozen biopsies containing ⩾70% tumor cells were macrodissected and non-tumor tissue was removed.13 A standard protocol (http://cc.ucsf.edu/people/waldman/Protocols) was used to extract DNA. Arrays included 2464 BACs at approximately Mb intervals along the genome, printed in triplicate.22, 26 DNA samples (500 ng) were labeled as described previously.27

VHL

Inactivating alterations in exons 1–3 of the VHL gene and promoter hypermethylation were assessed as described previously.13 Wild-type tumors included those without sequence alterations affecting the protein coding region, splice junctions or promoter hypermethylation.

Data processing

Data were processed using SPROC to filter data points with low DAPI intensity, reference to DAPI signal intensity and correlation between Cy3 and Cy5 within spots. Ratios that were derived from a single clone or those with a triplicate log2 SD >0.2 were considered missing. Samples were hybridized on two versions of human array (3.1, 3.2). Clone data were combined sharing data from both arrays. Post-processing, 2111 clones remained. Data were mapped to the human DNA sequence May 2004 freeze.

Data were segmented using circular binary segmentation,28 implemented in the DNAcopy package from Bioconductor29 to translate intensity measurements into equal copy number regions. Median absolute deviation, scaled by the factor 1.4826 of the difference between the observed (log2 ratio) and segment values of autosomal clones, were used to estimate sample-specific experimental variation. Clone status outcome assignment (gain/loss/normal) was conducted, applying the merge levels procedure to segment values.30 To identify single technical or biological outliers (that is, high-level amplifications), their presence was allowed within each segment. Outlier clones included those for with an observed value at least four sample-specific median absolute deviation from the segment. High-level amplifications included narrow and single outliers with high copy numbers, as described previously.31

To quantify the total amount of genome altered, each clone was assigned a genomic distance equal to the sum of one half the distance between its center and that of neighboring clones or to the chromosome end for probes with one neighbor.31 FGA was defined as the proportion of the distance of the autosomal clone regions altered to that of the entire genome. FGG or FGL, considered only distances that were gained or lost.

Clone-wise association tests between copy number alterations and clinical variables were based on the segment values with observed values for outlier clones. Moderated t-tests for dichotomous variables and moderated F-statistics32 were used for more than two groups. P-values were adjusted for multiple testing by controlling the FDR.32, 33 Wilcoxon's rank-sum tests were performed to evaluate differential associations of summary genomic events (that is, FGA, FGG, FGL, amplifications and specific alterations) with dichotomous variables. Kruskal–Wallis rank-sum tests were used for events having more than two groups. Statistical analyses of autosomes were performed using R/Bioconductor.29, 34 P-values <0.05 were considered significant.33

Copy number alterations

The prevalence of ccRCC cases with/without frozen biopsies were compared by patient characteristics using χ2 and Fisher's Exact tests. Alterations of chromosome arms were coded as ‘0’ (no alterations) and ‘1’ (dominant change), defined as gain or loss in >25% of cases. Also included were those previously reported as important alterations in renal cancer. To compare cases with/without p or q arm alterations, cases with gains/losses were considered as dependent variables in stepwise logistic regression models with patient/clinical characteristics, lifestyle and occupational exposure variables. Criteria for inclusion into multivariate models was a P-value <0.20. Study design matching variables (country, sex, age) were included in all models. All variables were fitted in logistic models to obtain odds ratios (OR) and 95% confidence intervals (95% CI) as risk estimates of each alteration. Individual clone copy number differences between groups were initially tested using a moderated t-statistic or F-statistic with pooled variance, adjusted for multiple comparisons controlling the FDR.32 Clones identified through the first screen, observed in >20% of cases, were included in multivariate analyses conducted using SAS 9.1.3 (SAS Institute Inc.) and STATA 10.0 software. Clone-wise tests were conducted in R,34 all were two-sided.