How genetic mutations lead to tissue-specific cancer phenotypes remains a fundamental open question in cancer biology9. Somatic mutations in most cancer driver genes are detected only in a minority of tumour types1,2, and inherited cancer predisposition alleles, both common and rare, are usually associated with cancer risk in a tissue-specific manner3. The strong effect of tissue of origin on carcinogenesis suggests that the transcriptional networks that define normal cellular states may also be crucial for oncogenic processes9. Lineage transcription factors (TFs), such as SOX10 in melanoma10,11, are often needed for cancer cell survival and proliferation12,13. However, whether specific interactions between lineage factors and genetic alterations are needed for the establishment of cancer-type-specific oncogenic programmes has remained unclear. Loss-of-function changes in VHL, which are commonly seen in ccRCC6, are extremely rare in other cancers1, and they lead to the constitutive stabilization of HIF1A and HIF2A (also known as EPAS1), of which HIF2A has a dominant role in ccRCC7. Capitalizing on the particular genetic make-up of ccRCC, we set out to investigate the effect of transcriptional lineage factors on the oncogenic phenotypes that arise downstream of cancer-associated genetic alterations.

PAX8 and HIF2A interact on chromatin

To identify essential TFs in ccRCC, we performed pooled loss-of-function screens for TFs that support the proliferation of two metastatic ccRCC cell lines: OS-LM1 and 786-M1A. These cell lines have been extensively characterized at the molecular and phenotypic levels and display clinically relevant genetic and gene regulatory characteristics, including VHL mutations14,15,16. The non-ccRCC cell lines MDA-MB-231 and HeLa were also used for comparison. Two factors, PAX8 and HNF1 homeobox B (HNF1B), showed strong specificity for ccRCC cells (Fig. 1a, b), a result supported by an analysis of public genome-wide CRISPR–Cas9 and RNAi screening data12,13,17,18,19 (Extended Data Fig. 1a) and by validation experiments in several ccRCC cell lines in vitro and in vivo (Extended Data Fig. 1b–h). The combined high expression of PAX8 and HNF1B was evident in renal cancers and normal tissues of the renal epithelial lineage at different developmental stages (Extended Data Fig. 2a–h). In line with their role as renal reprogramming factors20, inhibition of PAX8 and HNF1B reduced, but did not eliminate, chromatin accessibility at genomic loci, especially distal regulatory elements, enriched for their predicted DNA-binding motifs and characteristic of the renal origin (Fig. 1c and Extended Data Fig. 3a–n).

Fig. 1: Chromatin level interaction between the renal lineage factor PAX8 and oncogenic HIF2A in ccRCC.
figure 1

a,b, Pooled CRISPR–Cas9 loss-of-function screen results of ccRCC cell lines (a) and non-ccRCC cell lines (b). Sensitivity score, log2 of the mean of the top three depleted sgRNAs per gene, two replicates per condition, at the end of the assay compared with the start of the assay. ccRCC dependencies are in red. CTRL, average of non-targeting controls. c, Overlap between cancer-type-specific ATAC-seq peaks in TCGA data and those with reduced accessibility after PAX8 and HNF1B depletion in ccRCC cells. Top axis, odds ratio of overlap (black), 95% confidence interval. Bottom axis, P value, one-sided Fisher’s exact test (red). d, Overlap between PAX8- and HIF2A-interacting proteins as determined by RIME in 786-M1A cells. e, Network presentation of physical connections between 89 shared nuclear proteins from HIF2A and PAX8 interactomes. Protein names are provided in Extended Data Fig. 4a. f, Heatmaps of HIF2A and PAX8 ChIP-seq signals from 786-M1A and OS-LM1 xenografts (three tumours each) across regions with strong PAX8–HIF2A co-binding (red), predominant HIF2A binding (blue) and predominant PAX8 binding (grey). Top panels show the average signal within each of the three categories in the same colours. g, HIF2A and PAX8 co-bound genomic regions with reduced accessibility following PAX8 depletion. Median ATAC-seq signal from 786-M1A cells expressing a control RNAi construct (shRen, N = 6) or PAX8-targeting RNAi constructs (shPAX8, N = 6). Median HIF2A and PAX8 ChIP-seq signals from 786-M1A and OS-LM1 xenografts, three tumours each. Asterisk indicates a region of interest. h, Fraction of PAX8 peaks (red) in all high-confidence open chromatin regions (all) and HIF2A ChIP-seq peaks in 786-M1A and OS-LM1 xenograft tumours. Asterisk indicates P < 1.0 × 10−300, two-sided Fisher’s exact test.

Source data

In parallel, we performed rapid immunoprecipitation mass spectrometry of endogenous proteins (RIME)21 to characterize nuclear complexes occupied by HIF2A. The protein complexes that precipitated in all four replicates of the HIF2A RIME showed strong representation of HIF2A and ARNT (Fig. 1d and Supplementary Table 1), the HIF2A dimerization partner essential for DNA binding22. By contrast, the control IgG RIME experiments showed no signal for these proteins. In addition, we identified PAX8 as a member of the HIF2A nuclear interactome in three out of the four RIME experiments, but not in IgG controls. Conversely, HNF1B was identified in one out of the four HIF2A RIME replicates and in one of the IgG control experiments, which suggests that this could reflect background binding. A reciprocal experiment using PAX8 antibodies gave a strong signal for PAX8 in all four RIME replicates and identified HIF2A and ARNT as members of its nuclear interactome in three and four replicates, respectively (Fig. 1d and Supplementary Table 2). Of the 183 specific proteins identified in the same complexes with HIF2A, 99 also belonged to the same complexes with PAX8, and 89 of these represented nuclear proteins with functions in processes such as chromatin remodelling (the SWI/SNF complex) or mRNA processing (Fig. 1e and Extended Data Fig. 4a).

Chromatin immunoprecipitation with sequencing (ChIP-seq) analysis of xenografted ccRCC tumours revealed that PAX8 and HIF2A colocalized on chromatin substantially more frequently than what would be expected by chance. Specifically, 43% and 65% of the HIF2A binding sites in 786-M1A and OS-LM1 tumours, respectively, showed significant PAX8 binding (Fig. 1f–h and Extended Data Fig. 4b, c). PAX8 motifs were enriched in open chromatin regions that characterize both ccRCCs and papillary RCCs (Extended Data Fig. 3m, n), but VHL mutations are specific to ccRCC. In line with this, the HIF2A motif was the most significantly enriched motif in ccRCC-specific peaks from assay of transposase accessible chromatin sequencing (ATAC-seq) when compared to papillary RCCs in the The Cancer Genome Atlas (TCGA) cohort23 (Extended Data Fig. 4d, e). The orientation of PAX8-binding and HIF2A-binding motifs in ccRCC-specific genomic regions varied, and the distance was more than expected for co-operative DNA binding24 (Extended Data Fig. 4f). The only recurrent HIF2A–PAX8 motif orientation was related to the long terminal repeat sequence of a common endogenous retrovirus, ERV1, the expression of which has been linked to poor patient outcomes in ccRCC25,26. Also, although we observed HIF2A interactions with ARNT, we did not detect HIF2A–PAX8 interactions by co-immunoprecipitation (Extended Data Fig. 4g). These results demonstrate that in ccRCC, the renal lineage factor PAX8 and the oncogenic driver HIF2A interact at the chromatin level, probably through DNA and shared chromatin factor complexes.

PAX8–HIF2A-dependent oncogene activation

In line with the strong effect on proliferation, PAX8 and HNF1B depletion in 786-M1A and OS-LM1 cells led to reduced expression of genes involved in the cell cycle, targets of E2F1 and MYC signalling (Fig. 2a, Extended Data Fig. 4h and Supplementary Tables 3 and 4). The expression of PAX8-dependent and HNF1B-dependent genes also tracked with PAX8 and HNF1B expression, respectively, in fetal human kidney (Extended Data Figs. 2g and 4i, j). Notably, the hypoxia gene set was significantly downregulated in PAX8-depleted, but not in HNF1B-depleted, ccRCC cells (Fig. 2a). Restoration of VHL and consequent inhibition of HIF2A does not have a strong effect on the proliferation of ccRCC cells in vitro7. To identify gene regulatory elements that mediate HIF2A-driven ccRCC formation, we set out to identify transcriptional targets of HIF2A in vivo, map HIF2A-bound regulatory elements in the vicinity of these genes and target these enhancers using a CRISPRi-based loss-of-function screen in vivo.

Fig. 2: PAX8–HIF2A interactions support oncogene activation in ccRCC.
figure 2

a, Gene set enrichment analysis with MSigDB Hallmark gene sets on the effects of PAX8 and HNF1B depletion compared with a control RNAi construct (shRen). Two PAX8-targeting (shPAX8) and HNF1B-targeting (shHNF1B) RNAi constructs and cell lines (786-M1A and OS-LM1) were combined for each gene, respectively. Significantly changed gene sets are in colour (blue or red). EMT, epithelial-to-mesenchymal transition; NES, normalized enrichment score. b, Pooled in vivo CRISPRi screening. Normalized average depletion for the two most depleted constructs per region presented for 30 tumours in two groups (left versus right mouse flank). Essential genes, positive control genes. Control, average of non-targeting constructs. Empirical one-sided P values based on 10,000 permutations. c, Median ATAC-seq signals from shRen control (N = 6) and shPAX8 (N = 6) cells. Median HIF2A and PAX8 ChIP-seq signals from 786-M1A and OS-LM1 xenografts, three tumours each. Asterisk indicates E11:69419. d, Differential DNA accessibility. TCGA ATAC-seq data, 410 human tumours, 562,709 pan-cancer peaks. ccRCCs compared to all other tumour types by DESeq2. e, Normalized DNA accessibility at E11:69419, TCGA ATAC-seq data. ccRCC (KIRC), N = 16; papillary RCC (KIRP), N = 34. f, Normalized DNAse hypersensitivity (DHS) signal for E11:69419, 733 samples from different cell types. g, Tumour-free survival of mice inoculated with 786-M1A cells. iE11:69419, E11:69419 targeted by CRISPRi. log-rank test. N = 8 tumours for each group. h, RT–qPCR results of E11:69419 targeted by CRISPRi in 786-M1A cells. i, Subcutaneous tumour growth, 786-M1A cells. shRen (control RNAi construct), shCCND1-1 and shCCND1-2 (two RNAi constructs that target CCND1), N = 8; shMYEOV-1 and shMYEOV-2 (two RNAi constructs that target MYEOV), N = 10 tumours per group. Mean and s.e.m. Two-sided Kruskal–Wallis test. j, RT–qPCR results. EV, empty vector. For e and f, box plots show the median and interquartile range, and whiskers show the data range. For h and j, data points indicate independent RNA preparations (N = 3). Mean and s.e.m. Two-sided Kruskal–Wallis test.

Source data

We first developed a tumour model in which HIF2A expression could be experimentally controlled by deriving a HIF2A knockout clone from 786-M1A cells (referred to as C-M1AHIF2A–/) and reintroduced HIF2A in these cells using a doxycycline-dependent transgene (Extended Data Fig. 5a). C-M1AHIF2A–/ cells grew independently of HIF2A in vitro (Extended Data Fig. 5b), but they required the DNA-binding domain of HIF2A for tumour formation in vivo (Extended Data Fig. 5c). HIF2A depletion following doxycycline withdrawal from the diet (Extended Data Fig. 5d, e) enabled us to characterize the transcriptomic effects of HIF2A inhibition at different time points by RNA sequencing (RNA-seq) in vivo. Focusing on genes with early and sustained downregulation, we detected 205 strongly HIF2A-dependent transcripts (Extended Data Fig. 5f and Supplementary Table 5). A total of 175 HIF2A ChIP-seq peaks within a 500-kb region flanking the transcription start site of these genes were also identified, which was a significant enrichment over background (empirical P < 0.001 based on 1,000 permutations). We generated a pooled library of 706 single guide RNA (sgRNA) pairs that targeted these peaks and 30 positive and negative controls (Supplementary Table 6) using a tandem design previously shown to effectively inhibit enhancer function16,27. The library was transduced together with dCas9–KRAB28 into a clone of 786-M1A cells that was sensitive to VHL restoration in a tumour-formation assay in vivo. We then transplanted these cells into 15 NSG mice, two tumours each, and measured sgRNA representation in established tumours (Extended Data Fig. 5g). Permutation-based tests showed that the 698 constructs with high representation in the plasmid library contained sgRNAs that were consistently depleted in tumours (Extended Data Fig. 5h). All well-represented non-targeting control constructs were recovered with high efficiency from tumours, whereas constructs targeting essential genes were frequently lost (Extended Data Fig. 5i). In addition, several constructs that targeted HIF2A-bound enhancers were depleted in tumours (Extended Data Fig. 5i). Combining data from constructs with shared target regions, we observed significant (empirical P < 0.01 using 10,000 permutations) depletion of constructs that targeted 21 HIF2A-bound enhancers (Fig. 2b), 16 of which showed binding of both HIF2A and PAX8 (Fig. 2c and Extended Data Fig. 5j). The strongest hit was an intergenic region at chromosome 11:69,419,632-69,420,080 (Fig. 2c). This enhancer, referred henceforth to as E11:69419, overlapped with one of the most strongly ccRCC-specific open chromatin regions in a large clinical ATAC-seq cancer dataset23 (Fig. 2d), with clear activation in ccRCCs but not in papillary RCCs (Fig. 2e). Compared to renal cancer samples, the region covering E11:69419 showed low accessibility in other tissues, including samples representing normal renal lineage, in a large human DNAse I hypersensitivity catalogue29 (Fig. 2f). We validated the in vivo role of E11:69419 in ccRCC formation by inhibiting it using CRISPRi with two independent sgRNA pairs (Fig. 2g).

E11:69419 is flanked by two protein coding genes, MYEOV and CCND1, and it harboured strong HIF2A and PAX8 peaks (Fig. 2c). It also overlapped with the set of genomic loci that showed reduced accessibility after PAX8 depletion in our data, and its activity has been previously linked to HIF2A30,31. CCND1 encodes cyclin D1, a positive cell cycle regulator that is activated in several cancer types32, including ccRCC, in which its expression is controlled by the VHL–HIF2A pathway33. MYEOV is poorly characterized and has only weak homology to other known proteins. On the basis of chromatin interaction data, E11:69419 interacts with the promoter regions of MYEOV and CCND1 (ref. 34). CRISPRi-mediated inhibition of E11:69419 led to downregulation of both of these genesas determined by quantitative PCR with reverse transcription (RT–qPCR) (Fig. 2h)—but only CCND1 was required for ccRCC cell proliferation and tumour formation in vivo (Fig. 2i and Extended Data Fig. 6a, b). Inhibition of PAX8 and HIF2A, but not HNF1B or HIF1A, reduced CCND1 expression (Extended Data Fig. 6c–k and Supplementary Table 4). Notably, combined inhibition of PAX8 and HIF2A did not further reduce CCND1 levels (Fig. 2j and Extended Data Fig. 6l). We did not find consistent evidence of HIF2A affecting PAX8 expression or vice versa (Extended Data Fig. 6m, n), but CCND1 expression correlated more strongly with HIF2A than PAX8 expression in clinical ccRCC specimens (Extended Data Fig. 6o).

PAX8 mediates inherited ccRCC risk

Genome-wide association studies (GWAS) have identified common genetic variants that are associated with RCC risk in humans, the most significant of which is rs7105934 on chromosome 11q13.3 (refs. 4,5). This risk haplotype comprises E11:69419, which covers the linked single nucleotide polymorphisms (SNPs) rs7948643 and rs7939721 (ref. 30). Motif analysis identified two putative binding sites for both HIF2A and PAX8, but not HNF1B, in the E11:69419 sequence (Fig. 3a). As OS-LM1 and 786-M1A cells carry a luciferase transgene, we used 786-O and 2801-LM1 (a metastatic derivative of 786-O) cells in luciferase-based reporter assays, which showed robust enhancer activity for the E11:69419 sequence (Fig. 3b). Mutating one HIF2A and one PAX8 site reduced E11:69419 activity, whereas the other mutations did not have an effect (Fig. 3c). In line with the CCND1 expression data (Fig. 2j), combining mutations that inactivated the functional PAX8 and HIF2A sites did not further reduce reporter activity (Fig. 3c). Pharmacological HIF2A inhibition also reduced E11:69419 activity (Extended Data Fig. 7a). rs7948643 is located exactly at the functionally important PAX8 binding site within E11:69419, in which the more common risk allele T is the nucleotide with the highest information content in the motif derived from our PAX8-depleted ATAC-seq peaks (Fig. 3d). By contrast, the rarer protective allele C, with a ccRCC odds ratio of 0.7 (ref. 5), was predicted to reduce PAX8 affinity for the motif (Fig. 3d). In reporter assays, changing the allele T at rs7948643 for the minor allele C resulted in a reduction in enhancer activity that was comparable to that observed with larger mutations in the predicted PAX8-binding site (Fig. 3c, e).

Fig. 3: The ccRCC risk allele at rs7948643 increases PAX8-dependent activation of an oncogenic enhancer.
figure 3

a, Chromosome 11:69,417,866-69,422,866, median shRen (N = 6) and shPAX8 (N = 6) ATAC-seq signals, and median 786-M1A and OS-LM1 xenograft (N = 3 tumours each) HIF2A and PAX8 ChIP-seq signals. Asterisk indicates E11:69419, with the relative orientation of HIF2A and PAX8 DNA-binding motifs highlighted. b, Reporter assay showing E11:69419 enhancer activity, fold change over control, arbitrary units. N = 8. c, Reporter assay showing the effect of mutated HIF2A and PAX8 sites on E11:69419 activity. 786-O: wild type (WT), N = 10; HIF2A-1, N = 8; HIF2A-2, N = 8; PAX8-1, N = 9; PAX8-2, N = 8; HIF2A-2+PAX8-1, N = 4. 2801-LM1: WT, N = 12; HIF2A-1, N = 9; HIF2A-2, N = 11; PAX8-1, N = 10; PAX8-2, N = 8; HIF2A-2+PAX8-1, N = 4. d, PAX8- and HIF2A-binding motifs at E11:69419, with the ccRCC risk-associated SNP rs7948643 highlighted. e, Reporter assay showing the effect of the T>C change at rs7948643 on E11:69419 activity. N = 4. f,g, Electrophoretic mobility shift assay. Representative image (f) and quantification (g) of independent experiments (N = 4). Protein from MDA-MB-231 cells expressing EV or PAX8, oligonucleotides with the T or C allele at rs7948643. h, Long DNA reads used for phasing of the 11q13.3 RCC risk allele in RCC-JF cells. i, Allele-specific HIF2A, PAX8 or IgG ChIP qPCR in RCC-JF cells at rs7948643, normalized to allele ratio of input control. Data points indicate independent immunoprecipitation reactions. HIF2A, N = 3; other conditions N = 5. j, Allele-specific RT–qPCR results of rs7177 in RCC-JF cells, normalized to the allele ratio in genomic DNA (gDNA). k, Allele-specific RT–qPCR results of rs7177 in RCC-JF cells after PAX8 depletion, normalized to shRen control. l, Subtype-specific RCC risk associated with rs7948643 and rs7105934. Minor allele frequency of 0.07 for both variants. ccRCC, 5,648 cases and 15,010 controls; papillary RCC, 563 cases and 14,840 controls. Odds ratio shown, with whiskers representing 95% confidence intervals. For b, c and e, data points indicate the average of three technical replicates, independent transfections. For j and k, data points indicate independent RNA preparations (N = 4). For b, c, e, g, j and k, mean and s.e.m. shown. Two-sided Kruskal–Wallis test.

Source data

Gel-shift assays demonstrated that PAX8 bound the predicted PAX8 motif within E11:69419, with the allele T at rs7948643 showing higher affinity than allele C (Fig. 3f, g and Extended Data Fig. 7b). ATAC-seq analysis of a heterozygous VHL mutant ccRCC cell line, RCC-JF, suggested that there was equal chromatin accessibility of both the risk and protective E11:69419 alleles, a result supported by an analysis of heterozygous human ccRCC specimens (Extended Data Fig. 7c). In line with this, expression of a VHL-insensitive constitutively stable form of HIF2A in renal epithelial HK2 cells that endogenously express PAX8 led to E11:69419 activation in reporter assays but no accessibility at the endogenous E11:69419 locus or increase in CCND1 expression even after about 40 population doublings over 6 weeks (Extended Data Fig. 7d–g). Long-read whole-genome sequencing of RCC-JF cells resolved the haplotypes of the CCND1 locus, linking the risk allele T at rs7948643 to the allele A at rs7177 in the CCND1 3′ untranslated region (Fig. 3h and Extended Data Fig. 7h). Allele-specific ChIP–qPCR of RCC-JF cells confirmed there was higher binding of PAX8 and HIF2A to the allele T at rs7948643 when compared to allele C in the chromatin context (Fig. 3i). In line with this, PAX8 depletion reduced HIF2A binding at E11:69419 but not at a PAX8-independent HIF2A bound enhancer (E14:34035), and HIF2A depletion did not affect PAX8 binding (Extended Data Fig. 7i, j). Allele-specific RT–qPCR demonstrated higher baseline expression and a strong bias towards reduced expression of allele A at rs7177 following PAX8 depletion when compared to allele C (Fig. 3j, k and Extended Data Fig. 7k). We did not detect PAX8 binding at E11:69419 in papillary RCC cell lines (Extended Data Fig. 7l). A RCC-subtype-specific meta-analysis of human GWAS data5 revealed that rs7948643 was associated with ccRCC (P = 2.4 × 10−10) but not papillary RCC (P = 0.90) (Fig. 3l, Extended Data Fig. 8, Extended Data Table 1 and Supplementary Tables 7 and 8). Accessible E11:69419 therefore integrates the PAX8 and HIF2A signals, both of which are needed for full E11:69419 activity. Moreover, the ccRCC-protective allele C at rs7948643 inhibits PAX8 binding, which consequently reduces the activity of E11:69419 upstream of the oncogenic driver CCND1 and possible other pro-tumorigenic mediators.

A physiological MYC programme in cancer

In contrast to HIF2A inactivation (Extended Data Fig. 5b), PAX8 inhibition compromised ccRCC proliferation in vitro (Extended Data Fig. 1b), which indicated the presence of HIF2A-independent oncogenic PAX8 functions. PAX8 positively regulated HNF1B expression, whereas PAX8 expression was not altered in HNF1B-depleted cells (Extended Data Figs. 4h and 9a–e and Supplementary Tables 3 and 4), and HNF1B expression followed PAX8 expression in the developing kidney (Extended Data Fig. 4j). Reintroduction of exogenous PAX8 or HNF1B rescued the in vitro proliferation defect caused by PAX8 depletion (Fig. 4a and Extended Data Fig. 9f, g). The effect of HNF1B depletion was also rescued by exogenous HNF1B expression (Extended Data Fig. 9h). We identified genes that were downregulated in both PAX8-depleted and HNF1B-depleted cells and that were important for ccRCC proliferation based on genome-wide CRISPR–Cas9 screening data17,18. Only two genes fit these criteria: HNF1B and MYC (Fig. 4b). Of note, CCND1 was not on this list as its expression does not depend on HNF1B. The corresponding analysis of genes that were upregulated following PAX8 and HNF1B depletion did not reveal any genes that inhibited ccRCC proliferation (Extended Data Fig. 10a). We confirmed that MYC was downregulated in PAX8-depleted and HNF1B-depleted cells at the mRNA and protein level (Extended Data Fig. 10b–e). Furthermore, knockdown of MYC expression to the level observed in HNF1B-depleted cells closely phenocopied the effect of HNF1B inhibition on ccRCC proliferation in vitro (Extended Data Fig. 10f, g), and HNF1B restoration in PAX8-depleted cells restored MYC expression (Fig. 4c). Even though HIF2A has been linked to enhanced MYC activity35,36,37,38, the negative effect on MYC expression could explain the antiproliferative phenotype that follows PAX8 and HNF1B inhibition in vitro, and this effect may be independent of the VHL–HIF2A pathway.

Fig. 4: Co-option of a normal lineage factor programme for oncogenesis in ccRCC-associated 8q21.3-q24.3 amplifications.
figure 4

a, Competitive proliferation assay against EV-shRen control cells. Relative proportion of the indicated cells on day 12 compared with day 0. 786-M1A cells. Data points indicate technical replicates (N = 3). Mean and s.e.m. b, Average dependency score (CERES score) in the DepMap dataset for 25 shared genes downregulated by PAX8 and HNF1B inhibition. c, RT–qPCR results of 786-M1A cells. Data points indicate independent RNA preparations (N = 4). Mean and s.e.m. Two-sided Kruskal–Wallis test. d, Pooled CRISPRi-based proliferative screen for putative MYC enhancers in 786-M1A cells. Normalized average depletion for the two most depleted constructs per region presented for the two technical replicates (rep 1 and 2). Control, average of non-targeting control constructs. e, Median ATAC-seq signals in 786-M1A shRen (N = 5) and shHNF1B (N = 6) cells, and PAX8 and HNF1B ChIP-seq signals in 786-M1A and OS-LM1 xenografts (3 tumours each) for the MYC locus. Enhancers that support ccRCC proliferation are highlighted (enhancers containing a HNF1B motif in red, others in green). f, Effect of HNF1B depletion on chromatin accessibility at the enhancers in the MYC locus. Fold changes and adjusted two-sided P values derived by DESeq2. g, RT–qPCR results following CRISPRi-mediated targeting of E8:128132 and E8:128526 in 786-M1A cells. Data points indicate independent RNA preparations (N = 3). Mean and s.e.m. Two-sided Kruskal–Wallis test. h, Normalized DHS signal for E8:128132 and E8:128526, 733 samples from different cell types. Box plots show the median and interquartile range, and whiskers the data range. i, j, Growth of normal human renal epithelial organoids with and without PAX8 depletion. Representative images (i) and quantification (j). Scale bar, 1 mm. N = 21 random growing organoids per condition and time point. Mean and s.e.m. Two-sided Kruskal–Wallis test.

Source data

An increased copy number of MYC and its regulatory regions is associated with ccRCC metastasis8, and fluorescence in situ hybridization (FISH) analysis showed that cells of the metastatic 786-M1A cell line carry six copies of MYC (Extended Data Fig. 11a). Using a functional CRISPRi screen, we identified eight distal MYC enhancers that were important for ccRCC proliferation (Fig. 4d,e and Supplementary Table 9). Two of them, E8:128132 (chromosome 8:128,132,902-128,133,724) and E8:128526 (chromosome 8:128,526,339-128,526,710), showed HNF1B binding and contained the HNF1B motif (Fig. 4e). Unlike the majority of accessible chromatin regions in the MYC locus, both these regions showed significantly reduced accessibility in cells in which HNF1B and PAX8 were knocked down (Fig. 4f and Extended Data Fig. 11b–d) and targeting them with CRISPRi reduced MYC expression (Fig. 4g). In addition to renal cancer, E8:128132 and E8:128526 showed DNA accessibility in normal cells derived from the kidney (Fig. 4h), and PAX8 depletion inhibited the proliferation of HK2 cells, a renal epithelial cell line, and normal human renal organoids (Fig. 4i,j and Extended Data Fig. 11e). PAX8 depletion also inhibited HNF1B and MYC expression in HK2 cells and renal organoids (Extended Data Fig. 11f–h), but did not reduce CCND1 expression (Extended Data Fig. 11i). Furthermore, CRISPRi-based targeting of E8:128132 and E8:128526 reduced MYC expression in HK2 cells (Extended Data Fig. 11j). The cancer-specific 8q21.3-q24.3 amplifications in ccRCC cells therefore co-opt a lineage-factor-dependent physiological programme that supports MYC expression and proliferation that is already present in normal renal epithelial cells.


Tissue-specific factors are major determinants of carcinogenesis, but how they contribute to oncogenic processes remains largely unknown9. We identified the renal lineage factor PAX8 as a requirement for oncogenic signalling by three major genetic drivers of ccRCC, thereby providing support to the hypothesis that transcriptional lineage factors contribute to the tissue-specific manifestation of oncogenic phenotypes downstream of cancer driver mutations (Extended Data Fig. 11k). Chromatin accessibility data from human specimens together with our functional data indicate that in addition to PAX8, other factors are needed for the establishment of accessibility at crucial oncogenic PAX8-dependent enhancers. Moreover, even though VHL mutations are in general only associated with ccRCC in the context of common sporadic cancers, the tumour spectrum of VHL germline mutations is broader7. As-yet to be identified lineage factor programmes in other tissues may also collaborate with VHL loss-induced signals in tumorigenesis. Overall these observations suggest that the interaction between lineage factors and cancer-associated genetic alterations in oncogenesis depends on several layers of epigenetic conditioning.

Multiple known risk loci predispose to renal cancer5,30,36,39. We showed that rs7948643, a common genetic variant linked to the most significant renal cancer risk locus rs7105934 on chromosome 11q13.3, falls under a PAX8-binding site within E11:69419, and that the ccRCC risk allele T favours PAX8 binding. The requirement of both PAX8 and HIF2A for E11:69419 activity and the strong association of rs7948643 with ccRCC, but not papillary RCC, support a model in which the difference in PAX8 binding at rs7948643 is the cause of increased ccRCC risk associated with this locus. In line with the tissue and context-specific expression patterns of PAX8 and HIF2A, respectively, and the restricted accessibility of E11:69419, the rs7948643 genotype does not correlate with CCND1 expression in most normal tissues40.

Our results provide functional insight into the mechanisms that govern the interaction between inherited and somatic genetic alterations with developmental lineage factors in determining cancer risk, specifically in ccRCC. The distal enhancer E11:69419 integrates signals from the most commonly mutated ccRCC pathway and the most significant common ccRCC risk locus in a PAX8-dependent manner. The molecular mechanism uncovered here is therefore likely to have a significant effect on the population-level cancer burden in the kidney. Moreover, PAX8 supports the expression of two canonical oncogenes, CCND1 and MYC, and genetic inactivation of Pax8 is tolerated in the mouse kidney41. This suggests that PAX8 could be a viable therapeutic target in ccRCC. Strategies to inhibit lineage factors beyond nuclear hormone receptors should be of interest across different cancer types.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.