Spatial niche formation but not malignant progression is a driving force for intratumoural heterogeneity

Intratumoural heterogeneity (ITH) is a major cause of cancer-associated lethality. Extensive genomic ITH has previously been reported in clear cell renal cell carcinoma (ccRCC). Here we address the question whether ITH increases with malignant progression and can hence be exploited as a prognostic marker. Unexpectedly, precision quantitative image analysis reveals that the degree of functional ITH is virtually identical between primary ccRCCs of the lowest stage and advanced, metastatic tumours. Functional ITH was found to show a stage-independent topological pattern with peak proliferative and signalling activities almost exclusively in the tumour periphery. Exome sequencing of matching peripheral and central primary tumour specimens reveals various region-specific mutations. However, these mutations cannot directly explain the zonal pattern suggesting a role of microenvironmental factors in shaping functional ITH. In conclusion, our results indicate that ITH is an early and general characteristic of malignant growth rather than a consequence of malignant progression.

I ntratumoural heterogeneity (ITH) is a result of genomic instability and causes clonal variability that drives malignant progression, therapy resistance and fatal disease outcome [1][2][3] . Elucidating the causes and consequences of ITH is therefore of utmost importance not only for cancer diagnosis and prognosis but also for new therapeutic strategies and personalized cancer patient management 4 .
Multiregion exome sequencing studies of clear cell renal cell carcinoma (ccRCC) have recently underscored the extensive genomic ITH and branched clonal evolution of this tumour type 5,6 . In an initial study involving a widely metastatic ccRCC, approximately 31% of the somatic mutations detected were ubiquitous. Approximately 23% were shared between sampling sites of the primary tumour whereas approximately 21% were shared between metastatic lesions, or found exclusively in one of the lesions analysed ('private mutations'; approximately 25%) 5 . Inactivation of the VHL tumour suppressor gene was ubiquitously detected as expected for a truncal driver alteration. In a follow-up study 6 , the high frequency of heterogeneous mutations with most driver mutations arising in spatially separated subclones was confirmed. In addition, multiple subclones in a given tumour region were detected.
These findings raise the possibility that the evolution of tumour subclones leads to enhanced genomic ITH as a tumour progresses and that this is reflected by increased functional ITH. In this scenario, functional ITH as measured by changes in protein expression or protein phosphorylation could be a prognostic marker.
To test this hypothesis, we used a precision quantitative imaging approach as well as regional whole-exome sequencing to characterize ITH in primary ccRCCs of various stages. We did not detect any significant differences in the extent of functional and genomic ITH between ccRCC of the lowest stage in comparison to advanced tumours. However, we unexpectedly found marked and consistent spatial differences in functional ITH between the tumour centre and the tumour periphery with highest proliferation and signalling activities almost exclusively in the tumour periphery. Using whole-exome sequencing, we discovered various region-specific mutations, thus underscoring that genomic ITH also follows a spatial pattern. However, the periphery-specific mutations could not directly explain the functional ITH, which underscores the role of microenvironmental factors in Intratumoural niche formation. Collectively, our results suggest that ITH is an early and general characteristic of primary ccRCCs that does not increase with malignant progression. They support a model in which Intratumoural niches, most notably the tumour peripheral zone and the tumour centre, populated by tumour subclones with unique functional properties and mutations, are drivers of ITH.

Results
Phenotypic and functional ITH in ccRCC is stage-independent. For a quantitative image analysis of phenotypic and functional ITH, four immunohistochemical markers were used as surrogates for genomic alterations in key signalling pathways in ccRCC. These included HIF-1a and HIF-2a (VHL inactivation), phospho-mTOR S2448 as well as phospho-S6RP S235/236 (activating mutations in the PI3K/AKT/mTOR pathway) 5 . In addition, Ki-67 was included to assess the cellular proliferation (Fig. 1a). Primary tumours from 30 patients were analysed (Supplementary Table 1), and staining results were quantified as outlined in Fig. 1b. To detect changes in the extent of functional ITH associated with malignant progression, we analysed a spectrum of tumours ranging from small pT1 (n ¼ 9) tumours to advanced pT3/4 tumours (n ¼ 11) with or without lymphatic or distant metastasis. A very rare subgroup of small T1 tumours with synchronous distant  HIF-2α  Overview of the quantitative IHC analysis process using a representative image of a wholeslide scanned pT1M1 tumour after phospho-mTOR S2448 staining. The image was overlaid with a virtual grid consisting of 1 mm 2 squares that defines the tumour regions to be analysed. Non-cancerous tissue was subtracted manually (dark grey) and a positive pixel count was performed on the corrected area. Positivity per square was expressed as percentage of positive pixels. If a square contained non-tumorous tissue, the positivity was normalized to the corrected square area. Computed values were depicted as a heat map. Coloured squares represent tumour, whereas open squares indicate completely non-tumorous areas. Squares defined as peripheral tumour zone are marked by a black line. Scale bar, 1 cm. metastasis (n ¼ 10) was included with the idea that these tumours would contain features of more advanced ccRCCs. Since our intention was to compare small renal tumours with widely advanced tumours, pT2 RCCs were deliberately excluded.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms11845 ARTICLE stages did not indicate any major differences in functional ITH between tumour subgroups (Fig. 2).
To corroborate this notion, we first analysed expression of individual markers for each prognostic subgroup (Fig. 3a). No statistically significant differences were detected when mean percentages of positive nuclei or pixels per square in pT1M0 tumours were compared with pT1M1 or pT3/T4 tumours (P40.05, Kruskal-Wallis test for multiple comparisons). This result was not due to patient selection, since the cancer-specific survival analysis confirmed the differences in patient prognosis (Fig. 3b).
Next, we displayed the distribution of the five tissue markers as histogram plots (Fig. 4). These histograms represent the number of 1 mm 2 squares that contained a certain percentage of positive pixels or positive nuclei. For an analysis of individual marker distribution, we used the D'Agostino-Pearson omnibus test. This test measures goodness-of-fit and determines how symmetrical the distribution of the data is and whether the data distribution follows a Gaussian distribution. A homogeneous marker distribution throughout an entire tumour sample would be reflected in a histogram with a normal, or Gaussian distribution, and the test would be passed. The D'Agostino-Pearson test does not only provide information about whether a distribution follows a Gaussian distribution or not but also calculates P values to assess the probability that a negative or positive test result is correct. In total, 72% of the samples were significantly D'Agostino-Pearson negative (Po0.05; D'Agostino-Pearson test) and were hence considered to be heterogeneous. Twentyeight percentage of the samples passed the test; however, not with a significant P value (P40.05; D'Agostino-Pearson test; Supplementary Table 2).
We next sought to directly quantify the degree of functional ITH in the various prognostic subgroups. To this end, we used two quantitative scores, the s.d. of the PNCs and PPCs from all squares of a given tumour sample stained for a specific marker and the maximum-mean (MAX-m) score. The latter is particularly useful to identify outlier regions of positive marker staining within the tumour area analysed. In addition, we used two established biodiversity indices, the Shannon and the inverse Simpson index, to characterize functional ITH.
S.d. and MAX-m scores as well as Shannon and inverse Simpson indices were calculated for all five markers for each of the 30 tumour specimens (Fig. 5). No statistically significant differences between different prognostic subgroups were detected (P40.05, Kruskal-Wallis test for multiple comparisons). In addition, we were not able to corroborate a statistically significant correlation between ITH scores and histological tumour grade (P40.05; Mann-Whitney U-test; Supplementary Table 3).
In summary, our data indicate that functional ITH exists already in primary ccRCCs of the lowest stage and does not increase with malignant progression. ITH is unaffected by changes in signalling cascades. We next performed a correlation analysis of all markers and tumours using directly corresponding virtual squares of the adjacent tissue sections (Supplementary Table 4). In both pT1M0 and pT1M1 tumours, a moderate-to-good positive correlation (r ¼ 0.4-0.7; Spearman's rank correlation coefficient) for HIF-1a and HIF-2a as well as co-expression of phospho-mTOR S2448 and phospho-S6RP S235/236 was found. In addition, there was a negative correlation (r ¼ À 0.51; Spearman's rank correlation coefficient) for HIF-1a and phospho-mTOR S2448 expression in pT1M0 tumours. A negative regulation of mTORC1 by HIF-1a has previously been reported through the HIF-1a transcriptional target REDD1 (ref. 7), and it remains to be determined if this is the underlying mechanism for this observation. Tumours of pT1M1 stage showed a positive correlation between HIF-1a and phospho-S6RP S235/236 expression (r ¼ 0.51; Spearman's rank correlation coefficient). Remarkably, however, none of these correlations was found in pT3/4 ccRCCs. Instead, there was a negative correlation between HIF-2a expression and the proliferation marker Ki-67 (r ¼ À 0.46; Spearman's rank correlation coefficient) 8 . No significant correlations were detected when all tumours combined were analysed. Taken together, these results indicate that apparent changes in tissue marker correlations associated with malignant progression (highlighted in Supplementary Fig. 1) are not reflected by changes in functional ITH as determined by our quantitative image analysis.
Tumour periphery and centre are distinct topological niches. When we analysed the expression of Ki-67 as a marker of cellular proliferation, we noticed that proliferating cells were preferentially found in the tumour periphery (Fig. 6a,b). Markers indicating an activated PI3K/mTOR pathway were found to follow this pattern with highest expression in the periphery of the tumours (Fig. 6c). We therefore systematically analysed all tumours and defined the peripheral zone as the outermost squares of each tumour (Fig. 1) and determined the localization of the square with the peak expression for all markers used (Fig. 6d). A total of 4,475 peripheral zone and 4,425 central zone squares were evaluated. We found that in the vast majority of tumours, the peak marker expression was in the peripheral tumour zone (Po0.05; Bernoulli trial). The only exception was the expression of phospho-S6RP S235/236 in pT1M1 tumours, where the majority of tumours showed a peak expression in the central zone. This result and the fact that tumours were not uniformly negative in the centre (Fig. 2) rule out that non-specific effects of tissue processing are responsible for our findings. When we compared pT1M0 with pT1M1 and pT3/4 tumours, we did not find any significant differences between the prognostic subgroups with respect to the peripheral and central zone expression of markers (P40.05, Fisher exact test, two-tailed). Next, we tested whether an unequal admixture of non-tumorous cells may account for the topological differences that were detected. Centre and peripheral zones of 24 tumours were stained for CD31 to measure microvessel density and CD45 to measure leukocyte infiltration ( Supplementary Fig. 2). No significant differences between the tumour centre and periphery were detected (P40.05; Mann-Whitney U-test), thus underscoring that spatial ITH was not due to the admixture of non-tumorous cells or differences in intratumoural microvessel density (see also analysis of mutant allele frequencies below).
These results indicate that the tumour periphery and centre represent distinct topological niches that differ functionally and that are retained during malignant progression.
Tumour periphery and centre harbour distinct somatic mutations. We next sought to corroborate whether the differences between the tumour periphery and the tumour centre with respect to functional ITH are based on genetic alterations and/or distinct mutational processes. To this end, we performed whole-exome   Table 5). The zonal pattern of functional heterogeneity was maintained in all tumours (Supplementary Fig. 3). Tumour periphery and centre were macrodissected (Fig. 7a), and whole exomes were sequenced with an average coverage of 50.3 Â . We first analysed known truncal driver mutations and recurrently mutated genes and found a total of 28 mutations in seven known ccRCC driver genes and pathways (Fig. 7b) [9][10][11][12] . Driver mutations were typically found in both the peripheral and central tumour areas. For every patient, single-nucleotide variants (SNVs) detected only in the respective peripheral zone sample were designated as peripheral and SNVs detected only in the respective central zone sample were designated as central. Variants that occurred recurrently in at least six of the eight patients or that were dbSNP-associated without clinical context were removed. SNVs were called functional, when they were nonsynonymous, occurred within splice sites or lead to gain or loss of stop codons.
For an analysis of mutational signatures, SNVs were grouped into 96 different categories according to their trinucleotide context and the type of base exchange as previously described 13 .
We detected a total of 88 functional SNVs and indels that were specific for either the peripheral zone or the central zone of the tumours analysed (Fig. 7c,d). There was no correlation between the number of centre-or periphery-specific mutations and tumour stage or grade (Supplementary Table 5).
Whereas the central zone specimens from eight ccRCCs contained on average 8.5 ± 2.6 (mean ± s.e.m.) specific functional SNVs and indels, the eight peripheral zone specimens contained on average 2.5 ± 1.2 (mean ± s.e.m.) specific functional SNVs. There were no recurrently mutated genes in the tumour peripheral zone that could directly explain the consistently higher proliferation rate and signalling activities detected there. Instead, mutations were found, for example, in genes involved in chromatin remodelling (CARM1 and GATA3), DNA replication and repair (POLQ), and cell migration or cytoskeletal organization (TJP1, GSN and ENC1). These results suggest that, despite the presence of unique mutations in the tumour periphery in six of eight tumours, the enhanced cellular proliferation and signalling activity represent mainly functional events. An analysis of the mutant allele frequencies between the centre and periphery showed no major differences, thus further supporting the notion that our findings were not due to unequal admixture of non-tumorous cells (Supplementary Fig. 4).
To test whether specific mutational processes are active in either the central or the peripheral zone of the tumours, the   ARTICLE triplet distribution of the SNVs was analysed per patient (Fig. 8a) and per stratum (Fig. 8b). Pair-wise comparisons of the patientspecific distributions revealed cosine similarities between 0.107 and 0.477, while the cosine similarity between the merged sets of central and peripheral SNVs was 0.514 (Fig. 8c), indicating higher differences in the SNV spectra between the patients than between the tumour peripheral and central areas.
Taken together, whole-exome sequencing analysis demonstrates the presence of central and peripheral zone-specific mutations, while mutational signatures do not differ spatially. Given the presence of driver mutations in both the peripheral and central areas and the absence of recurrent mutations in cell cycle or other signalling-related genes in the peripheral zone specimens, we conclude that the enhanced proliferation rate and signalling activity in the tumour peripheral zone represent mainly functional events.

Discussion
Renal cancer is a prototypical example of a tumour entity that is characterized by extensive genomic ITH 5,6 . In the present report, we use quantitative image analysis and whole-exome sequencing to show that functional as well as genomic ITH can be detected at similar levels in primary ccRCCs of the lowest stage and advanced, metastatic tumours. ITH is hence not a result of malignant progression, but rather represents a feature of malignant growth per se. To explain these findings, we demonstrate that functional ITH follows a zonal pattern with peak proliferative and signalling activities almost exclusively in the tumour periphery. Genomic ITH was found to follow this topological pattern, since we detected various periphery-and centre-specific functional SNVs and indels in our whole-exome sequencing analysis. Functional and genomic niche formation may hence represent an early and general consequence of an uncoordinated, three-dimensional growth of a tumour within the architectural confinement of an established organ. Since such niches, that is, tumour centre and periphery, are maintained during malignant progression, the degree of ITH is comparable between small and large tumours. Our results provide strong support for the notion that genomic analyses of cancer need to take spatial differences into account.
Our whole-exome sequencing results do not contradict previous studies showing branched and converging evolution in ccRCC 5,6 , but they underscore that these evolutionary processes are shaped by topological niches and overlaid by functional heterogeneity. The fact that we found periphery-specific mutations could argue for a model in which tumour subclones with certain mutations that confer a selective advantage are able to populate this region ('clone drives niche'). If the peripheryspecific mutations detected do in fact confer such a growth advantage remains to be determined. However, the absence of recurrently mutated genes that would directly explain the consistently higher proliferative and signalling activity and the absence of periphery-specific mutations in three tumours suggest an important role of microenvironmental factors in niche formation and ITH ('niche drives clone'). In this scenario, the enhanced proliferation and signalling activity in the peripheral zone would represent mainly functional events. A zonal tumour growth pattern has previously been suggested by mathematical modelling as well as three-dimensional in vitro co-culture experiments using breast cancer cells [14][15][16] . Since truncal drivers were predominantly found in both niches, one possible explanation for our results is that proliferation is limited in the tumour centre, for example, by spatial or other constraints, and that these constraints become more relaxed towards the tumour periphery. How this relaxation is achieved is currently unclear, but there are several possible scenarios including differences in the level of oxygen, growth factors, cytokines or interstitial fluid pressure 17 . Although we did not detect any significant differences in microvessel density in vital tumour areas of the centre and the periphery, it is still possible that local oxygen tension and/or cytokine/chemokine levels play a role, since vessel density and nutrient supply are frequently uncoupled in malignant tumours 18 . Another possibility is the 'empty site' model that has recently been proposed and in which tumour cells proliferate preferentially when neighbouring non-tumour cells or the extracellular matrix 14 . A third possibility to explain the intratumoural niche formation shown here is co-evolution of  TASP1  IFT52  COPS7B  DICER1  DNAH8  ABCC6  ACAT2  ADAM19  ASH1L  ATXN2  CARD11  COG6  CPNE3  DSCAML1  HNRNPK  HSPG2  KCNQ3  KRT18  LOXL3  MEGF8  NCOA7  PDK2  RENBP  RTP4  SPTBN5  VTCN1  YME1L1  ZNF462  ZNF780B   AHR  BAP1  NIN  RADIL  SLC4A4  STAG1  XAB2  ACAD11  PDE2A  TMEM92  ABCB8  CLTC  DLAT  NRK  PPWD1  ZNF552  AL133481.1   FAT3  GGT5  IFIT1  LPPR4  MTUS1  NIPAL4  PLEKHO1   RIN1  SETD2  SLC35B2  SLC39A1  SYT10  TRIM72  YWHAE  DPCR1  DRICKLE3   PXDN  SYT8  TRIOBP Driver mutations  Periphery-specific functional SNVs and lndels   RCC001  RCC002  RCC003  RCC004  RCC005  RCC006  RCC007  RCC008   CARM1  GATA3  LDB2  OR52E8  POLQ  PRKG1  PRX  TH  TJP1  ZAN  GSN  AHSG  NEB  ZNF665  KLHL28  PPP1R3F  SLC26A5  ZNF831  ENC1 ACA  ACC  ACG  ACT  CCA  CCC  CCG  CCT  GCA  GCC  GCG  GCT  TCA  TCC  TCG  TCT  ACA  ACC  ACG  ACT  CCA  CCC  CCG  CCT  GCA  GCC  GCG  GCT  TCA  TCC  TCG  TCT  ACA  ACC  ACG  ACT  CCA  CCC  CCG  CCT  GCA  GCC  GCG  GCT  TCA  TCC  TCG  TCT  ATA  ATC  ATG  ATT  CTA  CTC  CTG  CTT  GTA  GTC  GTG  GTT  TTA  TTC  TTG  TTT  ATA  ATC  ATG  ATT  CTA  CTC  CTG  CTT  GTA  GTC  GTG  GTT  TTA  TTC  TTG  TTT  ATA  ATC  ATG  ATT  CTA  CTC  CTG  CTT  GTA  GTC  GTG  GTT  TTA  TTC  TTG  TTT   ACA  ACC  ACG  ACT  CCA  CCC  CCG  CCT  GCA  GCC  GCG  GCT  TCA  TCC  TCG  TCT  ACA  ACC  ACG  ACT  CCA  CCC  CCG  CCT  GCA  GCC  GCG  GCT  TCA  TCC  TCG  TCT  ACA  ACC  ACG  ACT  CCA  CCC  CCG  CCT  GCA  GCC  GCG  GCT  TCA  TCC  TCG  TCT  ATA  ATC  ATG  ATT  CTA  CTC  CTG  CTT  GTA  GTC  GTG  GTT  TTA  TTC  TTG  TTT  ATA  ATC  ATG  ATT  CTA  CTC  CTG  CTT  GTA  GTC  GTG  GTT  TTA  TTC  TTG  TTT  ATA  ATC  ATG  ATT  CTA  CTC  CTG  CTT  GTA  GTC  GTG  GTT  TTA  TTC  TTG  tumour and stroma cells 19 , which would be favoured in the tumour peripheral zone. It needs to be emphasized that these mechanisms may not function independently, but may rather represent co-operating mechanisms for topological functional and genomic ITH.
Despite an enhanced proliferation in the tumour periphery, the distributions of nucleotide exchanges in their triplet contexts were more similar for the two strata than between the different patients suggesting that the mutational processes are similar in the tumour centre and periphery.
Spatial differences in ITH have been reported not only in ccRCC 5,6 but also in other tumour entities such as prostate [20][21][22][23] , lung 24,25 and breast cancer 26,27 . Although we did not perform multiregion sequencing and focused on relatively large single tumour regions instead, our results are in line with studies showing ITH in ccRCCs of different stage 6 . Interestingly, the application of biodiversity scores such as the Shannon or Simpson index, which are also used in the present study, has previously shown a similar extent of ITH between various breast cancers 26,27 . The global extent of ITH may hence not be a suitable prognostic marker, not only for RCC but also for other tumours. Nonetheless, our results encourage a more refined approach to characterize ITH under inclusion of spatial differences and intratumoural niches. In this context, it is important to point out that there is heterogeneity even within the peripheral zone, as shown by the finding that phospho-mTOR S2448 and phospho-S6RP S235/236 occupied different areas within the peripheral zone, that is, direct expression at the invasion front and expression a few cell layers away from the invasion front as shown in Fig. 6. This is remarkable since both proteins can be phosphorylated by ribosomal S6 kinase 1 (ref. 28). However, mTOR phosporylation has been implicated in a negative-feedback loop 28 , and our finding may represent a reflection of this regulatory mechanism.
Although we used HIF protein expression as surrogate for VHL loss, it is noteworthy that the expression of HIF proteins is also affected by other factors, for example, loss of chromosome 14q that harbours the HIF1A gene. HIF protein expression has been shown to be heterogeneous in the past 8 and, furthermore, high expression of HIF-1a has been reported at the invading edge of malignant tumours similar to the results shown here 29 . HIF overexpression has been linked to the loss of the adhesion molecule E-cadherin 30,31 resulting in an altered migratory behaviour, which has recently been suggested as a key factor for malignant progression 14 . Hence, potential effects of a pharmacological inhibition of migratory behaviour of tumour cells to prevent malignant progression deserve further attention. Additional translational implications of our finding would be that the use of an mTOR inhibitor could specifically target peripheral zone cells with high proliferative capacity.
In summary, our results indicate the presence of topological niches within single regions of primary ccRCCs that not only harbour region-specific mutations but are also characterized by tumour clones with unique functional properties. We provide unexpected insights into the architecture of malignant tumours and clonal evolution that have important implications for the analysis of cancer genomes in general. An integrated analysis of functional and genomic ITH appears to be required for a comprehensive analysis and better understanding of cancer genome evolution and malignant progression.
Image quantification. Stained sections were scanned using a Nanozoomer 2.0-HT Scansystem (Hamamatsu Photonics) to generate digital whole-slide images. Quantification of the digital images was performed using the VIS software suite (Visiopharm, Hoersholm, Denmark). To facilitate the comparison of identical tumour regions stained on separate slides with different antibodies, adjacent tissue sections were aligned and virtually segmented into squares of 1 mm 2 in size. Staining results of each square were computed using either the PNC algorithm (HIF-1a, HIF-2a and Ki-67) or the PPC algorithm (phospho-mTOR S2448 and phospho-S6RP S235/236). The latter algorithm was designed to measure every pixel per 1 mm 2 square based on its spectral characteristics. Non-cancerous regions were excluded from quantification. Non-malignant cells within a cancerous area (for example, endothelial cells or immune cells) were not excluded under the assumption that the proportion of malignant versus non-malignant cells is comparable across tumours. Staining results for each square were expressed as either percentage positive nuclei or percentage positive pixels. If the 1 mm 2 square had to be reduced because of non-cancerous parts, the percentage of positive pixels was normalized to the corrected area. Staining results were then depicted as heat maps.
To assess and compare heterogeneous staining, two different heterogeneity scores were used. First, the s.d. that expresses the distribution of positivity of single squares compared with average and, second, the difference between the maximum and the average positive value of squares (MAX-m) which serves to highlight outliers within a tumour area. For the purpose of validating the used heterogeneity scores, we also applied two established biodiversity indices to our data set, that is, the Shannon and inverse Simpson index. Both indices estimate biodiversity based on the number of data categories s ( ¼ species) and the quantity of squares ( ¼ individuals) of their respective data categories. To compare the biodiversity indices between the five different biomarkers, data categories were defined dependent on the range of % PP or % PN, respectively. % PN of HIF-1a and HIF-2a range from 0 to 100% and 5% intervals were used to reach a maximum of 20 data categories per sample. % PP of phospho-mTOR S2448 and phospho-S6RP S235/236 as well as % PN of Ki-67 range roughly from 0 to 25% and therefore 2% intervals were defined to reach the same maximum of data categories (s ¼ 20).
The Shannon index assumes that all data categories are represented in a given sample and that they are randomly distributed: p is the proportion of squares of one specific data category, divided by the total number of squares found in a given sample, and s is the number of data categories. The inverse Simpson index gives more weight to common data categories. Therefore, rare data categories with only a few individuals ( ¼ outliers) will not affect the diversity.
The peripheral tumour zone was defined as outermost squares of each tumour. Small lesions in which only two layers of squares were present (and hence by definition no central zone; n ¼ 3) were excluded from the analysis.
Whole-exome sequencing. DNA extraction and whole-exome sequencing were performed by GATC Biotech (Konstanz, Germany) on an Illumina HiSeq 2500 platform (InView Human Exome). Fastq files were aligned against the human reference genome (build 37, version hs37d5), using bwa mem (version 0.7.8 with minimum base quality threshold set to zero [-T 0] and remaining settings left at default values), followed by coordinate sorting with bamsort (with compression option set to fast (1)) and marking duplicate read pairs with bammarkduplicates (with compression option set to best (9); both part of biobambam package version 0.0.148). The generated bam files were used to identify SNVs by applying samtools mpileup 33 (version 0.1.19 with parameter settings -REI -q 30 -ug) with output piped to bcftools view (version 0.1.19 with parameters settings -vgN -p 2.0). Determination of high-quality sample-specific SNVs was largely done as described in ref. 34 with additional classification into 96 categories according to their triplet of the base preceding the SNV, the actual transition/transversion and the following base. For each class, it was determined whether its SNVs exhibit a sequencing strand bias. For classes with bias, SNVs with less than two supporting reads on the strand opposite to the bias were filtered out. SNVs were called functional, when they were nonsynonymous, occurred within splice sites or lead to gain or loss of stop codons. For each category, SNV counts were determined and corrected for the triplet distribution of the human exome using the genomic coordinates of the used target capture kit (Agilent 5 without untranslated regions). Normalized counts were grouped into the two different strata (central and peripheral) and analysed both per patient to get eight patient-specific distributions and for all patients together to get two strata-specific distributions. The distributions were compared using pair-wise cosine similarity as previously described 13 .
Statistical analysis. The data are summarized as median±interquartile range. The D'Agostino-Pearson omnibus test was used to determine whether the data within are consistent with a Gaussian distribution. Statistical analyses were performed with the Mann-Whitney, Kruskal-Wallis, Fisher exact test and Bernoulli trial, as appropriate. Bernoulli trial was used with probability of success of 0.51 because the number of squares in the tumour periphery versus the centre were similar (4,475 versus 4,425). Statistical significance was accepted at values of Po0.05. Correlation analyses were performed using Spearman correlation for non-parametric variables. Degrees of correlation were considered as followed: 0.0-0.2 (no correlation), 0.2-0.4 (weak correlation), 0.4-0.7 (moderate-to-good correlation) and 0.7-1.0 (good-to-very good correlation).
Data availability. Genome sequencing data are deposited at the European Genome-phenome Archive (EGA, http://www.ebi.ac.uk/ega/), which is hosted at the EBI, under accession number EGAS#00001001784. All other relevant data supporting the findings of this study are either included within the article and its Supplementary Information files or available upon request from the corresponding author.