Spatial architectures of somatic mutations in normal prostate, benign prostatic hyperplasia and coexisting prostate cancer

This study aimed to identify somatic mutations in nontumor cells (NSMs) in normal prostate and benign prostatic hyperplasia (BPH) and to determine their relatedness to prostate cancer (PCA). From 22 PCA patients, two prostates were sampled for 3-dimensional mapping (50 normal, 46 BPH and 1 PCA samples), and 20 prostates were trio-sampled (two normal or BPH samples and one PCA sample) and analyzed by whole-genome sequencing. Normal and BPH tissues harbored several driver NSMs and copy number alterations (CNAs), including in FOXA1, but the variations exhibited low incidence, rare recurrence, and rare overlap with PCAs. CNAs, structural variants, and mutation signatures were similar between normal and BPH samples, while BPHs harbored a higher mutation burden, shorter telomere length, larger clone size, and more private NSMs than normal prostates. We identified peripheral-zonal dominance and right-side asymmetry in NSMs, but the asymmetry was heterogeneous between samples. In one normal prostate, private oncogenic RAS-signaling NSMs were detected, suggesting convergence in clonal maintenance. Early embryonic mutations exhibited two distinct distributions, characterized as layered and mixed patterns. Our study identified that the BPH genome differed from the normal prostate genome but was still closer to the normal genome than to the PCA genome, suggesting that BPH might be more related to aging or environmental stress than to tumorigenic processes.


Tissue specimen
For two PCAs, gland epithelial cells from 58 areas (1 tumor, 46 BPH, and 11 normal areas) of one patient (PCA-28) and 39 normal areas of the other patient (PCA-49) were microdissected by a pathologist, respectively (fig.S1).We did not analyze the tumor of the latter case because the tumor area was not available in the biobank blocks.These microdissection areas were widespread through the zones (PZ, transition zone (TZ), and central zone (CZ)) and the anatomical positions (horizontal and vertical).The distance between any of the two areas was at least 0.2 mm in length.Fresh frozen tissues were cut at 30 um thickness and stained with hematoxylin for 10 seconds without any fixation.Provided that the epithelial cells in a close distance are affected by similar mutagenic stimuli, we arbitrarily defined the unit of a microdissection area for a continuous thickness and width of epithelial cells under microscope.
For this, gland epithelial cells per microdissection were procured from nine serial sections by a manually controlled microdissection under the microscope 1 that collected 5,000-10,000 cells for each area.
For another 20 cases (20 tumor, 13 BPH, and 27 normal areas), we analyzed epithelial cells from one normal or BPH (N1), and one tumor (T) areas at an ipsilateral side, and another normal or BPH area on the contralateral side (N2) (i.e., trio samples).The distances between N1 and T, N2 and T, and N1 and N2 were at least 2 mm, 5 mm, and 5 mm in length.The microdissected cells were overnight incubated in proteinase K-containing buffer and used for the WGS after heat inactivation that collected 5,000-10,000 cells for each area 1 .

Panel sequencing data generation and processing
We performed targeted sequencing with DNA from microdissected tissue samples using OncoChase (ConnectaGen, Seoul, Korea) cancer panel.Sequencing libraries were generated using the Ion AmpliSeq Library Kit 2.0 (Thermo Fisher Scientific) and Ion Xpress barcode adapter kit (Thermo Fisher Scientific) according to the manufacturer's instructions.The sequencing libraries were normalized for templating on the Ion Chef (Thermo Fisher Scientific) and subsequently sequenced on the Ion S5 system (Thermo Fisher Scientific).Torrent Suite software version 5.12.1 (Thermo Fisher Scientific) was used to align raw sequence reads to the human genome (hg19) and detect genomic variants.

Somatic copy number analysis
For somatic CN alteration detection, we used the CN value of 2N data as a baseline for 1N and 1T samples, and 1N data as a baseline for SCNA detection in 1N.In the case where the 2N was PIN, we used 1N as a baseline for the analysis of 1T.In cases where multiple regions were sampled, we assigned one diploid genome sample as the baseline data.

Genomic rearrangements analysis
To identify initial somatic structural variants (SVs), we utilized Delly v2.0 2 .We removed recurrent artifactual SV calls based on a panel of normal SV datasets established from nonneoplastic samples in this study.Discordant read pairs and soft-clipped reads were thoroughly reviewed by screening aligned reads in regions of interest from raw SV calls following the process of Park et al. 3 .We clustered SVs based on spatial proximity with cut-offs at an interbreakpoint distance of < 5 Mb.A final SV cluster with more than 10 SVs in a clone was further classified into types of complex genomic rearrangements based on the criteria of Park et al. 3 .
We manually reviewed several important SVs, such as ETS fusions and SV events in nonneoplastic clones, using IGV software 4

Supplementary Figure 1 .Supplementary Figure 2 . 3 .Supplementary Figure 10 .Supplementary Figure 11 .
. Circos program was used to visualize SV clusters along with the CN profile of the 100kb window averaged coverage depth, with each SV cluster shown with different colors.Microdissection of normal epithelial cells of a prostate gland.a left column: a prostate gland lined by epithelial cells (red arrows in the box) and subepithelial connective tissue (black arrows).b The epithelial cells are partially detached by microdissection (red arrows) leaving the connective tissue (black arrows).Comparison of somatic profiles by histological types.a-b.Comparison with previous studies.The somatic single base substitution (SBS) burden (a) and estimated telomere length (b) analyzed by WGS are compared with those of previously published data.a An increase in the SBS burden during normal, benign prostatic hyperplasia (BPH), and prostate cancer (PCA) progression in the current study.Mutation burden of BPH samples from Liu et al. was extrapolated from whole-exome sequencing.b A decrease in the telomere length during normal, BPH, and PCA progression in the current study.PCAWG: pan cancer analysis of whole genomes, PRAD: prostate adenocarcinoma.c Number of structural variation is higher in PCA than normal and BPH.d Fraction of genome altered is higher in PCA than normal and BPH.Agecorrected linear regression analysis was performed.Genomic profiles in two prostates with spatial 3D sequencing.a The somatic single base substitution (SBS) burden and estimated telomere length in multi-region sampling cases (PCA-28 and PCA-49) showing anti-correlation.b Mutational signatures of PCA-28 and PCA-49.Each bar represents each sample.c Proportion of APOBEC and ROS signatures are not different between normal and BPH samples in the PCA-28.Phylogenetic trees reconstructed with somatic mutations.Each phylogenetic tree of three clones including PCA clone is constructed with maximum likelihood algorithm of MegaX program.Age of diagnosis and major driver alteration of PCA clone are shown.1N: PCA-close normal or BPH.1T: PCA.2N: PCA-away normal or BPH.BPH is shown with an orange dot.Contribution of aging and ROS signatures in PC and SC clusters in PCA samples.Private clonal (PC) mutations of PCA samples showed a significant level of ROS signature.Supplementary Figure 12.Dissection and pathologic review of two spatial sequencing cases.a58 areas sampled and sequenced in PCA-28.b 48 areas sampled and sequenced, and 39 areas used for analysis in PCA-49.