Single-cell analysis supports a luminal-neuroendocrine trans-differentiation in human prostate cancer

Neuroendocrine prostate cancer is one of the most aggressive subtypes of prostate tumor. Although much progress has been made in understanding the development of neuroendocrine prostate cancer, the cellular architecture associated with neuroendocrine differentiation in human prostate cancer remain incompletely understood. Here, we use single-cell RNA sequencing to profile the transcriptomes of 21,292 cells from needle biopsies of 6 castration-resistant prostate cancers. Our analyses reveal that all neuroendocrine tumor cells display a luminal-like epithelial phenotype. In particular, lineage trajectory analysis suggests that focal neuroendocrine differentiation exclusively originate from luminal-like malignant cells rather than basal compartment. Further tissue microarray analysis validates the generality of the luminal phenotype of neuroendocrine cells. Moreover, we uncover neuroendocrine differentiation-associated gene signatures that may help us to further explore novel intrinsic molecular mechanisms deriving neuroendocrine prostate cancer. In summary, our single-cell study provides direct evidence into the cellular states underlying neuroendocrine transdifferentiation in human prostate cancer.


Introduction
Lineage plasticity endows cancer cells with the ability to switch their cellular phenotype [1] and is often associated with more aggressive stages of cancers [2]. In prostate cancer, lineage plasticity contributes to the acquisition of the neuroendocrine (NE) phenotype [3][4][5], with the emergence of a highly aggressive variant, termed neuroendocrine prostate cancer (NEPC) [6]. Current studies support that NEPC tumors arise clonally from prostate adenocarcinoma (PCA) [7], accompanying with a phenotypic transition from acini epithelial tumor cells to NE-like tumor cells [8]. This lineage transition enables tumor cells to evade androgen receptor (AR) pathway inhibitors such as enzalutamide by shedding their dependence on the AR pathway [4,9]. Consequently, tumors develop resistant to the traditional androgen dependent therapy (ADT) and thus became one subtype of castration-resistant prostate cancer (CRPC), leading to the most lethal stages of this disease [10,11]. Therefore, understanding the cellular and molecular basis underlying neuroendocrine differentiation (NED) of prostatic tumor cells is of important clinical significance.
In the last decade, the molecular features of NEPC have gradually come to light, including genomic loss of RB1 and TP53 [12], and amplification of MYCN [9,13,14]. In addition, reprogramming to an NEPC state is also linked to overexpression of neural progenitor-associated genes such as SOX2 [5,15], MYCN [9,13,14], EZH2 [16], POU3F2 [17], FOXA2 [18,19] and SIAH2 [19]. Many NEPC drivers such as SOX2 and MYCN have also been reported to be es-sential for maintaining cell stemness [20,21], raising the possibility that the emergence of NEPC is associated with acquisition of stem-like properties. Interestingly, we previously showed that SOX2 is normally expressed in prostatic basal epithelial cells and a small population of luminal cells [22], highlighting a potential role of normal SOX2-expressing epithelial cells in deriving NED.
Other findings also suggest that acquisition of NE phenotype of the prostate cancer cell is likely to link with epithelial-to-mesenchymal transition (EMT) state that could be both induced upon down-regulation of AR signalling [4,23].
On the other hand, several groups have attempted to uncover the cell of origins of focal NED and even NEPC. The normal prostate gland consists predominantly of cells of the luminal and the basal compartment with a small minority of NE cells that are scattered between the luminal and the basal cell compartment [24]. As normal NE cells share many features with malignant NE cells (for example, expressing SYP and CHGA), it has been proposed that NEPC might arise from transformed NE cells [25]. However, genomic studies seem move supportive of an epithelial origin of NEPC, given that NEPC showed significant genomic overlap with PCA, such as TMPRSS2-ERG fusion [26,27]. Within the prostatic epithelial cell compartments, both luminal and basal epithelial cells have been shown to be capable of generating prostate cancer and even NEPC. For example, Zou et. al. [28] have demonstrated that focal NED, as well as eventual well-differentiated neuroendocrine disease directly arises via transdifferentiation from luminal adenocarcinoma cells. In contrast, Lee et. al. [29] have recently reported that basal cells can directly give rise to NE cells during prostatic tumorigenesis without undergoing an intermediate luminal state. In addition, there are some studies have suggested that NE cells derived from basal cells exhibit a loss of basal features and upregulation of luminal features during NED [13,30]. Overall, there is no consensus on the cellular characteristics during the transition from epithelial tumor cells to neuroendocrine (NE) tumor cells.
Gene expression is a key determinant of cellular phenotypes. Previous population-based RNA sequencing (RNA-seq) method has been performed to compare the transcriptional similarity between prostatic basal and luminal epithelial cells and suggested that metastatic NEPC molecularly resembled stem cell in basal compartment [31,32]. Recent advance in single-cell RNA sequencing (scRNA-seq) technology has greatly empowered us to gain a more comprehensive understanding of the transcriptional signatures of distinct subpopulations of epithelial cells in human and mouse prostate [33][34][35][36].
However, a detailed analysis of the cellular states of NED in primary human prostate cancer at single-cell resolution is still lacking. Herein, we apply scRNA-seq technology to determine the cellular identity associated with NED in human prostate cancer. Our datasets reveal that a luminal epithelial state is highly linked with of NED of prostate cancer cells. Furthermore, we show by intra-tumoral RNA velocity analysis that the NE cells are directly generated by luminal-like adenocarcinoma cells. Finally, we dissect the transcriptomic landscape underlying NED and validate single-cell derived NED-related gene signatures in bulk RNA samples. Altogether, our results support the epithelial-NE transdifferentiation model regarding the NED in human prostate cancer and offer fresh insights into cellular states and molecular features associated with this process.

Single-cell transcriptional profiling of biopsies from 6 CRPC
Given that focal NED can be more frequently detected in patients with advanced prostate cancer undergoing ADT but not in primary prostatic adenocarcinoma [37][38][39], we sought to perform scRNA-seq on tumor biopsies from CRPC patients. In this study, we isolated fresh cells from six CRPC patients, four out of whom were found to have low PSA levels (<20 ng/ml; Table 1, Fig. 1A and supplementary fig. 1), indicating a higher likelihood of having NED. In these patients, three had received the first-line therapy of the LHRH analogue goserelin coupled with the AR inhibitor bicalutamide, two had underwent surgical castration coupled with bicalutamide, while the remaining one was diagnosed as small-cell NEPC at the beginning and treated with chemotherapeutic drug docetaxel. By pathological examination, biopsy tissues from three patients (#2, #5 and #6) displayed cellular morphology resembling small cell carcinoma and biopsies from patient #1 and #4 presented a classical PCA phenotype (Fig. 1B). However, biopsy from patient #3 was characterized as prostatic intraepithelial neoplasia, which may due to the inaccuracy of the biopsy procedure. The clinical and pathologic features of the biopsy samples are summarized in Table1.
Then, single-cell suspension from each tissue was subjected to scRNA-seq by a 10x Genomics-based single-tube protocol with exclusive transcript counting through barcoding with unique molecular identifiers [40].

NE cells present an epithelial phenotype
Next, in keeping with our aim to characterize NED, we defined a NE index using 14 well-known NE markers that have been previously characterized as biomarker or driver genes of NEPC, such as ASCL1, CHGA/B, FOXA2, SOX2 [4,14,16,18,19,41,42]. We scored each cell using this NE gene set (Supplementary table3). In line with the pathological results, this analysis identified obvious NED in three patients (patient #2, #5 and #6; Fig. 1C). Notably, we found that NE high cell population detected in these three patients all belong to the epithelial cells instead of the non-epithelial cell compartments ( Fig. 1C and   supplementary fig. 2C, 2D), supporting an epithelial origin of NED. In addition, we noticed that majority of epithelial cells from patient #2 and #5 were scored for a NE phenotype, while only part of epithelial cells from patient #6 have a NE phenotype (Fig. 1C), manifesting different extent of NED among these three patients. Taken together, single-cell analysis showed that three patients likely have NED and suggested an epithelial origin of NED in human prostate cancer.

NE cells present a malignant luminal-like phenotype
Having characterizing an epithelial phenotype of NED, we next focused on epithelial compartment by computationally removing all non-epithelial cells. In order to gain more insight into the molecular features of NED in each patient, we then scored each cell for epithelial basal/luminal lineage markers [35], AR signature genes [41,43,44], EMT as well as stem cell genes expression [45].
Pairwise correlation analysis of all epithelial cells revealed that nearly all of epithelial cell from two patients (patient #2 and #5) represents an obvious NED phenotype ( Fig.2A, 2B and supplementary table4). Epithelial cells of patient #6 were divided into two main groups: a small population of NE-like cells and the remaining majority of NE -AR high cells, illustrating significant intra-tumoral heterogeneity regarding NED ( Fig. 2A-2C). By analysing cellular phenotypes/states, we found that all NE cells prominently exhibited a luminal phenotype rather than basal phenotype ( Fig.2A, 2B). Of note, AR scores were ex-tremely low in NE cells, which is consistent with previous findings that AR signalling activity is downregulated in NEPC [46]. As introduced earlier, the EMT process and stem cell state have also been proposed to participate in NEPC formation [23]. However, our analysis demonstrated that only NE cells from patient #5 displayed higher EMT and stemness signature scores.
We further interrogated malignant identity of NE cells by performing inferred copy number variation (CNV) analysis on the basis of the average expression of 101 genes in each chromosomal region [47,48]. We used the normal prostate epithelial cells from Henry dataset as "reference" cells [35], such that their average CNV value was subtracted from all cells. Inferred CNV analysis showed that most NE cells exhibited remarkable CNVs, indicating their malignant identity ( Fig.2D and supplementary fig. 3). Interestingly, most basal-like epithelial cells lacked CNVs, and thus likely representing a group of normal epithelial cells (Fig. 2E). In addition, we noticed that the epithelial cells of patient #3 had very few CNVs consistent with its histologically intraepithelial neoplasia characteristic (Supplementary fig. 3). Taken together, preliminary analyses revealed a malignant feature of NE cells and suggested a link between NED and a luminal-like AR low/phenotype.

Intra-tumoral analyses identify different extent of focal NED
To better understand the extent of NED in each individual tumor, we next in- The most interesting observation was from patient #4, a histologically diagnosed adenocarcinoma, in which we found that when compared with other epithelial clusters, cluster 5 preferentially expressed NE markers CHGA and SYP (Fig. 3B), probably representing a population of early NE precursors.
These observations were further validated by IHC assays for two NE markers (SYP and SOX2) and AR in sections from five samples (Fig. 3C). For instance, we detected a minority of scattered SYP + NE cells in section from patient #4， which may corresponded to the cells of cluster 4 revealed by single-cell anal-ysis. In addition, IHC analyses of patient #5 samples also showed an overall good concordance with the single-cell transcriptional profiles that SOX2 was intensively expressed, while another NE marker SYP was almost undetectable (Fig. 3C). Thus, intra-tumoral analysis confirmed NED in three patients (patient #2, #5 and #6) and enabled us to detect NE cells in a PCA (patient #4).

Epithelial cellular relationships in patient #4
We next paid specific attention to patient #4, given that the NE-subpopulation Immunofluorescence analysis of SYP and KRT8 further validated a luminal phenotype of SYP-expressing cells (Fig. 4C). Interestingly, the early NED cells and luminal cells shared almost the same CNV pattern, indicating that they had a common clonal origin ( Fig. 4A and supplementary fig. 4A). In contrast, basal cells in this sample displayed very few CNVs. Thus, the separation of different epithelial subtypes may reflect their marked genomic differences.
A closer relationship between NE cells and luminal-like malignant cells was further supported by visualization using Partition-based approximate graph abstraction (PAGA) [49] (Fig. 4D). To deepen our understanding of the dynamics of epithelial to NE transition, we next applied RNA velocity analysis that predicts the future state of an individual cell by leveraging the fact that newly transcribed, unspliced pre-mRNAs and mature, spliced mRNAs can be distinguished in common single-cell RNA-seq protocols [50]. Notably, unlike many other existing computational methods [51], RNA velocity analysis doesn't rely upon a priori knowledge to set up the starting cell for inferring the trajectory and thus enable us to more unbiasedly and accurately predict the cellular differentiation trajectory. Given the heterogeneous epithelial composition, we utilized scVelo, a likelihood-based dynamical model that has recently be introduced to solves the full gene-wise transcriptional dynamics [52]. This analysis clearly showed positive velocity from luminal malignant cells (cluster 3) towards early NED cells (cluster 5; Fig. 4E). In contrast, KRT5 + basal and UPK1A + urothelial-like cells were clustered far from NED cells and did not show a tendency to progress into SYP + cells. Therefore, this finding suggested that luminal-like malignant cells may serve as the direct progenitor cells responsible for early NED in this patient we analysed here.

TMA analysis confirms the prevalence of luminal-like NED phenotype
We next validated the generality of this observation in a large population using clinical PC TMAs, which contained 297 cancer tissues (280 PCA, 10 CRPC and 7 NEPC) (Supplementary Table6). We carried out triple IF staining for K18, K5 and SYP to evaluate the basal/luminal phenotypes of NE cells (Fig. 4F).

Consequently, we detected SYP-positive cells in 102 tumors, of which 81%
were K18 + K5 -SYP + , and 5% exhibited both K18 + K5 -SYP + and K18 -K5 -SYP + characteristics (Fig. 4G). Notably, no K18 -K5 + SYP + cells were found in any of the 297 cancer tissues. This analysis therefore verified that NED precursors in human prostate cancer had a prevalent luminal phenotype. Of interest, a substantial number of the SYP-expressing tumor specimens came from patients who had not received any therapy (96/102), demonstrating that NED in fact occurred much earlier than the development of castration resistance, which is in line with previous findings that neuroendocrine differentiation is present in 10% to 100% of localized PCAs and increases with disease progression [53,54]. Taken together, TMA analysis confirmed the single-cell results, demonstrating that NED in human prostate cancer was primarily presented as a luminal feature.

Epithelial cellular relationships in patient #6
Similar to patient #4, epithelial cells of patient #6 also showed intra-tumoral NED heterogeneity, which was composed of a small population of NE cells  (Fig. 5C), showing that NE cells in this sample still connected with luminal-like tumor cells. We next inferred cellular dynamics using RNA velocity, which predicated similar cellular processes that NED in this sample was exclusively branched from luminal cells (Fig. 5D). We further sought to identify genes that display pronounced dynamic expression patterns linked to the transition state toward a NE fate (Supplementary table7). As expected, signatures of AR signalling such as KLK2 and KLK3 were notably downregulated along with the emergence of NE phenotype (Fig. 5E). We then paid particular attention to genes that were positively correlated with NED. Within the top-ranked likelihood genes, we found ASCL1, a key transcription factor for neuronal differentiation [55], which has also been associated with NED in prostate cancer [56] (Fig. 5E). In addition, this analysis also illustrated many unknown genes that might serve as the potential drivers or biomarkers of the NED transdifferentiation, for example, VGF, SCGN and PAPPA2, the roles of which in NEPC have not been reported. Altogether, deeper analyses of epithelial cell relationships in this sample also suggested that malignant cells with a luminal phenotype fuels the development of NE cells.

Identifying NED-associated gene meta-programs
We next sought to understand the underlying molecular features associated with NED. For this purpose, we applied non-negative matrix factorization (NMF) to define underlying transcriptional programs specific to the epithelial cells from each tumor [57,58] (Fig. 6A and supplementary table8). To relate these meta-programs to cell phenotypes, we scored these ordered cells according to basal, luminal, NE, EMT, AR and cell cycle marker genes (Fig. 6B). This analysis revealed three meta-programs highly associated with NED (P1, P2, P4). For example, meta-program P1 was characterized by neuroendocrine markers such as CHGB and CHGA and meta-program P2 contained NE-related transcriptional factor (TF) EZH2 and DLX5, a homeobox transcription factor gene. DLX5 has been recently reported to mark delaminating neural crest cells during development [59]. Of note, neural crest cells can differentiate into numerous derivatives including neuroendocrine cells [60,61], implying a potential role of this gene in participating NED of prostate cancer cells. Moreover, we identified a cell cycle-related meta-program (P3) that was obviously upregulated in NE cells of patient #2 and #5) ， likely reflecting well-differentiated NE state of these two tumors. More interestingly, meta-program P2 was specifically associated with patient #2, while meta-program P4 was preferentially expressed in patient #5, suggesting two kinds of NED features.
We next asked whether the NE-related gene meta-programs derived from single-cell data could robustly reflect the NED in bulk expression profiles. Thus, we used three bulk transcriptomic datasets [7,41,62] , which included both CRPC and NEPC patients. We first performed correlation analysis between the expression of all genes from three meta-programs (P1, P2 and P4) and the NE score defined by the average expression of well-established NE markers to screen out genes that were most relevant to NED. This analysis identified 121 genes highly correlated with the NE score (Pearson R ≥ 0.3; Fig. 6C and supplementary table9). Consistently, we found that by plotting their expression in the 5 groups of samples that was defined by the expression patterns of NE and AR activity genes [41], most genes displayed significantly higher expression in the AR -NE + group than in NEgroups (Fig. 6D). Thus, NED-associated gene signatures derived from single-cell data can provide reliable clues for distinguishing human NEPC and searching for new drivers involved in NED.

Identifying NED-associated transcription factor regulatory network
The above NMF analysis revealed that two well-differentiated NEPC displayed distinct NED signatures. To explore the underlying molecular mechanisms driving the distinct NE differentiation phenotypes, we next used single-cell regulatory network inference and clustering (SCENIC) to identify the co-expressed transcription factors and their putative target genes, as an indicator of transcription factor regulatory activity [63]. SCENIC analysis showed that NED from different patients could upregulate the expression of different transcription-factor networks (Fig. 7A). For instance, DLX6 and ASCL1 regulons were highly active in NE cell of patient #2, whereas expression of FOXA2 and SOX21 network was restricted in NE cells of patient #5. In line with reports that SOX2 are essential for NED in prostate cancer, we found that SOX2 regulon was upregulated across almost all NE cells from patient #2 and #5 (Fig.   7A). Thus, single-cell regulatory network analysis provided an explanation for the divergence of NED from our patient cohort. In addition to many well-established NE-related TFs identified in this analysis (such as SOX2 and ASCL1), it also predicted many neuronal differentiation related TFs that might also be involved in NED. For instance, expression of LHX2 has previously been showed to confer neuronal competency for activity-dependent dendritic development of cortical neurons [64], but its role in NED of prostate cancer remains undetermined and need future studies to clarify their specific roles.
Next, we analysed TF regulons of epithelial cells from two patients with intra-tumoral NED heterogeneity. Analysis of patient #4 revealed that NE subpopulation specifically up-regulated transcriptional activities of NKX2-2, HES6, FOXA2 and ASCL1, all of which have been previously reported to be essential for a variety of neural cell types' differentiation ( Fig. 7B, 7C). The intra-tumoral heterogeneity in term of TF activity was also observed in patient #6, showing that NE subpopulation have obviously higher TF activities of SOX2 and FOXA2 (Fig. 7D, 7E). In addition, NE subpopulation strongly upregulated activities of UNCX and CELF5 regulatory networks, which have been both reported to involve in maintaining neural cells survive or promoting some neuron diseases [65,66]. Overall, TF network analysis revealed both known and unknown NED-associated TFs and offered more insight into both inter-tumoral and intra-tumoral heterogeneity regarding NED.

Discussion
In this study, we generated 21,292 single-cell transcriptomes from 6 CRPC patients with a focus on the cellular phenotypes associated with NED. We detected NED in four tumors, in which all of the NE cells exhibited a luminal rather than basal epithelial phenotype. It is important to note that in two tumors that contain both NE cells and non-NE epithelial cells (patient #4 and #6), there is clear cell fate transition tendency from luminal-like adenocarcinoma cells towards NE cells ( Fig. 4E and Fig. 5D). Thus, our finding has identified the transdifferentiation process that has been proposed for a long time in explaining NED in prostate cancer. Although previous genomic analyses have suggested that NEPC are clonally derived from PCA that usually present luminal-like phenotype [7,26,27], this is the first study to our knowledge that has shown the cellular diversity in human CPRC as well as the cellular phenotypes associated with NED at single-cell resolution.
Our current study is limited regarding the total number of samples that contain NED for analyses. To unbiasedly evaluate the cellular phenotypes associated with NED, we next performed triple IF staining against KRT5, KRT8 and SYP on our large cohort of PC TMAs. We found that the luminal-like malignant phenotype of NE cells (K5 -K18 + SYP + ) is mainly detected in adenocarcinomas (Fig. 4G). Therefore, this results further confirmed a closer relationship between NED and a luminal state rather than basal state in human prostate cancer. It should be noted that our results don't exclude the probability that basal cells serve as cell of origin of NEPC. According to in vivo cell lineage tracing studies, both basal and luminal cells are capable of initiating prostate tumorigenesis [67]. In particular, prostate cancer originated from human basal cells gradually loss basal features and upregulation of luminal hallmarks [13,68]. Based on these findings and our current results, we propose a model that PCA can be initiated from both basal and luminal cells, while focal and eventual NEPC is more likely to be made by NE precursors with luminal phenotype (Fig. 8). We also consider the possibility that a direct basal-NE trans- Our next aim is to explore the signature genes driving NE transdifferentiation.
By performing NMF analysis, we further identified three gene meta-programs consisting of many genes highly correlated with NED (Fig. 6A). The bulk datasets analysis has validated the robustness of this result, showing that most genes are significantly expressed in patients with NED (Fig. 6D). Interestingly, we found that two well-differentiated NEPCs (patient #2 and #5) seem to have distinct NED programs. SCENIC analysis highlighted that the heterogenous NED might be determined by distinct TF networks. Nevertheless, the exact role of many identified genes in prostate cancer, especially NEPC, is unknown and needs further comprehensive investigation.
In summary, our single-cell study has disentangled both intra-and intertumoral heterogeneity regarding to NED in human prostate cancer and characterized both cellular phenotypes and molecular features linked with the luminal to NE transdifferentiation. Understanding the progressive trajectory of NED will benefit the development of early diagnosis and even therapeutic treatments for human NEPC.

Methods & materials Patient selection
With a focus on neuroendocrine prostate cancer, the participating patients were required to meet the following requirements: 1) the patients must have developed resistance to castration therapy; 2) CT imaging showed an apparent prostate tumor (Supplementary fig. 1A). In addition, we preferentially selected patients whose circulating PSA level was lower than 20 ng/ml. The patient information is described in detail in Table 1. The present study was ap-

Single-cell data preprocessing and quality control (QC)
Raw sequencing data were converted to FASTQ files with Illumina bcl2fastq, version 2.19.1, and data were aligned to the human genome reference sequence (GRCH38). The CellRanger (10X Genomics, 2.1.1 version) analysis pipeline was used for sample demultiplexing, barcode processing and single-cell 3′ gene counting to generate a digital gene-cell matrix from these data.
Of note, Cell Ranger filters any barcode that contains less than 10% of the 99th percentile of total UMI counts per barcode, as these are considered to be associated with low quality cell barcodes. This processing resulted in an av-

UMAP visualization and cell type annotation
We used UMAP [69]

Multiple datasets integration and Batch correcting
For merging multiple datasets, we applied Harmony integration [75], which has been showed to reduce technical batch effects while preserving biological variation for multiple batch integration. RunHarmony returns a Seurat object, updated with the corrected Harmony coordinates. The manifold was subjected to re-clustering use the corrected Harmony embeddings rather than principal components (PCs), set reduction = 'harmony', with parameters of Seurat analysis.

Differential gene expression analysis
DEGs in a given cell type compared with all other cell types were determined with the FindAllMarkers function from the Seurat package (one-tailed Wilcoxon rank sum test, P values adjusted for multiple testing using the Bonferroni correction). For computing DEGs, all genes were probed provided they were expressed in at least 25% of cells in either of the two populations compared and the expression difference on a natural log scale was at least 0.25.

RNA velocity
RNA velocities were predicted using velocyto in R program (http://velocyto.org, version 0.6) [50,52]. Briefly, spliced/unspliced reads were annotated by velocyto.py with CellRanger (version 2.2.0), generating BAM files and an accompanying GTF; then, they were saved in .loom files. The .loom files were then loaded to R (version 3.6.0) using the read.loom.matrices function, and they generated count matrices for spliced and unspliced reads. Next, the count matrices were size-normalized to the median of total molecules across cells.
The top 3,000 highly variable genes are selected out of those that pass a minimum threshold of 10 expressed counts commonly for spliced and unspliced mRNA. For velocity estimation, we use the default procedures in scVelo (n_neighbors=30, n_pcs=30). In consideration that the assumptions of a common splicing rate and the observation of the full splicing dynamics with steady-state mRNA levels were often violated, we used the function recov-er_dynamics, a likelihood-based dynamical model, to break these restrictions.
Finally, the directional flow is visualized as single-cell velocities or streamlines in the UMAP embedding with the Seurat cluster annotations.

Connectivity of cell clusters
To identify potential developmental relationships of cell clusters in patient #4 and #6, we utilized the partition-based graph abstraction (PAGA) [49] to esti-mate any potential developmental relationships among the three prostate lineages. The computations were performed on the same subset of variable genes as for clustering, using the default parameters.

Identification of epithelial gene meta-programs
Transcriptional programs were determined by applying NMF as previously described [57,58]. Analysis was performed for the epithelial cells only. We set the number of factors to 10 for each tumor. For each of the resulting factors, we considered the 30 genes with the highest NMF scores as characteristics of that given factor (Supplementary table8). All single cells were then scored according to these NMF programs. Hierarchical clustering of the scores for each program using Pearson correlation coefficients as the distance metric and Ward's linkage revealed four correlated sets of programs with our focus.

SCENIC
In order to further investigate the gene regulatory networks (GRNs) in process of NED, we applied SCENIC [63] workflow to reconstruction the GRNs. The input matrices for SCENIC of every single sample was the corrected UMI counts in "SCT assay" of Seurat, in which we removed the variation of mitochondrial mapping percentage. For the combined sample (epithelial cells of 6 patients), Combat [76] were run to correct for "patient of origin" as source of batch effect. Following the standard procedure of SCENIC, we used GENIE3 (for single sample) and GRNBoost (for combined sample) to identify potential TF targets. Besides, the activity of each regulon in each cell is evaluated using AUCell, which calculates the Area Under the recovery Curve, integrating the expression ranks across all genes in a regulon. Finally, we used the default "AUCCellThreshholds" for each regulon as the threshold to binarize the regulon activity scores and created the "Binary regulon activity matrix". The motifs database for Homo sapiens was downloaded from the website https://pyscenic.readthedocs.io/en/latest/.

Bulk dataset analysis
Bulk-transcriptomic data were collected from Morrissey et. al. (https://github.com/cBioPortal/datahub/tree/master/public/prad_su2c_2019) [62]. To estimate the correlation of the P1, P2 and P4 meta-program with NED, we first defined an NE score by gene set variation analysis (GSVA) [77,78] for every sample in these bulk RNA-Seq data, and the NE markers we used are listed in the Supplementary materials. Then, we filtered cell cycle-related genes from the gene list of the three meta-program and performed Pearson correlation coefficient analysis of the remaining genes.

Tissue microarrays
Tissue specimens from 297 patients who underwent radical prostatectomy were collected for the construction of tumor microarrays (TMAs), and then the specimens were cut into 5μm thick sections using a standard microtome.
These tissue cores were assessed by uropathologists to determine tumor stages according to the haematoxylin and eosin staining results (Supplementary table6).

Immunohistochemistry (IHC) and immunofluorescence (IF)
Formalin-fixed and paraffin embedded tissue sections (5μm) were deparaffinized and rehydrated. Antigen retrieval was carried out using 10 mM sodium citrate (pH 6.0) in a microwave oven. For DAB staining, endogenous peroxidase activity was blocked with 0.3% hydrogen peroxide for 10 min and 5% BSA in PBS for 1 h. Slides were incubated overnight at 4°C with a primary antibody, which was followed by incubation with an HRP-linked secondary antibody (CST) at room temperature (30 min). Diaminobenzidine (DAB) was used as chromogen, and the sections were counterstained with haematoxylin. For immunofluorescence staining, the sections were washed with PBS and transferred to a blocking solution (10% normal donkey serum in PBS) for 1 h at room temperature. After blocking, specimens were incubated overnight at 4°C with diluted primary antibodies. The next day, slides were washed with PBS three times for 10 min each, and then they were incubated for 1 h at room temperature with secondary antibodies conjugated to Alexa-488, -555 or -647, which were diluted with PBS containing 1% normal donkey serum (1:1000).
Then, the secondary antibody was rinsed, and the slides were washed three times with PBS before being mounted with Vector Shield mounting medium containing DAPI (Vector Laboratories, H-1200).

Image acquisition
IF images were acquired using a Zeiss LSM 710 confocal microscope and were processed by ZEN Imaging Software. IHC images were acquired using an Olympus BX53 System Microscope.

Data availability
The scRNA-seq data were deposited in the NCBI Gene Expression Omnibus (GEO) database under accession number GSE137829.        Cells are colored by the corresponding cluster and gradient of regulon activity (grey to red).

P a t i e n t # 4 P a t i e n t # 5 P a t i e n t # 6
Ce l l t y p e s NE _ s c o r e P a t i e n t # 1 P a t i e n t # 2 P a t i e n t # 3 P a t i e n t # 4 P a t i e n t # 5 P a t i e n t # 6 mi n ma x P a t i e n t # 5 P a t i e n t # 1 P a t i e n t # 3 P a t i e n t # 2 P a t i e n t # 4  P a t i e n t # 1 P a t i e n t # 2 P a t i e n t # 3 P a t i e n t # 4 P a t i e n t # 5 P a t i e n t # 6