Single-cell RNA-seq integrated with multi-omics reveals SERPINE2 as a target for metastasis in advanced renal cell carcinoma

Tumor growth, metastasis and therapeutic response are believed to be regulated by the tumor and its microenvironment (TME) in advanced renal cell carcinoma (RCC). However, the mechanisms underlying genomic, transcriptomic and epigenetic alternations in RCC progression have not been completely defined. In this study, single-cell RNA-sequencing (scRNA-seq) data were obtained from eight tissue samples of RCC patients, including two matched pairs of primary and metastatic sites (lymph nodes), along with Hi-C, transposable accessible chromatin by high-throughput (ATAC-seq) and RNA-sequencing (RNA-seq) between RCC (Caki-1) and human renal tubular epithelial cell line (HK-2). The identified target was verified in clinical tissue samples (microarray of 407 RCC patients, TMA-30 and TMA-2020), whose function was further validated by in vitro and in vivo experiments through knockdown or overexpression. We profiled transcriptomes of 30514 malignant cells, and 14762 non-malignant cells. Comprehensive multi-omics analysis revealed that malignant cells and TME played a key role in RCC. The expression programs of stromal cells and immune cells were consistent among the samples, whereas malignant cells expressed distinct programs associated with hypoxia, cell cycle, epithelial differentiation, and two different metastasis patterns. Comparison of the hierarchical structure showed that SERPINE2 was related to these NNMF expression programs, and at the same time targeted the switched compartment. SERPINE2 was highly expressed in RCC tissues and lowly expressed in para-tumor tissues or HK-2 cell line. SERPINE2 knockdown markedly suppressed RCC cell growth and invasion, while SERPINE2 overexpression dramatically promoted RCC cell metastasis both in vitro and in vivo. In addition, SERPINE2 could activate the epithelial-mesenchymal transition pathway. The above findings demonstrated that the role of distinct expression patterns of malignant cells and TME played a distinct role in RCC progression. SERPINE2 was identified as a potential therapeutic target for inhibiting metastasis in advanced RCC.


INTRODUCTION
Renal cell carcinoma (RCC) is a major malignancy, causing the most common deaths in kidney cancer [1]. Radical surgery is the mainstay of treatment for early RCC at present. However, many RCC patients are already in the advanced stage or accompanied with metastasis at the time of diagnosis, when treatment becomes difficult [2]. Although targeted therapies or immune inhibitors, such as the use of tyrosine kinase inhibitors or immune checkpoint inhibitors treatments have improved the 5-year survival rate in patients with advanced or metastatic RCC, the overall clinical outcomes remain poor [3]. In addition, challenges still exist when clinicians decide on the option of the most effective treatment [4]. Therefore, investigating the underlying mechanisms will be beneficial to identifying potentially effective therapeutic targets for RCC.
Metastasis is the complex progression involving multiple processes and molecular communications between malignant cells and the tumor microenvironment (TME) [5]. Metastasis of RCC and tumor-niche interactions originating from multi-step genetic alternations may lead to abnormal expression of genes such as VHL [6], mTOR [7] and SOX17 [8]. In addition to changes in gene expression at the transcriptional level, chromatin spatial structures [9] and epigenetic factors [10] also contribute to the regulation of genes and their functions via gaining access to spatially distant regulons and regions, thus bringing DNA to sequence-specific binding proteins. However, there is a lack of multi-omics data about RCC metastasis.
In this study, we performed an integrative multi-omics analysis of single-cell RNA-sequencing (scRNA-seq), bulk RNA-sequencing (RNAseq), 3D high-throughput chromosome conformation capture (Hi-C) and assay for transposable accessible chromatin by high-throughput sequencing (ATAC-seq) for primary and metastatic RCC tumor tissues and compared differences between RCC and normal renal tubular cells. Then, we further validated the level of the identified marker for RCC metastasis by cell functional animal experiments and human tissue microarray. Our data demonstrated that TME played a pivotal role in RCC progression or metastasis and affected the response to immune or targeted therapy in clinical cohorts. More importantly, scRNA-seq identified SERPINE2 as a gene participating in the metastasis process, and hierarchical chromatin organization comparison showed that SERPINE2 was highly-expressed in RCC as a differential expressed gene (DEG) which could potentially predict metastasis, suggesting that it may prove to be a novel target to advanced or metastatic RCC.

MATERIALS AND METHODS Patients and sample collection
All procedures were approved by the ethical review board of Xinhua Hospital and Third Affiliated Hospital of the Second Military Medical University, and written informed consent was obtained from all included patients. The scRNA-seq samples were obtained from fresh surgically removed tissues of patients with pathologically confirmed diagnosis of advanced RCC, including ccRCC_LM3 and ccRCC_LM4, pRCC_LM used by Bi et al. for RCC study [11]. A total of eight samples from primary and metastatic sites were included for scRNA-seq analysis (Fig. 1B). The two tissue microarrays (TMA-30, n = 29 and TMA-2020, n = 293) included 322 patients from Xinhua Hospital and Third Affiliated Hospital of the Second Military Medical University (Shanghai, China). The detailed clinical information is presented in Tables 1 and 2. The TCGA sample data were obtained from http://xena.ucsc.edu/ of UCSC Xena, involving bulk-seq, somatic mutation and clinical data.
Sing-cell RNA-sequencing and data processing Single-cell suspension and droplet-based sequencing were prepared according to the manufacturer's protocol and our previous work [12]. Seurat (version 3.0.1) was used to perform data quality control [13]. Cells with <200 or >5000 genes or with more than 30% mitochondrial genes were considered low-quality and filtered. Markers of each main cell cluster were identified through FindAllMarker function. Cell type markers were obtained from CellMarker website [14] and previous studies [15,16]. The characteristic markers used for labeling are presented in Fig. S2.

InferCNV and cell malignancy evaluation
The InferCNV package was used to investigate the copy number variations (CNVs) by pipeline parameters in scRNA-seq analysis. Each cell was scored based on the range of CNV signals. The CNV signals were defined as previously described [17]. Finally, cells with CNV signal greater than 0.05 and CNV correlation coefficient greater than 0.5 were defined as malignant cells, and cells below these two thresholds were defined as non-malignant cells.

Epithelial identification and scoring
A series of epithelial markers included EPCAM, cytokeratins and SFN. The mean expression of these genes was used to measure the epithelial score (Table S1).

Expression program of malignant cell heterogeneity
The non-negative matrix factorization (source: https://github.com/dylkot/ cNMF) was applied to classify heterogenic expression programs in malignant cells with the parameters as previously described [17]. Finally, six program clusters were identified and applied to define meta-signatures. The first 20 genes of each cluster were defined as meta-signatures and used to define cell scores. The resulting NNMF program was compared with the meta-program defined in our original analysis, with a global Pearson correlation threshold (across all genes) of 0.2.

GSEA and GSVA
The samples in TCGA were divided into SERPINE2 high and SERPINE2 low group. DEGs were calculated by Seurat function. GSEA was used to determine which gene sets were enriched in subgroup comparison. Only a p value of a gene less 0.05 was considered as the target of interest. In addition, GSVA was used depending on the C2 and C5 hallmark gene set from molecular signature database, as the instructions of the GSVA package.

Cell-cell communication analysis
CellphoneDB (based on Python 3.7) [18] and iTALK (R version) [19] are calculation tools for intercellular communication analysis. The primary cell clusters were investigated to establish cell interaction networks. The ligand-receptor pairs with a p value < 0.05 were considered to be the interaction of interest.

Cell line bulk RNA-seq and ATAC-seq
For bulk RNA-seq, total mRNA with poly-A tail was extracted and reverse transcribed to cDNA for sequencing. R software (version 3.6.3) was used for downstream statistical analysis. The Deseq2 (version 1.4.5) R package was used to calculate differential gene expressions. The adjusted p value 0.01 and log2 fold change (logFC) >1 were applied to identify significantly DEGs. For ATAC-seq, sequencing was carried out on an Illumina NovaSeq 6000 generating 2 × 75 bp paired-end reads as previously described [20]. ATAC-seq peaks were called using MACS2 with no shifting model, and BEDTools was used to calculate the coverage within peaks.

3D high-throughput chromosome conformation capture analysis
All Hi-C sequencing reads were mapped to the human reference genome (hg19) using Bowtie. Raw interaction matrices were normalized by using the iterative correction and eigenvector decomposition (ICE) method and HiCNorm [21]. ICE-normalized interaction matrices at 500-kb resolution were used to detect chromatin compartment types by R-package HiTC. The compartment with a higher gene density was assigned as A compartment (active/euchromatic compartments), and the other compartment was assigned as B compartment (inactive/ heterochromatic compartments) [22]. For opologically call associating domains (TADs), ICE-normalized interaction matrices at 40-kb resolution were used by a Perl script matrix2insulation.pl (http://github.com/ blajoie/cranenature-2015). A higher resolution was used because TADs are smaller than A/B compartments. Insulation scores (IS) were calculated for each chromosome bin and valleys of IS identified TAD boundaries. TADs smaller than 200 kb or located in telomeres/ centromeres were filtered out using the previously described methods. When comparing TADs between two cell lines, at least 70% overlap between two TADs was considered as conserved TADs. Bedtools with the option of "intersectBed -f 0.70-r" was used to identify conserved TADs [23].

Immune and targeted therapy analysis
Checkmate 025 cohort data [24], including normalized bulk RNA-seq and clinical data, were obtained to perform treatment response analysis in Nivolumab (n = 181) and Everolimus (n = 130) groups. The TIDE algorithm [25] was used to predict potential ICB response between SERPINE2 high and SERPINE2 low groups.

Animals
All animal experiments were approved by the Animal Care and Ethical Committee of the Second Military Medical University (Shanghai, China). The 6-week-old male nude mice were randomly injected with PBS 200 μl containing 1 × 10 6 786-O-SERPINE2-OE-PR-luc cells or 786-O-PR-luc via the tail vein. After 8-week injection, the mice were sacrificed and the lung tissue was removed and fixed in 10% formalin buffer solution. 786-O-PR cells were transfected with luciferase reporter gene to detect subrenal tumor formation or lung metastasis. Tumor growth was monitored weekly using IVIS Lumina imaging system (PerkinElmer, Hopkinton, MA, USA) in vivo bioluminescent optical imaging.

Lentiviral gene tool establishment
The overexpression or shRNA lentivirus vector SERPINE2 (OE-SERPINE2 and sh-SERPINE2) was all synthesized by GeneChem Biological Technology (Shanghai, China) using the sequences shown in Table S8. The stable Caki-1 and 786-O cell lines were constructed using lentivirus in which SERPINE2 was knocked down or overexpressed. Lipofectamine 3000 reagent W. Chen et al.
(L3000015, Invitrogen) was used for siRNA and plasmid transfection according to the manufacturer's protocol.

Quantitative real-time PCR (qRT-PCR)
Total RNA was extracted using TRIZOL (Invitrogen, USA) and reversed transcribed into cDNA. SYBR Green Real-Time PCR Master Mix (QPK201, Japan) was used to quantify gene transcripts and normalized to the GAPDH expression. The sequences of primers are shown in Table S9.

Western blot
Total protein was extracted using SDS-PAGE and then transferred to the PVDF membrane (Thermo, USA), which was then incubated with the

Cell function assays
For migration and invasion experiments, transwell chambers (Millipore, USA) were used without or with Matrigel (BD Biosciences, USA). Cells were seeded in medium with no FBS into the upper chamber, while the medium with FBS was plated in the lower side. After 3-day seeding, the cells on the lower chamber were fixed with 4% Paraformaldehyde Fix Solution (E672002, Sangon Biotech, Shanghai), stained with crystal violet (E607309, Sangon Biotech), and scanned at ×200 magnification [8]. For clonogenic survival test [23], 100 cells were inoculated in triplicate into each cell of 6-well plates overnight. The transfection portion of the cells was performed as described earlier. After knockdown or overexpression, cells were incubated for 7 days. Subsequently, the colonies were washed with PBS, followed by immobilization with 70% ethanol for 20 min at room temperature and 0.5% crystal violet for 20 min. Colonies with >50 cells were counted under a light microscope. The survival score was calculated as the ratio of plate laying efficiency of treated cells to that of control cells. For CCK8, according to the manufacturer's instructions, the Cell Counting Kit-8 (CCK8 kit, CK-04, Dojindo, Kumamoto, Kyushu, Japan) was used to detect the proliferation of cells under the conditions shown. Optical

Immunohistochemistry (IHC) and H-score
IHC was performed as previously described [26], and the primary antibodies for IHC staining included rabbit anti-SERPINE2 (AB155549, Abcam, USA) and anti-CA9 (AB243660, Abcam, USA). The IHC results were scored as H-score, including semi-quantitative grades by "IHC Profiler (Macro)" of ImageJ (version 1.53a): negative, low-positive, positive, and high-positive. The H-score = 0 * the percentage of negative cells + 1 * the percentage of low positive cells + 2 * the percentage of positive cells + 3 × the percentage of high-positive cells, which ranges from 0 to 300 [27].

Statistical analysis
Statistical differences between numerical data (mean ± SD) were calculated by Student's t test (two-tailed). Categorical variables were analyzed by chisquare test or Fisher's exact test. Kaplan-Meier method was used to draw survival curves by "survival" package and "survminer" of R 3.6.3 or GraphPad Prism 7.0 (GraphPad Software, Inc.). ROC analysis was performed to obtain the cut-off value and AUC of H-score. Prognostic accuracy was calculated by Harrell's concordance index analysis (c-index) with  "survcomp" package of R 3.6.3. Nomogram analysis was conducted by "foreign" (version 0.8-78) and "rms" (version 6.0.1) packages for establishing the risk prediction model. All experiments were performed independently at least three times.

scRNA-seq landscape of RCC primary and metastatic sites
To investigate the cellular heterogeneity in RCC tumors, the scRNA-seq profiles were presented for primary tumors from six patients and matched LN metastasis from two of these patients (Fig. 1A, B and Tables 1 and 2). Subsequently to quality control, we obtained 46,552 cells from six patients for scRNA-seq landscape.
The whole chromosomal inferred copy-number variations (CNVs) across each cell were calculated based on the mean expression among chromosomal regions [17]. The CNVs presented the abnormal profiles among malignant cells, referenced by nonmalignant cells (Fig. 1C), which distinguished 30,514 malignant cells and 14,762 non-malignant cells. The CNV genes were shown as the network with gene-set based on GO: BP (Gene Ontology: biological process, Fig. 1D, https://dev.networkanalyst.ca/ NetworkAnalyst). The CNV scores of primary (Malignant-pri) and metastatic (Malignant-LN) malignant cells were significantly higher than those of non-malignant cells (p < 0.001) (Fig. 1E).
Based on the abnormal karyotypes [23], a global expression signature, including EPCAM, cytokeratins and SFN, was established (Table S1). The epithelial score of malignant cells was significantly higher than that of non-malignant cells (p < 0.001) (Fig. 1F). According to CNVs and epithelial marker analysis, most cells were part of clusters with a consistent malignant or non-malignant classification.

RCC TME characterization
The 14,762 non-malignant cells were clustered into seven main cell types ( Fig. 2A), involving T cells, endothelial cells, macrophages, B/Plasma cells, NK cells and fibroblasts identified by maker annotations (Table S2). The T cells and fibroblasts subgroups were further clustered at a higher resolution. As shown in Fig. 2B, the major T cell cluster was grouped into CD8+ T cells (expressed both cytotoxic and exhausted marker), CD4+ T cells (classical marker, TCF7 and CCR7), and regulatory T cells (Tregs, FOXP3). The 540 fibroblasts were re-clustered into two subgroups (Fig. 2C). The one expressed conventional myofibroblast markers ACTA2 and MYL9. Myofibroblasts are acknowledged as components of TME and can be detected in both primary and metastatic niche [28]. The other subset presented the characterization of cancer-associated fibroblasts (CAFs), expressing markers such as PDPN and PDGFRB [29]. Meanwhile, these subsets expressed markers with immediate early response genes (e.g., JUN, FOS), mesenchymal markers (e.g., VIM, THY1) and ECM (MMP11) (Fig.  S1A). This heterogeneity of fibroblasts among tumors was consistent with the perspective that CAF is involved in complex structural and paracrine interactions in TME. The cell-cell communication result also showed frequent interactions between CAF and other cells (Fig. S1B, C). Compared with non-malignant cells, 30,514 malignant cells were clustered based on the tumor heterogeneity in monocle UMAP (Fig. 2D). The top 10 expressed markers of each malignant cell cluster are shown in Fig. 2E.
Heterogeneity of the malignant cell expression patterns and identification of the metastasis programs Next, we investigated how the expression patterns varied in heterogenetic malignant cells among the tumor subsets by focusing on eight tumors that acquired the most of malignant cells. The NNMF algorithm was used to reveal consistent sets of genes tendentiously co-expressed by clusters of malignant cells. We identified four gene sets that varies among cells of Malignant 4 cluster cells (Figs. 3A and S2A). Using this method to each of the six malignant clusters, we identified a total of 60 gene signatures that varied consistently among individual cells in at least one malignant cluster (Table S3). Then, we applied hierarchical clustering to extract the 60 gene signatures into meta-signatures that represented different co-expression patterns in multiple malignant subsets (Fig. 3B). The high degree of consistency among signatures of different malignant cluster cells suggested that they embraced a common state of expression heterogeneity within tumors. Six expression patterns were tendentiously expressed by at least two subsets of malignant cells. As shown in Fig. 3B, two programs (clusters 1 and 2) were mainly composed of epithelial genes such as cytokeratin and EPCAM (Table S4). Although epithelial markers were expressed in all malignant cells, many of them were generally consistent in malignant cells (Table S4), and could reflect the level of epithelial differentiation and stemness. One additional program reflected G1/S and G2/M phases of the cell cycle and distinguished different cycle cells in each malignant subset. The last procedure enriched hypoxic-related genes (Fig.  3B).
We focused on the expression program related to metastasis in clusters 3 and 5 (Fig. 3B). The metastasis-I program contained ECM-related genes and had characteristics of EMT (Table S5). This process was evident in four subsets across all the six malignant subsets examined (Fig. S2B). Although EMT has been widely recognized as the potential target of drug resistance and tumor cell invasion and metastasis, their pattern and significance in human epithelial tumors are ambiguous. We then further examined the Metastasis-II program for EMT pattern. Besides ECM genes, this program included the partial EMT hallmarks such as conventional ITGA3, but lacked other markers (SNAIL, ZO1). The overall expression of epithelial markers remained remarkably unchanged, while the signatures were accompanied by decreased expression of some epithelial genes (Fig. S2C), suggesting that the Metastasis-II may be an unclassical EMT program.

SERPINE2 plays a key role in the metastasis program
We then examined the expression of genes involved in the Metastasis-II program across all the RCC samples (Table S6). These annotated genes were expressed in all RCC tissues, including primary and metastatic samples, and were the top genes in at least two samples (Fig. 3C). We then detected DEGs between malignant and non-malignant cells in the scRNA profile (Table S7) and found 37 commonly expressed genes, the top genes of which were in the NNMF program (Fig. 3D). Then, we examined all common genes in the TCGA survival data to filter out the oncogenes (Figs. 3E and S3A), knowing that oncogenes are relevant to poor survival outcomes [30] and highly expressed in tumor samples [31] compared with the normal tissues (Fig. S3B). After filtration, the highly-expressed SERPINE2 was found to be associated with the worse overall survival (OS) (Fig. 3E, p = 0.0069, HR = 1.55 (1.12-2.13) and p < 0.05). All the malignant scRNA subsets expressed SERPINE2, and the Malignant 1-3 had even higher expression (Fig. 3F). In addition, primary malignant cells expressed the highest SERPINE2 (Fig. 3G). Both GSEA and GSAV indicated that cell proliferation, cell movement and migration pathways were significantly enriched in SERPINE2-high malignant cells (Fig. S4A, B). The malignant cells frequently contacted with other cell clusters (Fig. S1B), and the growth factor-related ligands and receptors were actively expressed in malignant cells (Fig. S1C).
Next, we compared the SERPINE2 expression in RCC cell lines and the normal renal epithelial cell line HK-2. It was found that Caki-1 expressed the highest SERPINE2, compared with the other RCC cell lines (Fig. S4C). Knowing that Caki-1 originates from the human RCC metastatic tumor site, we performed Hi-C, ATAC-seq, combined with RNA-seq between Caki-1 and HK-2 to explore whether the chromatin spatial structures and epigenetic factors contributed to gene and function regulation in SERPINE2. The compartment status at the SERPINE2 locus was B (normal cell line) to A (RCC cell line), and this locus at ATAC-seq presented open status, suggesting that the SERPINE2 locus presented the active interaction frequency (Fig. 3H). The RNA-seq result also showed that SERPINE2 was highly expressed in Caki-1 compared with that in HK-2. Moreover, the region of SERPINE2 presented the TAD differences between HK-2 and Caki-1, suggesting that the TAD size decreased with the generation of new TAD boundaries in Caki-1 cells (Fig. 3I). These data show that SERPINE2 not only worked at the transcriptomic level but was correlated with the genomics alternations to RCC. Thus, SERPINE2 may play a potential role in RCC metastasis based on the whole-genomics and scRNA-seq.
SERPINE2 serves as a metastasis-associated oncogene in RCC and drug response SERPINE2 is a type of secreted protein and an inhibitor of plasminogen, urokinase and thrombin [32]. Previous studies have shown that SERPINE2 expression is associated with tumorigenesis and tumor cell invasion [33,34]. SERPINE2 is also commonly upregulated in lung [35], colorectal [36] and pancreatic [37] carcinoma. However, the molecular mechanisms by which SERPINE2 enhances tumor metastasis remains unclear. Firstly, we assessed somatic alterations in TCGA patients based on the mean expression of SERPINE2 (high group, n = 164; low group, n = 168). Genetic changes of previous studies on RCC are shown in Fig. 4A. The MTOR, MUC16, NOTCH2 and LPR1 presented differences in SERPINE2 high and low groups (Fig.  4A). Then, GSEA was performed between the two groups in TCGA. The EMT pathways were significantly enriched with p value <0.05 of false discovery rate (Fig. 4B). SERPINE2 was highly expressed in the tissues from patients with LN or other metastases (n = 89) as compared with that in the primary tumor tissues with no metastasis (n = 199) (Fig. 4C, p < 0.05). In addition, SERPINE2 of expression in patients with metastasis was also significantly higher than that in the normal tissues of the GSE105261 cohort (p < 0.05) (Fig. 4D). SERPINE2 expression in tumors was significantly higher than that in normal tissues in the GSE53757 cohort (p < 0.001) (Fig. 4E). The similar result was also observed in the GSE40435 and GSE22541 cohorts (p < 0.001, p < 0.01) (Fig. S4A). Next, we predicted the drug response for TCGA samples based on the largest publicly available pharmacogenomics database, the Genomics of Drug Sensitivity in Cancer (cgp2016), https://www.cancerrxgene.org/. The SERPINE2-high group had higher IC50 in response to sunitinib, which means a lower response rate compared with the SERPINE2-low group (p < 0.001) (Fig. 4F). Estimate algorithm could calculate the immune infiltration level of bulk-seq sample. We found that the SERPINE2-high group had higher immune infiltration than SERPINE2-low group (p < 0.001) (Fig. 4G). CD8+ T cell expression was negatively correlated with SERPINE2 expression in EIPC (p < 0.05) (Fig. S5B), or CIBERSORTx (p < 0.01) (Fig. S5C) calculation. Thus, SERPINE2 expression was likely to be associated with immunotherapy, too. Then, we detected the expression of MTOR expression in scRNA-seq. The SERPINE2-high group expressed more MTOR (p < 0.05) (Fig. 4H). However, the SERPINE2 level was not related to the survival in the Checkmate025 cohort treated with everolimus (p = 0.0629, HR = 0.68, 0.43-1.07) (Fig. S5D). The signature (PDCD1, HAVCR2, LAG3, TIGIT, TOX, ENTPD1, BATF and PRDM1) of terminally exhausted CD8+ T cells integrated in Fig. 4I suggested that the terminally exhausted CD8+ T cells actually affected immune infiltration. The higher SERPINE2 expression was associated with the worse survival in the Checkmate025 cohort treated with nivolumab (p = 0.0065, HR = 1.82, 1.06-3.11) (Fig. 4J). These data suggest that SERPINE2 may be a malignant gene in RCC and participate in drug response.

SERPINE2 promotes ccRCC invasion in vitro and vivo, accompanied by epithelial-mesenchymal transition (EMT)
To investigate whether SERPINE2 promoted the tumor malignant biological behavior and metastasis in RCC, we validated the function of SERPINE2 in RCC by qRT-PCR. The results showed that the expression of SERPINE2 was high in Caki-1 and low in 786-O cells (Fig. S4C). Then, SERPINE2 overexpression cells (786-O) and SERPINE2 knockdown cells (Caki-1) were constructed and observed by qPCR and Western blot (Figs. 5G and S6A-C). The effect of SERPINE2 expression on RCC cell proliferation and invasion was examined via colony formation and CCK-8 cell proliferation assay, Transwell cell migration and invasion assay. The results showed that RCC cell migration and invasion were significantly decreased in sh-SERPINE2 group (Fig. 5A, B, p < 0.01, p < 0.01; Fig. 5A, B, p < 0.01, p < 0.05), and significantly increased in OE-SERPINE2 group (Fig. 5C, D, p < 0.001; Fig. 5C, D, p < 0.01) in 48 h. The cell survival rate was significantly decreased in sh-SERPINE2 group (p < 0.01) (Fig. 5E), and significantly increased in OE-SERPINE2 group (p < 0.05) (Fig. 5F). Knowing that change in ECM expression and matrix metalloproteinase (MMP) participates in cancer cell metastasis and SERPINE2 enhances pancreatic tumor invasion [37] and lung metastasis of breast cancer via producing ECM and secreting MMP-9 [38], we next detected the expression of EMT markers (E-cadrin, N-cadrin, VIM and Snail) and MMP-9. qPCR and Western blot results showed that the EMT activation markers were highly-expressed in OE-SERPINE2 cells (p < 0.01) ( Fig. S6A-C, Supplementary Materials), and significantly decreased in sh-SERPINE2 cells (p < 0.01) (Fig. S6A-C). Moreover, 786-O-OE-SERPINE2 or 786-O-NC-SERPINE2 cells labeled with stable luciferase were injected into the caudal veins of nude mice. The photon flux was recorded during the 8 weeks. It was found that lung metastasis was more serious in mice of 786-O-OE-SERPINE2 group as compared with the mice in the control group (p < 0.01) (Figs. 5G and S6F). The HE results of lung tissues are presented in Fig. 5H. The Carbonic Anhydrase 9 (CA9) and SERPINE2 were highly expressed in the 786-O-OE-SERPINE2 induced lung metastasis sites (Fig. 5H).

SERPINE2 is associated with poor survival and predicts RCC metastasis in our clinical cohort
We validated the expression in tissue microarrays of two cohorts (TMA30, n = 29; TMA2020, n = 289) by IHC. H-score quantitative analysis (Fig. 6A) showed that the expression of SERPINE2 in tumor tissue group was higher than that in para-tumor tissue group (p = 0.001) (Fig. 6B). We classified the cohorts according to the clinical information and found that SERPINE2 expression was significantly higher in the tissues with metastasis (p = 0.038) (Fig.  6C) and a higher Fuhrman grade (p = 0.0091) (Fig. 6D) or tumor stage (p = 0.0076) (Fig. 6E) based on the IHC score. According to the optimal cut-off values from ROC analysis in SERPINE2 using 5-year OS of TMA30, the patients were classified as a highexpression group and a low-expression group. The cut-off value of H-score was 202.5 with AUC of 0.8342 (Fig. 6F). We wondered whether the expression levels of SERPINE2 reflected the   Table S10. D SERPINE2 expression among primary (n = 9), metastatic tumors (n = 26) and normal tissues (n = 9) in GSE105261. Data were obtained from https://www.aclbi.com/static/index.html#/geo. E SERPINE2 expression between tumors (n = 72) and normal tissues (n = 72) in GSE53757. Data were obtained from https://www.aclbi.com/ static/index.html#/geo. F IC50 prediction of sunitinib for SERPINE2 high and low groups in TCGA, based on the GDSC database. G The estimate score of tumor tissues from SERPINE2 high and low groups, reflecting the immune infiltration. H MTOR expression from cells in scRNA-seq analysis (50%: 50%). I UMAP plot shows the terminally exhausted CD8+ T cell signature expression. J The KM survival curve of SERPINE2-high and SERPINE2-low groups in the cohort of clinical trials Checkmate025 with Nivolumab. *p < 0.05; **p < 0.01; ***p < 0.001.
corresponding clinical outcomes. The result of Kaplan-Meier survival analysis showed that patients in SERPINE2 -high subgroup had poorer OS (Fig. 6G, p = 0.0003) and progress-free survival (PFS) (p = 0.0002) (Fig. 6H). Then, we performed the nomogram analysis for the clinical information and SERPINE2 (Figs. 6I and S7A, B) and found that SERPINE2 expression was an independent risk factor of OS of ccRCC patients.

DISCUSSION
The therapeutic challenges of advanced or metastatic RCC have urged researchers to explore the underlying mechanisms leading to RCC progression. The heterogeneity among solid malignancies drives one of the major challenges. scRNA-seq can help identify TME distinction [39], developmental patterns [40,41], drug response/resistance characterization [42], and immune infiltration programs involved with tumor behaviors and clinical treatment [43,44]. On the other hand, although some somatic analyses have been performed for RCC, tumorigenesis related DNA accessibility and gene localization remains unclear. In this study, we identified primary RCC tumors and matched LN metastases by using this well-integrated multiomics analysis. Our study revealed a metastasis program and SERPINE2 associated with this pattern across multiple tumors, involving significant structural variation of RCC cell line and obvious correlation between RCC and TME.
EMT is a classical and acknowledged driver of metastasis, which is regulated by different activators at different levels [45]. In this  study, we identified a Metastasis-II program in malignant cells, an unconventional pattern, compared with Metastasis-I program (with classical EMT modules). Although Metastasis-II program involves certain EMT-like enriched alternations of epithelial and mesenchymal genes, it lacks the expression of classical TFs of EMT such as SNAIL, SLUG and TWIST [46] in scRNA-seq analysis and the vitro experiment (Figs. 5G and S2F). Similarly, SERPINE2, as a key oncogene in this program validated by multi-omics, whose molecules expressions brought this situation in vitro. SERPINE2 activated the process by decreasing the expression of the epithelial marker E-cadherin and increasing the level of the interstitial markers N-cadherin and Vimentin, which further stimulated MMP9 expression to invade ECM. This is consistent with the finding in other types of malignancies [38,47].
Considering the lack of conventional regulation patterns, the Metastasis-II pattern reflects an additional state that generalizes some aspects of EMT. Actually, the expression characterization of EMT has aroused increasing attention because growing evidence has demonstrated the role EMT at different stages [48]. It is also speculated that the dynamic EMT stages embrace both invasiveness and tumor development [49]. Moreover, it is still ambiguous whether a complete EMT process undergoes or whether the several metastasis programs coexist in RCC metastasis. In any case, our identification of SERPINE2 from an unbiased analysis for malignant cells of RCC patients provides a novel sight into the programs of human cancers and metastasis for future research.
ECM remodeling usually requires proteases and their inhibitors to induce tumor metastasis. It was found in our study that SERPINE2 expression was upregulated in RCC. It is for the first time that we unveiled the role of SERPINE2 in RCC, demonstrating that increased expression of SERPINE2 significantly decreased OS and PFS. SERPINE2 overexpression promoted RCC cell migration and invasion, and enhanced lung metastasis in mice in vivo, while SERPINE2 did not affect cell proliferation in the 48-h CCK assay (Fig. S6D, E). This potentially suggests that SERPINE2 may be able to promote invasion rather cell growth. SERPINE2 has been reported to participate in metastasis of several human cancers through various mechanisms, such as by re-establishing tumor stroma and the polarization of TAM [50], or activation of glycogen synthesis kinase 3β [47] and P38 [51] pathways. Various cytokines such as fibroblast growth factor (FGF) [23] and transforming growth factor β (TGFβ) [52] are known to stimulate the secretion of SERPINE2 in normal cell lines. Here, we also found that FGF, VEGF, TGFβ and their receptors were activated in cell-cell communication (Fig. S1B), suggesting that SERPINE2 may be involved in promoting cytokines-related metastasis by cell-cell interaction. Bulk-seq analysis has been performed for several tumor types to classify subtypes but failed to describe the heterogeneity within tumors. It was found in our study that the mesenchymal subtypes could reflect TME. The potential of stromal components suggests that future subtype systems may eventually need to integrate malignant and non-malignant components together.
In conclusion, our results described a landscape of malignant, stromal and immune cells for RCC and matched LN. Our identification of the Metastasis-II program and SERPINE2 by multi-omics may facilitate linking the unclassical EMT data to the RCC biology in vivo and vitro. Although further research is needed, the relation of this metastasis program and SERPINE2 level to worse clinical features may help develop new diagnostic and treatment strategies for advanced RCC.

DATA AVAILABILITY
All primary data presented in this study are available from the corresponding author upon reasonable request and we would upload the scRNA-seq, Hi-C, ATAC-seq and RNA-seq raw data to the Genome Sequence Archive of Human in the BIG Data Center, Chinese Academy of Sciences under accession codes HRA000454 for that are publicly accessible at https://bigd.big.ac.cn/gsa-human. Processed data could be accessed at https://doi.org/10.6084/m9.figshare.21625964 after reasonable request.

CODE AVAILABILITY
Codes could be accessed along with the supporting data upon reasonable request from Wen-jC or X-gC.