NCAPH is a prognostic biomarker and associated with immune infiltrates in lung adenocarcinoma

Non-SMC condensin I complex subunit H (NCAPH) plays a regulatory role in various cancers. However, its role in prognosis and immune infiltrates in lung adenocarcinoma (LUAD) remains unclear. This study examined the expression of NCAPH in tumor tissues and its association with immune infiltrates and prognostic roles in LUAD patients. Patients characteristics were obtained from The Cancer Genome Atlas (TCGA). Integrated analysis of TCGA showed that NCAPH was overexpressed across cancers, including LUAD. NCAPH expression was verified by quantitative polymerase chain reaction and western blotting in 20 LUAD matched tissues. High NCAPH expression was significantly related to T, N, M, pathologic stage, primary therapy outcome and smoking status according to the Wilcoxon rank sum test. Cox and Kaplan–Meier analyses showed that the NCAPH-high group was associated with shorter OS. The PFI and DSS in the NCAPH-high group were significantly decreased. Multivariate analysis showed that NCAPH was an independent predictive factor for poor prognosis. Gene set enrichment analysis demonstrated that the G2/M checkpoint, ncRNA metabolic, memory B cells, KRAS, E2F targets and MIER1 process were significantly associated with NCAPH expression. Single-sample Gene Set Enrichment Analysis indicated that NCAPH expression was associated with levels of Th2 and mast cells. The impact of NCAPH on malignant phenotypes was evaluated by MTT, transwell, cell cycle and apoptosis assays in vitro. The malignant phenotype of LUAD cells was inhibited if NCAPH was knocked down. In conclusion, this research indicates that NCAPH could be a potential factor for predicting prognosis and a new biomarker in LUAD.

www.nature.com/scientificreports/ has notable roles in cancer progression and development. However, the functions and mechanisms of NCAPH in LUAD have not been fully explored. Moreover, we found that NCAPH may play an oncogenic role in LUAD through bioinformatic analysis. Therefore, we investigated the impact of NCAPH on LUAD development.
To better explore the role of NCAPH in LUAD progression, we applied RNA-seq data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) datasets. Statistical and bioinformatics methods, such as differentially expressed gene (DEG) analysis, Kaplan-Meier (KM) survival analysis, Cox and logistic regression analysis, nomogram, Gene Ontology (GO) analysis, Gene Set Enrichment Analysis (GSEA) and single-sample Gene Set Enrichment Analysis (ssGSEA) were utilized. Moreover, NCAPH was knocked down in vitro to determine how it affected LUAD proliferation, invasion and migration.

Methods
Data source and preprocessing. LUAD patients' clinical information and gene expression data (including 535 tumor and 59 normal tissues) were obtained from TCGA (https:// portal. gdc. cancer. gov/). The exclusion criteria were OS (overall survival) less than 30 days and normal tissues. Then, HTSeq-FPKM information of level 3 was transformed into transcripts per million (TPM); then, the TPM information of 513 lung adenocarcinoma samples was applied for the next analyses. Twenty-two samples were excluded due to a lack of clinical variables. Differential NCAPH expression in LUAD tissues in the TCGA database. By using disease state (normal or tumor) as a variable, scatter plots and boxplots were generated to estimate different expression levels of NCAPH. Receiver operating characteristic (ROC) curves were generated to estimate the diagnostic value of NCAPH. NCAPH expression above or below the median value was defined as NCAPH-high or NCAPH-low, respectively.
Identification of DEGs between the NCAPH-high and NCAPH-low LUAD groups. Analysis of the differential expression of genes between the NCAPH-high and NCAPH-low patients from TCGA LUAD datasets was conducted using DESeq2 (4.0 package). Genes with an adjusted P value < 0.05 and an absolute FC larger than 1.5 were considered to be statistically significant. All significant DEGs are presented in volcano plots and heatmaps, which were constructed using R software.

Functional enrichment and infiltration of immune-related cells. Enrichment of NCAPH-related
DEGs by pathway and process was analyzed by Metascape (http:// metas ape. org). Those with an enrichment factor > 1.5, a minimum count of three and P < 0.01 were regarded as statistically significant. By using GSEA, we investigated the differences in the signaling pathways between the NCAPH-high and NCAPH-low groups to predict NCAPH-related phenotypes and signaling pathways. The significantly changed pathways were identified by permutation testing 1000 times. A false discovery rate (FDR) < 0.25 and adjusted P < 0.01 were recognized as significantly associated genes. The R package cluster Profiler (4.0) was used for analysis and graphical plotting 11 . The relative tumor infiltration levels of 24 immune cell types were analyzed by ssGSEA to research the expression levels of genes in published signature gene lists 12 . The signatures included multiple sets of innate and adaptive immune-related cell types and comprised 509 genes in total. To evaluate the association between the infiltration levels of immune cells and NCAPH, Spearman correlation and Wilcoxon tests were used.

Risk prognosis model construction, model construction and estimation. All statistical analyses
were performed using an R package (V3.6.2). Using logistic regression and the Wilcoxon signed-rank sum test, the link between clinicopathological features and NCAPH was investigated. The clinical-pathological variables linked to 10-year OS, disease-specific survival (DSS) and progression-free interval (PFI) in TCGA database were analyzed by using the Kaplan-Meier method and Cox regression. Univariate and multivariate Cox analyses were utilized to investigate the effect of NCAPH levels on survival and other clinical variables. The cutoff value for NCAPH expression was determined as the median level. P < 0.05 was considered statistically significant. The differences in OS, DSS and PFI between the NCAPH-low and NCAPH-high groups were analyzed by the KM method with a log-rank test. Independent prognostic indicators were employed to generate nomograms for predicting the prognosis for one year based on the results of multivariate Cox analysis. We created nomograms with calibration plots and relevant clinical factors using the RMS package (https:// cran.r-proje ct. org/ web/ packa ges/ rms/ index. html). The calibration curves were pictorially evaluated by drafting the nomogram measuring likelihood versus actual occurrence, and the 45-degree line indicated the best predicting values. A concordance index (C-index) was calculated and used to evaluate the discrimination of the model using a bootstrap method with 1,000 resamples. The C-index was used to assess the prognostic features and predictive accuracy of the nomogram.
Experimental verification of the differential expression of NCAPH in LUAD tissues by quantitative polymerase chain reaction (qPCR) and western blotting. From January 2018 to January 2019, 20 matched LUAD tissues and neighboring noncancerous tissues were taken from patients who received surgery at the National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital. All cases were pathologically confirmed. The current study was approved by the Ethics Committee of the National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital and conducted according to the Declaration of Helsinki. All patients signed informed consent forms. Total RNA was extracted from tissues by using TRIzol reagent (Invitrogen, CA, USA) according to the manufacturer's protocol. The extracted RNA was reverse-transcribed into cDNA by using the Takara PrimeScript RT Reagent Kit www.nature.com/scientificreports/ (Takara, Nanning, China). RT-PCR was performed by using a LightCycler 480 Real-time PCR System (Roche, Shanghai, China). To normalize NCAPH, 18S rRNA was used as an internal reference. The relative expression of NCAPH mRNA was calculated with the 2 -ΔΔCt method. The following primers were used: sense: 5'-ATG TTG CTG ATG GAA GTG -3' and antisense: 5'-GTT CTG CTCA ATA GTT CTGT-3' for NCAPH; sense: 5'-AGG CGC GCA AAT TAC CCA ATCC-3' and antisense: 5'-GCC CTC CAA TTG TTC CTC GTT AAG -3' for 18S rRNA. Total protein was extracted from frozen tissues. Protein concentrations were tested by a BCA protein assay kit. Protein samples were separated on a 10% sodium dodecyl (lauryl) sulfate-polyacrylamide gel electrophoresis gel. The separated proteins were transferred to an Immun-Blot polyvinylidene fluoride membrane (Bio-Rad) using a wet transfer system (Bio-Rad) and then incubated with primary antibody at 4 °C overnight, followed by incubation with horseradish peroxidase-linked anti-rabbit immunoglobulin G (Merck Millipore) at a dilution of 1:10,000 for 1 h at room temperature. The following antibodies were applied in the experiment: NCAPH (1:1000 dilution; Proteintech, China) and β-actin (1:2000 dilution; Proteintech, China). Relative NCAPH protein expression levels were normalized to β-actin.
Cell culture, cell migration assay, cell cycle distribution and apoptosis assay. H2122 and H3122 cell lines were purchased from the IMMOCELL company (Xiamen Immocell Biotechnology Co., Ltd.). The H2122 and H3122 cell lines were cultured in RPMI 1640 medium (GIBCO) supplemented with 10% fetal bovine serum (HyClone). Cells were incubated at 37 °C in a humidified atmosphere containing 5% CO 2 .
For the cell migration assay, 1 × 10 4 cells in 200 μL of medium without serum were trypsinized, suspended and seeded in the upper chamber (8-μm pore size; Millipore, Zurich, Switzerland). Subsequently, medium (600 μL) containing 20% FBS was added to the bottom compartment of the chamber. Then, the chamber was placed in an incubator at 37 °C. After incubation for 48 h, the cells were fixed with methanol and stained with 0.1% crystal violet (Sigma-Aldrich). Then, the nonmigrated cells were removed by scraping. Finally, migrated cells were counted by using a microscope (Nikon Corporation, Tokyo, Japan). The Transwell invasion assay was similar to the migration assay. The difference was that the upper chamber of the Transwell invasion assay was covered with Matrigel matrix. These experiments were repeated three times. Relative migration or invasion (%) was calculated by the average number of migrated (invaded) cells in the transfection group/average number of migrated (invaded) cells in the control group × 100%.
Cell cycle distribution and apoptosis assays were conducted by flow cytometry. H2122 and H3122 cells were digested with trypsin, resuspended in phosphate-buffered saline (PBS) and then fixed with 70% ethanol. Cells were washed with PBS and treated with 100 μg/ml RNase for 30 min. Then, DNA was stained with propidium iodide (50 μg/ml) and analyzed on a FACS Calibur flow cytometer (BD Biosciences, San Jose, CA, USA). Apoptosis was assessed by using an Annexin V FITC/PI apoptosis kit (KeyGen Biotech, Nanjing, China). The samples were measured and analyzed by using a flow cytometer (Beckman Coulter, Brea, CA, USA).

Statistical analysis.
Statistical analysis was performed with Student's two-tailed t test using SPSS (version 22). Values of P < 0.05 were considered statistically significant.
Ethics approval and consent to participate. The current study was approved by the Ethics Committee of the National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital and Shenzhen Hospital. Signed informed consent was obtained from all patients.

Identification of NCAPH-associated DEGs in LUAD. DEG analysis involved 267 LUAD NCAPH-
high samples and 268 NCAPH-low samples (control group). A total of 1592 DEGs were identified, including 1167 upregulated genes and 425 downregulated genes (adjusted P value < 0.05, log2-fold change > 1.5) (Fig. 1H). Then, the DEGs in HTSeq-Counts were further analyzed by the DESeq2 package. The genes of the top 20 DEGs between the two groups are presented in Fig. 1I www.nature.com/scientificreports/

Functional enrichment analysis of NCAPH-related genes in LUAD.
To further study the functional enrichment information of NCAPH-related genes, Metascape was utilized for GO enrichment analysis. NCAPH-related genes play roles in various biological processes (BPs), cellular compositions (CCs) and molecu- Correlations between the expression of NCAPH and clinicopathologic characteristics. Data from 513 patients and NCAPH expression data were collected from TCGA to explore the relationship between NCAPH expression and clinicopathologic parameters.  (Fig. 5A). In addition, the PFI of patients with high expression of NCAPH was significantly lower than those with low expression, with medians of 48.5 months versus 89.4 months (Fig. 5B). Furthermore, the DSS of patients with high NCAPH expression was significantly poorer than those with low expression, with medians of 28.8 months versus 73.9 months (Fig. 5C). Finally, we also performed subgroup analysis of prognosis. Subgroup analysis results showed that the survival of patients with high NCAPH levels was poor in the T1&T2, T3&T4, N0&N1, M0 and M1 groups (Fig. 5D-H). To further evaluate the role of NCAPH in LUAD prognosis, multivariate regression was applied with T stage, N stage, M stage, pathologic stage, primary therapy outcome, sex and smoking status. In multivariate analysis, high NCAPH expression was still an independent poor prognostic factor (Table 3).

NCAPH-related prognostic nomogram.
To predict the prognostic value of NCAPH in LUAD, we established a nomogram and a risk classification system for predicting 1-year survival (Fig. 6A). According to the clinical relevance and multivariate Cox analysis results, variables in the nomogram were selected. With the adjusted range of 1 to 100, the points of each variable were summed, and total scores were calculated. By delineating a direct line down from the total score line to the outcome line, the probable prognosis of each LUAD patient at 1 year was defined. For example, a LUAD patient with high NCAPH expression (56 points), T3&T4 (98 points), N2&N3 (100 points), and a primary therapy outcome (100 points) who is a smoker (30 points) had a total score of 384 points. The probability of 1-year survival was approximately 56% (Fig. 6A). The efficacy of the nomogram was also evaluated, and the results showed that the prediction efficiency of the nomogram was moderately accurate (Fig. 6B).

Knockdown of NCAPH suppresses the malignant phenotype of lung adenocarcinoma in vitro.
The H2122 and H3122 cell lines were chosen to research the role of NCAPH in LUAD. Three NCAPH siRNAs were transfected into cells. NCAPH mRNA expression was measured to evaluate the knockdown efficiency of three NCAPH siRNAs. Among these siRNAs(small interfering RNA), siRNA showed the most significant inhibition ratio and was selected for further experiments. The MTT assay data indicated that the siRNA targeting NCAPH significantly reduced cell growth rates. Transwell assays revealed that NCAPHtargeted siRNA transfection notably reduced migration and invasion in both cell lines. By using flow cytometry analysis, the cell cycle distribution of the NCAPH siRNA-transfected cells demonstrated an increase in the G 1 / G 0 cell population in both cell lines. In addition, the apoptosis of H2122 and H3122 cells was markedly increased in the NCAPH siRNA treatment group, as shown by Annexin V-FITC/PI double staining. These data are shown in Figs. 7 and 8.

Discussion
Condensin is a highly conserved multiprotein complex that regulates chromosomal assembly and separation during mitosis 13 . Condensin I and II are two forms of condensin complexes found that are in many eukaryotic cells, and both share the identical pair of the structural maintenance of chromosome (SMC) 2 and 4 subunits. The condensin I complex comprises SMC2-SMC4 proteins and three non-SMC proteins, including subunits H (NCAPH), G (NCAPG) and D2 (NCAPD2) 14 . Previous research showed that phosphorylation of NCAPH at Ser70 by Aurora B kinase was indispensable in the recruitment of condensin I to mitotic chromosomes 15 . Intriguingly, bioinformatics analyses of potential molecular mechanisms have shown that NCAPH is a key gene involved in lung and prostate tumorigenesis 16,17 . Although a previous study found that overexpression of NCAPH was associated with LUAD pathogenesis, the function and molecular mechanism remain unclear. Herein, we attempt to elucidate the oncogenic impact of NCAPH in LUAD development. According to our findings, the expression levels and prognostic value of NCAPH were assessed. We found that NCAPH expression was increased in various tumors, including LUAD, in databases. Furthermore, NCAPH may serve as a good biomarker for high ROC scores with an AUC of 0.967 for LUAD. Generally, NCAPH was expressed differently in tumor and normal samples. Further studies are needed to fully research the diagnostic value of NCAPH in LUAD.
To further study the functional enrichment information of NCAPH-related genes, Metascape was utilized for GO enrichment analysis. NCAPH-related genes were involved in many BPs, CCs and MFs, including distal axon, neuron projection terminus, axon terminus, integrator complex, premiRNA processing, RNA 3' − end processing and miRNA catabolic process.
We also revealed that the G2/M checkpoint, ncRNA metabolic process, memory B cells, KRAS signaling, E2F targets and MIER1 process were significantly associated with NCAPH expression. In a previous in vitro study, cell proliferation, cell cycle, colony formation, migration and invasion were inhibited when NCAPH was knocked down 18 . This study directly supports our results. The KRAS signaling pathway and E2F were proven to play an important role in the progression and development of LUAD [19][20][21] . These studies and our results indicate that NCAPH might contribute to LUAD initiation and development by modulating E2F, the cell cycle and the KRAS pathway. The associations of NCAPH expression with memory B cells, ncRNA metabolism and MIER1 process were first reported. The molecular mechanisms need to be further researched.
Previous clinical studies found that tumor-infiltrating lymphocytes (TILs) had a vital impact on several cancers [22][23][24] . Strong infiltration of TILs was associated with a positive clinical outcome in several cancers, including lung cancer 25 . Our study showed that NCAPH was positively associated with the level of acquired immunocytes (Th2 cells) and negatively correlated with the abundance of innate immunocytes (mast cells). Th2 cells are defined by the expression of their signature cytokines IL-4, IL-5, and IL-13, which are important components in the defense against extracellular pathogens 26 . Cytokines, such as IFN-γ, TNF-α and IL-2, produced by Th1 cells were found to be vital factors in the inhibition of tumor growth 27,28 . In contrast, the cytokines IL-10, IL-4 and TGF-β from Th2 cells were proven to promote tumor cell dissemination and metastasis in various cancers 29 . Therefore, maintaining the Th1/Th2 immune cell balance is considered to be critical. Enhancing the Th1 response  www.nature.com/scientificreports/ www.nature.com/scientificreports/  www.nature.com/scientificreports/ and inhibiting the Th2 effect may help to prevent disseminated cancer cells, recurrence and metastasis 30 . Consistent with these findings, our results revealed that NCAPH may be associated with the Th2 immune response in LUAD.
In several studies, mast cells were found to be a predictor of poor outcome [31][32][33] . However, a study of 175 patients with NSCLC also demonstrated that mast cell presence was a good prognostic factor 34 . The prognostic role of mast cells is still uncertain. However, the association of NCAPH expression with mast cells was the first to be reported. Our research indicated that high NCAPH expression is associated with clinical pathological characteristics and poor prognosis in LUAD. NCAPH expression was significantly related to T stage, N stage, M stage, pathological stage and smoking status. In univariate logistic regression, the OS of LUAD patients with high NCAPH expression was significantly shorter. After adjusting for clinicopathological factors, our study found that NCAPH could act as an independent predictive factor for a poor prognosis of LUAD. Then, we constructed a clinical nomogram with NCAPH expression and other clinical factors. Based on the calibration plot, there was a favorable consistency between the actual and predicted values for 1-year OS. Our model could be a new method to estimate prognosis in the future.
We also investigated the function of NCAPH in the proliferation, invasion, migration, cell cycle progression and apoptosis of LUAD cells in vitro. The malignant phenotype of LUAD cells was inhibited when NCAPH was knocked down.
In this study, we first reported that high NCAPH expression was significantly associated with poor survival and immune infiltration in LUAD, which might promote tumorigenesis through abnormal inflammation and immune responses. NCAPH may be a potential factor for predicting prognosis and a new biomarker. The in vitro study demonstrates that NCAPH may function as an oncogene in LUAD.  www.nature.com/scientificreports/