Introduction

Renal cell carcinoma (RCC) is the most common type of genitourinary cancer. It has a mortality rate of 30–40% and is more common in men than1 in women. The incidence of renal cancer varies between countries and regions of the world, with developed countries having a higher incidence than developing countries2. The incidence of kidney cancer varies in different countries and regions of the world. In recent years, the incidence rate of kidney cancer has shown a rising trend year by year, and the age of onset of the disease is mostly concentrated in the age of 50–70 years old, and with the popularization of health checkups, there is a tendency for the age of onset of kidney cancer patients to advance3. At present, surgery is the main treatment measure for kidney cancer patients, kidney cancer patients are not sensitive to radiotherapy or chemotherapy, and there is a lack of effective treatment means for patients with advanced kidney cancer who cannot be resected4. Patients with advanced kidney cancer who cannot undergo resection lack effective treatment. Epidemiologic investigations have shown that approximately 3% of RCC patients have a family history with an autosomal dominant pattern, suggesting that genetic factors play an important role in RCC susceptibility1. Among the various genetic factors, the role of gene polymorphisms is gradually being recognized. Tong5, in a meta-analysis, reported that the SDF-1ɑ G801A gene polymorphism strongly increased cancer risk in Asians. Hirata et al.6 also revealed an association between the XRCC1 399Gln polymorphism and RCC risk.

Based on the above evidence, this study used a case‒control method to analyze the relationship between SDF-1ɑ(also known as the chemokine CXCL12) or XRCC1 and RCC and further analyzed the expression of the SDF-1ɑ and XRCC1 genes in RCC via bioinformatics methods, which can provide new clues and references for the active prevention and treatment of RCC.

Objects and methods

Bioinformatics methods

The Cancer Genome Atlas (TCGA) database7 was used to download SDF-1ɑ and XRCC1 gene expression data and corresponding clinical information from renal cancer patients; compare the expression levels of the target genes in tumor tissues and normal tissues; and perform Cox regression to analyze the relationship between SDF-1ɑ and overall survival. P < 0.05 indicated that the difference was statistically significant. Gene Ontology (GO) functional enrichment analysis8 and gene set enrichment analysis (GSEA)9 were performed on the SDF-1ɑ gene to determine the signaling pathway through which SDF-1ɑ may participate in and explore the possible mechanism of the SDF-1ɑ gene's involvement in the development and development of renal cancer.

XRCC1 and SDF-1ɑ(CXCL12) expression data were obtained from 542 cancer tissue samples and 72 paracancerous tissue samples using The Cancer Genome Atlas (TCGA) transcriptome RNA-seq data and normal tissue data were simultaneously log2 transformed to match the differential expression information between tumor and normal tissues. Differences in SDF-1ɑ expression were explored using the R package “limma”, and p < 0.05 was considered to indicate statistical significance.

After downloading and organizing the differential expression data for the SDF-1ɑ gene from the TCGA database, we explored the potential functional enrichment of SDF-1ɑ in RCC by using the R packages “enrichplot”, “ggplot2”, and “circlize” and selecting the significantly enriched functional categories according to p-val < 0.05 and FDR q-val < 0.5. To explore the potential functional enrichment of SDF-1ɑ in RCC, we selected the significantly enriched functional categories based on p-val < 0.05 and FDR q-val < 0.5 to obtain the proportion and significance level of SDF-1ɑ in the BP, CC, and MF functional categories.

Download the c2.cp.kegg.v7.4.symbols.gmt file from the GSEA website and explore the potential signaling pathways of SDF-1ɑ in RCC through the R package “enrichplot”. The top 5 signaling pathways that were significantly enriched were selected based on the normalized enrichment score (NES), NOM p value < 0.05 and FDR q value < 0.05 and were divided into two groups according to the median expression of SDF-1ɑ, namely, the high-expression group and the low-expression group.

Research objects and methods

Study subjects and research methods A total of 166 new RCC patients and 166 healthy individuals who attended or underwent physical examination at the Affiliated Hospital of Inner Mongolia Medical University and People’s Hospital from December 2017 to December 2021 were selected and paired according to the same sex, with an age difference of no more than 5 years. In addition, detailed information about the patients was obtained by reviewing the electronic cases and face-to-face interviews and completing the clinical data questionnaires. The study subjects signed an informed consent form. Improved multiple ligase detection reaction (iMLDR) was used to analyze the gene loci of the patients and controls10. Genotyping was performed on cases and controls.

Ethical approval and participatory consent

The study was fully reviewed and approved by the Medical Ethics Committee of Inner Mongolia Medical University (YKD202102003) and was conducted in accordance with the Declaration of Helsinki. Informed consent was obtained from all participants after the nature of the research principles, including confidentiality and freedom of choice, were explained.

Main enzymes and reagents

The whole-blood DNA rapid extraction kit was produced by Beijing Tiangen Technology Co., Ltd., tag DNA polymerase was produced by Qiagen, DNA marker was produced by New England Biolabs, deoxyribose hexoside triphosphate was produced by Gener-ay Biotech, and the primers were synthesized by Sangyo Bioengineering (Shanghai) Co. The primers were synthesized by Sangyo Bioengineering (Shanghai) Co. iMLDR Multiplex was used for typing by Tianhao Biotechnology (Shanghai) Co.

Research methodology

Peripheral blood specimen collection and preservation All the study subjects fasted for 6–8 h. After fasting, 2 ml of venous EDTA was collected, and an inert separator gel was used to promote coagulation; after centrifugation and collection of serum and blood cells, the samples were frozen at − 80 °C.

Extraction of human genomic DNA In total, 100 μl of whole blood was taken, and DNA was extracted in strict accordance with the instructions of the kit. The DNA was quantified via a UV spectrophotometer and stored at – 80 ℃ for spare use.

Genotype detection The blood genomic DNA of the study subjects was extracted, polymerase chain reaction primers were designed, and iMLDR was used to genotype the SDF-1ɑ and XRCC1 gene loci.

Statistical methods

The raw data were imported into EpiData 3.1 software using the principle of double entry, and a database was established after organization. SPSS 25.0 software was used to analyze the data. The χ2 test was used to analyze whether the genotype frequency distribution of individual loci in the population complied with the Hardy‒Weinberg (H–W) balance test, and the t test or χ2 test was used to compare the general conditions and lifestyle behavior of cases and controls according to the nature of the indexes. The multifactorial conditional logistic regression model was used to analyze the correlation between genetic polymorphisms and lifestyle habits and the risk of developing RCC. The correlation between gene polymorphisms and lifestyle behaviors and the risk of developing RCC was analyzed using a multifactorial conditional logistic regression model, and the interaction between genes and the environment was analyzed using the generalized multifactor dimensionality reduction (GMDR) method11. Generalized multifactor dimensional reduction (GMDR) was used to analyze the interaction between genes and the environment.

Results

Exploring the effect of SDF-1ɑ and XRCC1 expression on RCC based on raw letter analysis

Differences in the expression of SDF-1ɑ and XRCC1 in cancerous and normal tissues

The results showed that the expression of SDF-1ɑ (CXCL12) in normal tissues was significantly greater than that in cancerous tissues (Fig. 1A,B). The expression of XRCC1 in cancerous tissues was significantly greater than that in paracancerous tissues (P < 0.001) (Fig. 2A,B).

Figure 1
figure 1

Expression of SDF-1ɑ (CXCL12) in RCC. (A) Expression of SDF-1ɑ (CXCL12) in cancerous tissue and normal tissue. (B) Expression of SDF-1ɑ (CXCL12) in paired tissue.

Figure 2
figure 2

XRCC1 expression in RCC (A) XRCC1 expression in cancerous and normal tissues (B) XRCC1 expression in paired tissues.

Survival analysis

Survival analysis was performed using kidney cancer samples from the TCGA public database. Univariate Cox analysis revealed that SDF-1ɑ (CXCL12) expression, age, histologic grade, and clinical stage were associated with overall survival in renal cancer patients, and subsequent multivariate Cox analysis revealed that age, histologic grade, and clinical stage were independent risk factors for overall survival in renal cancer patients; moreover, high SDF-1ɑ (CXCL12) expression was a protective factor (Fig. 3A,B).

Figure 3
figure 3

Relationships between SDF-1ɑ (CXCL12) expression and overall survival (A) One-way Cox regression analysis (B) Multifactor Cox regression analysis.

GO enrichment analysis

GO enrichment analysis was performed on the SDF-1ɑ (CXCL12) gene, and the relevant biological processes were plotted as bubble plots according to the degree of enrichment (Fig. 4). GO enrichment analysis of BP was mainly focused on cellular cation homeostasis, regulation of membrane potential, cell recognition, cell‒cell recognition, and sperm-egg recognition. CC enrichment was mainly focused on collagen-containing extracellular matrix, anchored component of membrane, membrane raft, membrane microdomain, and Golgi lumen. MF enrichment was mainly related to channel activity and passive transmembrane transporter activity, ion channel activity, glycosaminoglycan binding, and extracellular matrix structural constituents.

Figure 4
figure 4

GO function set analysis of the SDF-1ɑ (CXCL12) gene.

GSEA enrichment analysis

GSEA of high and low SDF-1ɑ(CXCL12) expression data sets revealed that allograft rejection, ECM receptor interaction, Leishmania infection, Leishmania infection, and viral myocarditis in SDF-1ɑ(CXCL12) was significantly enriched in the high-expression group (Fig. 5).

Figure 5
figure 5

GSEA enrichment map of the SDF-1ɑ (CXCL12) gene.

Comparison of general information between the case and control groups

Smoking history, drinking history, education level, pesticide exposure history, food saltiness, fruit intake, meat and vegetable combinations, type of cooking oil, milk consumption, type of drinking water, hair dyeing, labor intensity, and urine holding were significantly different between the two groups (p < 0.05); BMI and ethnicity were not significantly different between the two groups (p > 0.05), as shown in Tables 1 and 2.

Table 1 Comparison of demographic data between the two groups (x ± s).
Table 2 Comparison of life history data between the two groups (x ± s).

Hardy‒Weinberg equilibrium test

Allele frequencies and genotype frequencies in the population did not change with each generation. The rs1801157 genotype of the SDF-1ɑ gene and the rs1799782 genotype of the XRCC1 gene obeyed the H-W equilibrium test (P > 0. 05) in this study, indicating that the samples could better replace the population and conformed to the law of genetic equilibrium of the population, as shown in Tables 3 and 4.

Table 3 Hardy‒Weinberg equilibrium test for the SDF-1ɑ gene polymorphism at the rs1801157 locus.
Table 4 Hardy‒Weinberg equilibrium test for the rs1799782 polymorphism locus of the XRCCI gene.

Relationships between polymorphic SDF-1ɑ and XRCC1 gene loci and clinicopathologic features of RCC

According to the WHO12 pathohistological types of RCC, patients with RCC were grouped according to different categories, including clear cell renal tumors, papillary renal tumors, oncocytic and chromophobe renal tumors, and molecularly defined renal carcinomas. After sorting the collected samples, we found that the samples of collected ductal tumors, other kidney tumors and molecularly defined kidney cancer accounted for a small proportion of the samples; therefore, we unified these three categories into other tumors, and the distributions of alleles and genotypes of each locus were compared with each other among the subgroups with or without differences. We classified the pathological sections of the study subjects into histological and TNM grades according to WHO standards and compared the differences between them. The results showed that the rs1799782 genotype (G:A) of the XRCC1 gene was significantly different between different pathohistological types of RCC (p < 0.05), whereas no statistically significant difference was found between the rs1801157 genotype of the SDF-1ɑ gene and clinicopathological features, as shown in Tables 5 and 6.

Table 5 Allele/genotype frequencies of the rs17997821 locus of the XRCC1 gene vs. Relationships between clinicopathologic features of RCC.
Table 6 Allele/genotype frequencies of the rs1801157 locus of the SDF-1ɑ gene vs. Relationships between clinicopathologic features of RCC.

Association of SDF-1ɑ and XRCC1 gene expression and lifestyle habits with kidney cancer

With the presence of RCC as the dependent variable, the rs17997821 locus of the XRCC1 gene, the rs1801157 locus of the SDF-1ɑ gene, smoking status, literacy level, holding urine, dyeing hair, exposure to pesticides, type of meat–vegetable combination, and labor intensity as the independent variables, the conditional logistic regression model was applied to investigate the associations between the polymorphisms of the SDF-1ɑ and XRCC1 genes and between lifestyle habits and RCC. Smoking, holding urine, and dyeing hair were risk factors for RCC, and regular intake of fruit was a protective factor against RCC, as shown in Table 7.

Table 7 Results of the RCC multifactor conditional logistic regression analysis model analysis.

Gene‒environment interactions and the risk of developing RCC

Using the GMDR method13, environmental factors were incorporated into the model for analysis, and the results showed that the rs1799782 locus of the XRCC1 gene in RCC interacted with smoking, alcohol consumption, pesticide exposure, hair dyeing, and urine holding, with a cross-consistency of 100%, an accuracy of 0.6854 in the training samples, and an accuracy of 0.6276 in the test samples, with P = 0.0010, suggesting that rs1799782 interacted with smoking, alcohol consumption, pesticide exposure, hair coloring, and urine holding had an interaction effect; i.e., the coexistence of these five factors could increase the risk of RCC by 2.94 times (OR = 2.94, 95% CI 1.23–7.02) (Table 8).

Table 8 GMDR analysis of RCC gene‒environment interaction models.

Discussion

According to data provided by the World Health Organization, there are more than 140,000 RCC-related deaths each year, and RCC is the 13th most common cause of cancer death worldwide14. Although early diagnosis of RCC is critical for treating patients and reducing mortality, optimal screening modalities and methods have not been established1; therefore, exploring the underlying molecular mechanisms of RCC and identifying biomarkers associated with early diagnosis, treatment, and prognosis are important for guiding clinically individualized treatment regimens.

Stromal cell-derived factor-1 (SDF-1ɑ), also known as chemokine CXCL12 (C-X-C chemokine ligand 12), is a class of cytokines with chemotactic activity. Moreover, targeting the CXCL12-mediated axis can control tumors and the tumor microenvironment by playing dual roles inanti-tumor15, Zhou16 demonstrated that the CXCL12/CXCR4 axis can be used as a molecular target for cancer therapy. Human X-ray cross-complementary repair gene 1 (XRCC1) is an important DNA damage repair gene and is one of the key genes in the base excision repair (BER) pathway. Polymorphisms in the XRCC1 gene have been reported to be associated with susceptibility to a variety of cancers17,18,19,20,21,22. Akhmadishina et al.22 found that the XRCC1 gene polymorphism c.839 G > A may promote early and advanced renal cancer. Therefore, this study analyzed and explored the relationship between genetic polymorphisms and RCC susceptibility from a genetic point of view to provide scientific biomarkers for the effective prevention of RCC and analyzed and studied the associations of SDF-1ɑ and XRCC1 gene polymorphisms with lifestyle habits in RCC to provide a scientific basis for the effective prevention of RCC development.

In this study, by analyzing the differences in the expression of SDF-1ɑ and XRCC1 between tumor tissues and normal tissues, we found that the expression of SDF-1ɑ was increased in normal tissues and that the expression of XRCC1 was increased in RCC tissues. Analysis of the survival data of RCC patients in the TCGA database showed that high SDF-1ɑ expression may play a role in the treatment of RCC and may have a positive antitumor immune effect. SDF-1ɑ has been shown to play a protective role against acute kidney injury (e.g., ischemia/reperfusion injury)23. A study by Armstrong et al.24 showed that SDF-1ɑ was the only vascular factor reduced by low-risk disease and that SDF-1ɑ was associated with improved survival, which is consistent with our study.

GO and GSEA of SDF-1ɑ revealed that the genes enriched in the GO enrichment analysis were involved mainly in channel activity, passive transmembrane transporter protein activity, ion channel activity, glycosaminoglycan binding, and structural components of the extracellular matrix, while the genes enriched in the GSEA were involved mainly in allografts, extracellular matrix receptor action, and the leishmaniasis pathway. Numerous studies have shown that extracellular matrix elasticity or stiffness affects fundamental cellular processes, including cell spreading, growth, proliferation, migration, differentiation and organoid formation25. Najafi et al. stated that cancer cell proliferation, migration and invasion, as well as angiogenesis, are consequences of the stiffness and degradation of the extracellular matrix and that establishing homeostasis in the extracellular matrix (ECM) to facilitate the penetration of chemotherapeutic agents and improve the efficacy of antitumor therapies would be a promising strategy26. In his review, Anderson showed that ion channels are important for the regulation of tissues and cancer stem cells and that the effects of ion channels on cancer cellular processes lead to cancers described as channelopathies27. However, there are no studies on the association between SDF-1ɑ and the RCC signaling pathway. Our results suggest that SDF-1ɑ may influence the progression of RCC by modulating multiple signaling pathways.

In the present study, smoking, urine holding and hair coloring were found to be risk factors for RCC, and regular intake of fruits was found to be a protective factor against RCC. Macleod LC found an independent correlation between smoking and RCC in the VITAL study28. Similarly, Lotan Y et al. in the PLCO trial, confirmed that smoking intensity was significantly associated with a high risk of RCC29, the present study found that the risk of RCC in smokers was 1.85 times higher than that in nonsmokers by Logistic analysis (OR = 1.852; 95% CI 1.105, 3.103), which is in line with the results of previous studies. The present study showed that there was no statistically significant correlation between alcohol consumption and susceptibility to RCC (all P > 0. 05), but moderate alcohol consumption has been shown to have a protective effect on the incidence of RCC30,31. However, these findings are not consistent with our findings, and we believe that this may be due to sampling errors or the insufficient number and type of samples analyzed in this study. These limitations resulted in a lack of statistically significant associations between alcohol consumption and susceptibility to RCC and may need to be further verified in a study with a larger sample size.

The genetic polymorphisms of the rs1799782 locus of the XRCC1 gene and the rs1801157 locus of the SDF-1ɑ gene showed that patients with each genotype had fewer lymph node metastases and distant metastases; the clinical stages were mostly stage I and II; and the pathohistological type was mostly clear-cell renal cell carcinoma. Although the rs1799782 locus of the XRCC1 gene was detected, the rs1801157 locus of the SDF-1ɑ gene and genotype were not correlated with tumor location or stage. Although the rs1799782 locus of the XRCC1 gene, the rs1801157 locus of the SDF-1ɑ gene allele and the genotype were not significantly correlated with tumor location or stage, the difference in rs1799782 of the XRCC1 gene among the different pathohistological types may be key factors in the onset and pathological development of renal cell carcinoma.

Multifactor dimensionality reduction (MDR) is a nonparametric, higher-order interaction analysis method developed in recent years without the need for a genetic model, while the extension method generalized multifactor dimensionality reduction (GMDR), based on the basic principles of MDR, makes it possible not only to analyze continuous variables but also to include covariates, thus controlling the interference caused by covariates and improving the accuracy of prediction32. The GMDR model enables not only the analysis of continuous variables but also the incorporation of covariates, thus controlling the interference caused by covariates and improving the accuracy of prediction13. In this study, the GMDR method was used to analyze the interaction between genes and the environment. The results showed that the rs1799782 locus of the XRCC1 gene in RCC interacted with smoking, alcohol consumption, pesticide exposure, hair dyeing, and urine holding.

In summary, although this study has several limitations, such as a small sample size and admission rate bias due to the two-way selectivity between patients and hospitals, which results in cases and controls not being a random sample of the whole population, and the lack of basic validation experiments on the SDF-1ɑ and XRCC1 genes, this study confirmed the downregulation of SDF-1ɑ expression and upregulation of XRCC1 expression in kidney cancer tissues by bioinformatic methods and integrated information from the TCGA database. However, this study confirmed that SDF-1ɑ expression was downregulated and that XRCC1 expression was upregulated in renal cancer tissues by bioinformatics methods and integrated information from the TCGA database. High SDF-1ɑ expression may be a protective factor for overall survival in renal cancer patients, providing a basis for clinical experimental studies. In future studies, we hope to further expand the sample size of clinical studies and further conduct basic studies in vivo and in vitro to further corroborate the possible molecular mechanism of the SDF-1ɑ gene in the development of renal cancer.