Spectroscopic evaluation of carcinogenesis in endometrial cancer

Carcinogenesis is a multifaceted process of cancer formation. The transformation of normal cells into cancerous ones may be difficult to determine at a very early stage. Therefore, methods enabling identification of initial changes caused by cancer require novel approaches. Although physical spectroscopic methods such as FT-Raman and Fourier Transform InfraRed (FTIR) are used to detect chemical changes in cancer tissues, their potential has not been investigated with respect to carcinogenesis. The study aimed to evaluate the usefulness of FT-Raman and FTIR spectroscopy as diagnostic methods of endometrial cancer carcinogenesis. The results indicated development of endometrial cancer was accompanied with chemical changes in nucleic acid, amide I and lipids in Raman spectra. FTIR spectra showed that tissues with development of carcinogenesis were characterized by changes in carbohydrates and amides vibrations. Principal component analysis and hierarchical cluster analysis of Raman spectra demonstrated similarity of tissues with cancer cells and lesions considered precursor of cancer (complex atypical hyperplasia), however they differed from the control samples. Pearson correlation test showed correlation between cancer and complex atypical hyperplasia tissues and between non-cancerous tissue samples. The results of the study indicate that Raman spectroscopy is more effective in assessing the development of carcinogenesis in endometrial cancer than FTIR.

www.nature.com/scientificreports/ the methods used for biomarkers marking are expensive. Moreover, lots of biomarkers are nonspecific [9][10][11] . Therefore, it is very important to find non-invasive and cost-effective tests, which can be used as novel diagnostic methods. Consequently, it is of prime importance to conduct interdisciplinary studies e.g. in physics to develop methods which can be used in diagnostic of cancer diseases 4,[12][13][14] . Some of these methods are Raman and FTIR spectroscopy. They allow chemical characterization of studied samples. Moreover, Raman and FTIR spectroscopy are non-invasive, simple, accurate and rapid. These non-destructive testing methods do not cause any damage to the evaluated materials. Spectroscopy could be used as a comparative for gold standard methods facilitating clinical decision-making and patient outcomes by detecting biochemical changes in tissues before the changes can be observed under microscope. Nowadays, Raman and FTIR spectroscopy methods are used to detect chemical changes caused by lung 12,15 , bones 16,17 , thyroid 13,18 , breast 19 , prostate 20 , endometrial cancers 4 . However, recognizing changes visible during neoplastic process is important in the cancer diagnostics. Changes caused in the subsequent stages of cancers should be investigated to understand the neoplastic disease development process. For this purpose, Raman and FTIR spectroscopy can be used to obtain information about chemical changes. Importantly, our previous study demonstrated the potential of FT-Raman and FTIR spectroscopy for the identification of chemical changes in atypical complex hyperplasia and endometrioid adenocarcinoma cancer 4 . To the best of our knowledge, this is the first study applying physical methods such as FT-Raman and FTIR spectroscopy with multidimensional analysis and correlation test to evaluate carcinogenesis in endometrial cancer. The paper also attempts to investigates which spectroscopy method is more effective in recognizing the different stages of carcinogenesis.

Results
Raman spectra of all analyzed groups of samples are presented in Fig. 1. Raman spectrum vibrations from nucleic acid (812 cm −1 , 1065 cm −1 , 1293 cm −1 ), proline and hydroxyproline (890 cm −1 ), tryptophan (1359 cm −1 ), protein (1553 cm −1 , 1598 cm −1 ), amide I (1695 cm −1 ), proteins and lipids (1447 cm −1 ) and only lipids (1776 cm −1 , 2798 cm −1 , 2872 cm −1 ) and water (3087 cm −1 ) were visible in control. The differences in the presence of Raman shift were noticed in the tissues from other study groups. The absence of OH vibrations were observed in all study groups compared to the control group. Moreover, peaks corresponding to proline, hydroxyproline, tyrosine, PO 2 − stretching of nucleic acids and C=C, tryptophan (protein assignment) disappeared in Raman spectra of tissues from atrophic endometrium and complex atypical hyperplasia groups. Furthermore, lack of peaks at 1011 cm −1 and 1598 cm −1 was observed also for Raman spectrum of atrophic endometrium and endometrial Figure 1. Raman spectra of samples: control (black spectrum); atrophic endometrium (orange spectrum); complex atypical hyperplasia (red spectrum); endometrial polyp (green spectrum); endometrioid adenocarcinoma (blue spectrum). www.nature.com/scientificreports/ polyp. Raman shift originating for C-C stretching from proline and hydroxyproline disappeared in the spectrum of endometrial polyp.
Comparison of the study groups and control one revealed the disappearance of peaks and peak shifts. Significant shift of peak corresponding to CH 3 stretching from lipids was visible in the Raman spectrum of atrophic endometrium. Moreover, the shift of peaks originating from C-C stretching from proline and hydroxyproline, amide I and CH 3 stretching from lipids, were noticed in the Raman spectrum of complex atypical hyperplasia. In the spectrum of endometrial polyp, a significant shift of peaks corresponding to amide I and C=O stretching vibrations from lipids were observed. The most significant shift of the largest number of peaks was visible in the Raman spectrum of endometrioid adenocarcinoma when compared to the control group. Significant shift of peaks originating from: proline, hydroxyproline, tyrosine, PO 2 − stretching of nucleic acids, C-C stretching from proline and hydroxyproline, phosphodiester groups in nucleic acids, tryptophan, amide I and C=O stretching from lipids, were visible in the Raman spectrum of endometrioid adenocarcinoma. All analyzed Raman peaks and their shifts are presented in Table 1.
FTIR spectra of all analyzed groups of samples are presented in Fig. 2 which demonstrated that all study group samples differed from the control one. In FTIR spectrum of atrophic endometrium (orange plot), significant shift of peaks corresponding to amide II, amide I and CH 2 as well as CH 3 groups from lipids vibrations was noticed. Moreover, lack of amide III vibrations and presence of C-O carbohydrates group was observed in comparison with the control group samples. FTIR spectrum of complex atypical hyperplasia (red plot) was characterized by shifting of peaks originating from C-O stretching mode of serine, threonine, and tyrosine, C-O deformation vibrations of proteins, glycogen, carbohydrates, all three amides modes and N-H vibrations of cytosine. The absence of CH 2 wagging for proline (amino acids and collagen) was also noticed. Comparison of FTIR spectrum of endometrial polyp (red plot) with FTIR spectrum of control group samples (black plot) revealed the presence of C-O group from carbohydrates and absence of C-O stretching mode from serine, threonine, and tyrosine of protein and N-H group of cytosine. Moreover, significant shift of peaks at 1168 cm −1 , 1245 cm −1 , 1560 cm −1 , 2842 cm −1 was observed. Finally, in FTIR spectrum of endometrioid adenocarcinoma (blue plot), shift of peaks corresponding to C-O stretching mode of C-OH groups from serine, threonine, tyrosine, amide III, amide II, amide I and absence of deformation N-H cytosine group was visible in comparison to FTIR spectrum of the control group. Description of analyzed peaks present on the FTIR spectra was presented in Table 2.
Differences in the values of absorbance and Raman intensity were visible in the Figs. 1 and 2. Therefore, average values of these parameters were calculated for all analyzed group of tissue samples, Fig. 3. Figure 3 presents mean ± SEM intensity values of peaks from the Raman, Fig. 3a, and FTIR, Fig. 3b, spectra. Only statistically significant differences were considered. Figure 3a Table 1. Raman shift with corresponding vibrations described in the Raman spectra from Fig. 1 [20][21][22][23][24] . Bold means statistically significant shift. www.nature.com/scientificreports/ samples, polyp tissue ones did not show any differences at the 1011 cm −1 and 1598 cm −1 . Figure 3b demonstrated, that statistically significant differences between groups were observed only in the case of two analyzed peaks: 1073 cm −1 and 1125 cm −1 . These differences were visible when we compared control and atrophic endometrium, endometrial polyp and control and complex atypical hyperplasia, endometrial polyp. Moreover, statistically significant differences between atrophic endometrium and complex atypical hyperplasia were noticed for the peak at 1073 cm −1 . In case of the peak at 1125 cm −1 , statistically significant differences were observed between endometrial polyp, complex atypical hyperplasia samples and the control ones. FT-Raman and FTIR spectra variation among the types of samples and their similarity were analysed with principal component analysis (PCA) and hierarchical cluster analysis (HCA), Fig. 4. All analyzed Raman spectra peaks were statistically significant, while in the case of FTIR spectra, only two peaks turned out to be statistically significant. Therefore, PCA analyses were performed only for statistically significant data, which can be further used to differentiate various types of endometrial tissue samples.
PCA analysis of average spectra ( Fig. 4a) showed, that Raman spectra of atrophic endometrium, atypical complex hyperplasia, endometrial polyp, endometrioid adenocarcinoma groups differed from the control group. However, similarities in the Raman spectra of endometrial polyp and endometrioid adenocarcinoma groups, were visible. These observations conformed with HCA analysis (Fig. 4b), which showed similarity between these two types of samples. Moreover, HCA analysis also showed that the Raman spectrum of atrophic endometrium group was similar to the one of complex atypical hyperplasia. PCA analysis of average FTIR spectra (Fig. 4c) demonstrated that FTIR spectra of complex atypical hyperplasia and endometrioid adenocarcinoma were placed in the same quarter of the coordinate system. This analysis revealed also similarity between control and endometrial polyp group. Furthermore, HCA analysis of average FTIR spectra (Fig. 4d) showed that control, atrophic endometrium and endometrioid adenocarcinoma create one group which is similar to FTIR spectrum of endometrial polyp tissues. Moreover, PCA of Raman data of all samples, Fig. 4e, revealed that the largest points in the same quarter of the coordinate system referred to endometrioid adenocarcinoma tissue group. However, the highest dispersion was found for atrophic endometrium tissue group. Similar case occurred in the PCA plot obtained for one selected FTIR region, Fig. 4f. Also in this case, significant separation between analyzed group of samples was not observed. Therefore, we marked vectors of least-squared lines representing different groups in Figure 2. FTIR spectra of samples: control (black spectrum); atrophic endometrium (orange spectrum); complex atypical hyperplasia (red spectrum); endometrial polyp (green spectrum); endometrioid adenocarcinoma (blue spectrum). www.nature.com/scientificreports/ Fig. 4e,f. Consequently, PCA obtained from Raman data showed, that atrophic endometrium, complex atypical hyperplasia and endometrioid adenocarcinoma, were separated from control and polyp endometrial tissues. PCA obtained from FTIR data, showed separation between control and polyp, as well as cancer tissue groups. Therefore, Partial Least Squares analysis (PLS) with variables importance in projection (VIP) was performed, Fig. 5.
The PLS results presented as plots of predictor showed, that in the Raman spectra, the region which could be used to separate endometrial tissues with different carcinogenesis stages was between 500 and 1500 cm −1 and around 2800-2900 cm −1 , Fig. 5a. Moreover, the region between 1500 and 1700 cm −1 , which is very important in description of changes in proteins, can give false positive or false negative results. Obtained prediction correlation was characterized by good predicted response for training data, which was visible in linear fit presented in Fig. 5b. FTIR data analysed with PLS showed good linear fit, Fig. 5e for data enabling separation between each analysed sample in the range between 500 and 1500 cm −1 and in the wavenumbers corresponding to CH 2 and CH 3 vibrations of lipids, Fig. 5d. VIP values were generated from the model as shown in Fig. 5c,f for Raman and FTIR data, respectively. The threshold level was established at 0.8. Note that the VIP values were not associated for all the measured Raman and FTIR spectra. For the Raman data, VIP values showed that separation of the control and atrophic endometrium samples was impossible, while VIP values obtained for FTIR spectra showed, that polyp samples could not be separated from others. Furthermore, Random Forest as well as second learning machine method C5.0 classification model were calculated to obtain information about accuracy of Raman and FTIR spectroscopy in distinguishing analyzed samples. These two analyses were done for fingerprint FTIR and Raman region (800-1800 cm −1 ), and for a selected region (only peaks which differences were statistically significant-data taken from Fig. 3). The results were presented in Table 3.
The results obtained using Random forest and C5.0 algorithms indicate that both Raman and FTIR approaches can effectively identify groups of cases. The classification accuracy is in the range of 62.71 to 96.61%. It can be seen that much better results were obtained using Random forest algorithm which is an ensemble machine learning method. At the same time, the obtained results provide a basis for further extensive research on a larger number of learning instances.
The correlation between all measured samples was investigated using Pearson correlation test, Table 4. The Pearson correlation test obtained from Raman spectra and presented in Table 3 showed, that it was possible to determine the correlation between control, atrophic endometrium and endometrial polyp tissues. Moreover, correlation between atrophic endometrium, endometrial polyp and endometrioid adenocarcinoma was also observed in Raman spectra. Furthermore, correlation in Raman spectra between polyp, control, atypical hyperplasia and endometrioid adenocarcinoma was noticed. Interestingly, lack of correlation between endometrioid adenocarcinoma and other analyzed tissues was visible in Raman spectra. Pearson correlation test obtained from FTIR spectra ( Table 3) showed, that each analyzed sample correlated with all others, e.g. correlation between control, atrophic endometrium, atypical hyperplasia, endometrial polyp and endometrioid adenocarcinoma was observed. Table 2. FTIR wavenumbers with corresponding vibrations described in the FTIR spectra presented in Fig. 2 [25][26][27][28][29][30] . Bold means statistically significant shift.

Discussion
Observation of lesions characteristic for cancer cells without significant architectural changes in tissues is very important to understand the carcinogenesis of endometrial cancer. However, the diagnostic gold standard used currently does not allow for it. Therefore, a new diagnostic technique is required to facilitate this application in future. Consequently, this study showed chemical changes that occur during carcinogenesis process in endometrial tissues using FT-Raman and FTIR spectroscopy. Recent studies show, that all spectroscopic techniques can be used to observe chemical differences between healthy and non-healthy endometrial tissues 4,31-36 . Patel et al. showed, that amides and prolines vibrations can be used as spectroscopic marker in endometrial cancer 31 . Our Raman results also showed, that quantitative and qualitative changes in endometrial cancer tissues are observed at 1695 cm −1 and 812 cm −1 wavenumbers, which corresponded to amide I and proline vibrations, respectively, Fig. 1. Moreover, we also noticed structural changes in amides vibrations in FTIR spectra, Fig. 2. Similar results were obtained by Taylor et al. 37 . They used FTIR spectroscopy to identify chemical changes that occur at different stages of endometrial cancer. Furthermore, the changes in amides vibrations are more significant, when the cancer stage in higher. As shown in Table 2 shifts were observed in amides vibrations. Interestingly, in comparison with control samples, these shifts were more atrophic endometrium (orange spectrum); complex atypical hyperplasia (red spectrum); endometrial polyp (green spectrum); endometrioid adenocarcinoma (blue spectrum). Data was analyzed using one-way ANOVA followed by Tukey's post hoc test. Statistical significance was adopted at *p < 0.05 versus Control; ^ p < 0.05 versus atrophic endometrium; & p < 0.05 versus complex atypical hyperplasia; # p < 0.05 versus endometrial polyp; + p < 0.05 versus endometrioid adenocarcinoma. www.nature.com/scientificreports/  www.nature.com/scientificreports/ significant for samples with more developed carcinogenesis process. It is important, especially, when attempting to differentiate complex atypical hyperplasia and endometrial cancer, because differential diagnosis between these two types of endometrial tissues still a problem 38 . Some research showed, that p53 protein can be a factor, which could be used in diagnostics of these two endometrial changes 39 . Therefore, results obtained from the Raman and FTIR range corresponding to amides vibrations confirmed the molecular biology hypothesis. Moreover, the differences in the proteins region in spectra can suggest, that expression of genes and, consequently, proteins conformation could be used as a biomarker in the different stage of endometrial carcinogenesis. Paraskevaidi et al. used blood serum collected from women suffering from complex atypical hyperplasia and endometrial cancer stage I and II 36 . We observed in FTIR spectra of analysed samples that chemical changes visible in tissue samples corresponded with these noticed in blood serum, especially in amides IR range. However, Paraskevaidi et al. showed also that CH 2 wagging vibrations from collagen played important role in carcinogenesis process detected from blood serum. In our study significant shift of collagen vibrations was visible in Raman spectra of the samples characterized by the most advanced carcinogenesis process, Fig. 1, Table 2. Moreover, intensity of peaks originating from functional groups building collagen structure decrease together with carcinogenesis process. It could be caused by collagen linearization and tissue stiffness and such changes increase the tumor incidence and progression 40 . Collagen constitutes the scaffold of tumor microenvironment and regulating its extracellular matrix remodeled by collagen degradation and re-deposition, and promoting tumor infiltration, angiogenesis, invasion and migration 41 . Therefore, we think, that also vibrations of collagen functional groups can be used as spectroscopic marker of carcinogenesis.
Moreover, PCA analysis of average spectra of different types of tissues, showed, that only in the case of Raman data, differentiation between cancer and complex atypical hyperplasia was possible, Fig. 4a. These two kinds of endometrial changes differed enough to allow differentiation 38 . The histopathology image of complex atypical hyperplasia is very similar to the one of cancer. Therefore, new methods are required that enable differentiation of these two kinds of endometrial changes. However, PCA analysis of all analyzed samples measured with Raman (Fig. 4a) and FTIR (Fig. 4b) did not show separation between each stage of carcinogenesis. The similarity between control and polyp endometrial tissues, was detected from the Raman data, while PCA analysis of FTIR spectra showed non-separation between control, atrophic endometrium and complex atypical hyperplasia. Polyp and atrophic endometrium are classified as normal endometrial tissues without any signs of neoplastic changes 42 . Complex atypical hyperplasia is very similar to the image of cancer 38 . Therefore, FTIR may not be used to separate very similar carcinogenesis stages. However, in this study we used sample collected from women at different age and it is known that age greatly influences FTIR and Raman spectra obtained 43 .
As shown in Figs. 1, 2 and 3 endometrial tissues in different carcinogenesis differs fundamentally from normal tissue in terms of structure, genetics, and cellular activity. Therefore, obtained spectra were analyzed by multivariate analysis methods 44,45 . In this study we performed PCA analysis and Person correlation test, Fig. 3, Table 3, respectively. PCA analysis, as well as Person correlation test demonstrated that Raman spectroscopy offers higher possibility to distinguish between analysed types of tissues. The reason could be the difference in the physical principles of both techniques 46 , e.g. FTIR spectroscopy is sensitive especially for OH stretching in water, while Raman spectroscopy to-C-C, C=C, and C≡C bonds 47 . A high number of carbon functional groups was observed in biological materials, therefore Raman spectroscopy can give a more precise chemical composition of a sample. Particularly, in Fig. 3 the differences in the peak area were observed for almost all analyzed Raman peaks, and only for two FTIR peaks, which corresponded with functional groups building carbohydrates and protein amino acid. These suggest, that changes observed by us using two complementary methods, which have the greatest impact on endometrial carcinogenesis occur in carbohydrates and amino acids, thus consequently in protein. Indeed, lectins, which are carbohydrate-binding proteins permit glycoproteins to be expressed on cancer cells 48 . Furthermore, other research showed, that in the cancer cells there are changes in the expression of genes that code glycoproteins 49 . Importantly, the changes in the carbohydrates and amino acids metabolism and expression, were visible not only in the case of endometrial carcinogenesis, but also in other types of cancer, e.g. colon 50 , overian 51 ones etc. "

Conclusions
In this study, Raman and FTIR spectroscopy were used to evaluate a carcinogenesis in endometrial cancer. Moreover, the methods were compared in terms of efficiency in detection of chemical changes during carcinogenesis of endometrial cancer. Obtained Raman spectra showed, that together with endometrial cancer development, chemical structure and compositions of tissues differs from the ones observed in control samples. Moreover, these changes are statistically significant in all analyzed Raman ranges, while only two peaks of values of absorbance in the FTIR spectra were statistically significant (1073 cm −1 and 1125 cm −1 ). Furthermore, PCA and HCA analysis of Raman spectra showed that tissues with cancer cells or changes which occurred before cancer (complex atypical hyperplasia) are similar to each other, but they are different from the control samples or changes which can be observed in non-cancerous endometrial tissues. Correlation was found between non-cancerous tissues, as well as between cancer and complex atypical hyperplasia tissues in the correlation test. In contrast, PCA and HCA results as well as correlation obtained from FTIR data showed, that samples which are not histologically similar, correlated and were characterized by similar chemical changes. Overall, our results indicate that Raman spectroscopy is more effective than FTIR in assessing the development of carcinogenesis in endometrial cancer. Using Raman spectroscopy helps to investigate information about atrophic endometrium, complex atypical hyperplasia and endometrioid adenocarcinoma development, while FTIR spectroscopy facilitates differentiation between healthy, cancer and polyp endometrial tissues. Consequently, FTIR spectroscopy methods can be used to detect more visible changes in pathomorphologic images, but not in the case of very similar carcinogenesis phase. Importantly, the amount of tissues collected from the women at different age investigated in this study www.nature.com/scientificreports/ was not extensive. Therefore, further tests should be performed to confirm the obtained results. Moreover, the results need to be verified by other researchers. Moreover, application of Raman and FTIR spectroscopies as diagnostics tools in carcinogenesis process should be studied regarding other organs. The medical characteristics of patients, from whom we obtained samples, is presented in Table S1. Moreover, particularly important is the issue that samples were collected from 16 patients over several years, where each next sample were characterized by other endometrial changes. Furthermore, noteworthy is relatively small number of tissue samples in overall group and distribution of their subgroups. It is connected with character of the Clinic and does not reflect their frequency in population.

Materials preparation.
All obtained materials were prepared like in the paper Depciuch et al. 52 . First, tissues were placed for 12 h in a liquid fixative. Secondly, ethanol within the tissue was gradually replaced with xylene. A paraffin infiltration of the tissue was performed at a temperature of 52 °C. When infiltration sections were embedded in paraffin block, they were prepared by pouring liquefied paraffin into a metal mold and thin piece of tissue was inserted with the appropriate spatial orientation. Each section was flattened on a hot water surface. For the spectroscopic measurements, the obtained samples were placed on CaF 2 slides. Moreover, an immunohistochemical diagnostics was performed for each obtained sample.

Methods
All methods were performed in accordance with the relevant guidelines and regulations.

FT-Raman measurement.
For the FT-Raman spectra Nicolet NXR 9650 FT-Raman Spectrometer was used. The spectrometer has an Nd:YAG laser (1064 nm) and a germanium detector. The measurement range was between 150 and 3.700 cm −1 . The laser power was 0.5 W. Moreover, each sample was measured using 64 scans with 8 cm −1 resolution. All obtained spectra were analyzed by OPUS software using baseline correction, smoothing (7 points) and normalization using vector normalization functions.
FTIR measurements. Vertex 70v spectrometer by Bruker was used to obtain FTIR spectra of all analyzed samples,. Moreover, Attenual Total Reflectance (ATR) technique was used with diamond crystal. All samples were measured in the IR range between 400 and 4000 cm −1 using 32 scans with 2 cm −1 resolutions. Moreover, all measurements were made in triplicate. The obtained spectra were normalized, smoothed and baseline corrections were made in the OPUS software.

Statistics-multivariate analysis, Pearson correlation test. Principal component analysis (PCA)
was performed to obtain information about the spectra variation among the types of samples. PCA reduced the dimensionality, the number of variables of the data, while maintaining as much variance as possible. PCA was performed based on the selected spectral regions, which were determined after counting the average values of Raman intensities and absorbance of analyzed peaks. The statistical significance of the calculated values of Raman intensities and absorbance were analyzed by one-way ANOVA followed by the Tukey's test (Statistica 10). The obtained experimental results were represented as the means ± SEM (the standard error of the mean). Furthermore, HCA analysis was performed to determine similarity between each group of samples. Moreover, to obtain information about correlation between obtained Raman and FTIR spectra of measured samples, the Pearson's test was performed with p < 0.05 and significance level of 95%. All analyses were performed using Past 3.0 software. In addition, considering low number of samples in this study, Partial Least Square (PLS) analysis was performed. It was used in case of multicollinearity problems associated with complex biological data when the number of predictors was much larger that the number of samples in tall or wide data sets, like in this study. Moreover, Variables importance in projection (VIP) was calculated to define the most important vibrational band associated with the separation between each type of measured samples. The PLS analysis and VIP factor were calculated using Origin 2019 software. Random forest as well as second learning machine method C5.0 classification model were calculated to obtain information about accuracy of Raman and FTIR spectroscopy in distinguishing of analyzed samples. We used the suggested Random forest algorithm 53 and additionally applied the traditional C5.0 single decision tree algorithm well known in the literature 54 . The R environment and the Random Forest and C5.0 software packages were used to conduct the experiments. www.nature.com/scientificreports/