Introduction

While in the US the incidence of breast cancer has been growing moderately in recent years, in the Gulf Cooperation Council Countries it grew by 40% in the last 12 years1. In clinical practice, the analysis of tissue samples relies on the examination of microscopic structures observed in stained tissue sections. Robustness of that practice is limited as evidenced by inter- and intra-observer discrepancies. Staining specificity can be improved by immunostaining of a few key markers such as oestrogen and progesterone receptors, HER2, Ki-67 and some more2. In practice, the information obtained is very limited and definitively not sufficient to deliver accurate diagnostic, provide adequate therapy and result in satisfactory prognosis at individual level. Furthermore, tumors are heterogeneous3,4 and their behavior strongly depends on their microenvironment5,6. The lack of molecular information available at cell level when observing tissue section results in incomplete overview of the patient pathology. While analysis of genetic materials at cell level is not a viable option, some spectroscopic approaches accurately reflect the molecular content of the cells. Vibrational histopathology relies on FTIR or Raman imaging. It allows the discrimination of very closely related cell lines by providing, for each pixel of the tissue section image, full vibrational spectra which precisely report the biochemical content of the cells7,8. Fourier transform infrared (FTIR) spectroscopy in particular has shown its ability to recognize unique cancer features in the field of breast cancer9,10,11,12. A recent comparison of a series of breast cancer cell lines grown in 2D and 3D cultures by transcriptomic analysis and by FTIR imaging indicated that FTIR and transcriptomics are as sensitive to detect differences between cell lines and differences within cell lines induced by growing in a 3D environment instead of the regular 2D culture condition13. In turn, normal and tumor tissue in breast7,14,15,16,17, colon7,18,19,20, lung7,21, prostate7,22,23,24,25 and cervix7,26,27,28 can be distinguished using FTIR spectroscopy. The use of 2D correlation analysis within the FTIR dataset of breast cancer tissues indicates that a very significant number of FTIR contribution are cross-correlated, decreasing the number of independent potential markers in the spectra, which suggests the addition of biomarkers from other sources could be beneficial29. While FTIR relies on the organic molecules present in the tissue sections, laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) is a complementary technique which provides elemental analysis mapped on a micrometer scale in thin sections of a soft tissue for up to 10–15 different essential elements. LA-ICP-MS enables identification and discrimination of elemental differences with an accuracy in the range of the part per billion (ppb). The sample is volatilized in an ablation chamber by a powerful ultraviolet beam. The resulting aerosol is then driven to the inductively coupled plasma device that fully decomposes the volatilized sample into simple elements which are inonized. The ionized elements are finally analysed by mass spectrometry. LA-ICP-MS provides a unique means to detect levels of trace elements including Fe, Cu and Zn that may be related to cancer development in breast tissue. Metal distribution in a tissue has been shown to be predictive of cancer behavior, potentially because elements such as Zn parallel the overexpression of extracellular matrix metalloproteinases (MMPs), in particular of MMP-11 which is involved in the degradation of the extracellular matrix and tumor progression30. While numerous previous FTIR studies and a more limited number of LA-ICP-MS studies indicate a very good potential to obtain information of diagnostic value, combining FTIR imaging with LA-ICP-MS, two orthogonal methods bringing information on the organic molecule composition (FTIR) and abundance of simple inorganic elements (LA-ICP-MS) could therefore constitute a particularly powerful approach to decipher the subtle variations present in breast tissue. In a recent study, Anyz et al.31 developed a methodology using image registration to overlay H&E-stained tissue section images and LA-ICP-MS images reporting Zn and Cu concentrations in 10 melanoma sections. We demonstrate here the feasibility of this approach by processing and merging FTIR and LA-ICP-MS breast tissue image data. After image registration and pixel resizing, the two set of data could be combined and analyzed simultaneously. It must be noted that quantification of the improvement for diagnostic purposes is beyond the scope of the present communication.

Methods

Tissue sections

Six Formalin-Fixed Paraffin-Embedded (FFPE) breast tissue samples were obtained from the histopathology laboratory at Al-Ahli Hospital, Doha, Qatar. Experimental protocols were approved by Qatar University and Al-Ahli Hospital ethical committees. All methods were performed in accordance with the relevant guidelines and regulations of Qatar University, Al-Ahli Hospital, and Université Libre de Bruxelles. All the material were taken anonymously, a consent form from AL-Ahli Hospital was signed by all patients. As described in Verdonck et al.10 for each FFPE breast tissue sample, 3 adjacent tissue sections were cut using a microtome. Paraffin was removed by incubation in 2 successive xylene baths for 20 minutes. Tissue rehydration was achieved through 3 successive ethanol baths with a decreasing gradient of ethanol (100%, 90%, 70%) for 15 minutes and 2 milliQ water baths for 10 minutes. For one 5 µm thick tissue section, standard H&E staining was performed. This section was used as a reference. A second adjacent 5 µm thick section, used for FTIR imaging, was deposited on a Kevley Technologies MIR low-e microscope glass slide. These glass slides are covered by thin metal layers. The surface actually consists of several layers of tin oxide and silver and their reflective surface allows the recording of FTIR spectra in reflection mode, sometimes called transflection mode. The data were collected in transflection mode from sample regions of 350 × 350 µm2. One FTIR image (unit image or tile) resulted in 4,094 spectra. As described earlier, to cover larger areas an automatic tiling combined several FTIR tiles in order to obtain one large mosaic FTIR image29. A third 10 µm thick section was used for LA-ICP-MS imaging. As shown in Table 1, the FTIR data included a total of 31 106 infrared spectra, i.e. a mean number of spectra per image of 5 106. With larger pixels, the total number of LA-ICP-MS spectra was just above 150,000. The slides were submitted to two pathologist and only when the diagnosis was concordant between the two, the case was included in the study (i.e each pathologist made his/her diagnosis without knowing the other pathologist diagnosis).

Table 1 Characteristics of the images analyzed in this work. One FTIR tile corresponds to 4096 pixels or 4096 spectra.

An example of tissue section is presented in Fig. 1 for a fibroadenoma. The section contains a piece of tissue showing loose fibroblastic stroma containing duct-like structures. These glandular or duct-like spaces are lined by single or multiple layers of cells that are regular with well-defined intact basement membrane.

Figure 1
figure 1

Image of an H&E stained fibroadenoma section described in the text. Bottom: enlargement of the area contained in the rectangle draw in the upper part of the figure (will be detailed in Fig. 2).

Imaging

FTIR spectroscopic images were obtained in transflection mode using an Agilent FTIR imaging microscope equipped with Mid-band MCT detector (12,000–600 cm−1, Mercury Cadmium Telluride). The images were obtained in the range of 4000–700 cm−1 with 128 scans per pixel, each pixel covering an area of 6.25 × 6.25 μm2 and 4 cm−1 spectral resolution.

As described elsewhere32, Laser Ablation Inductively Coupled Plasma Mass spectra were acquired using a laser ablation system (New Wave 213, ESI) equipped with a frequency quintupled neodymium-doped yttrium aluminium garnet (Nd:YAG) laser and a fast-washout ablation cell. The laser ablation device was coupled to a quadrupole ICP-MS system (iCAPQc, ThermoFisher Scientific, Bremen, Germany) using polytetrafluoroethylene tubing. Helium gas was used for ablation; before entering the inductively coupled plasma, argon was admixed as make-up gas. The applied laser fluence (approximately 5.5 J/cm2) in combination with the high stage scan speed (120 mm/s), resulting in low number of laser shots per position, was not sufficient to create negatively contributing amount of sodium signal originating from glass substrates. This was also true for other highly abundant elements in glass, such as potassium. Before measurements, a thin gold layer was deposited on every sample as a pseudo-internal standard. Samples were rasterized using a line-scan pattern that covered the complete tissue section. Laser output energy was adjusted to ablate all tissue material in one run of analysis.

Chemometric analyses

As described in a previous paper33 Principal component analysis (PCA) is an unsupervised multivariate method allowing variable reduction by building linear combinations of wavenumbers varying together, called Principal Component (PC)34. The first principal component explains most of the data variance. The second principal component, uncorrelated to the first one, accounts for most of the residual variance and so on. Usually 2 to 6 PCs are sufficient to explain the major proportion of the original variance of the data set, reducing the description of each spectrum to 2 to 6 numbers representing the projection (scores) of each spectrum on the PCs.

Hierarchical bottom-up clustering (HCA) is a method for grouping spectra based on their similarity. It starts with the computation of a distance matrix between all spectra. The Euclidian distance was used here. The more similar (shortest distance) spectra form a cluster and the distance matrix is updated for the remaining spectra/clusters. The process is then repeated; most similar clusters are successively merged until there is only one cluster left. There is no need to define the final number of clusters. As described in Benard et al.35, K-means clustering is based on a non-hierarchical process and is particularly efficient for dealing with large data sets as it is less demanding of computational resources36. The number of clusters has to be defined before computation. The process minimizes the intra-cluster variance and maximizes the inter-cluster variance. The algorithm works iteratively to assign each data point to one of K groups based on the Euclidian distance. As the first step of K-means clustering starts with a random selection of centers, the final result may therefore depend on this random selection. The process was repeated 10 times to improve the robustness of the process. Two-dimensional (2D) correlation was calculated as described by Noda37 and used recently for the investigation of breast cancer tissue sections by FTIR imaging29.

Double clustering analysis is designed at providing an overview of the similarities both within spectra and between spectra. It has been intensively used for analyzing gene expression as families of genes displaying an identical behavior upon a perturbation (pathology, exposure to a drug etc.) do form functional clusters and the phenotypes (the cells for instance) are also grouped according to their gene expression. Here, both FTIR absorbance and element abundance have been scrutinized in place of gene expression. They have been sorted with a K-means clustering while spectra were sorted with a full hierarchical clustering.

All computations have been carried out with Kinetics, a custom-made program running under Matlab (Mathworks, Inc.).

Processing of FTIR spectra

For FTIR images, processing was carried out in the following sequence: 1. water vapor contribution subtraction, 2. removal of CO2 contribution, 3. scaling, 4. baseline subtraction and 5. filtering for signal-to-noise ratio. The processing was reproduced for each spectrum of each image independently.

Subtraction of water vapor contribution

A reference water vapor spectrum was acquired as the mean of the difference between all the spectra of an image recorded in the absence of any sample before and after purging the sample cabinet with dry air. The area of the water vapor band between 1878 and 1860 cm−1 was used as a reference to determine the subtraction coefficient. Correction for water vapor contribution brought little visible change to the spectra as the sample cabinet was continuously purged with dry air during the experiments and as the spectra were quite intense. Nevertheless, it is critical to remove this contribution to take full advantage of the accuracy of the FTIR spectra8.

Removal of CO2 contribution

As CO2 absorbs between 2450 and 2250 cm−1, a region where biological molecules do not absorb, this region of the spectrum is of little interest. Correction is however required in some instances for proper scaling of the spectra on the display. Here, a straight line was drawn between 2450 and 2250 cm−1 to replace the CO2 contribution.

Scaling

Scaling of the spectra is necessary to account for thickness variation in the same section and among different sections. It is well documented that microtome sections have thicknesses that varies in the range of several % or even several tens of %38. Here, the area under the amide I and amide II bands (i.e. between 1730 and 1490 cm−1) has been set to an arbitrary value identical for all the spectra

Baseline subtraction

Baseline subtraction is required because shifts in baseline can be observed in spectra present in images. The origin of these shifts remains unclear but loss of light by reflection on top of the sample and variation in substrate reflectivity may contribute significantly to this phenomenon. The spectra were baseline-corrected. The baseline was built as a succession of segments interpolated linearly between spectral points at 3900, 3800, 3666, 3116, 3000, 2700, 1800, 1490, 1422, 1358, 1114, 1138, 980 and 900 cm−1 and subtracted from each spectrum. A baseline going through many points such as the one described above does not represent a “real” baseline but, applied in a consistent way, it improves the quality of spectral comparison by enhancing the significance of absorbance variations with respect to the points set to zero as demonstrated elsewhere8. After such a correction, it is usually not necessary to apply second derivatization as also demonstrated elsewhere8.

Signal-to-noise ratio (SNR)

Flagging spectra with insufficient Signal-to-Noise ratio (SNR) is required to eliminate spectra of poor quality from further analyses. The SNR was checked on each spectrum as described earlier10. Unless otherwise mentioned, it was required to be higher than 150 with noise defined as the standard deviation in the 2000–1900 cm−1 region of the spectrum and signal defined as the maximum of the curve between 1730 and 1490 cm−1 after subtracting a baseline passing through these two points.It has been discussed before8 that requiring high signal-to-noise ratio (SNR) is time consuming as SNR increases only as the square root of the number of scans. According to simulations made by Bhargava39, SNR beyond 150 provides little benefit for typical classification.

Once all the corrections have been applied (Fig. 2), one may be confident that the spectral features present in the spectra are only related to the sample.

Figure 2
figure 2

Example of a processed FTIR image. Here the ratio A1230/A1655 is reported, evidencing the epithelial cells surrounding the ducts. The pixels where the SNR is below 150 have been turned to black. This image corresponds to the framed region in the section shown in Fig. 1.

Figure 2 reports a processed FTIR image of a region framed in Fig. 1. In this image, the absorbance at 1230 cm−1 representative of phosphate vibrations found in nucleic acids has been divided by the absorbance at 1655 cm−1 representative of proteins. Epithelial cells surrounding ducts, in red, are clearly distinguished from the rest of the tissue.

Processing LA-ICP-MS spectra

LA-ICP-MS images have been recorded for 13C, 31P, 34S, 52Cr, 55Mn, 56Fe, 58Ni, 63Cu and 64Zn.

Background subtraction

In a first step, areas without samples were selected to obtain a background relevant to the current tissue section

Rectangles were drawn in areas of the images where no tissue contribution was present (Fig. 3A). All spectra present in these areas were collected and averaged. The mean spectrum representing the background was then subtracted from all spectra of the image. The distribution of the intensities in the image is now shifted, bringing the large contribution of regions of the image without tissue to zero (Fig. 3B and C).

Figure 3
figure 3

Illustration of the process followed for background subtraction. A. the rectangles represent the areas selected to be used as background in this 13C image. In this example, 1918 spectra were included in the rectangles and their mean was subtracted from all spectra. B. intensity distribution before subtraction of the background, C. intensity distribution after subtraction of the background, D. intensity distribution after subtraction of the mean and division by the standard deviation.

Scaling

On the contrary to FTIR images, the scale of the observed intensities varies widely for the different elements. Each spectrum was therefore processed by subtracting the mean and dividing by the standard deviation. For each individual image, the mean has therefore been subtracted and every spectrum of the image was divided by the standard deviation. The consequence is that the areas without tissue have usually negative values and the areas where the tissue is present have positive values as indicated by the intensity distribution (Fig. 3D).

Results

Six breast tissue sections have been analyzed by FTIR imaging and LA-ICP-MS. These 6 tissue sections have been selected for their size which is representative of the samples analyzed in the clinic. Size is an issue, especially for FTIR imaging which collect spectra every 6.25 µm, resulting in 2.5 million full FTIR spectra per cm2. Most of our samples were close or above 2 cm2. The detail of the samples is presented in Table 1. The goal of this paper is to report in detail the combined analysis of FTIR and LA-ICP-MS images which, to the best of our knowledge, has not been attempted before. We show how images obtained by both approaches can be merged into a single data set and analyzed.

Comparison of FTIR and LA-ICP-MS images

The examples reported in Fig. 4 indicate that shape and orientation of the tissues sections are similar for FTIR and LA-ICP-MS imaging but not identical. Image registration will therefore be required for comparing identical regions between the two imaging modes40.

Figure 4
figure 4

FTIR image reporting the absorbance at 1652 cm−1 of 3 breast tissues (left column) and elemental analysis image reporting the abundance of 13C for the same 3 breast tissues (right column). Data have been processed as described below in the text. Regions with SNR < 150 have been turned to dark blue.

Analysis of FTIR images

Analysis of FTIR images in the context of breast tissue has been described in numerous papers9,10,11,12,35,41,42 and will not be detailed here. FTIR imaging has been shown to identify successfully the various cell types present in breast tissue section12,35,43, to reveal breast cancer effect on the extracellular matrix11 and on fibroblasts44,45, to distinguish the different types of lymphocytes (B cells, T cells CD4+ or CD8+)33,46,47 and to identify most breast cancer cell lines grown in vitro after FFPE processing48 or in spheroids13. It has also been shown to be able to classify anticancer drug effects according to the drug-induced spectral perturbations observed on cancer cell lines49. In the framework of this study, the FTIR images will only be used in conjunction with LA-ICP-MS images.

Analysis of LA-ICP-MS images

Resizing and stitching LA-ICP-MS images

The principal interest of imaging of tissue is to compare element abundance not only within a tissue section but also among various tissue sections. To allow such a comparison, the individual LA-ICP-MS images have been padded with zeros on the left and right as well as below and above the actual image to obtain a final image size of 180 × 180 pixels for all tissue sections. Only section #5 (see Table 1) had to be cut on the edges to fit into this common size. The resized images were then assembled into a unique matrix containing the 6 tissue section images (Fig. 5).

Figure 5
figure 5

64Zn distribution in the 6 tissue sections described in Table 1. The areas in grey have values below 0 for both 13C and 64Zn.

Once the individual images have been merged into a larger single image matrix, comparison can be carried out. A normalisation by the standard deviation for each element was applied on the new larger image for proper comparison between tissue sections. Figure 5 reports 64Zn distribution. It must be stressed that the analysis of the spectra (we also use here the term “spectrum” for the abundance profile of the elements) now required a filter allowing to separate spectra belonging to tissues and spectra belonging to regions outside the tissue sections. Here each spectrum with a value below 0 has been assigned to non-tissue response and appears in grey. It can be observed (not shown) that the same filtering is obtained when using 13C values. Figure 5 clearly indicates that it is a reasonable filter to apply.

It is interesting to note that distribution of some elements such as 64Zn reported in Fig. 5 is not homogeneous. The distribution maps for 13C, 31P, 34S, 52Cr, 55Mn, 56Fe, 58Ni, 63Cu and 64Zn can be found in Fig. S1.

Correlations between the abundance of the different elements can be addressed in two ways: correlation analysis and principal component analysis.

Correlation analysis

It is first important to select only spectra and element values which belong to tissue. For this purpose, only spectra with positive values for 13C and 64Zn (see Fig. 5) were retained. All the 57,892 spectra on a total of 194,400 were selected. For correlation analysis, the correlation coefficient was computed between all elements. The result is reported in Fig. 6.

Figure 6
figure 6

: 2D correlation analysis of the abundance of elements (13C, 31P, 34S, 52Cr, 55Mn, 56Fe, 58Ni, 63Cu and 64Zn) in the 6 breast tissue sections.

The diagonal indicates that, as expected, each element is correlated with itself. Off-diagonal cross peaks indicate the presence of two strong correlations 1) between 13C and 34S (label 1 in Fig. 6) and 2) between 56Fe and 58Ni (label 2).

Principal component analysis

Principal component analysis was performed on the spectra of the 6 breast tissue sections analysed above. Figure 7 reports score maps for the first 2 principal components as well as the shape of these 2 principal components. PCA was performed only on element distributions belonging to the tissue.

Figure 7
figure 7

Top: shape of the first 2 principal components PC1 and PC2. Bottom score maps for PC1 and PC2 of 6 tissue sections. PCA was computed only on the spectra with 13C values above 0 as shown in Fig. 5.

It is interesting to analyse the shape of the first PCs. PC1 describes a correlation between 34S and 52Cr as well as between 56Fe, 58Ni and 64Zn while 31P and 55Mn abundance varied in the opposite direction. The enlargement in Fig. 7 demonstrates that PC1 identifies regions of the images where large concerted variations of these elements do occur. PC1 describes the largest part of the variance, i.e. 43% of the total variance, and is orthogonal to all other sources of variance described by the other PCs. PC2 describes a correlation between 34S and 52Cr varying in the opposite direction as compared to 56Fe, 58Ni and 64Zn. It represents 18% of the total variance. All other PCs account for 10% or less of the total variance. It is interesting to note that the details revealed by PCA were not apparent in the previous global correlation analysis which considers only the overall correlations.

Co-analysis of LA and FTIR data

As mentioned above, FTIR and LA-ICP-MS are orthogonal methods providing information on respectively organic molecules and inorganic elements. Their co-analysis could therefore reveal a relevant discrimination power higher than for each method considered alone. The problems related to co-analysis and the solution developed to solve them will be illustrated with one tissue section (section #3 in Table 1).

Image processing

In the first step, a matching sub-region of the LA-ICP-MS and FTIR images was extracted for both image types. Yet, overlay of the image required both a rotation of one image with respect to the other and a pixel resolution match. It was decide to modify the FTIR images whose pixel resolution was much higher. Rotation was obtained by applying a rotation matrix ([cosθ −sinθ; sinθ cosθ]) on the pixel coordinates and interpolating the values accordingly. A rotation by 2° was applied. Resampling was obtained first by binning pixels to arrive at a pixel number along X and Y axes slightly above the one of the LA-ICP-MS image. In a second step, 2D-Fourier transform of the image was computed for the images representing spectral intensities wavenumber by wavenumber. At each wavenumber, the image FT was cut for keeping the final number of points and a FT−1 was taken to generate the absorbance image with the right pixel resolution. The process was repeated for each wavenumber, thereby recreating a series of spectra. As a result, the two images can now be superimposed and have the same number of pixels in X and Y directions. In order to merge the two approaches, the next step was to fuse the data of the two images into a single matrix.

Concatenation of FTIR and LA image data

To obtain a single matrix of data, the two matrices (FTIR and LA-ICP-MS) were concatenated. The spectra now consist for one part in infrared absorbance and, for the other, in a measure of the 9 element abundance. As the units are unrelated for FTIR and LA-ICP-MS, a normalisation by the standard deviation was applied for the new data set. First a background specific to this section was subtracted by subtracting the mean of the spectra present in an area without tissue (Fig. 8), then for each wavenumber and each element, the mean value was subtracted and the resulting value was divided by the standard deviation. The process is illustrated in Fig. 8 which presents the ratio between 64Zn abundance and protein quantity as measured by the absorbance at 1654 cm−1.

Figure 8
figure 8

Left: represents the 64Zn/A1654 FTIR ratio. The two rectangles include a total of 1705 spectra whose average was subtracted from all spectra of the image. For all values at each wavenumber/element, the mean was subtracted and it was divided by the standard deviation. Right: distribution of the SNR through the FTIR image. The red curve reports the integrated counts.

It must be stressed that the averaging of the FTIR applied to create larger pixels resulted in a data set with an excellent signal-to-noise ratio (SNR) centred around 1800 (Fig. 8).

Correlation analysis

As LA-ICP-MS data contain only 9 points (9 elements) while FTIR data contain 226 points between 1800 and 900 cm−1 after interpolating the FTIR spectra to obtain one data point every 4 cm−1, each LA-ICP-MS data point has been quintupled. It makes correlation analysis more clearly readable and gives a significant weight to LA-ICP-MS data in PCA. Figure 9 reports the correlation map.

Figure 9
figure 9

Correlation analysis of the FTIR/LA-ICP-MS concatenate spectra. The LA-ICP-MS data are represented by 9 points present below 900 cm−1 as indicated by the purple circle. Elements are in the same sequence as previously: 13C, 31P, 34S, 52Cr, 55Mn, 56Fe, 58Ni, 63Cu and 64Zn. The white line present on the figure corresponds to points of the spectra where there is no variance because a baseline has been drawn.

Observation of Fig. 9 indicates that there are significant correlations within the FTIR spectra, particularly well-marked after normalization by the standard deviation but little correlation between LA-ICP-MS and FTIR bands. It is very interesting that little significant correlation exists between FTIR and LA-ICP-MS data, demonstrating the very good complementarity between the two approaches.

PCA

Principal component analysis (Fig. 10) also indicates that within this particular image, there is little correlation between element distribution and FTIR bands. As here the mean spectrum has not been subtracted before PCA, the first PC (bottom, blue) represents the mean of the data. The next 4 PCs describe essentially uncorrelated abundance variations of various elements with no significant correlation with FTIR features. PCs 6, 7 and 8 on the other hand describe correlated variations in LA-ICP-MS and FTIR spectral features but describe only less than 5% of the total variance (Fig. 10B). The last PC shown shows variations in the FTIR spectrum not significantly correlated with element variations.

Figure 10
figure 10

(A) PCs 1 to 10 (from bottom to top) obtained after PCA of the data presented in Fig. 8. The mean spectrum has not been subtracted prior to PCA. (B) fraction of the variance explained as a function of the number of PCs. The red line reports the cumulative fraction of the variance explained.

Double clustering analysis

Double clustering analysis is commonly used when analysing gene transcription data. First, the mean spectrum has been subtracted from all spectra (merged FTIR/LA-ICP-MS data sets, see Fig. 8) and each value was normalized by the standard deviation. In a second step, the so-processed merged FTIR / LA-ICP-MS spectra have first been sorted according to a hierarchical cluster analysis. The spectral features (wavenumbers and elements) have then been sorted according to a K-means cluster analysis. Figure 11 reports the intensity of the sorted values.

Figure 11
figure 11

Representation of the intensities of the 10,780 FTIR/LA-ICP-MS spectra of section #3 presented on Fig. 8 passing a SNR threshold of 500 after double clustering analysis. Spectra were processed by subtraction of the mean and normalization by the standard deviation prior to clustering. The 10,780 spectra were sorted according to a hierarchical cluster analysis shown on top of the figure. The wavenumbers/elements were sorted in 4 clusters by K-means clustering. The dotted line on the left side of the figure indicates the limits of the clusters. The mean spectrum after sorting the wavenumbers/elements by the K-means (“sorted spectrum”) is also presented on the left side of the figure. For the sake of the clarity, the “sorted spectrum” is shown prior to mean subtraction and normalization by standard deviation.

Wavenumbers/elements clustering was obtained by the K-means method after mean subtraction and normalization by standard deviation. The limits of the clusters and the mean spectrum obtained after sorting the wavenumbers/elements appear on the left hand side of the figure. For the sake of the clarity, the sorted spectrum is shown (on the left hand side of the figure) prior to mean subtraction and normalization by standard deviation. K-means #1 cluster contains the FTIR spectral region 1070–1020 cm−1, in K-means #2 cluster, the right hand side of Amide I (wavenumbers <1645 cm−1) and the full Amide II bands can be recognized as well as 1380–1430 cm−1 region, K-means #3 cluster contains the left hand side of Amide I (wavenumbers >1645 cm−1) and 1380–1180 cm−1 region. K-means #4 cluster contain the FTIR spectral region found between 1020 and 900 cm−1 as well as all elements. Figure 11 reveals some correlations that were not apparent when looking at the entire dataset. An example is indicated by the two spectrum clusters identified by the blue rectangles on Fig. 11. In these particular clusters of spectra, wavenumber/element cluster #4 groups high values for 13C, 34S and 52Cr and the 1020–900 cm−1 FTIR spectral region assigned to glycosylation and phosphate vibrations.

Discussion

For the analysis of tissue sections, some features of infrared imaging are particularly interesting. One of these advantages is that it is fully FFPE (formalin-fixed, paraffin-embedded) compatible. Currently, FFPE remains the standard for clinical histopathology. Samples are stable and the large library of FFPE tissues allows retrospective studies. Yet, while the morphology of the tissues is well preserved upon formalin fixation and paraffin embedding, nucleic acids are usually partially deteriorated, making NGS (new generation sequencing) and transcriptomic studies difficult. LA-ICP-MS can also be applied to tissue sections and provide new information on the tissue. So far it has been essentially used to help immunochemistry imaging50 or to locate platinum-based anticancer drugs in tissues51 but relatively few works deal with measuring biologically relevant elements in tissue sections52. We previously showed on breast tumor that FTIR spectroscopy has a high potential to identify tissue types10,35 but we also showed that many FTIR biomarkers are highly correlated29. We also considered both FTIR and LA-ICP-MS for investigation of rat brain after ischemic stroke but the data were collected and analysed separately53. While FTIR imaging has a demonstrated use for diagnostics and prognostics in breast cancer, LA-ICP-MS is a completely orthogonal method that could complement FTIR with another set of markers. A key result obtained in this paper is the correlation analysis (Fig. 9) which indicates that there is no significant correlation between FTIR data and elemental analysis. Quite significant correlations exists within the FTIR data set as indicated on Fig. 8. Similarly, some correlation exists between the abundance of different elements (Fig. 6). Yet, almost no correlation is found between the two techniques (Fig. 9). This is confirmed by the PCA analysis reported in Fig. 10 which displays little covariance between the two methods before PC#6 (PC#1 is the mean). The LA-ICP-MS method brings therefore new non-redundant data which can only help potential diagnostics. Even though it was not the purpose of the present paper to decipher a diagnostic tool, a useful contribution of elemental analysis to diagnostic is supported by the role trace element have in some enzymes involved in disease progression, e.g. metalloproteinases54 as well as in many zinc finger motives involved in reprogramming breast cancer transcriptional network55,56 related to metastasis.

For microscopy approaches, resolution is an issue. As reviewed elsewhere8 for FTIR imaging, resolution is diffraction-limited, which means intracellular details will generally not be resolved57,58. Furthermore, pixel content may also be affected by the point spread function of the Schwarzschild optics58,59. The optimal size of the pixels has been evaluated by Reddy et al.60. Roughly, the wavelength (5–10 µm for the spectral range considered in this study) places a limit to the expected spatial resolution. Though there are means to record infrared images at much higher resolution, they are not practically usable when several cm2 have to be analyzed. Yet, numerous studies quoted before in this paper have demonstrated the usefulness of FTIR imaging for the analysis of tissue sections. When looking at essential trace elements, a resolution of 50 μm is a reasonable compromise between resolution and sensitivity61,62. Though single cell analysis is out of reach, pathologies like cancer usually display sufficient cell density to allow a precise characterization of the cell type. The LA-ICP-MS technique can therefore give sufficient sensitivity and spatial resolution to link the elemental data with the molecular data obtained from the FTIR imaging in cancer pathologies. Similarly, characterization of changes in the extracellular matrix, already shown to be feasible by FTIR imaging11,41,45, is perfectly adapted to characterization by LA-ICP-MS.

It must be stressed here that the goal of the paper was to describe how FTIR and LA-ICP-MS imaging data can be combined and analyzed simultaneously to provide a larger set of markers. We used a set of 6 breast cancer tissues with different pathologies (Table 1). The samples were selected for their within-image and between-image diversity of tissues. Within this sampling, we could conclude the elemental markers do not significantly covariate with the FTIR markers, underlining the complementarity between the two methods.

In conclusion, the results obtained in this paper show the feasibility of merging FTIR and LA-ICP-MS datasets, providing a hybrid set of markers based respectively on organic molecules and on trace elements. The correlation analyses and PCA presented in the paper show that little correlation could be found here between FTIR and LA-ICP-MS values. In the limited size of the sampling tested, this is a good indication that both do not co-vary and therefore bring their own independent information. Interestingly, in a recent paper, Anyz et al.31 developed a similar concept to compare adequately LA-ICP-MS images and H&E-stained section images. Their goal was to better relate abundance of Cu and Zn to histological features. The present paper add the FTIR dimension which contains a demonstrated series of biomarkers. The next step will be to repeat the analysis on a much large selection of tissues more specific pathologies.