Rapid brain structure and tumour margin detection on whole frozen tissue sections by fast multiphotometric mid-infrared scanning

Frozen section analysis is a frequently used method for examination of tissue samples, especially for tumour detection. In the majority of cases, the aim is to identify characteristic tissue morphologies or tumour margins. Depending on the type of tissue, a high number of misdiagnoses are associated with this process. In this work, a fast spectroscopic measurement device and workflow was developed that significantly improves the speed of whole frozen tissue section analyses and provides sufficient information to visualize tissue structures and tumour margins, dependent on their lipid and protein molecular vibrations. That optical and non-destructive method is based on selected wavenumbers in the mid-infrared (MIR) range. We present a measuring system that substantially outperforms a commercially available Fourier Transform Infrared (FT-IR) Imaging system, since it enables acquisition of reduced spectral information at a scan field of 1 cm2 in 3 s, with a spatial resolution of 20 µm. This allows fast visualization of segmented structure areas with little computational effort. For the first time, this multiphotometric MIR system is applied to biomedical tissue sections. We are referencing our novel MIR scanner on cryopreserved murine sagittal and coronal brain sections, especially focusing on the hippocampus, and show its usability for rapid identification of primary hepatocellular carcinoma (HCC) in mouse liver.

for transmitted and confocal reflected light examinations 12,15,16 . Reflection-based microscopic systems require hematoxylin and eosin (H&E) staining or various fluorophores to achieve optimal molecule dependent analysis results 17 . Furthermore, the related sample preparation is work intensive. Raman-based tissue scanners exhibit high detection sensitivity and are well suited for tissue section analysis 18 . However, low measurement times can only be reached for small (scanning) areas. Therefore, they are not competitive with established scanning methods, that are suitable to analyse whole tissue sections with a pathological assessment in efficient time [18][19][20] .
In clinical settings, rapid tissue analysis is also required during surgery. Especially in case of tumour resection, such as skin cancer 21 , lymph nodes 22,23 or liver tumours [24][25][26] , pathological examination is performed during surgery to determine the progress of tumour resection. In general, the analysis of the sample is performed by H&E staining and has to be evaluated by a pathologist. This evaluation of the degree of tumour removal can be made intraoperatively 27 . The aim is to extirpate the preoperatively diagnosed findings while reliable intraoperative assessment of a frozen section 1 . Although this method is widely used to analyse frozen sections, the analysis is time-consuming and interrupts the running surgical process 22 . Furthermore this method is the current approved methodology, the pathological evaluation of H&E stained sections is subject to inaccuracy [28][29][30] . Independent studies proved that tissue sections of the same type can be assessed subjectively by several pathologists in different ways and therefore can result in misdiagnosis. In certain cases, a tumour can be classified as malignant or benign or neither, leading to incorrect treatment of the patient. Intraoperative misdiagnosis of pathological tissue sections inevitably results in an unsuitable progress of the surgery [31][32][33] .
The aim is to find typical tissue structures or tumour margins by rapid chemical segmentation in order to select areas of interest for further more precise investigations. This is valid for both laboratory and clinical applications 1,6 . The tumour and its rough location is known in most cases by preoperative examinations. Anyway, the precise boundaries to adjacent known tissue are not entirely defined.
Therefore, a reliable measurement procedure that treats with fast measurement at large measuring range (> 1 cm 2 ) and less data load is essential. In addition, the spatial resolution has to be as high as technically feasible. This is the balance in this project. In this work we show that we have succeeded in dealing with this problem and in developing a measuring method that perform these requirements. This is not about classifying tumours into tumour types or conducting in-depth molecular investigations. We focus on the rapid segmentation of general tissue structures. This includes the rapid recognition of tumour margins in tissue sections. We are validating and testing our measurement system on brain structures and thus demonstrate that the measurement method is competitive with the already established FT-IR imaging method. Subsequently, we present a scanning method that is capable of detecting naturally grown liver carcinoma.

Results
Base for our experimental procedures is a novel MIR scanner, that will be integrated into a common workflow for frozen section analysis. In order to clarify the significance of the MIR scanner, its properties and functions are described in the following. An overview of the workflow integrates the developed scanning system into a familiar workflow of tissue analysis. The measurement results are focused specifically on the detection of the hippocampus in the coronal plane of mouse brain and the recognition of tumour margins in mouse liver.
Middle infrared scanner. The optical measuring system presented here consists of a flying spot scanner [34][35][36] . As previously published, the system is already designed to scan a sample surface with a non-contact technique 37,38 . Here the measuring system is based on two laser wavenumbers that are feasible for detection of long-chained C-H molecules. The related experiments referred specifically to polymers and body fat of animals. To use the application presented in this work, the laser unit was extended by two to a total of four lasers. Thus, the scanning process is performed with four wavenumbers in the mid-infrared range.
The wavenumbers (2790 cm −1 /~ 3.6 µm, 2926 cm −1 /~ 3.4 µm, 3350 cm −1 /~ 3 µm, 3700 cm −1 /~ 2.7 µm) of the four distributed feedback (DFB) lasers (nanoplus Nanosystems and Technologies GmbH, Germany) are selected to ensure that one laser wavenumber is absorbed by lipids (target laser lipids). The other laser emits a wavenumber that absorbs less to almost none lipids (reference laser lipids). A similar laser combination is applied to protein vibration bands. This results in two target lasers and two reference lasers. The interpretation of the correct wavenumber requires the observation of absorption bands of lipid and protein vibration in the spectral range. The spectral range in the infrared (IR), especially the mid-infrared, is classified into functional groups and the fingerprint region 39 . These two areas include the most important vibrations of molecules. The individual vibration types such as deformation vibration and valence vibration are located in the fingerprint region. In the functional group range, the valence vibrations for lipids are particularly important, since they have extremely strong vibrations in this range. Valence vibrations of lipids have a vibration peak at 2926 cm −140,41 . For proteins, the vibration band varies between 3200 cm −1 and 3500 cm −1 of the functional groups 41 , which depends on the type of protein. The selected laser wavenumbers for the MIR scanner setup (Fig. 1) are 2790 cm −1 (reference lipids, L1), 2926 cm −1 (C-H vibration bond for target lipids, L2), 3350 cm −1 (N-H vibration bond for target proteins/amide A, L3) and 3700 cm −1 (reference proteins, L4). Basically, the target lasers are sufficient for later clustering and segmentation of the scan results. In this case, the reference lasers represent sufficient scattering properties of the sample, whereas the target lasers contain the absorption properties of the sample. A combination of the reference and target lasers improves the subsequent clustering by provides more spectral information about the sample.
The lasers (L1 to L4) are used in sequence for the following measurements (Fig. 1a). A mirror is mounted on a slide on a linear axis moves automatically from one laser module to the next and thus couples the selected laser into the optical focus unit. The focused laser beam is guided on the sample by an agile mirror unit. A scanning field is achieved by orthogonal movement of the sample by means of a 2D translation stage. The measuring system acquires the confocal direct reflex via an infrared (IR) detector. The detection signal is transferred from www.nature.com/scientificreports/ the optical system via beam splitter (BS). This enables the setup to run at a sampling rate of ~ 2.7 MS/s (including oversampling) with a spatial resolution of 20 µm. The spatial resolution is determined by means of calibration targets with defined resolution patterns. The spatial resolution results from the mirror velocity of the agile mirror unit and the velocity from the translation stage. As a result of the mirror movement, the laser is deflected a defined and calibrated pathway on the sample. On this resulting laser line, a certain number of measuring points is acquired. The temporal acquisition of the individual measuring points results in the spatial resolution. To ensure a constant resolution, the data acquisition is triggered on each new laser line generated by the align mirror unit. In combination with the constant translation stage velocity, a measurement field with the specific spatial resolution is then generated. Hence, the scans are performed by scanning defined lines on the sample while the translation stage is moving orthogonally. This results in an array of pixels represented by a measurement image. The measuring time for scanning a sample area of 1 × 1 cm 2 is at 3 s. The following measurements are based on a spatial resolution of 20 µm. The scanning time increases proportionally while the scanning range increases and the spatial scanning resolution remains constant. Compared with other proposed measurement systems, our measurement method is much faster and can be integrated into routine tissue testing. Moreover, this method is not limited to transparent samples for transmission light examination. The MIR scanner is primarily intended for the approximate detection of tumor margins on spectral data basis. Depending on the application and need, a high-resolution examination (spatial resolution less than 5 µm) can be performed using an additional measurement method, such as high-resolution infrared microscopy 16 . In this case, an additional measurement process can be performed, based on the measurement data of the MIR scanner and its preselected ROI. www.nature.com/scientificreports/ The fundamental feasibility of the MIR scanner for tissue imaging is shown in Fig. 1b-e. These are scan results of brain from mouse type C57BL/6 in sagittal plane. Each scan result includes 400k measurement points, acquired within ~ 4.8 s. The measurement images are additionally coloured for this illustration to indicate the different laser wavenumbers. The measurement images show significant differences for each laser used. Target (Fig. 1b,d) and reference (Fig. 1c,e) wavenumbers are distinguished from each other based on the measurement data, since the intensity of the pixels represents the absorbance. The scan results are normalized and calculated with a previously acquired background measurement. Since these measurement results just represent a demonstration of the MIR scanner, the results are not discussed in detail in this case.
Workflow. Two adjacent sections of the respective tissue are prepared for the investigations. One section is used for spectral measurement with FT-IR imaging followed by MIR scanning as shown in Fig. 2. The second section is used for H&E staining for better visibility of morphological structures of the respective tissue. Before the spectral measurement, drying is performed in a desiccator for 10 min. Drying is necessary to minimize the water content in the sample. Otherwise, the O-H vibration superposes the C-H and N-H valence vibration. Thus, a reliable measurement of the C-H and N-H bands in the functional group region is impossible. The subsequent FT-IR imaging acquires a full spectrum (4000 cm −1 until 750 cm −1 wavenumbers) and is used as a reference system. In comparison, the novel MIR scanner method is using only the previously described wavenumbers. A comparison is performed between established FT-IR imaging method 6 and the novel MIR measuring method. Algorithms like denoising (locally adaptive total variation regularization 42 applied to the measurement images) and normalisation prepare the mid-infrared data of both scanning methods for clustering. For each scanning method the processed measurement data are sorted by k-means algorithm 43 . The resulting clustering and segmentation provides related data that can be associated with interrelated structures. The advantage of this method is that multiple wavenumbers are viewed simultaneously and related data sets are displayed combined in a single image 6 . Thus structures of different wavenumbers become visible. Subsequently, further complementary processes can be added as shown in a previous work 6 .
Based on the H&E image the referencing of clustered FT-IR and MIR images is feasible. In addition, H&E sections are used to evaluate which structures are better visible with the respective scanning method. At the end of our experiments we assume that the MIR scanning can also substitute the FT-IR process, which is indicated by the dashed line in Fig. 2.
Clustering the hippocampal region. For segmentation and clustering of the hippocampus, the coronal section of a brain from mouse type C57BL/6 is used. This experiment does not require all wavenumbers integrated in the MIR scanner. Figure 3a shows the spectral properties of selected parts of the mouse brain. The data are acquired by FT-IR and refer to measurement points within grey (cortex) and white (corpus collosum) matter 44 as well as in the hippocampus (Fig. 3b), especially in the CA3 region 45 . In spectral observation, the spectral information of laser 1 and laser 4 does not differ except for an offset. It is not required to use both lasers in this case. Therefore, MIR measurement results are presented for three wavenumbers. Laser 2 and laser 3 are significantly different in absorbance. This indicates that protein and lipid content change for each measured substance. Though laser 3 is not located exactly on the vibrational peak of proteins (N-H), the absorption difference is still sufficient to distinguish the individual substances. Schematic experimental workflow for precise comparison of two IR scanning systems. Two adjacent tissue sections are used. After drying one section was used for spectral measurements by FT-IR imaging followed by MIR scanning. After data pre-processing clustering by k-means was performed. The second tissue section was H&E-stained for structure referencing. www.nature.com/scientificreports/ The spectral data from the MIR scanner and the FT-IR imager can be compared based on a similarity calculation. The similarity calculation indicates that the selected wavenumbers are comparable between the two measurement techniques. Therefore, the similarity calculation (Fig. 4) is based on the selected wavenumbers 2790 cm −1 (laser 1), 2926 cm −1 (laser 2) and 3350 cm −1 (laser 3) and are applied on the measurement results from Fig. 5. The data of both measurement techniques are mapped to each other. Subsequently, the individual data sets are compared via structural similarity index 46 . The index is composed of the calculation of the mean squared error (MSE) 47 . The similarity between the data sets is ~ 75%. The labelled region of interest shows that the structural variation between the measurement images is minor (Fig. 4). In contrast, the intensity difference between the measuring signals is significantly different and affects the similarity index.
The images produced by the MIR scanner (Fig. 5a) show different measurement images for different wavenumbers. Each scan result includes 250k measurement points, acquired within ~ 3 s. Grey matter (cortex and CA3 region) and white (corpus collosum) matter are visible for all wavenumbers. White matter has an absorption maximum at wavenumber 2926 cm −1 in the functional group region 48 . This is due to the increased CH 2 or CH 3 vibrations of white matter compared to grey matter 49 . The vibration of the NH-band, especially the vibration band of secondary amines near to wavenumber 3350 cm −1 , is detected [50][51][52] . Significant differences between measurement results for wavenumber 2926 cm −1 and 3350 cm −1 are evident for some structures. Especially the hippocampus, due to its molecular composition, shows distinct differences in the vibrational properties of lipids and amines [51][52][53] . This is apparent in the measurement results too. For reference wavenumber 2790 cm −1 , the   ) in relation to the Allen Brain Atlas for mice brain (2015 Allen Institute for Brain Science. Allen Brain Atlas API. Available from: brain-map.org/api/index.html) 54,55 . The hippocampal region is distinctly segmented as a coherent structure. www.nature.com/scientificreports/ hippocampal area is detected on a low absorbance level and thus represents an approximately ideal reference to this brain region. Despite the limited spectral information by the MIR scanner, clustering can be performed using a k-means algorithm. In this case, the focus is to achieve similar clustering results for MIR and FT-IR imaging. For both measurement methods, structures in the mouse brain are clustered (Fig. 5b,c). The direct comparison between the two measurement results proves that the essential structures such as the grey and white matter are clustered in a comparable way. For clustering spectral data in this case, the MIR scan data consists only of three wavenumbers per pixel. In comparison, the spectral FT-IR dataset comprises over hundred wavenumbers each pixel. Due to the larger amount of information for the FT-IR measurement results, the clustering produces smoother structures and more homogeneous areas. In addition, some areas of the brain are more distinctly visible. This is due to the significant part of the spectral fingerprint area being included in the clustering of FI-IR images. However, for the measurement results of the MIR scanner, distinct structures are clustered, such as the white matter, the grey matter and the entire hippocampal region. Parts of thalamus and hypothalamus are rudimentary clustered too. The segmentation of the hippocampus is done at k-value k = 4. For measurement results via FT-IR imaging, the clustering and segmentation is done for k = 4 as well. But the mentioned k-value is related to clustering and segmentation on the specimen and represented by a certain colour in the resulting figures. Some cluster occurred on the background surface, so they are rejected. In comparison to the Allen Brain Mouse Atlas 54,55 , the region of the hippocampus is particularly well clustered for MIR scans (Fig. 5d). The resulting segmentation shows the hippocampus as an approximately coherent structure. The associated H&E staining of the sample confirms the measurement results of the MIR scanner regarding the hippocampal region.

Examination of spontaneously occurring primary hepatocellular carcinoma. Based on success-
ful segmentation of defined and specific structures in mouse brain, the next validation step for the MIR scanner is the investigation of not distinctly defined tissue structures. The examination of a spontaneous primary hepatocellular carcinoma (HCC) in mouse liver therefor offers suitable conditions since these usually show inhomogeneous tissue structures 56 . At the initiation of this investigation, it is necessary to ensure that the scanning systems used provide equivalent detection of healthy mouse livers (H&E staining, Fig. 6a) and ensure the resulting clustering provides comparable results by k-means. The H&E-stained section from Fig. 6a shows the location of the measurement performed in Fig. 6b on the sample. In addition, an magnified section (Fig. 6b) shows the morphological structure of healthy liver. This can be compared with the morphological structure (surrounding structure) from Fig. 6c. This experiment is performed by using all four wavenumbers from the MIR scanner. The approximately homogeneous structure of a healthy mouse liver is detected with comparable parameters for both measurement systems, the MIR scanner and the FT-IR imager. The MIR scanner dataset for each laser consists of 375k measurement points, acquired at ~ 4.5 s. In Fig. 6b, the clustered scan results are shown next to the adjacent section stained by H&E. In this case, the entire tissue section is spectrally acquired and clustered for both measurement methods. Subsequently, the clustered measurement data is mapped on the H&E image. The image section is aligned and cropped identically for both data sets. Basically, the k-means algorithm has a high variance in reproducibility, thus the measurement results between MIR and FT-IR scan are not identical at all. For this reason, a pixel-by-pixel comparison is not valid. Nevertheless, the two methods show a very similar pattern for clustering k = 2 within healthy liver. For k = 3, similar segmentation results are obtained as well, but they differ from each other in size and location. This emphasizes the variance of the algorithm by using a different number of spectral data, but demonstrates that for low spatial scan resolutions and thus lower structural information, segmentation of approximately homogeneous tissue is challenging.
The examination of HCC is performed on a sample from mouse type BALB/c-Abcb4 −/−57,58 . By visual assessment of the diseased liver a tumour nodule of about 3 mm at the left lobe could be observed. The sample is chosen to include a transition from the nodule to surrounding tissue. This way, the detection of tumour margins of an infiltrated tumour is topographically feasible and provides direct comparison between tumour nodule and surrounding regions. The corresponding measurement results (250k measurement points each laser, ~ 3 s acquisition time each wavenumber; Fig. 6c) of the MIR scanner show a precise differentiation between the healthy and diseased tissue for clustering with k = 2. Higher k-values segment a transition area between healthy and diseased tissue. In comparison, measurement results for the FT-IR imager exhibit the same margins. However, the detection and clustering of tumour margins are slightly different for the two scanning methods. This effect might occur due to the different drying of blood within the liver. Blood vessels in the liver change their spectral properties in the N-H vibration band with increasing drying time 59 . The time between FT-IR and MIR measurement is about 2 h. Thus, the measurement result for this organ also changes slightly. In addition, the heterogeneous margin in both MIR images represents the infiltrating property of the tumour. Spectral analysis of the tumour nodule in relation to the surrounding area shows that these two regions are fundamentally different from each another (Fig. 6d). In this spectral range, it is predominantly the proportion of long-chain C-H molecules that indicates tissue changes within the liver. Based on cluster k = 3 (Fig. 6c), it can be concluded that a substructure within the HCC is detected. A more detailed investigation of the substructure is part of a further study. Therefore, it will not be discussed here. It should merely be pointed out that the MIR scanner can potentially detect such a structure comparable to the FT-IR imager and the current settings.
Both scanning methods (FT-IR and MIR) produced equivalent measurement results, although they are based on different spectral amount of information. Figure 6e summarises the main differences between FT-IR imager and MIR scanner. The comparison is based on scanning a square centimetre with spatial resolution of 25 µm for the FT-IR imager and 20 µm for the MIR scanner. The previous measurement images and measurement data are also generated by these parameters. In this case, the MIR scanner is scanning faster than the FT-IR imager by www.nature.com/scientificreports/ www.nature.com/scientificreports/ factor 225. The data load is reduced by factor 136 and the calculation time for data pre-processing and clustering is reduced at the MIR scanner by factor 10. The comparison is based on an established workflow 6 and is intended to show the optimization of the workflow by the MIR scanner.

Discussion
With the novel MIR scanner, it was demonstrated that the segmentation of selected tissue structures in mouse brain is possible by using selected wavenumbers in the low mid-infrared range. Here, the focus is on detection of lipid structures and structures associated with amide A. This results in the detection of the characteristic structure of the hippocampus. White and grey matter were detected with the MIR scanner as well. The measurement results of the MIR scanner are directly comparable to the results of FT-IR imaging in combination with k-means clustering. The MIR scanner provides just three or four spectral information per pixel, depending on the use case. In contrast, the pixels for segmentation with the FT-IR imager consist of more than a hundred spectral information sources, including the fingerprint area. Thus, some tissue structures are clustered more precise. Nevertheless, the MIR scanner is capable of providing comparable final results for specific applications based on reduced information content. The detection of naturally grown tumour in mouse liver (hepatocellular carcinoma) is also successfully performed by referencing with the FT-IR imager. However, the MIR scanning method has the major disadvantage that it can only detect selected fixed wavenumbers, which are not sufficient for every application. Despite this, we are able to accelerate the scanning process by a factor of 225 compared to FT-IR imaging. This reduces the data load by a factor of > 136 for the MIR scanner in comparison with an established workflow 6 . Depending on the application, the laser modules have to be adapted or supplemented by FT-IR imaging. Nevertheless, considering this issue, a first fast analysis of a thin tissue section is achievable. As we have shown, the use of the MIR scanner is much faster than conventional reference methods 6,10,13,14 for the rapid analysis of frozen sections. In this context, the presented MIR method does not replace H&E staining or even histological structural assessment, especially in the clinical field, by a pathologist. It can, though, be used as a supporting tool in the future.
Based on the measurement results presented here, the examination of further tissue sections (with tumor) of other organs such as spleen, kidney or heart is essential. In this context, the reduction of spatial scan resolution is a necessary requirement to enable the visualization of smaller structures and to perform further analyses. With a lower spatial resolution, an investigation of neurons within mouse brains would be feasible. There is potential for the detection of various plaques in the hippocampus as part of the investigation of Alzheimer's disease. In addition, based on the knowledge of automated guided MALDI (Matrix-Assisted Laser Desorption/Ionization) examinations 6 , the link between MIR scanner and MALDI MSI (Mass Spectrometry Imaging) is made available. The rapid pre-diagnosis by the MIR scanner enables to select predefined regions of interest within the tissue for MALDI MSI analysis that reduce acquisition time for high-resolution data with low spatial resolution. Furthermore, it avoids unused data that occur with measurement ranges that are not relevant. Thus, a more efficient use of MALDI MSI can be increased.
Considering the wavenumbers used, the MIR scanner provides significant added value for clinical applications as well. After validating the measurement setup on human tissue samples, the MIR scanner can support clinical frozen section analysis. With sufficient knowledge of the sample composition of lipids and proteins, the pathologist can also use the MIR scanner as a pre-processing tool for further investigations. Thus, for larger samples, a segmentation of relevant tissue regions or tumour areas could be performed, which the pathologist confirms with histopathological knowledge. For this topic, further studies with the MIR scanner need to be initiated to prove its feasibility.
Furthermore, the reliability of the measurement data has to be developed. In particular, the reproducibility of the clustered measurement results via k-means needs to be ensured. The dynamic determination of the individual clusters during the calculation of k-means is problematic, which considerably limits the reproducibility. However, even from this point of view, we were still able to deliver comparable and usable results for the frozen section analysis.
Taken together, this study showed a novel none-destructive measurement system for fast structure and tumour detection. We proved our measuring setup by referencing our results with a commercially available FT-IR imaging system. Due to this we demonstrated a new rapid analytical opportunity for intraoperative frozen tissue section analysis or rapid pre-analysis for MALDI MSI.

Methods
Optical characterisation and settings of the MIR scanner. Based on the measurement of absorbance differences depending on molecule vibrations, the MIR scanner provided measurement points representing the vibration intensity of the local molecules of the sample. Thus, a measuring point represented an intensity value which was due to the local concentration related absorption behaviour of the sample was examined. The absorption referred to the background of the sample, especially to the absorption characteristic of the microscope slide. Ideally, the slide was coated with gold or silver in order to be able to make a comparable statement of the absorption via FT-IR or DRIFTS (Diffuse Reflectance Infrared Fourier Transform Spectroscopy). In our experiments, silver-coated slides (Kevley Technologies, Ohio) were used because gold-coated slides did not provide a beneficial difference.
The entire optical setup of the MIR scanner was designed for the measurement of functional groups. Thus, the design of the optical components was also focused to this spectral range. The spectral range to be detected was between 2000 cm −1 and 4000 cm −1 . The IR detector was configured for this spectral range and had its detection maximum at this specific spectral region. The single element sensor had a sensor area of 1 × 1 mm 2  www.nature.com/scientificreports/ four-stage pre-amplified. A sapphire window with anti-reflective coating was located in front of the sensor chip. Thus, the incident collimated measuring signal was projected optimally onto the sensor chip. In addition, all lenses (CaF 2 and Black Diamond) of the MIR scanner were coated with an anti-reflective coating. The IR detector and the lens coating improved the optical performance of the MIR scanner in this spectral range, but limited it at the same time. To remove the spectral limitation caused by lenses, the coating of lenses in combination with the lens material had to be modified, depending on the application [60][61][62] . Optical components from the align mirror unit consisted of a unit called ELEFHANT Precession (Novanta Europe GmbH, Germany). The including mirrors were gold-coated to improve the reflective properties in the middle infrared spectral range. The focus unit consisted of a combination of several lenses (CaF 2 , anti-reflective coated for the spectral range between 4000 cm −1 and 2000 cm −1 ). The lens system was able to automatically adjust the focus point. Therefore, a focus shift due to the wavelength were adjusted. The lens system had a numerical aperture of 0.12 and a magnification of 1.25. The MIR scanner had four wavenumbers that were used for measurements. The individual wavenumbers were represented by Distributed Feedback (DFB) lasers (nanoplus Nanosystems and Technologies GmbH, Germany) that illuminated the sample depending on the measurement and necessity 37,38 . A lens system was used to focus the laser light onto the sample.
Measurements with the MIR-Scanner were performed placing the sample on the 2D translation stage. The translation stage moved automatically into the predefined scanning field of the MIR scanner. The scanning velocity of the align mirror was set to 5.3 m/s with a spatial resolution of 20 µm. For a laser line of 1 cm length, this resulted in 5k measurement points (including oversampling with factor 10). The mirror velocity predominantly defined the sample velocity of the data and the velocity of the translation stage. For 5k measurement points per measurement line (in this case a scanning field 1 × 1 cm 2 ), a sample rate of ~ 2.7 MS/s and a translation stage speed of ~ 3.3 mm/s resulted. For each measuring line, an offset of 0.5 cm was included at the beginning and end of the line, to ensure that the scanning speed was constant in the measuring range. The parameters referred to a corrected image field and a working distance of ~ 2 cm. Without calibration of the image field, image distortions occurred because the mirror deflection did not match the target pathway on the sample. Using different wavenumbers, the lasers were initiated sequentially. A background correction was performed via a reference measurement for each laser. The reference measurement was performed on the substrate the specimen was located on, in this case on a Mirr-IR slide (Kevley Technologies, Ohio). The background measurement was then calculated with the measurement signal of the respective laser. In addition, the measurement images were preprocessed and calculated by the described workflow.
FT-IR imaging. The FT-IR measurement results listed in this work were acquired with the Perkin Elmer Spotlight 400 FT-IR imager 6 (Perkin Elmer Inc., USA). With the FT-IR imager, the spectrum from 4000 to 750 cm −1 was measured for each experiment. The spectral resolution was set to 8 cm −1 . The measurements were made in reflexion mode, as this is closest to the scanning method of the MIR scanner. Each measurement point was accumulated twice to reduce background noise. The scanning speed was set to 2.2 cm/s. Before measurement, the detector of the FT-IR imager was cooled with liquid nitrogen.
Pre-processing und k-means. The measurement results were pre-processed to prepare the data from the MIR scanner and the FT-IR imager for the k-means algorithm. The results of the MIR scanner were aligned to each other so that the position of each pixel is the same for all images. A threshold adjustment was applied to amplify particularly weak signals. Subsequently, the measurement data was denoised. For segmentation and clustering of the individual tissue structures, k-means clustering was used. The first two k-values basically could not be used for the evaluation, because these were related to the background and the associated segmentation between slide and tissue. So, k-values related to the background were rejected.
The spectral data sets of the FT-IR imager were denoised and normalized afterwards. A smoothing and normalization of the spectra was used to purify the measurement data. Subsequently, spectral ranges were defined that were to be considered in the clustering via k-means. The spectral range was set to the functional groups and fingerprint region.
Pre-processing for MIR scanning data was performed by using python (https:// www. python. org/). MAT-LAB (The MathWorks Inc., USA) was used for FT-IR data pre-processing, general k-means clustering and segmentation 6 . Sample preparation and H&E staining. The tissue samples were cut in 10 µm sections using a CM1950 cryostat (Leica Biosystems Nussloch GmbH, Germany). Before and after cutting the tissue sample, the organ was stored at − 80 °C in a freezer. The frozen tissue samples were placed and thaw-mounted on a Mirr-IR slide (Kevley Technologies, Ohio). Samples used for H&E-staining were thaw-mounted on a SuperFrost Plus slide (Thermo Fisher Scientific Inc., USA). Finally, the samples for MIR and FT-IR measurements were dried using a desiccator at low pressure for 10 min.
H&E staining was performed by removing hematoxylin after ~ 2 min, followed by washing in tap water for ~ 3 min, one dip in distilled water and ~ 1 min acidic alcohol. Then, the sample was dipped in distilled water three times and put into blueing solution for ~ 2 min. In addition, blueing solution was removed by three drips in distilled water. Eosin (0.5% aqueous) was removed after ~ 2 min by ~ 1 min distilled water. Washing the samples was performed by 80% and 90% ethanol for ~ 2 min each and two times in 100% ethanol for ~ 1 min each. The samples were final cleaned by ~ 2 min xylene. After H&E staining the samples were mounted with Eukitt (Sigma Aldrich GmbH, Germany) and protected with cover glass. www.nature.com/scientificreports/ Animal specimens. Animal studies conducted at the Universitätsklinikum Mannheim (UMM) were supervised by institutional animal protection officials in accordance with the National Institute of Health guidelines Guide for the Care and Use of Laboratory Animals. The animal experiments were approved by governmental authorities (Regierungspräsidium Karlsruhe, Germany, approval number: G172/15 for mouse strain BALB/c-abcb4 −/− and I-20/08 for mouse strain C57BL/6). The BALB/c-abcb4 −/− mouse strain 58 used to investigate primary infiltrating liver tumorous tissue carries a homozygous deletion of the gene encoding the drug transporter ABCB4. This strain develops spontaneous hepatic fibroses homologous to the phenotype of sclerosing cholangitis. Thus, hepatocellular carcinoma (HCC), which is a primary malignancy originating in the liver, is often spontaneously developed by the 9-12th month of their life span. Liver harvesting was conducted at a 548 days old male mouse of the described strain. A part of the left lobe of the liver was immediately snap-frozen by the use of liquid nitrogen.