Introduction

Blood has critical functions in the body, such as delivery of oxygen to cells and transport of nutrients, immune substances, and waste products. Not only clinical applications (e.g., blood collection, blood transfusion and blood diagnosis), but elucidating biochemical properties of blood has also been a scientific enterprise for so long. Like other biological samples, blood contains many kinds of biomolecules. To comprehensively evaluate the behaviors and relationships among the constituents and integrate the information into interpretable formulas, methods for both experimental and data analyses need to be explored. This is a fundamental issue common to studies of complicated systems with varying compositions.

Spectroscopic techniques, such as visible absorption spectroscopy, near-infrared (NIR) or mid-infrared spectroscopy, Raman spectroscopy, and electron paramagnetic resonance spectroscopy, have contributed to analyses of blood composition1,2,3,4,5,6,7,8. Hemoglobin and its internal heme structure have been primary targets of spectroscopic investigations. Heme contains a porphyrin ring with an iron ion at its center. Hemoglobin has unique optical properties that change with coordination (i.e., the number and types of ligands) to the iron and the oxidation state of the iron ion9,10,11. Knowledge of spectral changes attributed to hemoglobin variants is crucial to understanding changes in biochemical behavior with blood composition. The versatility of spectroscopic techniques enables detection of various other molecules in blood simultaneously. Moreover, rapid and non-invasive spectral measurements, particularly by vibrational spectroscopy, show potential for simple analysis of blood in practical situations, such as medical diagnoses and criminal analyses.

Biochemical characterization of blood is of great significance in forensic science. Analysis of bloodstain evidence provides important clues in criminal investigations12,13. Identification of blood in evidence can be an indicator of criminality in a case. DNA profiling from bloodstains enables narrowing of potential suspects and victims. As a main emphasis in this study, we focused on predicting the age or time elapsed since bloodstain deposition, which could provide supportive evidence and enable estimation of the details of a crime. Specific autoxidation of hemoglobin, oxyhemoglobin (oxyHb), methemoglobin (metHb), and hemichrome (HC), is a potential indicator of bloodstain aging14,15. oxyHb has a ferrous iron in heme, which an oxygen molecule binds to in the sixth vertical coordination position of the heme plane. The heme of metHb is known to contain a high-spin ferric iron and a water molecule or a hydroxy ion as the sixth ligand. HC, the final oxidized state, contains bishistidine complexes of hemes with low-spin ferric irons. Spectral changes caused by heme autoxidation have been investigated and characteristic peaks have been identified1,11,16,17,18. Moreover, multivariate statistical modeling, such as partial least squares and support vector machine, has been applied to time series of Raman scattering, Fourier transform infrared spectroscopy absorption and visible absorption spectra to distinguish new/old bloodstains and predict the time elapsed since deposition19,20,21,22. These studies demonstrate that spectral changes reflect transformation of the chemical constituents in bloodstains during aging, and these can be mathematically correlated with the elapsed time.

This pioneering research has successfully shown the potential of spectroscopic approaches to capture and correlate the changes of chemical components in blood. However, because complicated mixed signals are observed from blood constituents, simultaneous characterization of individual blood components is quite difficult. In addition, the pure statistical natures of the prediction models mean they cannot provide chemical interpretations of the process of bloodstain aging. Furthermore, a mathematical model built by a training dataset collected under specific experimental conditions is not adequate for application to conditions beyond the range of the prepared parameters, which is not suitable for the diversity encountered at crime scenes. Besides, data collection from bloodstains in all possible conditions is practically impossible. Therefore, a novel analytical approach is required to reveal the overall chemical dynamics of blood compositions during aging. Moreover, development of a model that incorporates the underlying chemistry in bloodstain aging would be beneficial to enhance the availability of bloodstain analysis in forensic investigations.

Here, we develop a spectral deconvolution model for NIR Raman spectra of bloodstains involving a kinetic description of the overall aging process. Kinetic modeling of chemical processes in bloodstain aging is feasible for prediction of the transformation of blood components under different conditions. NIR Raman spectroscopy using an excitation wavelength of 785 nm shows well-balanced spectral signals from both the weak resonances of heme variants and off-resonances of other compositions23,24. This feature is advantageous to explore comprehensive changes in blood composition during aging and heme autoxidation. In addition, the non-destructive manner of Raman spectral measurements is preferable in forensic investigations as it enables preservation of the evidence for subsequent DNA analyses25. Furthermore, blood identification analysis using NIR Raman spectroscopy26,27,28 can be conducted simultaneously with evaluation of bloodstain aging. In this study, we collect Raman spectra of bloodstains aged at three different temperatures (30, 24, and 16 °C) for up to 4 months. In addition, we develop a multivariate algorithm for optimal spectral deconvolution and incorporate chemical kinetic models. Deconvolution of the bloodstain spectra identifies significant spectral components and provides a comprehensive aging model for blood constituents. This study facilitates forensic analysis of bloodstain aging. Furthermore, our discussion integrates prior knowledge and provides a comprehensive overview of bloodstain aging.

Results

Raman spectral changes during bloodstain aging

To comprehensively investigate the chemical processes of bloodstain aging, we investigated the time and temperature experimentally. Fresh blood samples collected from three individual donors (A, B, and C) were deposited on glass slides and stored at 30, 24, or 16 °C for up to 3 or 4 months. For each time point (Supplementary Table 1), 10 or 15 spectra collected using 785 nm excitation light were recorded at different spots on the bloodstain samples. This enabled averaging of the spectral quality and avoided photodegradation by long-term exposure to the excitation light. Figure 1 shows spectra of bloodstain samples stored at 24 °C for some select periods. The intensities of the fluorescence background profiles apparently increased over time. The total areas of the fluorescence profiles evaluated by a polynomial baseline fitting demonstrated increases at every temperature (Supplementary Fig. 1). Here, it is worth noting that the fluorescence background intensity cannot be used as a practical indicator of bloodstain aging because the fluorescence intensity can be affected not only by temperature and time, but also by other factors such as contaminants and ambient light. In actual cases, the bloodstain substrate may contribute to the fluorescence background21,29. Additionally, photobleaching also hinders reliable evaluation of fluorescence. Therefore, Raman signals on the fluorescence profiles were targeted in subsequent analyses for bloodstain aging.

Fig. 1
figure 1

Spectral series during bloodstain aging. The averaged spectra of bloodstain samples collected from one of the three donors are represented. The aging process at 24 °C was observed over 3 months. The wavenumber of each spectrum was calibrated and truncated to the range of 274–1709 cm−1. No offset was applied to any of the spectra

The Raman signals in the blood spectra were obtained by subtraction of the fluorescence baselines and averaged for each time point and each donor. Figure 2 shows temporal changes of the averaged Raman spectra at 30, 24, and 16 °C. The spectral profiles varied according to the storage time and temperature. Visually, the intensities of some bands, at around 1580, 1254, 970, and 525 cm−1, increased over time. By contrast, the intensities of other bands, such as those at 1225 cm−1 and 570 cm−1, decreased over time. These monotonic changes occurred more rapidly at higher temperatures. Meanwhile, at 24 and 16 °C, a band at 376 cm−1 showed an increase at early time points and then decreased. By contrast, at 30 °C, a slight but continual increase was observed for this band. The asynchronous behaviors of the spectral features indicate that the blood spectra arise from some spectral components or chemical constituents, which have different kinetic properties during bloodstain aging. The overall tendency in the Raman spectra was an increase in the total areas over time (Supplementary Fig. 2). This result suggests that the spectral profiles also contain certain dominant components whose concentrations increase with time. Herein, in order to evaluate temporal changes of relative abundance of respective spectral components, the averaged Raman spectra observed by a stable instrumental condition were subsequently analyzed without normalization pre-processing.

Fig. 2
figure 2

Raman spectra for bloodstain aging. Bloodstain aging was monitored at 30, 24, and 16 °C for up to 3 or 4 months. The Raman spectra were preprocessed by subtraction of fluorescence baseline and averaged for each time point. The blue and magenta lines indicate the Raman bands for which increases and decreases were visually observed over time, respectively. The yellow lines highlight the Raman bands at 376 cm−1, which showed non-monotonic behavior over time. The spectral data of bloodstains collected from one of the three donors are represented

To assess the latent spectral features with different temporal- and temperature-dependencies, we performed principal component analysis (PCA) for the whole spectral dataset after baseline subtraction. PCA enables decomposition of a multivariate dataset into linear combinations of orthogonal, or mathematically independent, components (i.e., principle components: PCs)30. The Raman spectra dataset of the bloodstains had three parameters: Raman shift (cm−1), time (day), and temperature (°C). Datasets were collected from three donors. All of the Raman spectral data were aligned in order of time, donor, and temperature, and then augmented into a large matrix (D in Supplementary Fig. 3). To precisely investigate the contribution of potential significant spectral components without rank reduction31, the PCA for the augmented spectral dataset was conducted without mean-centering (Supplementary Fig. 4). The first PC showed a spectral profile similar to the total averaged spectra and the corresponding scores increased with time at all temperatures. Several subsequent PCs also showed characteristic time- and temperature-dependent behaviors of the corresponding scores and significant spectral profiles. Therefore, the Raman spectra of bloodstains can be separated into several significant components that have specific kinetic behaviors in bloodstain aging.

Blood Raman spectral resolution by self-modeling

Multivariate curve resolution-alternating least squares (MCR-ALS) is a technique for multivariate bilinear deconvolution. MCR-ALS facilitates chemical or physical interpretation of the decomposed components because it enables incorporation of some constraints with physiochemical meaning, such as non-negativity, unimodality, and closure, into the calculation for multivariate resolution32. MCR-ALS finds the optimum solution, a set of two decomposed matrices (C and S in Supplementary Fig. 3), through an iterative calculation to minimize residual errors. Here, using a non-negativity constraint on all elements of the decomposed matrices, the augmented matrix of the Raman spectra of bloodstains (D in Supplementary Fig. 3) was subjected to the MCR-ALS algorithm. This showed that the blood Raman spectra contained five significant spectral components with characteristic spectral profiles (Fig. 3) and time- and temperature-dependent behavior of the corresponding scores (Supplementary Fig. 5).

Fig. 3
figure 3

Decomposed Raman spectral profiles of bloodstains. The spectral decomposition into five components was performed by multivariate curve resolution-alternating least squares. The spectral profiles are represented after normalization by the total area. The characteristic bands are labeled on each profile

The resolved spectral components were assignable to specific constituents in blood according to the distinctive peaks and scores’ behaviors. The first component profile showed good consistency with some characteristic bands for Fe2+–O2 binding in oxyHb at 1638 cm−1 (v(CαCm)asym), 1225 cm−1 (v13 or v42 of δ(CmH)), 570 cm−1 (v(Fe–O2)), and 419 cm−1 (δ(Fe–O–O))17,18,23,33. The corresponding scores of the first component showed maxima at the beginning of the measurements, and then a monotonic decrease toward zero over time at all temperatures. The fastest rate of decrease was observed at 30 °C and the slowest at 16 °C. According to these results, the first spectral component was dominated by oxyHb features in the initial state of heme autoxidation. The most noteworthy band in the second spectral component was at 376 cm−1. This is a known marker band for metHb formation (δ(CβCcCd))34,35. Bands observed at 1629 cm−1 (v(CαCm)asym), 1372 cm−1 (v4), and 1212 cm−1 (v5 + v18 or v13) are also typical of metHb11,18,34,35. In addition, the corresponding scores started from zero and increased over time. After reaching their maxima, the scores slightly decreased or stayed constant. The maximum values of the second scores were dependent on temperature, with the highest observed at 24 °C and the lowest at 30 °C. The scores’ behaviors suggest that the second component represents an intermediate state of a chemical reaction with more than two steps. Therefore, the second resolved component was assigned to metHb character in the second state of heme autoxidation. The third component of the blood spectra showed distinctive bands at 1578 cm−1 (v(CαCm)asym), 1254 cm−1 (v13 or v42), 970 cm−1 (γ(CαH=)), 745 cm−1 (v15(pyrbreathing)), and 665 cm−1 (v7(pyrdeform)asym)34. These peaks reportedly appear after long-term aging and photo/thermal denaturation of blood34,36. The corresponding scores of the third component showed increases from zero, and then plateaued at 30 and 24 °C or remained around zero at 16 °C. Accordingly, the third component was assigned as a contribution from HC, which is the third and final product of heme autoxidation. The HC state does not exist in blood in vivo or in fresh bloodstains and is formed under in vitro conditions16,37,38. HC formation has been sometimes been discussed along with protein denaturation. However, the HC state also reportedly forms under non-denaturing conditions37,39. Therefore, we presumed that the third component represents the extracted feature of HC’s heme formed by the autoxidation process. As described above, the first three spectral components showed salient features of the constituents of heme autoxidation (i.e., oxyHb, metHb, and HC). The relationship among the three corresponding scores was also supportive of the assignments. The 30 °C condition showed the fastest decrease of the first scores (oxyHb). At the same time, the third scores (HC) showed the fastest increase. Therefore, the second scores of metHb was observed at a low level. At 24 °C, there was a relatively low rate of HC formation. Then, metHb formed from oxyHb accumulated, resulting in a high maximum value of the second score. At 16 °C, we observed the lowest reaction rates for both metHb and HC formation. Accordingly, the second and third scores increased slowly.

The fourth component profile showed no distinctive features related to heme autoxidation, but did show some peaks for proteins (1654 cm−1 (Amide I), 1451 cm−1 (δ(CH2/CH3)), 1003 cm−1 (Phe)), and heme (1620 cm−1 (v(C=C)vinyl), 1605 cm−1 (v(C=C)vinyl), 1580 cm−1 (v37), 1562 cm−1 (v11), 1124 cm−1 (v5(Cβ−methyl)), 753 cm−1 (v15), and 676 cm−1 (v7))18,23,33,35. The corresponding scores of the fourth component started from non-zero values and then increased with time. The rates of increase were fastest at 30 °C and slowest at 16 °C. We tentatively associated the fourth component with an overall denaturing process in bloodstains, involving hemoglobin, other proteins, and some minor constituents. Protein features of hemoglobin were transferred mainly from the first three components as the native states to the fourth component as the denatured state. Heme aggregation caused by hemoglobin denaturation, which was previously suggested, may also be reflected in the fourth component23,36. In addition, other proteins and constituents exist and can be degraded over time, although their contributions in the blood Raman spectra may be relatively small. We supposed that, because of the weak resonance effect of the NIR Raman to heme, the denaturation process of the surrounding proteins was identified separately from the heme autoxidation processes. The fifth component had broad and featureless patterns. However, the corresponding scores of the fifth component significantly increased with time at every temperature, and was especially rapid at 30 and 24 °C. The fifth component was interpretable as a contribution of fluorescence arising from residuals of the baseline subtraction because of baseline variance and noise.

Kinetic description of bloodstain aging

The final aim in this study was to comprehensively describe the process of bloodstain aging, specifically the chemical kinetics. Using the MCR-ALS model discussed above, we decided to apply the calculated scores for the first (oxyHb) and fourth (overall denatured products) components to kinetic modeling. Behaviors of the scores of the second (metHb) and third (HC) components indicated a relationship based on a chemical reaction between them (Supplementary Fig. 5). However, we determined that the complicated mechanism involving the two components would need to be investigated further before modeling the kinetics. In addition, the fifth component as a fluorescence contribution is possibly affected by other factors, such as contaminants, external light, substrates, and photobleaching, besides temperature and time. Therefore, the scores of the fifth component were also excluded from the kinetic description. This spectral deconvolution processing is tolerant of unmodeled perturbations.

For the first spectral component, autoxidation from oxyHb to metHb has been described by a first-order reaction, as shown in Eq. (1)40,41,42:

$$\begin{array}{*{20}{c}} {{\mathrm{oxyHb}}\mathop{\longrightarrow}\limits^{{k_{\mathrm{o}}}}{\mathrm{metHb}} \to {\mathrm{Hemichrome}}} \end{array}$$
(1)

Then, the kinetic equation for oxyHb concentration is given by the following equation:

$$\begin{array}{*{20}{c}} {\left[ {{\mathrm{oxyHb}}} \right]_t {\hskip .05pt} = C_1\exp \left( { - k_ot} \right) = {\mathrm{Score}}\,1(t)} \end{array}$$
(2)

Equation (2) was fitted to the scores for the first component (oxyHb) from all three donors’ data using a common coefficient C1 simultaneously for each temperature condition (global fitting). Figure 4a shows the result of curve fitting for the first scores. The fitting errors and the estimated rate constants are summarized in Table 1 and Supplementary Table 2, respectively. Although the observed scores involved experimental dispersion, the fitted curves substantially represented decreases over time. The decrease rate at higher temperatures was obviously faster than that at lower temperatures. Here, the Arrhenius equation is available to describe the dependency of the reaction rate (k) on temperature (T):

$$\begin{array}{*{20}{c}} {k = A\exp \left( { - \frac{{E_{\mathrm{A}}}}{{RT}}} \right)} \end{array}$$
(3)

where A is a pre-exponential factor, EA is an activation energy, and R represents the gas constant. Taking the logarithm, Eq. (3) can be re-written as follows:

$$\begin{array}{*{20}{c}} {{\mathrm{log}}_{10}k = {\mathrm{log}}_{10}A - {\mathrm{log}}_{10}e\frac{{E_{\mathrm{A}}}}{R}\frac{1}{T}} \end{array}$$
(4)
Table 1 Error evaluations of kinetic modeling for scores corresponding to spectral components developed by spectral deconvolution techniques

To assess the relationship with temperature, each rate constant yielded by the curve fitting was converted into the logarithm, and then applied to Eq. (4). Figure 4b shows good linearity of the logarithms of the rate constants against the inverse of the temperature (T), where the coefficient of determination (R2) was 0.982 (Table 1). Moreover, the calculated activation energy, 152 ± 20 (kJ mol−1), was also comparable with previous reports (Supplementary Table 2)43,44. These results strongly support that the first component resolved via MCR-ALS represents the oxyHb contribution in the blood Raman spectra, and the corresponding scores can be modeled based on the kinetics of autoxidation into metHb.

Fig. 4
figure 4

Kinetic analysis for bloodstain compositions. The scores corresponding to the first (a, b) and the fourth (c, d) spectral components via multivariate curve resolution-alternating least squares were applied to the curve fitting for respective kinetic models. a, c Represent the observed scores (circles) for one of the three donors and the fitted kinetic equations (solid lines) for 30 °C (dark magenta and dark turquoise), 24 °C (magenta and turquoise), and 16 °C (light magenta and light turquoise). The rate constants yielded at each temperature are represented by diamonds in b, d. The dashed lines indicate the results of line fitting using the Arrhenius equation (Eq. (4))

Next, the scores of the fourth component were also explored using a chemical reaction model. The fourth scores showed monotonic and temperature-accelerated increases over time (Supplementary Fig. 5). We presumed that the fourth component exhibited mixed contributions from constituents related to the overall denaturing process in bloodstain aging. Then, we built the following reaction scheme of first-order (Eq. (5)) and kinetic equation (Eq. (6)):

$$\begin{array}{*{20}{c}} {{\mathrm{Native}}\mathop{\longrightarrow}\limits^{{k_{\mathrm{d}}}}{\mathrm{Denatured}},} \end{array}$$
(5)
$$\begin{array}{*{20}{c}} {\left[ {{\mathrm{Denatured}}} \right]_t = C_4\exp \left( { - k_{\mathrm{d}}t} \right) + d = {\mathrm{Score}}\,4\left( t \right).} \end{array}$$
(6)

As the increases of the fourth scores started from non-zero values, we considered that small amounts of the constituents formed by denaturation are contained in the bloodstains at the beginning of deposition. The contributions of the native proteins were mainly represented in the first three components and then transferred into the fourth scores by denaturation. The developed Eq. (6) was fitted to the fourth scores from all three donors using common coefficient C4 and intercept of d for each temperature (global fitting). Figure 4c, d show the results of curve fitting and the correlation of the rate constants against temperature based on the Arrhenius equation (Eq. (4)). The fitted curves exhibited increases over time with different rates for the three temperatures. Moreover, the fitted rate constants agreed with the Arrhenius equation and the coefficient of determination (R2) was 0.999 (Table 1). The fitting errors and the estimated parameters are summarized in Table 1 and Supplementary Table 2, respectively. These results suggest that the denaturation process represented in the fourth component proceeds in parallel with the heme autoxidation, but is not completely synchronized. This tendency is consistent with previous reports that the spectral features of protein denaturation and heme show different time-courses in aging of blood samples24. Consequently, the proposed reaction scheme (Eq. (5)) and equation (Eq. (6)) can adequately describe the kinetics of the fourth spectral component as a denaturing process in bloodstains.

Kinetic optimization of blood Raman spectral resolution

The MCR-ALS model discussed above was obtained only with self-modeling constraints, that is, non-negativity. However, a drawback of such self-modeling methods is that the provided solutions (i.e., the spectral loading profiles and corresponding scores) are not definitely unique but have rotational ambiguity within a range in which the constraints are satisfied32,45. Moreover, since the observed data exhibited deviations from ideal behaviors because of noise and unideal chemical or spectroscopic responses, the non-negativity constraints possibly caused distortion of the profiles of spectral components and the scores, particularly overestimation around zero. These properties introduce lack of selectivity to the spectral profiles, increase fitting errors, and, more importantly, hinder accurate kinetic modeling of the scores’ behaviors. The decreases of the first scores to zero showed considerable deviations, particularly at higher temperatures. In addition, the fitting qualities of the kinetic equations (i.e., R2 for the scores) were not sufficient for precise prediction of bloodstain aging (Table 1).

To improve the model of spectral resolution for kinetic modeling in bloodstain aging, we developed a calculation algorithm that incorporated the following: (i) curve fittings of the kinetic equations, and (ii) penalty functions for moderate constraint in the iterative process of the MCR-ALS algorithm (designated as “kinetic P-ALS”). Hybrid techniques that introduce physiochemical constraints, such as concentration calibration46, chemical equilibrium47, area correlation48, and chemical kinetics49,50,51, into the iterative optimization process of self-modeling have been proposed to reduce the rotational ambiguity of the solutions. Furthermore, the penalty functions allow small deviations from the constrained values (i.e., fitted kinetic equations), as well as negative values52,53. These functions are beneficial to avoid excess fitting to insignificant experimental deviations and to yield more a probable model with smaller errors. The algorithm of the kinetic P-ALS is outlined below (Fig. 5a).

Fig. 5
figure 5

Kinetic optimization for spectral deconvolution of bloodstains. a Schematics of the kinetic P-ALS. The Raman spectral dataset depicted as the matrix D was decomposed into the score matrix C and the loading profile matrix S. The first (magenta) and fourth (turquoise) scores in the matrix C were subsequently fitted to the respective kinetic equations (fo and fd), which yielded the rate constants (ko and kd). The rate constants were updated by fitting to the Arrhenius equation. The corresponding new kinetic equations (fo,new and fd,new) were subsequently obtained. The penalty ALS was implemented by incorporating the fitted kinetic curves and the label-vectors (Ho = [1, 0, 0, 0, 0] and Hd = [0, 0, 0, 1, 0]) with the weighting factor λ into the least-squares calculation. Then, the updated matrix C was used to estimate the matrix S for the next iterative calculation. be The kinetic fitting to the scores of the first (b, c) and fourth (d, e) spectral components decomposed by the kinetic P-ALS. b, d represent the fitted curves (solid lines) of the kinetic equations to the scores (circles) at 30 °C (dark magenta and dark turquoise), 24 °C (magenta and turquoise), and 16 °C (light magenta and light turquoise), respectively. The circles only show the scores obtained from one of the three donors, corresponding to Fig. 4. c, e Show the fitting of the Arrhenius equation (dashed lines) to the yielded rate constants (diamonds)

The spectral profiles, matrix S, obtained above by (normal) MCR-ALS was used as the initial input. The score matrix C was first calculated as C = D S+ without non-negativity constraint, where S+ is the pseudo-inverse of the matrix S. Then, the first and fourth component’s scores of the matrix C were fitted using Eqs. (2) and (6), respectively. Applying the obtained rate constants (ko and kd) to the Arrhenius equation (Eq. (4)), the rate constants were updated and corresponding kinetic equations (fnew) were obtained. Subsequently, the new kinetic equations and label-vectors (H) for the first and fourth components were augmented with the matrices D and S, respectively, multiplying by the weighting factor λ. Then, the moderately-constrained matrix C (depicted as Cnew) was produced by solving the matrix equation. Finally, the matrix S was re-calculated by non-negative linear least-squares algorithm using the matrices D and Cnew. The updated matrix S was used in the next iterative cycle.

Considering the noise level of the scores, the penalty function weighting factor, λ, was set at 0.01 for soft constraints52,53,54. The iterative calculation continued until reaching convergence (see the Methods section). Supplementary Figs. 6 and 7 represent the final results of the decomposed spectral profiles and the corresponding scores from the kinetic P-ALS algorithm. Although the spectral profiles from the kinetic P-ALS were almost similar to those from MCR-ALS, some slight changes were observed. First, the bands at 378 and 420 cm−1 disappeared in the fourth profile but increased in the first profile. In addition, the fifth profile became flatter with less features. These differences indicated that some spectral features were redistributed among the five components, while maintaining critical features for each component. The corresponding scores for each component also showed similar patterns to those by MCR-ALS (Supplementary Fig. 7). However, the fitting accuracies of the first and fourth scores with the kinetic equations Eqs. (2) and (6), respectively, were drastically improved compared with those in MCR-ALS (Fig. 5b–e and Table 1). Besides, sufficient correlations of the rate constants with the Arrhenius equation were maintained. The rate constants for the respective temperatures are summarized in Supplementary Table 2. The rate constant for oxyHb autoxidation (ko) increased in the kinetic P-ALS, particularly at 30 °C. We considered that these increases were because non-negativity constraints were not used for the score matrix C, which prevented overestimation, especially of small values around zero. At the same time, by eliminating underestimation of the fourth scores, the rate constants for denaturation (kd) decreased. Meanwhile, the sigma value (i.e., standard deviation of the residual errors after matrix decomposition) obtained in the kinetic P-ALS model was comparable to that of the MCR-ALS model (Table 1). This result supported that the kinetic equations employed in the model’s calculation process adequately explained the data’s characteristics without accumulating unmodeled residuals.

To validate the reliability of the calculation scheme developed in this study (i.e., kinetic P-ALS), we performed donor-independent external validations. From the spectral data collected from three individual donors, we selected the data from two donors (calibration donors) and used these to build the MCR-ALS model and then the kinetic P-ALS model. The provided spectral profiles in the kinetic P-ALS model (matrix S) were subsequently applied to calculate the corresponding scores (matrix C) from the dataset of the third donor (test donor). Finally, the prediction accuracies for the calculated first and fourth scores were evaluated in comparison with the modeled kinetic equations. This sequence was repeated three times by changing the donor used as the test donor. No significant differences were observed in the regression errors for the first and fourth scores between the calibration and test data (Supplementary Fig. 8 and Supplementary Table 3). Moreover, despite the small size of the dataset used for calibration (i.e., only two donors), the predicted rate constants (ko and kd) showed good consistencies among the three external validations and with those obtained using data from all three donors (Supplementary Table 4). Therefore, these results demonstrate that the kinetic P-ALS scheme is successful without over-fitting and the kinetic models have robustness against donor individualities.

Estimation of bloodstain aging

The corresponding scores of the first and fourth components exhibit their respective kinetic behaviors. However, the absolute values of the two modeled scores involve the intrinsic restriction on a specific instrumental condition, which affected the absolute intensity of Raman spectra used for the model building. Here, the ratio calculation of Score1/Score4 provides a universal index of bloodstain aging available in various forensic analytical conditions since it removes the dependency on experimental conditions, such as efficiencies of excitation and signal detections, and on blood concentration of a stain sample. Figure 6 shows temporal curves of the ratio indices at 30, 24, and 16 °C with their standard deviations. The fitting errors for the ratio curves were also evaluated (Table 1 and Supplementary Fig. 9). According to the kinetics of both the first and fourth scores, the ratio indices at 30 °C rapidly decreased to zero, whereas those at 16 °C decreased gradually. The prediction accuracies of the ratio and the ages (time elapsed since deposition) of bloodstains changed depending on both the ratio itself and temperature (see the Methods section). As the ratio index decreases rapidly at the beginning of aging at 30 °C, the range of the predicted age becomes small (e.g., several days). However, when a bloodstain is aged for >8.6 days, it cannot be distinguished from more aged samples. A lower temperature can make the decrease in the ratio index slower. Then, the bloodstain age can be predicted over more than several months, and the predicted range widens (e.g., several tens of days). These tendencies about the prediction accuracies were chemically reasonable and demonstrated that the developed model successfully avoided over-fitting to a specific experimental condition. Obviously, increasing the number of time points, temperature points, and donors for the model training would effectively improve the prediction accuracy.

Fig. 6
figure 6

Index for bloodstain aging. The index was defined as the ratio score between the first and the fourth spectral components. The observed ratios are plotted for 30 °C (magenta), 24 °C (purple), and 16 °C (blue). The circles, crosses, and triangles indicate data from individual three donors (A, B, and C). The modeled ratios are shown as dashed lines with the standard deviation (shaded area) for each temperature

Discussion

A salient advantage of this kinetic-based modeling approach for bloodstain aging is the availability to evaluate bloodstain aging at different temperatures. Moreover, even if the temperature is not constant, if the temporal changes are recorded, the ages of bloodstains can potentially be predicted by integrating the index equation with time. Besides, a relative comparison of the degree of bloodstain aging is also beneficial in criminal investigations to support the relevance of bloodstain evidence and to identify details about the crime. This analysis is feasible even without accurate temperature information or if some external stimuli, such as exposure to chemical materials and light, may affect the aging rate. More detailed investigation at various temperature conditions would be useful to validate the developed approach further and expand the potential.

The comprehensive investigation performed in this study about Raman spectral changes during bloodstain aging also suggests potential directions for further studies. The chemical origins and mechanisms of the fluorescence increases have not been clearly determined. The analysis of fluorescence profiles may be useful to estimate the environment where the bloodstain evidence was stored and possibly the time elapsed since deposition. As for the second heme autoxidation step from metHb to HC, the scores’ behaviors suggest the possibility of an equilibrium relationship between the two forms. Previous studies have also suggested an equilibrium reaction between these species in solution9,38,55. The humidity dependency of the rate for this transition is an interesting aspect as previously reported56. The spectral deconvolution technique described here should be helpful to explore the mechanism of this chemical pathway. More specific assignment of the origin of the fourth component would be beneficial to further support the reliability of the developed model for bloodstain aging. In addition, it is possible further chemical processes could occur with bloodstain aging over several months and years. Studies on the detailed mechanisms of bloodstain aging will offer deeper scientific insights and enable more precise modeling. Besides, bloodstains on absorbent substrates such as fabric and paper are considered as more practical targets in forensic analysis. We have previously investigated a spectral processing method to remove interfering signals from the substrate in use of infrared spectroscopy57. The spectral processing method will also be applicable to the developed technique based on Raman spectroscopy, which should be explored in the future.

In conclusion, we developed a comprehensive model for bloodstain aging via Raman spectroscopy using spectral deconvolution with kinetic descriptions of the heme autoxidation and denaturing processes. Discovery of significant spectral components and kinetic models was achieved by use of NIR (785 nm) excitation for well-balanced detection of heme variants and other components in a large spectral dataset collected from multiple donors at several temperatures and many time points over several months. The developed model can predict the time elapsed since deposition for bloodstains under different experimental conditions or compare the relative degree of bloodstain aging using the proposed index, which reflects the abundance ratio of spectral components in the Raman spectra.

To date, spectral decomposition techniques with physiochemical constraints, such as kinetics, have been applied to reactions involving relatively simple chemical compounds49,50,51. The novel application to a biological sample in the present study highlights the feasibility for extraction of information of interest from extremely complex data and provides complementary perspectives about unmodeled constituents. This study expands the range of possible targets for spectral analysis and increases understanding by providing comprehensive descriptions of the constituents.

Methods

Blood sample preparation

Blood samples were collected from three healthy Japanese volunteers (donors A, B, and C) by pricking the ears with a lancet. Aliquots (2 µL) of whole blood were immediately deposited onto glass slides covered with aluminum foil. A total of 60–70 small blood spots were prepared for each donor at one blood collection, and the blood collection was repeated three times. The time of each blood collection was recorded. The deposited blood samples were dried and stored in incubators with controlled temperatures set at 30 and 16 °C, or in a paper box at room temperature (24 °C). A blood sample set from each donor was exposed to each temperature condition. The bloodstains were stored for up to 99 days at 30 °C, 98 days at 24 °C, and 121 days at 16 °C. The three temperature conditions were determined so that substantial spectral changes of bloodstain could be observed within 3–4 months in this study. All procedures involving human participants were carried out in accordance with the guidelines of the National Research Institute of Police Science and approved by the Institutional Ethics Committee of the National Research Institute of Police Science (Kashiwa, Japan). Written informed consent for sample collection, analysis, and publication was obtained from all participants.

Acquisition of Raman spectra of bloodstains

Raman spectra of the bloodstains were acquired using a Raman spectroscopy system developed in our laboratory, which included a spectrometer (Acton SP2500, Princeton Instruments, NJ) and a charge-coupled device camera (SPEC-10-100BR, Princeton Instruments). Excitation light (785 nm, approximately 2 mW at the sample) from a diode laser was focused by a ×50 magnification objective lens (MDPlan50, NA0.75, Olympus, Japan). The system was operated using WinSpec/32 ver. 2.6.0.0 software (Princeton Instruments). For each time point, a total of 10 (up to 1 week of storage) or 15 (from 1 week of storage to the last time point) spectra were recorded at different points on each dried bloodstain sample. Each spectrum was obtained using a 20 s accumulation within the spectral range of 90–2355 cm−1. The spectral resolution was approximately 6–7.5 cm−1. Collection of the series of spectra at each time point was started at a certain time after blood sample deposition (Supplementary Table 1). The total number of time points at which Raman spectra were measured was 42–50 for each donor and each temperature condition. A new (i.e., not previously observed) bloodstain sample was used for acquisition of Raman spectra at every time point. The performance of the Raman system, such as the spectral intensity, spectral resolution, and laser focusing, was confirmed and adjusted if necessary on a daily basis using the Raman spectrum of polystyrene as a reference.

Data analysis

All of the recorded spectra were loaded into IGOR Pro software (WaveMetrics Inc., OR) for spectral pretreatment. The raw spectra of the bloodstains were calibrated using a polystyrene standard, and truncated to the range of 274–1709 cm−1, including significant signals. The fluorescence baseline of each spectrum was estimated using a sixth degree polynomial fitting and then subtracted. The averaged Raman spectrum of bloodstain was obtained for each time point and each donor. Multivariate statistical analysis of the averaged pretreated spectra was performed using R software with the RStudio environment. PCA was executed using the prcomp function in the R package “stats”. The MCR-ALS algorithm was implemented using the als function in the R package “ALS” with a non-negative constraint on all elements of the decomposed matrices. The initial spectral profiles for MCR-ALS were defined as dominant positive or negative regions in corresponding PCs. Convergence of the MCR-ALS calculation was assessed by evaluating the standard deviations of the residuals between the observed data matrix and calculated data matrix58 (designated as “sigma”):

$$\begin{array}{*{20}{c}} {{\mathrm{sigma}} = \sqrt {\frac{{{\sum} {\left( {d_{ij}{\hskip 2pt} - {\hskip 2pt} \widehat {d_{ij}}} \right)^2} }}{N}} ,} \end{array}$$
(7)

where dij indicates an element of the observed data matrix in the i-th row and j-th column, \(\widehat {d_{ij}}\) is an element of the reconstructed data matrix from MCR-ALS, and N is the total number of elements in the data matrix (imax × jmax). When the relative differences of sigma between two consecutive interactive cycles were <1 × 10−5, the convergence was determined as follows:

$$\begin{array}{*{20}{c}} {{\mathrm{Threshold}}\,{\mathrm{for}}\,{\mathrm{convergence}} = \frac{{{\mathrm{sigma}}_{{\mathrm{old}}} {\hskip 2pt} - {\hskip2pt} {\mathrm{sigma}}_{{\mathrm{new}}}}}{{{\mathrm{sigma}}_{{\mathrm{new}}}}} {\hskip2pt} < {\hskip2pt} 10^{ - 5}.} \end{array}$$
(8)

Non-linear least-squares fitting of kinetic equations to the decomposed spectral scores was implemented by the Levenberg–Marquardt non-linear least-squares algorithm59 using the nlsLM function in the R package “minpack.lm”. The regression quality was evaluated using the normalized root-mean squared error (NRMSE), relative error (RE), and coefficient of determination (R2) with the following equations:

$$\begin{array}{*{20}{c}} {{\mathrm{NRMSE}} = \frac{1}{{\bar c}}\sqrt {\frac{{{\sum} {\left( {c_i {\hskip 2pt} - {\hskip 2pt} \widehat {c_i}} \right)^2} }}{n}} ,} \end{array}$$
(9)
$$\begin{array}{*{20}{c}} {{\mathrm{RE}} = \sqrt {\frac{{{\sum} {\left( {c_i {\hskip2pt} - {\hskip2pt} \widehat {c_i}} \right)^2} }}{{{\sum} {\widehat {c_i}^2} }}} ,} \end{array}$$
(10)
$$\begin{array}{*{20}{c}} {R^2 = 1 - \frac{{{\sum} {\left({c_i {\hskip2pt} - {\hskip2pt} \widehat {c_i}} \right)^2} }}{{{\sum} {\left( {c_i {\hskip2pt} - {\hskip2pt} {\bar{c}}} \right)^2} }},} \end{array}$$
(11)

where ci and \(\widehat {c_i}\) indicate the observed and predicted i-th element, respectively; c̄ is an average of all elements; and n is the total number of the observations. Non-negative least-squares calculation in the kinetic P-ALS scheme was performed using the nnls function in the R package “nnls”. The convergence of iterative calculation in the kinetic P-ALS algorithm was also monitored using the criterion of Eq. (8). The degree of bloodstain aging was finally represented using the ratio of the first score (s1) against the fourth score (s4). The standard deviations of the ratios (σr) were determined as follows:

$$\begin{array}{*{20}{c}} {\sigma _{\mathrm{r}} = \frac{{s_1}}{{s_4}}\sqrt {\left( {\frac{{\sigma _{{\mathrm{s}}_1}}}{{s_1}}} \right)^2 + \left( {\frac{{\sigma _{{\mathrm{s}}_4}}}{{s_4}}} \right)^2 - {\hskip2pt} 2\frac{{\sigma _{{\mathrm{s}}_1,{\mathrm{s}}_4}}}{{s_1s_4}}} ,} \end{array}$$
(12)

where σs1 and σs4 represent the standard deviations of the first (s1) and fourth (s4) scores, respectively; and σs1,s4 indicates the covariance between the first (s1) and fourth (s4) scores as follows:

$$\begin{array}{*{20}{c}} {\sigma _{{\mathrm{s}}_1,{\mathrm{s}}_4} = \frac{{{\sum} {\left( {s_{1,i} {\hskip2pt} - {\hskip2pt} \widehat {s_{1,i}}} \right)\left( {s_{4,i} {\hskip2pt} - {\hskip2pt} \widehat {s_{4,i}}} \right)} }}{n}.} \end{array}$$
(13)