Introduction

Optical properties can be used as indicators of pathological and physiological conditions of biological tissue1,2. Accurate quantitation of these properties from experimental measurements depend on analytical models that need to account for the structural complexity of the tissue system. Therefore, it is important to consider the optical heterogeneity of biological tissues, which are usually approximated as optically homogeneous to facilitate data analysis3,4. Experimentally, light propagation measurements are made by illuminating the tissue surface with either a continuous, frequency modulated, or pulsed light source and collecting measurements of the scattered light after it has propagated through the tissue medium5,6. Measured optical signals are translated into absorption and scattering properties of the medium by utilizing an appropriate forward model of light transport that best represents the measured data7,8.

Light propagation in random media such as biological tissues is theoretically modeled using the Radiative Transfer Equation (RTE)9,10,11. Due to the highly scattering nature of these media, the RTE can be reduced to the diffusion equation, which gives analytical solutions in homogeneous, semi-infinite, or infinite slab geometries12,13. The RTE can also be solved by the Monte Carlo method which remains the gold-standard approach to calculate light transport in media with complex geometries14 but is computationally expensive15. Although parallel implementations have significantly decreased the computational time of forward Monte Carlo simulations down to several seconds16,17,18,19, they still broadly remain non-viable as inverse solvers to obtain optical properties from experimental measurements in real-time (< 1 s) which require thousands of forward simulations at different modelling parameters16,18.

Theoretical approaches that account for structural complexity in tissues provide improved reconstruction of optical properties using diffuse optical measurements when studying brain hemodynamics20. Although Monte Carlo methods can simulate light propagation in realistic head geometries derived from magnetic resonance imaging (MRI) data16, modeling the head as layered homogeneous slabs, each with their own set of optical properties, provided similar accuracy in reconstruction of optical properties3. Further, diffuse optical measurements are applicable to various parts of the body that exhibit a layered structure (e.g. skin over top muscle, scalp and skull surrounding brain tissue). Therefore, an accurate, versatile and efficient analytical approach to model spatially and/or temporally resolved diffuse reflectance in layered media would enhance optical property reconstructions from diffuse optical measurements obtained in such layered media in vivo3.

Several methods to solve the diffusion equation for layered media have been reported in literature by using integral transforms21,22,23,24, method of images25, eigenfunctions26,27, or finite differences28. These methods do not give closed-form expressions directly in the spatial or time-domains for the photon fluence. Instead, the fluence in real-space is computed using numerical transforms24 or root-finding techniques26 which tend to increase numerical errors and computational costs29. For example, the integral transform approach24 solves the diffusion equation in the spatial-frequency domain which then must be inverse space-transformed (e.g. 2-D inverse Fourier) for real-space calculations. An additional inverse time Fourier transform is required for computation in the time-domain24. Both of these transforms make calculations of the steady-state and time-domain solutions difficult to compute for a wide range of optical and geometrical inputs21,24. Other approaches have been developed to compute geometries with large layer thicknesses and high scattering coefficients and/or spatial frequencies but rely on approximations21,30,31,32. Given these challenges, the fastest reported computational times for time-domain fluence in multi-layered media range between 0.5–5 s, depending on the number of layers and numerical accuracy required23,26,30,33. Such computational performance would preclude direct use of such layered analytical solutions for real-time analysis as optimization of multiple parameters in layered media would take several minutes34.

In this report, we present an accurate and efficient procedure for computing the photon fluence in a layered cylinder using solutions to the diffusion equation23. Our code is open-source and well documented for ease of use. We overcome the computational difficulties noted above by modifying the solutions23 in the spatial-frequency domain for numerical stability, which allows for computation of arbitrarily sized inputs without approximations. Lastly, we use an inverse Laplace transform for better convergence in the time-domain which improved the numerical accuracy while decreasing the computational cost by several orders of magnitude35,36. Below we describe: (a) implementation of the numerical solutions in the steady-state and time-domain for diffuse optical reflectance and transmittance measurements in N-layered media, (b) verification of the numerical accuracy and stability of the approach in calculating photon fluence for several source-detector configurations and tissue models, and (c) validation by direct comparisons to Monte Carlo simulations of fluence in multi-layered tissue models.

Methods

Theory

In highly scattering turbid media such as biological tissue, the steady-state diffusion equation37

$$\begin{aligned} D \nabla ^2 \Phi ({\vec {r}}) - \mu _a \Phi ({\vec {r}}) = -S({\vec {r}}) \end{aligned}$$
(1)

can be used to approximate light propagation in random turbid media where \(\Phi \), \(D = 1/(3\mu _s')\), \(\mu _a\), and \(\mu _s'\) denote the fluence rate, the diffusion coefficient, the absorption coefficient, and the reduced scattering coefficient, respectively37. The fluence can be used to calculate the diffusely reflected or transmitted intensity which are the quantities usually experimentally measured13. We assume an incident beam can be approximated by an isotropic point source located at a distance of \(z_0 = 1/\mu _s'\) from the incident surface. Using an integral transform, Eq. (1) can be solved in the spatial-frequency domain under extrapolated boundary conditions considering stacked layers of distinct absorption and scattering properties of arbitrary thickness as shown in Fig. 123,37. For the special case of a point source incident onto the center top of the cylinder, the fluence in the \(k\)th layer can be expressed in real space after applying an inverse finite Hankel transform23,37 by

$$\begin{aligned} \Phi _k(\rho )=\frac{1}{\pi a'^2} \sum _{n=1}^{\infty } G_k(s_n, z)J_0(s_n \rho )J_{1}^{-2}(a' s_n) \end{aligned}$$
(2)

where \(J_m\) is the Bessel function of first kind and order m and \(s_n\) is determined from the roots of \(J_m\) such that \(J_m(a' s_n)=0, \quad n=1,2,...,\) where n is the \(n\)th root of \(J_m\). z is the detector depth within the medium in cylindrical coordinates such that \(\Phi _1(z=0)\) is used for reflection calculations and \(\Phi _N(z=L)\) where L is the sum of all the layer thicknesses is used for transmission. The extrapolated boundary is determined with \(a' = a + z_b\) where a is the radius of the cylinder and \(z_b = 2AD\) where A is proportional to the fraction of photons that are internally reflected at the boundary12. \(G_k\) represents the Green’s function in the \(k\)th layer which must be solved separately for each layer. Derivations of Eq. (2) are given in Appendix A of the Supplementary material along with the explicit expressions of \(G_k\) in the topmost and bottommost layers. Applying inverse transforms directly to \(G_k\) as previously given23,37 can lead to numerical overflow for large layer thicknesses, scattering coefficients, and frequencies. To overcome these issues, we give expressions for \(G_k\) in terms of exponentially decaying terms that will not cause numerical overflow for large arguments and do not require any approximations.

Figure 1
figure 1

Schematic of a cylindrical turbid medium with an arbitrary number of layers and thickness where the source term is located on the center of the cylinder top (\(\rho _0\)).

Time-domain solutions are often computed using the inverse Fourier transform by using the substitution \(\mu _a \rightarrow \mu _a + i \omega /c\) and computing the real and imaginary parts of the fluence in the frequency domain at many frequencies (400–4000)23,35. As these Fourier integrals are slow to converge and can rapidly oscillate, the number of nodes needed for Gaussian integration methods are highly dependent on input parameters which makes the Fourier integral difficult to accurately and efficiently compute35. Instead, an inverse Laplace transform with \(i\omega \rightarrow {\bar{s}}\) can be used and numerically integrated along a complex contour35,36. Utilizing a contour that begins and ends in the left hand plane Re \(z \rightarrow -\infty \) forces a rapid decay of the integrand making for easier and faster numerical integration by using trapezoidal rules36. The corresponding solution for the time-domain fluence is then

$$\begin{aligned} \Phi _k(\rho , t) = \frac{1}{\pi a'^2} \sum _{n=1}^{\infty } \frac{1}{2\pi i} \left[ \int _{B} e^{{\bar{s}}t}G_k(s_n, z, {\bar{s}}) \,d{\bar{s}}\right] \times J_0(s_n \rho ) \; J_{1}^{-2}(a' s_n) \end{aligned}$$
(3)

where B denotes the Bromwich path with \({\bar{s}}\) being a complex number along the contour. A single contour is considered for rapid evaluation of the fluence at many time points for \(t \in (t_1, t_2)\)35,36. The full derivation of the time-domain solution shown in Eq. (3) along with details of the numerical Laplace transform are provided in Appendix B of the supplementary material.

Numerical algorithm and verification

To calculate the steady-state fluence in real space, a finite inverse Hankel transform (Eq. (2)) must be numerically computed. While calculation of solutions in the time-domain requires Eq. (2) to be evaluated at N complex valued absorption terms during the numerical inversion of the Laplace transform in Eq. (3). The numerical accuracy and efficiency of the procedure depend on the convergence and difficulty of computing the two sums. Since both the computation of the steady-state and time-domain fluence depends on Eq. (2), the accuracy and computational speed depend primarily on how many terms n of the infinite sum are retained in Eq. (2). To allow for computation over an arbitrary number of terms we have expanded the hyperbolic functions given previously23 in terms of exponential functions which also reduces the computational time by simplifying the expressions. Additionally, we have precomputed the roots of \(J_0\) and developed a custom procedure for calculation of \(J_0(x)\) which reduces the computational time substantially.

As exact, closed-form solutions for the photon fluence in layered media are not available, we validate our solutions in layered media to closed-form homogeneous solutions for semi-infinite media13. In these validations, each layer in our tissue-model was set to have the same optical properties as the homogeneous medium along with large lateral boundaries. This allows us to precisely quantify numerical errors and determine convergence of our solutions in terms of the number of terms retained in the sum in Eq. (2). We first compare Eq. (2) evaluated with 2 and 8 layers of similar optical properties to the semi-infinite solution13 and to Monte Carlo simulations in a semi-infinite medium. Next, we compare the accuracy of Eq. (2) as a function of the number of terms n considered in the summation for different input parameters \(\mu _{s1}'\), \(\mu _{s2}'\), \(\mu _{a1}\), \(\mu _{a2}\), z, and a. The accuracy is compared to the computation when using \(n=50{,}000\) terms in quadruple precision.

Solutions in the time-domain require computing both the infinite sum in Eq. (2) as well as numerically performing the inverse Laplace transform in Eq. (3). Strategies to invert the Laplace transform could allow for significantly faster convergence compared to the Fourier transform35,38, however the obtention of the inverse Laplace transform is not always an easy or even possible task to perform38. Therefore, the accuracy of the numerical approach to invert the Laplace transform must be rigorously tested. We focus on two attributes in performing the numerical integration in Eq. (3) that affect both the convergence and numerical accuracy: (a) the number of Laplace space evaluations N used to evaluate the Laplace integral in Eq. (3) and (b) the contour width determined from \(\Lambda = t_2/t_1\) where \(t \in (t_1, t_2)\). The effect of both of these parameters on the convergence and numerical accuracy are again examined by comparison to closed-form homogeneous analytical solutions. We show the reconstruction of the time-domain signal for high scattering and high absorbing media at short and long distances and times where the numerical reconstruction has been difficult to perform32,35. We have included in Appendix A extended discussion on how to efficiently compute Eq. (2), which also directly affect the computation of Eq. (3), and the advantages compared to other routines21. Additionally, we give approximations for reflectance simulations (\(z=0\)) that are accurate for double precision arithmetic (see Supplementary Fig. S3) which can decrease the computational time by 2–3 orders of magnitude.

All the numerical routines and figures presented here were developed using the Julia programming language (v1.7.0)39. Numerical simulations were performed on a MacBook Pro with an Apple M1 chip (MacOS version 11.1) and 16 GB of memory. Simulations in the steady-state utilized a single core while the inverse Laplace transform in the time-domain used multi-threaded parallelism. Here, the Laplace space evaluations were evenly distributed across the 4 cores and 8 threads of the M1 chip. All benchmarks are done in double precision arithmetic using v0.8.0 of LightPropagation.jl.

Validation with Monte Carlo

To validate the derived analytical solutions, the fluence is compared with results obtained from Monte Carlo simulations. The Monte Carlo method simulates the propagation of photons through the scattering medium using appropriate probability functions and random number generation15,16. In the limit of an infinitely large number of photons used in the simulations, the Monte Carlo method is an exact solution of the RTE15. We utilized an independent open-source Monte Carlo code provided by the Virtual Photonics Technology Initiative40 to validate the layered diffusion theory model. The optical properties used were taken from literature using three biologically relevant tissue models with an isotropic emitting source at a depth of \(z_0 = 1/\mu _{s1}'\) and a Henyey-Greenstein phase function. The anisotropic factor was assumed to be g = 0.8 for all layers. The Monte Carlo simulations used \(5\times 10^7\) photons for each simulation which visually reduced the effect of stochastic noise for all bin widths in the spatial and time domain. For all comparisons the refractive index of the medium is assumed to be \(n_r = 1.4\) where the external medium is assumed to be air \(n_r = 1.0\). The fluence as a function of t and/or \(\rho \) and z was recorded in discrete bin widths of \(\Delta t = 0.02\) ns, \(\Delta \rho = 0.99\) mm, and \(\Delta z = 0.27\) mm.

Results

Numerical accuracy of the layered diffusion equation

In Fig. 2, the fluence on the top boundary (\(z=0\)) in a semi-infinite medium with optical coefficients \(\mu _a=0.1\) \(\mathrm {cm}^{-1}\), \(\mu _s'=10\) \(\mathrm {cm}^{-1}\), \(g=0.8\), \(n_r=1.4\) is simulated using Monte Carlo methods in the steady-state and time-domain. We compare the results to solutions of the diffusion equation in a semi-infinite medium13 and to Eqs. (2) and (3) when solved for a 2 and 8 layered medium with the same optical coefficients in each layer. Here, we used a cylinder radius of \(a=20\) cm and a total cylinder length L of 10 cm (i.e., the thickness of each layer in the 2 and 8 layered model was 5 and 1.25 cm, respectively) to approximate a semi-infinite medium. As previously shown13, diffusion theory exhibits excellent agreement with relative errors (\(|1 - \Phi _{DT}/\Phi _{MC}|)\) \(<0.05\) compared to Monte Carlo simulations given enough scattering events. Equation (2) also shows excellent agreement to the closed-form semi-infinite solution13 in both the steady-state and time-domain giving similar relative errors to the Monte Carlo results.

Figure 2
figure 2

Equations (2) and (3) computed for both 2 and 8 layers agree with Monte Carlo simulations within relative errors of 0.05 which matches the errors achieved with the semi-infinite (SI) solution13. We show the (a) steady-state and (b) time-resolved fluence calculated with Monte Carlo simulations and diffusion theory for a semi-infinite medium with optical coefficients \(\mu _a=0.1\) \(\mathrm {cm}^{-1}\), \(\mu _s'=10\) \(\mathrm {cm}^{-1}\), \(g=0.8\), \(n_r=1.4\) and \(z=0\) cm. We considered the same optical properties in each layer and laterally infinite geometries in (2) and (3) to approximate semi-infinite media. The relative error between the diffusion theory results and Monte Carlo are shown below.

In contrast to the semi-infinite solution13, the numerical accuracy of Eq. (2) is affected by the termination of an infinite sum after n terms. For example, given a large amount of terms (\(n \approx 1500\)), the layered simulations shown in Fig. 2 can approximate the closed-form semi-infinite solution close to the limits of the numerical precision (detailed below). In practice, the sum should be terminated once a desired precision is reached. For example, to achieve similar relative tolerances to the Monte Carlo results in Fig. 2a, the steady-state fluence used \(n=500\) for \(\rho <2\) cm, \(n=1000\) for \(\rho <7\) cm, and \(n=1500\) for \(\rho <10\) cm whereas in Fig. 2b we use just \(n=50\) for both the 2 and 8 layer simulations in the time-domain for \(\rho =1.5\) cm. In general, to simulate lower fluence values a larger number of roots in Eq. (2) will be required to achieve similar relative errors. Consequently, the number of terms n required in Eq. (2) will be dependent to varying degrees on the input optical properties and cylinder dimensions considered.

Figure 3
figure 3

The rate of convergence of the infinite sum in Eq. (2) depends mostly on input parameters \(\mu _{s1}'\), z (detector depth), and a (cylindrical radius) while showing little dependence on \(\mu _{a1}\), \(\mu _{a2}\), and \(\mu _{s2}'\). We show the absolute error between Eq. (2) when calculated in quadruple precision using \(n=50{,}000\) and when calculated in double precision as a function of the number of terms n used in (2) for different values of (a) \(\mu _{s1}'\), (b) \(\mu _{a1}\), (c) \(\mu _{s2}'\), (d) \(\mu _{a2}\), (e) z, and (f) a. We fix the other properties to \(\mu _s' =10\) \(\mathrm {cm}^{-1}\), \(\mu _a=0.1\) \(\mathrm {cm}^{-1}\), \(l=(1.0, \; 20.0)\) cm, \(a = 10\) cm, and \(\rho = 1\) cm.

In Fig. 3, we investigated the convergence properties of Eq. (2) as a function of the number of terms n used in the summation. We considered an example 2-layer medium with baseline optical properties of \(\mu _{s1}'=\mu _{s2}'=10 \; \mathrm {cm}^{-1}\), \(\mu _{a1}=\mu _{a2}=0.1 \; \mathrm {cm}^{-1}\), \(\rho =1.0\), \(l=(0.1, 20)\) cm, \(z=0\) cm, and \(a=8\) cm. The fluence was calculated as a function of summation terms \(n \in (50,\, 3500)\) for varying ranges of 6 input parameters (\(\mu _{s1}',\, \mu _{s2}',\, \mu _{a1},\, \mu _{a2},\, z,\, a\)) while keeping all other variables constant. The absolute difference between this calculation which was done in double precision and a calculation done in quadruple precision with \(n=50{,}000\) is shown in Fig. 3. The convergence of Eq. (2) is highly dependent on the scattering coefficient in the first layer \(\mu _{s1}'\) as seen in Fig. 3a. Increasing \(\mu _{s1}'\) severely diminishes the convergence of Eq. (2) when \(z=0\) (see Supplementary material in Appendix A for extended discussion). On the other hand, for the range of values shown here, \(\mu _{a1},\, \mu _{a2}\), and \(\mu _{s2}'\) had a negligible effect on the convergence. There is a close relationship between \(\mu _{s1}'\) and z as shown in Fig. 3a,e and their effect on the convergence of Eq. (2). When \(z\approx z_0\) with \(z_0=1/\mu _{s1}'\), Eq. (2) requires a high number of terms to converge. This is also the primary reason why increasing the scattering coefficient also requires significantly more terms when \(z=0\) as \(z_0 \approx z\). Additionally, increasing a results in slower convergence due to smaller values of \(s_n\) during the sum. The routine can be made accurate down to absolute errors of the machine precision used in the calculation. For example, Fig. 3 was calculated using double precision arithmetic with machine precision \(\epsilon \approx 10^{-16}\). The loss of precision in calculating \(J_0(s_n \rho )\) in Eq. (2) is the primary limitation of the routine. For typical \(\mu _{s1}'\) found in biological tissue (\(\mu _{s1}'<50 \text{ cm}^{-1}\)), \(n<1000\) is usually sufficient. For example, only 50 terms were used in Fig. 2b resulting in similar relative errors compared to Monte Carlo simulations when using 5000 terms. However, the numerical procedure should check for convergence during the summation of Eq. (2) so that n can be dynamically determined during the routine.

The accuracy of the time-domain solution given in Eq. (3) is affected by both the termination of the sum in Eq. (2) as previously discussed and the numerical inversion of the Laplace integral in Eq. (3). We focus on two main attributes for the convergence of the inverse Laplace transform: (a) the hyperbola contour size (proportional to \(\Lambda =t_2/t_1\)) and (b) the number of Laplace space evaluations N used to evaluate the Laplace integral in Eq. (3) by comparing the time-resolved fluence simulated with Eq. (3) to the semi-infinite solution13. The fluence is simulated at \(\rho = 1\) cm on the top boundary (\(z = 0\)) using a 4-layer model with layer thicknesses of \(l_k = (0.5, 1.5, 3.0, 5.0)\) cm, \(\mu _s' = 10\) \(\mathrm {cm}^{-1}\) and \(\mu _a = 0.1\) \(\mathrm {cm}^{-1}\) with a radius of 15 cm to approximate a semi-infinite geometry. Typically, the time-domain signal is required at many values in some range \(t \in (t_1, t_2)\) where it becomes significantly more efficient to use a single contour for all time points35.

Figure 4
figure 4

(a) The (top) absolute and (bottom) relative errors for the time-domain reconstruction at a single time value \(t=1\) ns in \((t_1, \Lambda t_1)\) between the time-domain solution in Eq. (3) and the semi-infinite solution as a function of the number of Laplace space evaluations N. Larger contour sizes (\(\propto \Lambda \)) require higher values of N to reach similar accuracies. (b) Reconstruction of the time-domain signal at 600 time points in \(t\in (0.03, 6.0)\) corresponding to \(\Lambda =200\) considering four different values of N. The semi-infinite solution is shown as black circles with the resulting absolute error between the semi-infinite and layered solution shown in the bottom plot. The absolute error is dependent on N, which is similar for all time values considered in \(t \in (t_1, t_2)\).

In Fig. 4a, we show the absolute (top) and relative (bottom) errors between Eq. (3) and the semi-infinite solution13 at a single instant of time \(t = 1.0\) ns as a function of N, for four different values of \(\Lambda \). Variable values of \(\Lambda \) are achieved by using different \(t_1\) values of 1.0, 0.1, 0.01, and 0.001 ns such that \(\Lambda t_1 = 5\) ns is fixed and \(t=1\) ns is within the bounds of \((t_1, t_2)\). The absolute errors were similar for any t value within \(t\in (t_1, t_2)\) while the relative error was dependent on the value of t (i.e. larger relative errors are observed at long times when the fluence is lowest). Less than 20 Laplace evaluations were needed to give absolute errors \(<10^{-8}\) even for large values of \(\Lambda \). We can also see that the sum exponentially converges allowing it to be accurately computed with the midpoint rule41. The main limitation is that the function must be evaluated at very small and large values along the contour which leads to floating point errors limiting the procedure to absolute errors approaching the machine precision. Additionally, the evaluation points along the contour depend on N which can not be reused for different values of N. Therefore, for the best computational performance N must be determined before the computation.

Figure 5
figure 5

Equation (3) can be computed to absolute errors up to the machine precision compared to homogeneous closed form models at high scattering over a wide range of times and distances away from the source. (a) Time-resolved fluence from a 4-layered highly scattering media with optical properties \(\mu _s'\) = 80.0 \(\mathrm {cm}^{-1}\) and \(\mu _a\) = 0.1 \(\mathrm {cm}^{-1}\) at \(\rho \) = 0.2 and \(\rho \) = 3.5 cm. Computation was performed using double precision arithmetic. (b) Time-resolved fluence at the top boundary of a 4-layered media with optical properties \(\mu _s'\) = 10.0 \(\mathrm {cm}^{-1}\) and \(\mu _a\) = 0.6 \(\mathrm {cm}^{-1}\) at \(\rho \) = 3 and \(\rho \) = 6 cm. Computation was performed using octuple precision arithmetic. The semi-infinite solution is shown as markers with the absolute error between the two solutions shown below.

In Fig. 4b, we considered a single contour \(\Lambda =200\) to reconstruct 600 time points in \(t\in (0.03, 6.0)\) and we show these for four different values of N. A larger N improved the overall accuracy and was relatively independent of the time point \(t\in (0.03, 6.0)\) for a given N. For larger contours \(\Lambda =t_2/t_1\) a higher number of N are needed to reconstruct the time-domain signal over the whole time window \(t\in (t_1, t_2)\) for a given absolute error. For example, Fig. 4b shows the same \(\Lambda \) but reconstructs the time-domain signal for different values of N. However, smaller values of N are not able to reconstruct accurately over the entire time window due to the lower fluence values \(\Phi (\rho , t)\) at later times. A given N reconstructs the time-domain signal over the entire window at a relatively fixed absolute error. Therefore, larger relative errors will be observed at later times when the fluence is lowest.

We note that calculations in Fig. 4b are only shown up to the point where the time-domain signal is not accurately reconstructed. Therefore, it is recommended to use a \(t_1\) as late as possible and choose N based on the dynamic range of time-domain signal required. For example, 12 Laplace evaluations were typically required to reconstruct the time-domain signal with a dynamic range of 3 orders of magnitude where 24 evaluations can provide roughly 6 orders of magnitude which represent typical dynamic ranges of time-domain systems8,42. Increasing the number of evaluations does not decrease absolute errors once the errors reach the machine precision. Coincidentally, we have found that the numerical inversion of the Laplace transform is also limited by absolute errors approaching the machine precision, similar to the numerical computation in the spatial domain.

The previous examples have focused on modest values of \(\mu _{s}',\;\mu _{a},\;\rho \), and layer thicknesses. In Fig. 5, we reconstruct the time-domain signal for high scattering media and large layer thicknesses over a wide range of times which has previously been previously difficult due to numerical overflow29,30,31. In Fig. 5a, the time-domain signal on the top boundary (\(z=0\)) for a high scattering medium \(\mu _s' = 80\) \(\mathrm {cm}^{-1}\) and \(\mu _a = 0.1\) \(\mathrm {cm}^{-1}\) at \(\rho \) = 0.2 and \(\rho \) = 3.5 cm is shown. We considered a 4-layered medium with the same optical properties in each layer with layer thicknesses \(l_k = (0.5, 1.5, 3.5, 30.0)\) cm and a cylinder radius of 15 cm for comparison to a semi-infinite model13. We used \(n=1000\) roots in Eq. (2) with \(N=24\) Laplace evaluations at \(\rho \) = 3.5 cm and \(N=72\) evaluations at \(\rho \) = 0.2 cm. Although the fluence at \(\rho \) = 0.2 cm is significantly larger, we considered \(t \in [0.004, 6.0]\) resulting in a \(\Lambda =1500\) whereas at \(\rho \) = 3.5 we considered \(t \in [0.8, 6.0]\) giving \(\Lambda =7.5\). This again highlights that the number of Laplace space evaluations is highly dependent on \(\Lambda \). Even considering a very large layer thickness \(l_4 = 30\) cm and large reduced scattering coefficient \(\mu _s' = 80 \text{ cm}^{-1}\), the time-resolved fluence can be easily simulated in double precision arithmetic, very close to the source, and at both early and late times.

In Fig. 5b we show calculation of the time-domain fluence at the surface (\(z=0\)) for very low fluence values from a high absorption medium (\(\mu _s' = 10.0\) \(\mathrm {cm}^{-1}\) and \(\mu _a = 0.6\) cm\(^{-1}\)) at two detector locations of \(\rho \) = 3.0 cm and 6.0 cm and for \(t \in [0.1, 6.0]\). These calculations were performed in octuple precision using \(N=168\) Laplace evaluations and only \(n=600\) roots in Eq. (2) which reconstructed the time-domain signal over 50 orders of magnitude in dynamic range with high numerical accuracy. The absolute error is relatively constant for \(t \in (t_1, t_2)\) which leads to a higher relative error at lower fluence values. An increase in amount of Laplace evaluations is needed for very low fluence values and can also be observed by extrapolating the asymptote of convergence in Fig. 4 to very low absolute errors. Lastly, the roots of \(J_0\) must be calculated in higher precision to achieve the shown absolute errors.

Figure 6
figure 6

Comparison of the (left column) steady-state and (right column) time-domain fluence using diffusion theory (lines) simulated using Eqs. (2) and (3) and the Monte Carlo method (symbols) for the tissue geometries representing a (top row) 2-layer, (middle row) 3-layer muscle, and (bottom row) 5-layer brain tissue models. The relative error between the Monte Carlo results and diffusion model are shown in the plots below. The diffusion approximation displayed relative errors less than 0.1 over a large domain of arguments suggesting it could be used in a variety of diverse tissue geometries.

Comparison to Monte Carlo simulations

Next, we compared the solutions obtained from Eqs. (2) and (3) to Monte Carlo simulations for both the steady-state and time-domain. We consider three different tissue geometries that are of high clinical interest and have been extensively used to model light propagation in different organ systems previously: a 2-layer model24, a 3-layer model representing a skin/fat/muscle layer32, and a 5-layer brain model43 representing a scalp, skull, cerebrospinal fluid (CSF), and a gray and white matter layer.

The optical properties considered in the 2-layer model are \(\mu _{a1} = 0.2\) \(\mathrm {cm}^{-1}\), \(\mu _{a2} = 0.1\) \(\mathrm {cm}^{-1}\), \(\mu _{s1}' = 13\) \(\mathrm {cm}^{-1}\), and \(\mu _{s2}' = 12\) \(\mathrm {cm}^{-1}\) with layer thicknesses of \(l_1 = 6\) mm and \(l_2 = 90\) mm. We report the optical properties of the 3-layer muscle and 5-layer brain model in Table 1. For all tissue models, we consider the index of refraction for each layer to be \(n_r=1.4\) with the external index of refraction being air (\(n_r=1\)). The anisotropy \(g=0.8\) was consistent for all layers in the Monte Carlo simulations. In all cases we compare the fluence on the top boundary (\(z=0\)) as a function of \(\rho \) for the steady-state calculations and as a function of t in the time-domain for \(\rho =0.45, 1.45,\) and 3.05 cm.

Table 1 Optical properties and layer thicknesses for the 3 layer skin/fat/muscle32 and 5 layer brain tissue used in the Monte Carlo simulations43.

In Fig. 6, we compare the steady-state and time-domain fluence when simulated using the Monte Carlo method and diffusion theory. The left column shows the steady-state fluence \(\Phi (\rho )\) for \(\rho \in (0.15, 10)\) cm simulated using Eq. (2). Excellent agreement (relative errors \(<0.1\)) is observed for all three tissue models, however the agreement is not uniform. The 2-layer model showed the best agreement for all values of \(\rho \) where the results asymptotically agreed with the Monte Carlo method. The 3-layer muscle model showed good agreement for \(\rho <5\) cm, but did not asymptotically agree. These results are consistent with recent reports45 that showed a breakdown in diffusion theory when the mean free path approaches the thickness of the top layer. Here, a top layer thickness of 1.2 mm was used. Although the significance of these errors were not studied on the reconstruction of optical properties, the relative errors between Monte Carlo solutions are less than 0.1 for \(\rho <6\) cm. Diffuse optical measurements are not usually collected at such large distances due to low signal to noise. A similar effect is observed in the brain model Fig. 6e where agreement (relative error \(<0.1\)) is observed for \(\rho <6\) cm, however longer distances show higher errors. These errors can be mostly attributed to the limitations of diffusion theory to accurately model the low scattering CSF layer44. Our analytical solutions utilized \(\mu _s' = 3.5\) \(\mathrm {cm}^{-1}\) to most accurately model the low scattering CSF layer with diffusion theory as previously suggested44, though the choice of \(\mu _s'\) significantly affects the resulting fluence calculated with diffusion theory for \(\rho >6\) cm. If \(\mu _s'\) is less than 4 \(\mathrm {cm}^{-1}\), a severe overestimation of the fluence is seen. Practically, for \(\rho >6\) cm it may become unrealistic to consider the CSF and other brain layers as parallel planes. We note that all models had similar disagreements for \(\rho < 0.5\) cm which is a known limitation of diffusion theory13.

In the right column of Fig. 6, we show the time-domain fluence for \(\rho = 0.45, \; 1.45, \; 3.05\) cm simulated using Eq. (3) and compare to Monte Carlo results. The 2-layer model is well approximated by diffusion theory in the time-domain for each value of \(\rho \) given enough scattering events illustrated by the uniform agreement across a wide range of time values. As in the spatial domain, the time domain results for the 3-layered model do not asymptotically converge to Monte Carlo simulations due to the small top layer thickness. The agreement is not uniform for each value of \(\rho \) as shorter distances are better approximated until much later arrival times. This is in contrast to the 5-layer model where the errors are relatively flat at all times and distances. This could be attributed to only presenting results in the time-domain for \(\rho <3.05\) cm, whereas the effect of the low scattering CSF layer is more significant for \(\rho >6\) cm. We note that the precise choice of \(\mu _s'\) in the CSF layer for the distances and times shown do not significantly affect the time-domain simulations compared to the steady-state results. Additionally, decreasing the discretization of \(t, \; \rho \) and z and simulating for more photons in the Monte Carlo method will reduce the noise, however, smaller discretization will not improve agreement between diffusion and Monte Carlo where they do not asymptotically agree.

Computational time

In Table 2, we show the amount of time in microseconds to compute the steady-state fluence in the top layer, \(\Phi _1(\rho , z = 0)\), for a given number of terms n considered in the sum in Eq. (2) for 2, 4, 8 and 16 layers. Different values of optical properties do not significantly affect the computation time (when n is fixed), which is instead dependent on the number of roots n used in the sum. Though, increasing n does not linearly increase the computational time as shown in Table 2 because it is faster to compute \(J_0(s_n \rho )\) at large arguments with asymptotic expansions. This affects the total run time because the calculation of \(J_0(s_n \rho )\) accounts for nearly 40 \(\%\) of the run time where the computation of \(G_1(s_n, z)\) takes most of the remaining time. For realistic applications where less than 1000 roots are needed, the fluence can be calculated in less than \(\approx \) 100 μs for up to 8 layers.

Table 2 Number of microseconds to compute steady-state fluence where n is number of roots in Eq. (2).
Table 3 Number of microseconds to compute the time-domain fluence where N is the number of Laplace space evaluations.

The time-domain routine must compute the steady-state routine for N (\(\approx 12{-}24\)) complex absorption values. This procedure lends itself well to parallelism as each computation is independent and can be used at each time point needed. Therefore, performance is limited by the runtime listed in Table 2, however using a complex absorption term increases the runtime by 2.5×. In Table 3, we show the runtime in microseconds for the time-domain fluence as a function of the Laplace space evaluations N for 2 and 4 layered media considering \(n=600\) roots in Eq. (2) and 1024 time points. We note that the time values do not have to be linearly spaced as when using the Fast Fourier Transform.

If a dynamic range of 3 orders of magnitude is needed (\(N \approx 12\)), the fluence can be simulated in less than 300 μs. The performance in the time-domain is highly dependent on the CPU used and its multi-core performance. When using an Intel CPU 8700k with 6 cores and 12 threads the runtimes can be decreased by 30% compared to using a MacBook Pro M1 as shown in Tables 2 and 3. The number of Laplace evaluations used should be in multiples of the available number of threads. We note that the load times for multi-threaded applications represent a significant portion of the total runtime. For a low number of Laplace evaluations (< 16) these computational times can be reached within 2× using a single core. The advantage of these procedures is that rapid simulation can be performed on a personal laptop while allowing for time-domain runtimes to be significantly reduced with higher end CPUs.

Discussion and conclusion

Limitations of homogeneous tissue-models to describe light transport in layered biological media have been discussed previously4,46. Although analytical models that incorporate heterogeneous optical properties are becoming more frequent30,47,48, their use, particularly in inverse calculations, is limited by their numerical accuracy and efficiency29. Therefore, homogeneous models are typically used given their simplicity and efficiency in solving inverse problems that require 100–1000 evaluations of the forward model to reach convergence. Neural networks can also be used to quickly (\(\approx \) 50 ms) estimate tissue optical properties in inverse problems49. However, they require long training times and are roughly 1000× slower than the presented analytical model for forward calculations49,50. Several solutions for photon diffusion in layered media have been reported, but present technical difficulties for numerical computation. We have investigated a previously developed model23,37 that has received wide interest30,47,48,51. However, the model relies on numerical inverse transforms for obtaining the photon fluence for both steady-state and time-domain simulations which limits the numerical accuracy and speed. For example, the computation of the steady-state fluence requires the inversion of a Bessel-type 1-D inverse transform (Eq. 2) over the \(k\)th root of the zeroth order Bessel function \(J_0\). The discrete version has several advantages compared to using Gaussian integration21,24 as the roots can be precomputed to improve the overall speed of the routine and can be implemented with strict convergence criteria for accurate computation over a wide range of input arguments using a variable number of roots. This is important because the convergence of Eq. (2) is highly dependent on the model inputs (Fig. 3), where different optical properties and tissue geometries require a different number of roots to be used. However, these expressions require numerical integration over hyperbolic functions that can numerically overflow for large input arguments or at large roots of \(J_0\).

In this work, we provide numerically stable expressions for the Green’s functions in terms of exponentially decaying functions, which facilitates accurate computation for large input arguments (e.g., scattering, layer thickness, spatial frequency) over any root of \(J_0\) without approximations or loss of generality that are usually required to numerically compute Eq. (2)30,31. As shown in Fig. 3, the accuracy and speed of computed solutions is determined by the number n of roots used in the sum in Eq. (2) which is most dependent on \(\mu _{s1}'\), a, and z. In practice, the values used for the cylindrical radius of the tissue-model a (Fig. 3f) should be kept as small as possible to increase convergence but should be large enough to accurately represent lateral boundary conditions. Although the total number of n largely dictates the speed and accuracy of the routine, the algorithm is limited to simulating Eq. (2) with absolute errors up to the machine precision in the calculation. This is largely due to the finite precision used in the calculation of \(J_0(s_n \rho )\) in Eq. (2) which is limited to absolute tolerances approaching the machine precision. For higher precision calculations, it is important to calculate the roots of \(J_0\) in the desired precision to simulate fluence values down to the machine precision. For experimental measurements with background noise, fluence values below the epsilon value of double precision (\(\epsilon \approx 10^{-16}\)) are rarely needed. Additionally, computing \(J_0\) accounts for the majority of the routine’s runtime especially when the fluence is required at multiple spatial locations, as is the case in many tomography52 or functional imaging53 applications. New numerical routines for the computation of \(J_0\) were developed that decrease computational time by at least 3×54 compared to using standard routines55. An advantage of the routine presented is that computing the fluence at 10 arbitrarily specified spatial locations takes only 3× longer than the times reported in Table 2. Although it can be difficult to directly compare computational times of different routines, as they depend highly on the computational resources and effort put into them, we were able to simulate the steady-state fluence 500–1000× faster than previously reported21,51. We note that these times are achieved on a personal laptop using a single core.

Computation of the time-domain signal requires an additional inverse time transform which is usually performed with the Fourier Transform30. Here, we have used the inverse Laplace transform35,36 for faster and more accurate reconstructions of the time-domain signal. We have found that 12–24 terms in the Laplace integral are needed in Eq. (3) to reconstruct the time-domain signal with dynamic ranges of 3–6 orders of magnitude, which is the range of current experimental systems8,42. Due to the decreased number of evaluations needed in the inverse time transform, the computational times for time-domain simulations are 1000–10,000× faster than what is usually reported depending on the number of layers considered and accuracy required23,26,33. Most of the performance gain can be attributed to utilizing the faster converging Laplace transform instead of the Fourier transform35 while other improvements come from other numerical optimizations for the steady-state calculation and threaded parallelism as further discussed in Appendix A in the Supplementary material. The Laplace transform can also evaluate the time-domain fluence up to absolute errors approaching the machine precision as shown in Fig. 4. The number of terms needed in the Laplace transform for adequate convergence will depend highly on the contour size \(\Lambda =t_2/t_1\) which is recommended to be kept as small as possible for faster reconstructions.

A primary limitation of the layered solutions presented here is that a large amount (500–5000×) of terms are required in the computation of Eq. (2) when \(z=0\) which is required for reflectance calculations. As seen in Fig. 3, increasing the top layer scattering coefficient will significantly increase the number of terms required in Eq. (2), while the convergence is mostly independent of deeper layer optical properties. This can be explained by the slow convergence of the particular solution of the Green’s function when \(z \approx 1/\mu _{s1}'\). When z is farther away from the source depth \(z_0\) as seen in Fig. 3e, only a few terms are needed. However, if we approximate that \(z \approx z_0\), it becomes possible to sum the particular solution of the Green’s function exactly which improves convergence significantly. We present detailed derivations of this approximation in Appendix A in the Supplementary material and show that such an approximation when \(\mu _{s1}' > 2\) \(\mathrm {cm}^{-1}\) and \(z=0\) can simulate the fluence with relative errors down to \(10^{-14}\) (Supplementary Fig. S3), which is as accurate as the exact forms in double precision arithmetic due to floating point errors. Therefore, it is highly recommended to use such an approximation in double precision arithmetic which can decrease computational times by 2–3 orders of magnitude depending on the input \(\mu _{s1}'\), allowing for computation of the steady-state fluence in less than a microsecond (Supplementary Fig. S4). This approximate form also allows for very accurate simulation for large scattering coefficients where it is difficult for the exact expressions to converge due to the slow exponential decay of the sum.

Finally, in addition to testing the numerical accuracy and efficiency, we have tested the physical approximation of diffusion theory compared to the Monte Carlo method by using three previously reported layered tissue models that approximate several different organ systems24,32,43. We note that solutions to the RTE in layered media have been presented which are more accurate than the diffusion approximation but, like Monte Carlo methods, come at increased computational cost45. Additionally, the situations presented in Fig. 6 represent the simplest forms of heterogeneous media consisting of layered slabs which may be a rather crude approximation of complex biological media. Though, the use of such a simple approximation has been shown to provide similar accuracy to more realistic tissue geometries in a brain model using atlas based meshes3. The primary disadvantages of diffusion theory are the inability to correctly predict photon fluence for short time scales and source-detector separations and the requirement that \(\mu _s'>> \mu _a\)10,11. A recent report also indicated that the diffusion approximation could increase inaccuracies far away from the source in layered models where layer thicknesses are small compared to the mean scattering length45. We found that for a top layer thickness of 1.2 mm, these predictions were in agreement with the results reported in Fig. 6, but we also find that such errors were only significant at distances of \(\rho >5\) cm for steady-state calculations. These errors were not apparent in the time-domain for \(\rho = 0.45, 1.45, 3.05\) cm when \(t < 4\) ns (Fig. 6). We also find that our solutions from diffusion theory agree well with Monte Carlo simulations for a 5-layer model of the brain even when considering a thin CSF layer of low scattering (Fig. 6).

In conclusion, we have developed and verified an open-source, easy-to-use numerical algorithm to accurately and efficiently compute solutions of the diffusion equation in layered media. The absolute errors of the routine can be made arbitrarily accurate and can simulate both the steady-state and time-domain fluence 3 to 4 orders of magnitude faster than previously reported. Therefore, the routine could be used in inverse procedures to recover optical properties of measured data in real-time (1–10 Hz). It can also be employed for rapid generation of the intensity profile in layered media at multiple spatial locations and varying optical properties, as required in tomography and functional imaging applications. These solutions are also easily amendable to solve the correlation diffusion equation in layered media. An additional advantage of the routine is that the computational time marginally increases with the addition of a new layer, as a 4-layered medium can be computed within 10% of the time to compute 2-layers. This could allow for more accurate simulations in highly layered media such as the brain at little cost to total run times. Additionally, we showed good agreement between diffusion theory and Monte Carlo simulations in three separate tissue geometries of clinical interest.