Efficient computation of the steady-state and time-domain solutions of the photon diffusion equation in layered turbid media

Accurate and efficient forward models of photon migration in heterogeneous geometries are important for many applications of light in medicine because many biological tissues exhibit a layered structure of independent optical properties and thickness. However, closed form analytical solutions are not readily available for layered tissue-models, and often are modeled using computationally expensive numerical techniques or theoretical approximations that limit accuracy and real-time analysis. Here, we develop an open-source accurate, efficient, and stable numerical routine to solve the diffusion equation in the steady-state and time-domain for a layered cylinder tissue model with an arbitrary number of layers and specified thickness and optical coefficients. We show that the steady-state (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$< 0.1$$\end{document}<0.1 ms) and time-domain (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$< 0.5$$\end{document}<0.5 ms) fluence (for an 8-layer medium) can be calculated with absolute numerical errors approaching machine precision. The numerical implementation increased computation speed by 3 to 4 orders of magnitude compared to previously reported theoretical solutions in layered media. We verify our solutions asymptotically to homogeneous tissue geometries using closed form analytical solutions to assess convergence and numerical accuracy. Approximate solutions to compute the reflected intensity are presented which can decrease the computation time by an additional 2–3 orders of magnitude. We also compare our solutions for 2, 3, and 5 layered media to gold-standard Monte Carlo simulations in layered tissue models of high interest in biomedical optics (e.g. skin/fat/muscle and brain). The presented routine could enable more robust real-time data analysis tools in heterogeneous tissues that are important in many clinical applications such as functional brain imaging and diffuse optical spectroscopy.

We use the integral transform approach 1 to solve the diffusion equation for a N-layered cylindrical model as shown in Fig. S1. A collimated source-beam is approximated by an isotropic point source located at a distance of z 0 = 1/µ ′ s1 from the location of incidence of the beam and boundary with µ ′ s1 representing the reduced scattering coefficient in the first layer. The steady-state diffusion equation can be given by where Φ, D = 1/(3µ ′ s ), and µ a denote the fluence rate, the diffusion coefficient, and the absorption coefficient, respectively 1 . The source function S(⃗ r) can be expressed as a Dirac delta function in cylindrical coordinates. Eq. A.1 can then be rewritten in cylindrical coordinates as with the abbreviation Φ = Φ(ρ, φ , z). The derivative with respect to φ in Eq. A.2 can be eliminated using a cosine transform and then reduced to an ordinary differential equation by using the finite Hankel transform of m th order 2 . These two transforms can be expressed together by 1 Supplementary Figure 1. Schematic of the N−layered turbid medium with a source located onto the center of the cylinder top.
using an extrapolated boundary condition at ρ = a ′ for the upper limit of the integral transform with a ′ = a + z b1 where a is the radius of the cylinder. The extrapolation length can be calculated with z bk = 2AD k where A is proportional to the fraction of photons that are internally reflected at the boundary 3 . The subscript k signifies the k th layer of the cylinder with distinct absorption µ ak and scattering µ ′ sk in each layer. Applying Eq. A.3 to Eq. A.2 yields an ordinary differential equation for Φ = Φ(s n , φ , m, z) after applying a finite Hankel transform relation 2 . J m is the Bessel function of first kind and order m and each s n is determined from the roots of J m such that J m (a ′ s n ) = 0, n = 1, 2, ...,.
A Green's function approach is used solve Eq. A.4 for the fluence in a specific layer assuming that the isotropic source (from an incident pencil beam) is located within the first layer (0 ≤ z 0 < l 1 ). We then seek separate Green's functions in each layer where the solution for the first layer G 1 (s n , z) is composed of a homogenous and particular part while the solutions for the remaining k layers have only a homogeneous solution. Therefore, the Green's function G 1 (s n , z) in the top layer becomes when z is within the first layer 0 ≤ z < l 1 . For the N th layer, G N (s n , z) becomes when z is located in the bottom layer ∑ N−1 k=1 l k ≤ z < ∑ N k=1 l k . These equations can be used to reduce Eq. A.4 to We use the boundary conditions previously described 1 but provide solutions for G 1 (s n , z) and G N (s n , z) purely in terms of exponentially decaying functions. This form of expression provides stability in numerical calculations and is in contrast to previously derived expressions 1 which contain hyperbolic trignometric functions that can easily produce overflow-errors. We also note that the solutions we derive are exact and do not require any approximations for calculation, as is required by previous reports [4][5][6] . Expanding hyperbolic functions as exponentials also allows for simplifying several other terms that serve to reduce the computational time.
The Green's function G 1 in the first layer (0 ≤ z < l 1 ) is given by In general, the quantities β 3 and γ 3 are obtained by downward recurrence relations with start values with the downward recurrence given by We note that we seek just the terms β 3 and γ 3 which must be determined recursively if the total number of layers N is larger than 3. In that case, Eq. A.9 is used to generate starting values in the recurrence relation, then Eq. A.10 is used recursively until β 3 and γ 3 are obtained. If N = 2, β 3 = 1 − e −2α 2 (l 2 +z b2 ) and γ 3 = 1 + e −2α 2 (l 2 +z b2 ) . For N = 3, only Eq. (A.9) is needed to calculate β 3 and γ 3 .
The solution to Eq. A.11 for G N is The quantity ξ 2 is computed with start values ξ N = α N (l N + z b2 ) and with downward recurrence relations ξ k−1 = ξ k + α k−1 l k−1 such that for N = 2, ξ 2 = α 2 (l 2 + z b2 ) and for N = 3, ξ 2 = α 2 l 2 + α 3 (l 3 + z b2 ) and so on. We note that our routine has unrolled these relations completely for N = 4 to improve performance.
For solutions in real-space, Φ(ρ, z) we apply the inverse relation of Eq. A.3 to Eq. A.7 such that the fluence in real space can be written as 1 for the special case of a point source that is incident at the center top of the cylinder.

Computing the real-space fluence
We note a few points that help to efficiently compute Eq. A.12. First, we determine s n such that J 0 (a ′ s n ) = 0, n = 1, 2, ...,.
Calculation of these roots at runtime significantly impacts performance. Instead n roots of J 0 (x) are precomputed and the relation s n = j 0,n /a ′ where j 0,n is the n th root of J 0 (x) is calculated at runtime. It is then also possible to remove the runtime computation of J −2 1 (a ′ s n ) because a ′ s n = j 0,n such that J −2 1 ( j 0,n ) is constant and therefore can be precomputed. For results

3/8
shown in this manuscript, we computed j 0,n for n = 1...1, 000, 0000 in octuple precision which are loaded along with J −2 1 ( j 0,n ) to be used in every simulation. We have observed that simply computing the Bessel functions account for over 50% of code execution and removing a function call to J 1 (x) substantially reduced the total computation time. For the calculation of J 0 (x), we developed an optimized routine 7 that increased computation speed by 3x relative to the common Amos' Fortran routines 8 . This optimization is crucial for fast calculation of the fluence at multiple spatial locations. Loss of numerical precision in calculation of J 0 (s n ρ) limits the accuracy of solutions typically to errors on the order of the machine precision. Though, we note that our routine can be run in arbitrary precision at the cost of increased computational time. Lastly, it may also be advantageous if computing the fluence at fixed spatial locations to precompute J 0 (ρ j 0,n /a ′ ) for several values of ρ by approximating that a ′ ≈ a if the cylinder radius is large. This would eliminate any calls within the loop to special function libraries which would significantly improve performance while allowing for calculation at varying optical properties and layer thicknesses as required in inverse problems.  Figure 2. We show the convergence of three separate terms in Eq. A.8 by showing the value of the n th term in the sequence when summing over j 0,n . Each term converges at a much different rate with the overall convergence being highly dependent on µ ′ s1 when z = 0.
A limitation of computing Φ 1 (ρ, z) using G 1 (s n , z) in Eq. A.8 is the slow convergence when z ≈ z 0 . This is of particular importance for reflectance measurements when we must compute Φ 1 (ρ, z) when z = 0. Unfortunately, when z = 0 and µ ′ s1 is large, the routine is limited by slow convergence requiring thousands of terms in Eq. A.12. The convergence of Eq. A.12 for Φ 1 (ρ, z) can be analyzed by separating Eq. A.8 into three terms where G 1 (s n , z) = G 1 (s n , z) + G 1 (s n , z) + G 1 (s n , z). In Fig. S2, we show the value of the three terms a n as we iterate over the n th root of j 0,n . Here, we show a specific example when µ ′ s1 = 20 cm −1 , µ ′ s2 = 15 cm −1 , µ a1 = 0.1 cm −1 , µ a2 = 0.2

4/8
cm −1 , a = 10 cm, l 1 = 0.5 cm, and l 2 = 5 cm. We can see that G 1 (s n , z) is rapidly decaying and just a few terms are needed for accurate computation. It should also be noted that this term is the only term affected by the optical properties of deeper layers k ≥ 2 and therefore the dominate convergence of Eq. A.8 is only affected by the optical properties of the first layer. However, because the three terms decay at such a different rate, for the highest numerical accuracy it is recommended to sum these terms separately and combine them once the infinite sum is terminated. As we can see, the convergence is dominated by the slow decay of G (1) 1 (s n , z) when z = 0. This convergence becomes slower as µ ′ s1 becomes larger. However, when z ≈ z 0 we can sum the particular solution G (1) 1 (s n , z) over n exactly in closed form yielding the infinite space Green's function caused by an isotropic point source. This allows us to write the fluence Φ 1 (z ≈ z 0 ) as where r = ρ 2 + (z − z 0 ) 2 and κ 1 = µ a1 /D 1 . The first term in Eq. A.16 is the infinite space Green's function. The homogenous part Φ We note that this solution is only approximate if we want to consider the fluence on the boundary z = 0, but to calculate the fluence inside the first layer when z = z 0 it represents the exact solution. This significantly reduces the number of terms needed in Eq. A.12, however G 1 (s n , z) now limits the overall convergence rate which also decays at a slow rate when z = 0 and for increasing µ ′ s1 . However, a similar approximation can be made which allows for the exact summation of G 1 (s n , z) in closed form which represents the steady-state solution for a semi-infinite medium. Therefore, it is appropriate to rewrite the fluence as where Φ SI (ρ, z) is the steady-state fluence in a semi-infinite medium (Equation 3 given in Kienle and Patterson 9 ). The optical properties of Φ SI (ρ, z) are that of the first layer (µ ′ s1 , µ a1 ). We test this approximation in Fig. S3 as a function of µ ′ s1 and ρ by summing G 1 (s n , z) for n = 50, 000 in octuple precision using Eq. A.12 and comparing the absolute and relative errors to the closed form semi-infinite Green's function. Each of these forms only contain terms that represent the top layer optical properties (µ ′ s1 , µ a1 ). In Fig. S4, we compare the approximation to the exact solution for two tissue models: (a) µ ′ s1 = 10 cm −1 , µ ′ s2 = 13 cm −1 , µ a1 = 0.1 cm −1 , and µ a2 = 0.2 cm −1 and (b) µ ′ s1 = 60 cm −1 , µ ′ s2 = 40 cm −1 , µ a1 = 0.01 cm −1 , and µ a2 = 0.08 cm −1 . We use the same layer thicknesses, l 1 = 1 and l 2 = 5 cm with a cylinder radius of 20 cm.
1 (s n , z) when summed over 50,000 terms using Eq. A.12. This approximation also gives absolute errors below the machine precision in double precision calculations when µ ′ s1 > 2 cm −1 .
Theoretically, this solution becomes more accurate when z = 0 as µ ′ s1 → ∞. We find this approximation to be accurate for relative errors greater than 10 −14 for µ ′ s1 > 2 cm −1 which is within the relative errors provided by the exact forms when double precision arithmetic is used. Even for very low µ ′ s1 = 0.1 cm −1 , this approximation can give at least 3 digits of accuracy. The 1 (s n , z) exactly using Eq. A.12 (lines) for two different tissue geometries. The absolute error is shown in the below plot displaying that these solutions give absolute errors below the machine precision in double precision calculations. The solutions were computed in octuple precision for accurate comparison.
advantage of this approach is that just 150 roots were used in the approximate form when µ ′ s = (10, 13) cm −1 where 3,500 roots were needed in the exact form. When µ ′ s = (60, 40) cm −1 the approximation only needed 200 roots at all values of ρ compared to 20,000 roots required in the exact form. However, if only a couple digits of accuracy are needed the approximate form can give reasonable convergence in less than 50 roots even for very large scattering coefficients. This results in computational times being 2-3 orders of magnitude faster allowing for computation in less than one microsecond for a wide range of optical properties. Therefore, it is highly recommended to use such an approximation for µ ′ s1 > 2 cm −1 .

Appendix B: Layered solutions in Time-Domain
Given a sinusoidally modulated source at frequency f , the real and imaginary parts of the fluence Φ(ρ, ω) can be calculated using the same formula for Φ(ρ) and adding a complex absorption term 10 where c is the speed of light in the medium and i = √ −1. The real and imaginary parts are used to calculate the phase angle and modulation.
For solutions in the time-domain, the real and imaginary parts of the fluence in the frequency domain must be calculated at many frequencies (400-4,000x) 10, 11 and inverse Fourier transformed into the time-domain 2 . The Fourier integral is slowly converging and the number of frequency evaluations needed is highly dependent on ρ, µ ′ s and t 11 . Alternately, an inverse Laplace transform can be applied to Eq. A.12 11,12 by making the substitution iω →s in Eq. A.18 and numerically integrating the Bromwich complex contour integral In Eq. A.19, B denotes the Bromwich path wheres is a complex number along the contour. We note we use a bars to avoid confusion with s n . The corresponding solution for the time-domain fluence is then The Bromwich line can be deformed into a Hankel contour that begins and ends in the left half-plane, such that Re z → −∞ 12 . On such a contour, the exponential in Eq. A.19 ensures that the integrand decays rapidly and renders the integral wellsuited for approximation using a simple trapezoidal rule. We utilized a hyperbola countour 12 parameterized by s(θ ) = µ + iµ sinh(θ + imϕ) where s ′ (θ ) = iµ cosh(θ + iϕ) to evaluate the time-domain solutions. For non-complex time-domain signals, application of the midpoint rule gives: wheres k =s(θ k ) for θ k = (k + 1/2)h and h being the uniform node spacing with N being the number of nodes along the hyperbola in the upper half-plane Re(s) > 0.
The parameters µ and ϕ as well as node spacing h are obtained for computing f (t) across many time values in t ∈ (t 1 ,t 2 ) by considering a single (fixed) integration path for all time values. The time-independent parameters are expressed by: where Λ = t 2 /t 1 . The uniform node spacing becomes h = A(ϕ)/N with the previously optimized 11 parameter ϕ = 1.09.