X-ray Fokker–Planck equation for paraxial imaging

The Fokker–Planck equation can be used in a partially-coherent imaging context to model the evolution of the intensity of a paraxial x-ray wave field with propagation. This forms a natural generalisation of the transport-of-intensity equation. The x-ray Fokker–Planck equation can simultaneously account for both propagation-based phase contrast, and the diffusive effects of sample-induced small-angle x-ray scattering, when forming an x-ray image of a thin sample. Two derivations are given for the Fokker–Planck equation associated with x-ray imaging, together with a Kramers–Moyal generalisation thereof. Both equations are underpinned by the concept of unresolved speckle due to unresolved sample micro-structure. These equations may be applied to the forward problem of modelling image formation in the presence of both coherent and diffusive energy transport. They may also be used to formulate associated inverse problems of retrieving the phase shifts due to a sample placed in an x-ray beam, together with the diffusive properties of the sample. The domain of applicability for the Fokker–Planck and Kramers–Moyal equations for paraxial imaging is at least as broad as that of the transport-of-intensity equation which they generalise, hence the technique is also expected to be useful for paraxial imaging using visible light, electrons and neutrons.

• Local absorption by the object (attenuation contrast, known since the 1895 discovery of x rays by Röntgen 30 ); • Local lensing by the object which concentrates or rarefies energy density due to local focusing or defocusing (propagation-based Laplacian-type phase contrast 31 ); • Local prism-like effects of the object which transversely shift optical energy (propagation-based differential phase contrast 31 ); • Local blurring with a position-dependent point spread function, associated with the SAXS fan emerging from each point at the exit of the object.
The first three effects are associated with the "TIE part" of the Fokker-Planck equation, with the fourth effect associated with the "diffusion equation" part. Of particular interest is the use of an x-ray form of the Fokker-Planck equation for modelling forward problems in paraxial x-ray imaging in a simple manner, e.g. by expressions valid in the near field (small propagation distance, Fresnel number 32  N 1 F ) that are of first-order accuracy with respect to the sample-to-detector propagation distance. A Kramers-Moyal 13 extension will also be of use when a more detailed treatment of the SAXS, emanating from each point over the exit surface of the object, is required. We anticipate that the resulting formalism will be of use in forward problems related to using x rays to image thin objects, as well as inverse problems such as phase/amplitude/diffusion-tensor recovery in both two and three dimensions (the latter constitutes the field of tensor tomography [33][34][35][36][37][38]. We close this introduction by outlining the remainder of the paper. The x-ray Fokker-Planck model is derived in two complementary ways. The first derivation considers the Fokker-Planck equation to be a fusing of (i) the www.nature.com/scientificreports www.nature.com/scientificreports/ transport-of-intensity equation for coherent energy transport in paraxial monochromatic x-ray beams, with (ii) the diffusion equation for diffusive paraxial energy transport associated with small-angle x-ray scattering within a sample. The second derivation is a microscopic treatment based on first principles, which relies on the crucial assumption that the phase at the exit surface of the object can be written as a sum of (i) a component that varies slowly with respect to the detector pixel size, and (ii) a component that fluctuates many times over one detector pixel and is therefore unresolved. We then give a generalised form of the Fokker-Planck equation, known as the Kramers-Moyal equation, as a means for treating the in-sample SAXS in a more precise manner. Next we sketch some indicative means by which the Fokker-Planck and Kramers-Moyal equations may be used in the forward problem of modelling near-field intensity contrast in coherent x-ray imaging, together with the associated inverse problem of phase retrieval and SAXS determination. We also consider some broader implications of the Fokker-Planck formalism.

fokker-planck model for x-ray imaging
Here we give two complementary derivations of the x-ray Fokker-Planck equation. The first is phenomenological, based on local conservation of energy in the presence of both coherent and diffusive paraxial x-ray energy transport. The second is a microscopic first-principles analysis. The former derivation is more general, while the latter is more physically illuminating.
X-ray Fokker-Planck equation: Transport-of-intensity equation with a diffusive term. X-ray beams "flow" in the sense that they stream optical energy as they propagate. As a descriptor of such energy flows, the Fokker-Planck equation may be viewed as a diffusive generalisation of coherent paraxial photon energy transport. Adopting this perspective, below we consider the transport-of-intensity equation 5 for coherent paraxial x-ray energy transport and then separately consider the diffusion equation for diffusive paraxial x-ray energy transport. We then show that these two equations may be merged, in a manner that conserves energy both locally and globally, to give the Fokker-Planck equation for paraxial x-ray imaging.
Transport-of-intensity equation for coherent paraxial energy transport. The transport-of-intensity equation 5 describes the intensity evolution of a paraxial complex coherent scalar wave field as it propagates. This continuity equation, arising from the parabolic equation of paraxial scalar wave optics, expresses local energy conservation. It states that the negative divergence -which may loosely be spoken of as the "convergence" -of the transverse Poynting vector (energy flow vector) is proportional to the longitudinal rate of change of intensity 5 : Here, I is the intensity of the monochromatic scalar wave field, φ is its phase, k is the wave-number and ∇ ⊥ is the gradient in (x, y) planes perpendicular to the optical axis z. Equation (1) quantifies the fact that energy density (and hence intensity) increases with increasing propagation distance for locally-converging waves, and decreases with increasing propagation distance for locally-diverging waves. As previously mentioned, the TIE is a continuity equation expressing local energy conservation under coherent x-ray energy transport. The conserved (Noether 39 ) current is the transverse Poynting vector, which up to a multiplicative constant is 4,8,10 The corresponding constant Noether charge N is the total integrated intensity over any plane perpendicular to the optical axis: For propagation in vacuo through a distance Δ ≥ 0 that is sufficiently small for the Fresnel number to be much greater than unity, a longitudinal finite-difference approximation to the TIE can be used to model the resulting propagation-based phase contrast 4 : The underpinning physics is sketched in an imaging context in Fig. 1(a), whereby local specular refraction of x-ray radiation, by a thin object, leads to transverse re-distribution of optical energy upon propagation through a distance Δ that is not too large. Here and henceforth, we use the term "specular refraction" in direct analogy to "specular reflection", with the former term referring to the refraction associated with a mirror-like refracting surface (i.e. smooth, lacking in the effects of roughness or spatial randomness; cf. Latin speculum). Rays passing through A have a positive divergence, hence the intensity at B is reduced compared to that which would have been registered if the rays passing through A had all been parallel to the optical axis z. Similarly, rays passing through C have a negative divergence (positive "convergence"), hence the intensity at D is increased compared to that which would have been obtained had the rays exiting C been parallel to the optical axis. This phase contrast effect is increased with increased propagation distance Δ, reduced x-ray energy (decreased wave-number k) or faster spatial variations in wave-field phase (larger φ |∇ = | ⊥ x y z ( , , 0) ). We can also have effects such as a transverse deflection, e.g. with the local maximum of intensity at D being slightly shifted from the point x = x 0 . Concentration or rarefaction of energy density (and hence intensity) may be thought of as due to the lensing term proportional to φ ∇ ⊥ I x y z x y z ( , , ) ( , , ) 2 in Eq. (4), while the transverse shifts in intensity maxima or minima may be viewed as due to the prism term φ ∇ ⋅∇ ⊥ ⊥ I x y z x y z ( , , ) ( , , ) 31 .
Diffusion equation for diffusive paraxial energy transport. If there were only diffusive energy transport on account of unresolved micro-structure in the sample (i.e. SAXS), but no specular refraction, we could instead introduce a diffusion coefficient D(x, y, z) to describe the effects at distance z = Δ downstream of an illuminated thin object in the plane z = 0. The fact that this coefficient depends on x and y reflects the fact that the SAXS fan will in general vary with position over the nominal exit surface z = 0 of the thin sample. Bearing all of this in mind, the intensity evolution of the beam may then be described by the diffusion equation: z 0 2 with the very important proviso that the above expression is always to be understood in its finite-difference form Note that a formal dependence on Δ has been added to the diffusion coefficient D, since SAXS-induced diffusive blur has a characteristic transverse width that scales with Δ rather than Δ 1/2 .
The effect shown in one transverse dimension in Fig. 1(b) can be modelled using the above finite-difference form of the diffusion equation. Here, an x-ray beamlet of width Δx illuminates a slab containing SAXS-inducing unresolved micro-structure. This width Δx is assumed to be large compared to the characteristic length scale of the projected micro-structure, but small compared to the characteristic transverse width associated with the macroscopic projected phase and attenuation coefficients of the object. The illuminated region E of the slab, located at transverse coordinate x = x 0 , results in a smooth SAXS fan of opening angle θ(x 0 ), which smears the propagated intensity over the plane z = Δ. The characteristic width of this blur, which may be considered as a broadening of the local rocking curve due to the sample 40,41 , is 0 0 0 Thus, local diffuse scattering by a thin object leads to a transverse smearing of intensity upon propagation. This blur increases with increased propagation distance Δ. At the level at which diffusion is here being considered, the opening angle θ(x 0 ) of the SAXS fan is taken to completely characterise its diffusive effects at each location x 0 ; a more sophisticated treatment which replaces the single number θ(x 0 ) (or its associated diffusion coefficient) with a hierarchy of tensors, will be given later.
Like the TIE, the diffusion equation may be viewed as a continuity equation expressing local energy conservation, albeit under diffusive x-ray energy transport. Here, the conserved current is given by Fick's first law as 13 (2) Fokker-Planck equation for combined coherent and diffusive paraxial energy transport. If both specular refraction and local-SAXS effects are present simultaneously, we can add the TIE and diffusion equations together, in an energy-preserving manner. This gives the continuity equation: Here, F(x, y) is the fraction of the optical energy converted to SAXS 43 when illuminating a thin sample in the plane z = 0 at the point (x, y). If  F x y ( , ) 1 (as is often the case for both soft 11 and hard 44 x-rays) one can replace 1 − F(x, y) in the above expression by 1. The resulting convection-diffusion equation (forward Kolmogorov equation 13 ) is the Fokker-Planck equation in two transverse dimensions. It models both effects in Fig. 1, i.e. it models the effects of both local sample-induced specular refraction and local sample-induced SAXS.
If the expressions for the coherent current ⊥ J (1) and diffusive current ⊥ J (2) are written explicitly, and the approximations made that (i)  F x y ( , ) 1 and (ii) F(x, y) is slowly varying with x and y, we obtain the 2 + 1-dimensional Fokker-Planck equation: The corresponding finite-difference forms, which are to be preferred for reasons that have already been outlined, are respectively given by the following pair of equations:  We close this section by noting that, if D may be considered to be sufficiently slowly varying with respect to transverse coordinates that D commutes with the transverse Laplacian, we may instead work with the simpler finite-difference expressions: z 0 2 In the above two finite-difference Fokker-Planck equations, FD plays the role of an effective diffusion coefficient eff that also accounts for the fraction of the incident radiation converted to SAXS.

X-ray Fokker-Planck equation: Derived from first principles.
Here we derive the x-ray Fokker-Planck equation from first principles. We consider two length scales of sample-induced wave-field variation. The larger scale corresponds to the length scales in the sample that are sufficiently large for their absorptive and refractive effects to lead to detector-resolvable intensity variations. The smaller length scale corresponds to diffuse scattering from unresolved micro-structure resulting in SAXS. This derivation clarifies the physical mechanisms underpinning both coherent and diffusive x-ray energy transport, in a paraxial x-ray imaging context. We work in one transverse dimension for simplicity, ignore polarisation and assume monochromatic x rays. Let ψ(x, z = 0) be the spatial part of the complex scalar amplitude at the exit surface of a sample. Subsequent Fresnel diffraction gives 4 where P Δ (x) is the Fresnel propagator corresponding to free-space diffraction through a distance Δ ≥ 0, and paraxial radiation is assumed throughout. The squared modulus of the above convolution integral gives the propagated intensity as the following linear combination of contributions from every pair of points x′, x″ in the unpropagated wave field 45 : The sample is assumed to be thin, normal to the incident plane-wave illumination, and with nominal exit surface z = 0. Many models exist for incorporating unresolved micro-structure into this sample 46,47 . Follow Voronovich 47 , Nesterets 45 and Yashiro et al. 48,49 is a phase component that varies slowly with respect to the detector pixel size, and (ii) φ f (x) is a fast-varying spatially random phase due to the sample micro-structure, which fluctuates many times over one detector pixel and is thus unresolved (see also Gureyev et al. 31 ). Further assume a slowly-varying intensity I s (x), together with x-rays that are of sufficiently high energy for the projection approximation to hold for a given sample. Hence (2019) 9:17537 | https://doi.org/10.1038/s41598-019-52284-5 www.nature.com/scientificreports www.nature.com/scientificreports/ Average over an ensemble of realisations of the unresolved fast phase-maps φ f (x) 45,[50][51][52] . Denoting this average by an over-line, Assuming the fast phase maps φ f to be spatially statistically stationary over transverse displacements that are small compared to the characteristic length scale over which φ s fluctuates appreciably, the over-lined expression in Eq. (22) depends only on coordinate differences x′ − x″. Assuming φ f (x − x′) and φ f (x − x″) to be Gaussian variables, with zero mean, this correlation function is 45,48,49,[51][52][53] f f 2 f is the normalised fast-phase auto-correlation function 54 as a function of coordinate separation Δx and transverse coordinate x. The quantity σ φ x ( ) f is called the "phase depth". If this is not significantly greater than unity, the right side of Eq. (23) does not decay to zero as | ′ − ″| → ∞ x x . Rather, this correlation function asymptotes to : see Fig. 2(a). The physical reason for this decay to a non-zero constant is the phase depth not being sufficiently high for φ i exp( ) f at two different locations to cross-correlate to zero, irrespective of how widely separated these locations may be (cf. Prade et al. 55 ).
To pro ceed fur ther, follow G o o dman 51 in s eparating out the asymptote term f rom (23), and re-normalising the part of the correlation that does decay to zero. Thus: is a re-normalised correlation function www.nature.com/scientificreports www.nature.com/scientificreports/ 2 2 2 f f f that does decay to zero for coordinate separations , and τ is the transverse correlation length (see Fig. 2 = Δ x z ( , ) represents small-angle x-ray scattering. Thus a physical consequence of the lack of decay to zero in Eq. (23), is that the diffraction pattern in Eq. (27) contains two components 45,51,52,56,57 : • A "specular diffraction" component that undergoes Fresnel diffraction as if there were no micro-structure present, but with intensity damped by the decoherence factor 45 shown in Fig. 2 This splitting of x-ray energy flows -whereby an incident coherent x-ray energy flow upstream of the object bifurcates into superposed coherent and diffusive energy flows downstream of the object -corresponds directly to the factors of F and 1 − F introduced "by hand" in the previous sub-section (see Eq. (9)), if we take Note that a similar bifurcation of energy flows occurs, for similar reasons, in statistical dynamical x-ray diffraction theory 58,59 and the frozen-phonon model of electron diffraction theory. Note also that a non-zero specular-diffraction component (which will typically be the dominant component in the present context) indicates the SAXS-induced speckle to be "partially developed" 60 ; this may be contrasted with the case of "fully developed" SAXS speckle, which is not relevant to our context, in which the specular component would be fully extinguished.
Assume I s (x) to be sufficiently slowly varying that the square-root term in Eq. (28) may be replaced by I s (x). For the complex exponent in Eq. (28), make the more sensitive approximation: where φ′ x ( ) s denotes the derivative of φ s (x) with respect to x; the above equation makes implicit use of the fact that φ f varies much more quickly than φ s . Then write P Δ (x′) in Eq. (28) as a Fourier integral using the notation and convention specified by∫ where k x is the Fourier coordinate corresponding to x. Also write Δ ⁎ P and C in Eq. (28) as Fourier integrals, with C being Fourier transformed with respect to its first argument. With all of these steps, together with the fact that | | = Δ P 1 for the Fresnel propagator 4 , one obtains (cf. similar equations in related contexts 45,52,61,62 , especially those due to Nesterets 45 ): x x 2 s s f Physically, the above equation states that the SAXS contribution, at a given transverse location x, integrates over a fan of rays making angles φ′ ]/ x s with respect to the optical axis. This fan of rays is due to scattering from unresolved micro-structure associated with the ensemble of fast phase maps {φ f (x)}. The whole fan of SAXS rays is rotated by the angle φ′ x k ( )/ s , due to the specular refraction. All SAXS-scattered x rays, due to illumination of the point at x = x 0 , are integrated over a single pixel. See Fig. 2(d). The angular spread of the SAXS fan will typically be on the order of a few degrees (50 milli-radians) for hard x rays 63 , while the fan rotation angle φ′ x k ( )/ s will typically be on the order of micro-radians 64 . The function C , being the Fourier transform of a correlation function 43 , is proportional to the power spectrum of the ensemble of fast-phase fields φ i x exp { ( )} f (Wiener-Khinchin theorem 65 ); this power spectrum (local differential scattering cross section 54,55,66 ; cf. Eq. (44) below) gives the weights for the diffracted "rays" in the SAXS fan. The fraction channelled into the specular Fresnel diffraction (cf. Eq. (29)).
If the object-to-detector distance Δ is large enough that the SAXS-induced fan of rays in Fig. 2(d) spreads over more than one pixel, we have the scenario shown in Fig. 2(e). Equation (32) generalises to the convolution-type integral: The function d is simply a transverse re-scaling of C , from a function of Fourier-space coordinates k x (i.e. angular coordinates k x /k, in the paraxial approximation) to a function of the transverse coordinates x over the detector plane. Stated differently, d(x; x 0 ) is the intensity distribution of SAXS, as a function of position x on the detector, due to x rays incident at position x 0 . If the object-to-detector distance Δ is sufficiently small, two additional approximations can be made.
(i) Small Δ implies d(x; x 0 ) to be sufficiently narrow for its Fourier transform d k x ( ; ) x 0 with respect to x to be sufficiently broad that it can be represented by a second-order Taylor expansion in k x . The term in the Taylor series for d k x ( ; ) x 0 that is linear in k x will vanish since C(Δx; x 0 ) is approximately even in Δx (cf. Eq. (47)). Given all of the above, the convolution operator π is an x 0 -dependent blurring width (this blurring being due to SAXS), and Q is a positive constant. Invoke conservation of energy, which implies that the total scattered intensity, due to an incident x-ray pencil beam of intensity I s (x) and with small thickness Δx, will have the same integrated intensity after scattering from the object; see Fig. 2(e). This shows that Q = 1, so that π 67 ). Hence we may approximate Eq. (33) as: s 0 0 0 0 . Strictly speaking, this linear integral transform is not a convolution integral since the blurring kernel d depends on x 0 . We use the convolution notation for clarity, however, to denote such smearing with a position dependent point spread function that results from the position-dependent SAXS. See also the next section, for an analysis that considers the same linear integral transform using a more conventional and explicit notation. (ii) A second consequence of small Δ is that the TIE approximation 5 may be applied to the first term on the right of Eq. (27):  (36) and (37) are substituted into Eq. (27), we obtain Assume only a small fraction of the incident x rays to be converted to SAXS, so that σ φ  x ( ) 1 2 f . Thus: 2 The corresponding position-dependent effective diffusion coefficient D eff (x; Δ) obeys Eq. (16), so that www.nature.com/scientificreports www.nature.com/scientificreports/ Laplacian. Note also that a Kramers-Moyal generalisation of the Fokker-Planck equation would have resulted if all terms had been included in the expansion for π ⊗ d x x 2 ( ; ) 0 (see next section). We again emphasise that Eq. (39) is to be considered with Δ being small but finite and fixed. Thus, the simplicity of using diffusion concepts to describe SAXS, must be balanced against the fact that free-space propagation blur-width L scales as Δ rather than Δ ; formally, this is accounted for by having a Δ-dependent diffusion coefficient for the x-ray Fokker-Planck equation, which is only ever used in its finite-difference form for small nonzero Δ.
We close this section with Fig. 3. This indicates the relation between various key quantities: the divergence angle θ of the SAXS fan, the characteristic transverse length scale l of the spatially-rapid wave-front fluctuations induced by unresolved micro-structure at the sample exit surface, the phase depth σ φ f of the associated spatially-rapid phase fluctuations, the diffusion coefficient D and the effective diffusion coefficient D eff in the Fokker-Planck equation, and the smear width L associated with the SAXS fan. We make the following remarks on each equation in Fig. 3. All physical dependencies, for the quantities in Fig. 3, are intuitively reasonable: (a) the blur width L is directly proportional to the angular spread θ of the SAXS fan; the blur width L is directly proportional to the sample-to-detector distance Δ; (b) the angular width θ of the SAXS fan is inversely proportional to the transverse length scale l of the projected unresolved micro-structure, etc.

Kramers-Moyal extension of x-ray fokker-planck equation
The preceding analysis gives only a crude treatment of the angular distribution of the local SAXS fan, which is described merely via its angular spread θ(x, y) at each point over the exit surface z = 0 of a thin sample. Here, we show that a less-crude treatment leads to the Kramers-Moyal generalisation of the Fokker-Planck equation, in a form suitable for x-ray near-field imaging of thin objects.
Recall Eq. (29), which links the local phase depth associated with the projected micro-structure of a thin object, to the associated fraction F(x, y) of the x-ray beam which is channelled into SAXS. The two-transverse-dimensional forms of Eqs (27), (33) and (37), upon writing the linear integral transform in Eq. (33) explicitly and setting π ≡ d K 2 , then become: s Again, the first term on the right side of Eq. (42) is associated with the coherently-scattered component of the x-ray field downstream of the thin object in the plane z = 0, while the second term B is associated with the diffusely scattered component. In the zero-SAXS limit where F(x, y) → 0, Eqs (42) and (43) reduce to Eq. (4).
Recall the usual definition of a differential scattering cross section σ θ θ Ω d d [ / ]( , ) x y , where dΩ is an infinitesimal element of solid angle and θ x , θ y describe angular deflections (scattering angles) in the x, y transverse directions, respectively 69 . Hence conclude that π = K d 2 in Eq. (43) is proportional to the local differential scattering cross section 66 associated with unresolved micro-structure in the sample, namely a differential scattering cross section σ θ θ Ω ′ ′ d d x y [ / ]( , ; , ) x y that depends upon an illuminating beamlet's centre (x′, y′), as well as upon the scattering angles θ x , θ y . Thus we have the proportionality www.nature.com/scientificreports www.nature.com/scientificreports/ x y where the paraxial approximation has again been assumed in taking θ θ ≈ tan x y x y , , . Equations (42) and (43) then have the simple interpretation that the intensity downstream of a sample, which is contained in the plane z = 0, superposes a micro-structure-independent near-field Fresnel diffraction pattern that is damped with a decoherence factor 45 1 − F, with a diffusely-scattered signal B arising from the superposition of each SAXS fan emerging from each point over the exit surface of the sample. Such a family of SAXS fans is measured e.g. in Schaff et al. 36 ; this amounts to a measurement of ′ ′ = Δ K x y x y z ( , , , , ) . In the form given by Eqs (42) and (43), the object-to-detector distance Δ is assumed to be sufficiently small that the detector is in the near-field of the slowly-varying part s s of the complex wave field at the sample exit surface z = 0. In these same equations, Δ may be sufficiently large that the SAXS fan Δ ′ ′ = K x y x y z ( , , , , ) -emerging from the point (x′, y′, 0) over the exit surface z = 0 of the thin object, and considered as a function of coordinates (x, y) over the detector in the plane z = Δ-can have a width that is several pixels or more (cf. Fig. 2(e)). If these SAXS fans are highly structured, e.g. if they exhibit SAXS satellite peaks from highly directional unresolved structures such as collagen fibres 70 , then Eqs (42) and (43) can be used without any need for further approximation, to model image contrast in a regime where the detector plane is in the near-field of the resolved structure in the sample, but in the far-field of the unresolved micro-structure.
However if Δ is indeed small enough that the SAXS fan spreads over no more than a few pixels, and if the SAXS fans are not too highly structured, an intermediate simplification is possible, with a domain of validity between that of the less-general Fokker-Planck equation and the more general expression in Eqs (42) and (43). This intermediate model arises from a more precise consideration of the angular scattering distribution in the local SAXS differential cross sections, compared to that given in the preceding section. With this end in mind, assume that the angular widths θ(x′, y′) of all differential scattering cross sections contained in ′ ′ = Δ K x y x y z ( , , , , ) obey θ ′ ′  x y ( , ) 1, with all such narrow SAXS fans being strongly peaked in the forward direction. If this is the case, we are justified in Taylor  In obtaining the above expression, we have made use of the following three points: • We have assumed that each SAXS fan ′ ′ = Δ K x y x y z ( , , , , ) -emanating from the point (x′, y′) over the exit surface of the sample to produce an intensity distribution that is a function of detector coordinates (x, y) in the plane z = Δ-obeys: This normalisation condition is appropriate since the multiplicative factor of F is a scattering fraction which accounts for the fact that only a certain percentage of the incident beam is converted to SAXS upon passage through the sample.
consistent with the fact that any transverse shift in the centroid of the SAXS fan may be interpreted as a local gradient in φ x y ( , ) s that has already been accounted for in Eq. (42). This eliminates the terms Δ D x y ( , ; ) 01 (1) and Δ D x y ( , ; ) 10 (1) that would otherwise have appeared in Eq. (45). There is an intrinsic ambiguity here: it is a question of semantics whether we consider φ x y ( , ) s to possess a linear phase gradient and assume the SAXS fan to have a centre-of-mass that is undeflected, or whether we instead consider there to be no local gradient in φ x y ( , ) s but view the centre-of-mass of the SAXS fan to be transversely shifted on account of correlations that are present in the sample's unresolved micro-structure. In writing Eq. (47) we take the former of these two equivalent points of view.
• We have defined the following hierarchy of diffusion tensors (SAXS-fan moments, cf. Gureyev et al. 67 ), which arise from terms of progressively higher order in the Taylor expansion of I s (x′, y′, z = 0) about (x, y): Δ D x y ( , ; ) 20 (2) , which may be mapped to the semi-major axis of the SAXS ellipse, the semi-minor axis and a rotation angle 42 . The higher-order diffusion tensors in Eqs (49) and (50) have an analogous interpretation.
Finally, if we insert Eq. (45) into Eq. (42), we obtain the following finite-difference form of a Kramers-Moyal equation 13 : The F(x, y) = 0 case of Eq. (51) is the commonly-employed first-order finite-difference form of the transport-of-intensity equation 5 (see Eq. (4)). The final lines of this equation incorporate SAXS-induced blurring, in a manner which takes the anisotropy of the local SAXS fans into account. Pawula 71 has shown that such Kramers-Moyal expansions may be either truncated at most to second or lower orders in spatial derivatives, or not truncated at all. In our context, this amounts to working with one of five possibilities, with increasing order of generality: • Keeping only the first two terms on the right-hand side of Eq. (51) (TIE approximation), with F = 0, which ignores SAXS altogether; • Setting D (3) and all higher-order diffusion tensors to zero, and assuming rotationally-invariant SAXS fans via

02
(2) 20 (2) 11 (2) leaving us with the Fokker-Planck form given in Eq. (14) if we can also assume that  F 1; • Setting D (3) and all higher-order diffusion tensors to zero, thereby reducing Eq. (51) to an anisotropic Fokker-Planck equation incorporating the diffusion tensor Δ D x y ( , ; ) mn (2) (assumption of an elliptic SAXS fan); • Keeping all orders in Eq. (51), which contains all SAXS-fan moments. Note that this hierarchy of SAXS-fan moments may be measured, e.g. by raster scanning a focused beam through a thin sample to obtain a family of SAXS patterns 36,72 from which each component of each diffusion tensor at each location (x, y) may be obtained using Eqs (48)-(50); • Eschewing the Kramers-Moyal and Fokker-Planck approximations and instead working with Eqs (42) and (43) directly.

Discussion
The primary goal of this paper is to give several means by which the Fokker-Planck equation (and its extension to the Kramers-Moyal equation) may be used to model forward and inverse problems of near-field x-ray imaging, in the presence of both phase-amplitude shifts and non-negligible small-angle x-ray scattering from unresolved micro-structure in a thin sample. A detailed study of any particular application or applications is beyond the scope of this paper. One such application, grating-based x-ray phase contrast imaging [73][74][75][76] , is studied in detail in a companion paper 77 that gives a number of analytical expressions based on the finite-difference expression in Eq. (11). More generally, Eqs (11) and (10) may be used to calculate, in one and two transverse dimensions respectively, the near-field paraxial image corresponding to a thin sample with known position-dependent exit-surface intensity, phase and diffusion coefficient. As we have already seen, a sufficiently slowly varying fraction F(x, y), this being the fraction of the incident radiation that is converted to SAXS, may be brought into the second square brackets on the right hand side of Eqs (10) and (11), and thereby incorporated into an effective diffusion coefficient. See Eqs (16), (29) and (41), in addition to Eq. (g) in Fig. 3. These expressions can be used for calculating propagation-based phase contrast x-ray images [78][79][80] and speckle-tracking phase contrast images [81][82][83] , as well as the previously mentioned grating-based phase-contrast images. Note, also, that if one is in the intermediate-field rather than the near-field www.nature.com/scientificreports www.nature.com/scientificreports/ regime for φ I i exp ( ) s s , expressions such as (i) Eqs (27) and (36); or (ii) Eqs (27) and (43); or (iii) Eqs (27) and (45) may instead be used. All of these expressions are amenable to analytical implementation if the functional forms of intensity, phase and diffusion coefficients are known, or to computational implementation if the intensity, phase and diffusion are numerically modelled e.g. using Geant4 84 .
The above indications, of the forward problem of near-field and intermediate-field x-ray imaging for a thin sample in the presence of both coherent and diffusive scattering, set up Fokker-Planck and Kramers-Moyal equations that can be used to consider the associated inverse problem of reconstructing intensity, phase and diffusion from measured intensity data. It is again beyond the scope of our paper to explore this point further, beyond the indicative suggestions given in the three examples below.
Example #1. As a first example of inverse problems that may be tackled based on the formalism of this paper, observe that the diffusion-tensor moments may be measured e.g. using the previously-mentioned technique of raster scanning a focused quasi-monochromatic x-ray probe over a lattice of points at the entrance surface of a thin object 36,72 , and the scattering fractions F(x, y) obtained using the same data. Since both = Δ I x y z ( , , ) and = I x y z ( , , 0) s can be measured using quasi-monochromatic x rays, and since both the object-to-detector distance Δ and the radiation wave number k are known, the only remaining unknown in Eq. (51) would be the phase φ s . This phase could then be obtained using existing techniques for numerically solving the transport-of-intensity equation (e.g. a full multi-grid method 85 or a method using four Fourier transforms 86 ).
Example #2. A second application of the formalism of the present paper, is to the field of x-ray speckle tracking 81,82 . The technique of x-ray speckle tracking illuminates an object with a known resolved reference speckle field, and then measures one or more distorted forms of that speckle field which arise when a sample is placed in the illuminating speckle beam. The diffuse-scatter signal, considered in the present paper, arises naturally in x-ray speckle tracking -via unresolved sample-induced speckles that should be carefully distinguished from the titular illuminating speckles -and gives useful information that may be extracted using a variety of techniques. These techniques are typically based on the reduction in illuminating-speckle visibility due to SAXS-induced diffusion: see e.g. the review by Zdora 83 , and references therein. The geometric-flow approach to speckle tracking, which has recently been developed for x-rays 87 but has also been very recently applied to visible-light imaging 88 , takes as a starting-point the following equation: R S R This relates the intensity I R (x, y) of a reference speckle image obtained in the absence of a sample, to the distorted/coded/encrypted form of that speckle image, denoted I S (x, y), obtained in the presence of a pure-phase-object sample; D ⊥ (x, y) is a transverse displacement-vector field, which distorts the reference speckle image I R (x, y) into the speckle image I S (x, y) obtained in the presence of the sample. The above equation, modelled on the continuity equation that has played such a dominant role in the present paper, conserves photon energy both globally and locally. It may be used to reconstruct the speckle displacement D ⊥ (x, y) induced by the phase object, and hence a map of the phase shift φ(x, y) due to the object, in a simple and efficient manner that implicitly rather than explicitly tracks speckles 87 . Note that the displacement field is related to the phase shift of the object, via where Δ is the sample-to-detector distance and k is the wave-number of the x-rays. A Fokker-Planck form of Eq. (53), suitable for a phase object described by both its phase shift φ(x, y) and effective diffusion coefficient D eff (x, y; Δ), is: This incorporates both coherent flow and diffuse flow, into the geometric-flow method for x-ray speckle tracking. It would be interesting to investigate an augmented form of the geometric-flow x-ray speckle-tracking method, which takes Eq. (55) rather than Eq. (53) as a starting point, and seeks to reconstruct both φ(x, y) and D eff (x, y; Δ) rather than just φ(x, y). Of possible relevance in such a context -namely, of separating out the signal due to coherent versus diffusive energy flow -is the fact that D eff (x, y; Δ) will transform as a scalar upon rotation of the sample through 180 degrees (about a rotation axis that is perpendicular to the optical axis, e.g. in a computed-tomography setting), whereas D ⊥ (x, y) will transform as a vector under the same rotation.
Example #3. As a third and somewhat more detailed example of inverse problems that may be tackled using the formalism outlined in the present paper, consider a static non-magnetic non-crystalline sample that is made of a single material. This sample may and in general will have varying spatial density, with complex refractive index 4 δ β = − + n x y z x y z i x y z ( , , ) 1 ( , , ) ( , , ) (56) which is such that the ratio δ β x y z x y z ( , , )/ ( , , ) is independent of position at all points within the sample 7,89 . The associated unresolved micro-structure is left arbitrary, and the exit surface of the sample is assumed to correspond to the plane z = 0. Further assume normally-incident unit-intensity illumination by z-directed quasi-monochromatic x-ray plane waves. This single-material approximation, together with the projection (2019) 9:17537 | https://doi.org/10.1038/s41598-019-52284-5 www.nature.com/scientificreports www.nature.com/scientificreports/ approximation 4 , implies that the x rays at the exit surface of the sample have an intensity = I x y z ( , , 0) and phase φ = x y z ( , , 0) given by Here μ = 2kβ is the linear attenuation coefficient of the material from which the sample is composed, and T(x, y) is the projected thickness of the sample along the z direction. The fact that 7 2 enables the 2 + 1-dimensional finite-difference Fokker-Planck equation in Eq. (14) to be transformed to: eff 2 Here, we have used Eq. (16) to incorporate the effective diffusion coefficient D eff (x, y; Δ). Equation (59) is a linear elliptic partial differential equation for the unknown exponentiated projected thickness exp[−μT(x, y)] of the single-material sample with projected thickness T(x, y), and is an algebraic equation for the unknown effective diffusion coefficient (SAXS term, or so-called "dark-field" signal 11,44,76 ) D eff (x, y; Δ). Note that the D eff (x, y; Δ) → 0 limit of Eq. (59) is the basis of a commonly-used single-image method for x-ray phase retrieval 7 . When we cannot assume D eff (x, y; Δ) to vanish, we may solve Eq. (59) for both D eff (x, y; Δ) and T(x, y), as follows. Suppose propagation-based phase contrast images are obtained at two different propagation distances, Δ 1 and Δ 2 . From Eq. (f) in Fig. 3, we can write down the scaling relation: This scaling results from the already-mentioned fact that the slab of vacuum, in between the exit surface z = 0 of the object and the entrance surface z = Δ of the detector, is not a diffusive medium. Thus the width of the SAXS fans scales linearly with Δ, rather than being proportional to Δ 1/2 (as would be the case for true diffusion). Setting D ∝ Δ ensures that the final term on the right side of Eq. (6) is proportional to Δ 2 , ensuring the width of the SAXS fan scales as (Δ 1/2 ) 2 = Δ, as required. As is the case throughout the paper, the artifice of a Δ-dependent diffusion coefficient is a price to be paid for the simplicity of being able to work simultaneously with both diffusion concepts and free-space coherent propagation concepts, in the context of the Fokker-Planck and Kramers-Moyal equations for x-ray imaging. To proceed we write the z = Δ 1 and the z = Δ 2 cases of Eq. (59), and then use Eq. (60) to eliminate D eff (x, y; Δ 1,2 ). Hence: Equation (61) is mathematically identical in form to Eq. (7) of Paganin et al. 7 , and may therefore be solved using the same Fourier-transform method. This gives: x y e Here,  denotes Fourier transformation with respect to x and y, in any convention that transforms ∂/∂x into ik x and ∂/∂y into ik y ; (k x , k y ) are Fourier coordinates corresponding to (x, y), and −1  is the inverse of  . Equation (62) shows how a linear combination of two images, taken at two different distances Δ 1 and Δ 2 , may be combined to yield the projected thickness T(x, y) of a thin single-material sample, in a manner that is independent of SAXS scattering by the sample (cf. Pagot et al. 40 ). Once T(x, y) has been so obtained, Eq. (59) can then be solved algebraically for D eff (x, y;Δ). Such a phase/dark-field retrieval approach that incorporates dark-field effects might benefit samples from a range of fields including materials development (e.g. micro-crack detection in high-strength materials 90 ) and biomedical imaging (e.g. diagnostic imaging of the lungs 91 ).
In the third example above, and indeed throughout the paper, the effective diffusion coefficient (and, more generally, the diffusion tensors as defined in Eqs (48)-(50)) will have at least three independent contributions: • The first contributor to the hierarchy of diffusion tensors is the sample-induced scatter due to unresolved micro-structure, that has been a main theme of the present paper. • The second contributor to the diffusion tensors is an edge-scattering signal 11,44 that is now known to be due to unresolved sharp edges 92 , and which may be viewed as a Young-Maggi-Rubinowicz type boundary (2019) 9:17537 | https://doi.org/10.1038/s41598-019-52284-5 www.nature.com/scientificreports www.nature.com/scientificreports/ wave 65,93-97 or a Keller-type 98 diffracted ray. This is examined in further detail in the previously mentioned companion paper 77 .
• The third contribution to the diffusion tensors is associated with incoherent aberrations (including the modulation transfer function of the detector, and penumbral blurring effects due to finite source size) of the imaging system that is used to measure images of the sample. As an example: take Eq. (59), set D eff (x, y; Δ) to a constant D eff (Δ) that is independent of x, y so as to model penumbral blur due to finite source size in a linear shift-invariant imaging system, and then note that the resulting equation has a solution identical to that recently published by Beltran et al. 99 .
All three effects will in general be present simultaneously, so that e.g. the SAXS fans associated with sample-induced blurring will need to be convolved with the intensity point-spread function of a given optical imaging system, and similarly with the signal associated with unresolved sharp edges. These interesting complications, which may be summarised in the statement that "partial coherence can come from the source, or the object itself ", are beyond the scope of the present paper.
There is some relation between the differential form of the diffusion operators employed in this paper, and the idea of Laplacian-based unsharp-mask image sharpening 100,101 . From an unsharp-mask image processing perspective, the linear differential operator 102 locally sharpens an image I(x, y) to which it is applied, with a characteristic position-dependent partial-deconvolution transverse length scale equal to w(x, y); cf. Eq. (59). The approximate inverse of this operator, which locally blurs an image over the same transverse length scale w(x, y), is (cf. Eqs (6) and (35)) 102 : This blurring is rotationally symmetric. If this rotational symmetry is not present, we can use a Kramers-Moyal type linear differential operator to effect local blurring with a position-dependent point-spread function, namely The notation above is analogous to the diffusive terms in Eq. (51), and we have set ∂/∂x ≡ ∂ x , ∂/∂y ≡∂ y . This gives another perspective on the use of derivative operators to effect blurring associated with a position-dependent point spread function. Similar ideas are used in image sharpening via coherent defocus 67 and image blurring operations of the convolution type 103 . The two derivations of the Fokker-Planck equation for paraxial imaging, given in the present paper, may be augmented by other approaches. An approach based on Wigner functions 104 would be an interesting avenue for future research, since Wigner functions naturally capture the idea that there can be a distribution of energy flow vectors at each point in space 105 . See e.g. Eq. (19) of Nugent and Paganin 106 , which was obtained via a Wigner-function formulation of paraxial optics, and which may be readily manipulated into a Fokker-Planck form given by Eq. (14). From a different perspective again, the intensity transport equations associated with arbitrary aberrated linear shift-invariant optical imaging systems 107 may be rendered into Fokker-Planck and Kramers-Moyal forms via the slight generalisation of replacing shift-invariant local differential operators with their corresponding shift-variant forms, along very similar lines to those indicated in the preceding paragraph.
Unresolved speckle lies at the heart of many phenomena observed in imaging using partially coherent optical fields 4,108 . Here, "speckle" is used in the general sense of optical fields whose spatial and/or temporal intensity fluctuations have a random component; this differs from the more usual usage which equates "speckle" with "fully developed speckle". Unresolved speckles have two distinct origins: (i) the partial coherence of the illuminating beam and its associated spatio-temporal speckles, and (ii) unresolved speckle associated with sample micro-structure. In the present paper, the spatial width w of the point-spread function (PSF) of the position-sensitive detector-a lower limit on which is given by the pixel size-defines a natural transverse length scale. Speckles much smaller in size than w will be spatially averaged when registering an image. If ergodicity of the associated stochastic process can be assumed, over the area of each pixel, then by definition the spatially-averaged intensity at each pixel location can be replaced with a corresponding ensemble average over many realisations of the ensemble of fields illuminating each pixel. Physically, and as has been used in the present paper, this corresponds to a statistical ensemble of objects, each of which has the same coarse-grained projected complex refractive index distribution, but differing spatially-random micro-structure. This construct permits us to approximate the resulting measured intensity map, with a probability distribution obeying the 2 + 1-dimensional Kramers-Moyal equation, including the special case of the 2 + 1-dimensional Fokker-Planck equation. Refraction and prism terms are associated with the resolved coarse-grained structure of the ensemble of objects and the ensemble of illuminations, with the diffusion term being associated with the unresolved speckles. What is deemed SAXS is a detector-resolution-dependent construct associated with unresolved sample-induced speckles. Thus, the component of the scattered radiation attributed to SAXS is dependent on w, and thereby detector dependent. This last statement ties in with the broader idea that there is a close link between partial coherence and the spatio-temporal resolution of one's detector-see e.g. Sec. 3.6.3 of Paganin 4 . It is also closely related to what component of the x-ray energy flow is classed as "coherent energy transport", and what is classed as "diffusive energy transport", in the Fokker-Planck and Kramers-Moyal constructions: the finer the spatio-temporal (2019) 9:17537 | https://doi.org/10.1038/s41598-019-52284-5 www.nature.com/scientificreports www.nature.com/scientificreports/ resolution of the intensity detection, the smaller the fraction of the x-ray energy flow that will be classed as diffusive. Both the TIE-type and SAXS-type scattering (resolved and unresolved structures) arise from the same fundamental physics, as is clear in the first-principles derivation of the x-ray Fokker-Planck equation. Again, what length scales ultimately fall into either category is determined by the resolution of the detector.
We close this discussion by emphasising the broader applicability, beyond the margins delineated in the present paper, of the Fokker-Planck and Kramers-Moyal equations to (i) partially coherent paraxial x-ray imaging in particular, and (ii) partially-coherent paraxial imaging more generally. Context may be given to this claim, by the huge domain of applicability of Fokker-Planck and Kramers-Moyal concepts. Of equal relevance is the likelihood that the domain of validity of the paraxial-imaging Fokker-Planck and Kramers-Moyal equations is at least as broad at that of the transport-of-intensity equation which they generalise; and, since the transport-of-intensity equation has been applied to visible light 109 , electrons 110 , neutrons 111 and x-rays 7 , there is every expectation that the formalism of the present paper may be utilised for paraxial-imaging scenarios using all of the just-listed radiation and matter wave fields. Extensions of the present paper, to imaging applications beyond those listed previously, include a Fokker-Planck approach to neutron phase contrast imaging 111,112 , x-ray magnetic circular dichroism imaging 113 , x-ray edge-illumination imaging 114 , x-ray Zernike phase contrast imaging 115 , neutron grating imaging 116 , electron out-of-focus contrast imaging (Fresnel contrast imaging) 117 and visible-light out-of-focus contrast imaging 109 . As with the present study, a core motivation for such an approach is its potential simultaneously to model, under the aegis of a single powerful formalism, both deterministic and stochastic aspects of paraxial thin-object imaging using partially coherent radiation and matter wave fields.

conclusion
Fokker-Planck and Kramers-Moyal equations were obtained, to model near-field and intermediate-field paraxial quasi-monochromatic x-ray imaging of thin samples exhibiting both phase-amplitude and small-angle-scattering contrast. Two complementary derivations were given. Possible applications, to both forward and inverse imaging problems, were discussed.