Orientationally-averaged diffusion-attenuated magnetic resonance signal for locally-anisotropic diffusion

Diffusion-attenuated MR signal for heterogeneous media has been represented as a sum of signals from anisotropic Gaussian sub-domains to the extent that this approximation is permissible. Any effect of macroscopic (global or ensemble) anisotropy in the signal can be removed by averaging the signal values obtained by differently oriented experimental schemes. The resulting average signal is identical to what one would get if the micro-domains are isotropically (e.g., randomly) distributed with respect to orientation, which is the case for “powdered” specimens. We provide exact expressions for the orientationally-averaged signal obtained via general gradient waveforms when the microdomains are characterized by a general diffusion tensor possibly featuring three distinct eigenvalues. This extends earlier results which covered only axisymmetric diffusion as well as measurement tensors. Our results are expected to be useful in not only multidimensional diffusion MR but also solid-state NMR spectroscopy due to the mathematical similarities in the two fields.

In MR examinations of porous media as well as biological tissues, one is often confronted with a medium comprising an isotropic ensemble of individually anisotropic domains. The effect of diffusion within such media on the MR signal has thus been considered since the 70s [1][2][3] . The problem we tackle here is for the situation where diffusion within each microdomain can be taken to be free, thus can be characterized by a microscopic diffusion tensor D. This assumption has been widely employed in the recent development of multidimensional diffusion MR (see ref. 4 for a recent review and the references therein), which employs general gradient waveforms for diffusion sensitization. The level of diffusion-sensitivity is fully captured by a measurement tensor 5 B, yielding the signal attenuation DB tr( ) for the microdomain. When some residual anisotropy is present upon the inherent signal averaging over the sample (or voxel in image acquisitions), a series of diffusion signals can be acquired with rotated versions of the same gradient waveforms. Upon averaging such signals, one obtains the orientationally-averaged signal, which is devoid of any macroscopic (ensemble) anisotropy, i.e., the anisotropy of the orientation distribution function of the microdomains.
As demonstrated in ref. 4 , there is a close resemblance between the mathematics involved in this problem with that in multidimensional solid-state NMR spectroscopy 6 . In the latter, the local structure is described by the chemical shift tensor, and a measurement tensor can be introduced, which is determined by the orientation of the main magnetic field and manipulations of the sample orientation within it [7][8][9][10][11] .
Our interest in this article is the averaged signal obtained by repeating a given measurement protocol in all orientations, which is relevant for both diffusion and solid-state MR applications. Our results permit a more general analysis and modeling when the orientationally-averaged signal has been measured, as the common condition on axisymmetry is relaxed. The problem of determining the set of optimal diffusion gradient orientations for computing the orientationally-averaged signal is addressed elsewhere 12 . Here, we extend the existing literature 3,13,14 by providing explicit expressions for the orientational averages to accommodate measurement and/ or structure tensors that are not axisymmetric.
With R denoting an arbitrary rotation matrix (   = = RR R R I), the complete set of such measurements is spanned by the expression RBR  , hence yielding the orientationally-averaged signal as

DRBR R
tr ( ) where the average is over the three dimensional special orthogonal group SO(3), i.e., the space of all rotations. Since the matrix trace operation is invariant under cyclic permutation of the product in its argument, this is equivalent to the expression R DRB R tr( )  which demonstrates the utility of such an averaging over rotated protocols: The information is the same as that which would be obtained by a single measurement, had the specimen consisted of microdomains with the same diffusivity (D) distributed uniformly in orientation. For solids, such a specimen is obtained by grinding the material into a powder, eliminating any nonuniformity in the orientational dispersion (macroscopic, global, or ensemble anisotropy) the bulk specimen might have had. Hence we use the terms powder-averaged and orientationally-averaged interchangeably, given the latter effectively achieves the same result. An alternative interpretation of the average in (2b) can be realized when one considers the Laplace transform of a function which takes matrix arguments 15 , in our case a tensor distribution p(D), p D DB ( ) Sym tr( ) 3 where the integration is performed over Sym 3 , the space of all symmetric 3 × 3 matrices, and for those matrices ∈ B Sym 3 where the integral converges. With  3 denoting the set of all positive semi-definite matrices in Sym 3 , the applications of interest in this work will be when all the matrices ∈ D 3  . This means that 3. Interestingly, when p D ( ) is such that it describes an ensemble of tensors consisting of copies of a given tensor D , rotated to lie in all possible orientations, the above expression is the same average as (2b). In this scenario In other words, the orientationally-averaged signal that we are evaluating in this work is nothing but the Laplace transform of the distribution p D ( ) describing a uniform ensemble of rotated copies of a given tensor. Parallels can be drawn with previous works [16][17][18] that have employed parametric Wishart distributions (or its generalizations) for representing the detected MR signal; in these studies, the resulting expression for the Laplace transform is borrowed from the mathematics literature 19 . However, the Laplace transform of the distribution we are tackling in this work (Eq. (2b)) for a general D is not available to our knowledge.
Note that throughout the article, the phrase "general tensor" refers to a tensor whose ellipsoidal representation is not axisymmetric. Thus, a general tensor T is still to be understood as being real-valued and having the index symmetry = T T ij ji . Given that the matrices D and B are real and symmetric, and that the averages (2) are insensitive to individual rotations of these matrices, we are free to consider their diagonal forms, possibly in different bases It turns out that the average signal (2) can take a particularly simple form if repeated eigenvalues occur in at least one of the matrices D and B. With an ellipsoidal representation of these matrices in mind, we refer to these cases as axisymmetric or isotropic, depending on whether two or all eigenvalues coincide, respectively. The rank of the matrices appear to have no further significance for the simplicity of the calculation, except for rank-1 being a special case of axisymmetry.
In the following sections, we consider various combinations of symmetries of D and B in evaluating the average (2). Keeping in mind that the roles of D and B can be interchanged without changing the average signal, the relevant cases are

Results
Here, we give the resulting expressions for the average signal (2) in the cases listed above, with D and B given as in Eq. (4). All derivations are provided in Supplementary Section I, where S is first derived for the case (4) of a general D and axisymmetric B, from which more specialized cases (1)(2)(3) are deduced, while the most general case (5) is taken up last. D general, B isotropic. This is the case where all eigenvalues (a, b, c) of D are possibly different, while B is proportional to identity with = = d e f. One finds, ) tr( ) and hence the average in Eq.

D and B axisymmetric, B rank 1.
Here, two eigenvalues of D coincide, which we choose as = b c, while B is rank-1 with = = e f 0. This is a widely-utilized case in diffusion MR 1,3,[20][21][22][23][24] . The result follows as, It is assumed that ≠ d 0 and ≠ a c, as this implies that one of the matrices is isotropic, but this case can of course also be included here by continuity arguments.
With similar remarks as in the previous case, any of the two forms in (7) can be chosen, and if moreover = a c or = d f , one gets the first (isotropic) case above. The first form has real arguments when both D and B are either prolate or oblate, while the second form may be preferable when one matrix is prolate and the other oblate. Here, the symbol i F j ⋅ ( ) stands for a hypergeometric function, in particular with 1 F 1 ⋅ ( ) being the confluent hypergeometric function. Equations (8a-c) were derived by a rather direct approach; see Supplementary Section I.A. The fourth expression (8d), which is perhaps the most interesting for S when one matrix is axisymmetric, is derived from the general expression, i.e, the case accounted for next. As discussed below, it can lead to very efficient numerical evaluation.
The expressions above feature hypergeometric functions. However, it is possible to express them using more familiar functions. For example, by using a scheme 25 that involves term by term comparison with the confluent hypergeometric function's asymptotic behaviour 26 , we were able to write Eq. (8b) in the following form:

D and B general.
In the general case where neither matrix (necessarily) has any degenerate eigenvalue, the result can be expressed as a b e f cd m k mk mk and there are three alternatives for Y mk as follows: The coefficients q mk contain I m ⋅ ( ), i.e., the modified Bessel function of the first kind for various orders m, and the regularized hypergeometric function ⋅  F ( ) 1 2 . Also note that the floor function ⋅ ⌊ ⌋ indicates the largest integer smaller than or equal to its argument.
It is straightforward to check that S as given by Eq. (10) is invariant under the rescaling λ → D D, → λ B B 1 for nonzero λ, verifying an obvious invariance it should have due to its definition (2). Implied by that definition, S must also be unaffected by permutations of a b c { , , } and of d e f { , , }, as well as swapping This is not at all evident in the expansions above; in fact the ordering of eigenvalues can have drastic effects on the numerical behaviour of the series, even though the eventual sum does not change. Given D and B, and their eigenvalues, the remaining task is thus to assign them (in some order) to a, b, c and d, e, f in such a way that the series can be approximated well with only a few terms; see the next section.

Numerical Behaviour
In this section we give some comments and guidelines on how to evaluate S via the series in Eqs (8) and (10). The numerical behaviour of the series given in Eqs (8) and (10) varies drastically depending on how the eigenvalues of D and B are ordered (i.e., which eigenvalue is named a, b, and so on). In the formulas in Eq. (10), there is also the option of interchanging D and B. The discussion here revolves around an example employing Eq. (8), but the guidelines apply to the general case of Eq. (10) as well, albeit more involved.
When B is axisymmetric ( = e f ), f and d are fixed in Eq. (8). On the other hand, in these formulas, which all give the same result, one is free to permute a b c { , , }. Although this does not affect the answer, it affects the number of terms needed to get a good approximation. Hence, given the three eigenvalues of D, one wants to assign them to a, b, c in a clever way, as well as choose the most efficient formula.
Looking at the expansions (8), while there appears no obvious "best" choice for assigning a given set of eigenvalues, it is still wise to try and (i) keep the series expansion parameter (e.g., www.nature.com/scientificreports www.nature.com/scientificreports/ in Eq. (8a), and so on for the rest. Therefore, even though a deep analysis of the properties of hypergeometric functions is beyond the scope of this work, it is useful to note the following as a guideline for their sign and magnitude. With S 0 019175, and all the terms are normalized by S so that their sum is 1.
The first alternative (8a) has the property that its individual terms are invariant under the change ↔ a b, which can be shown to follow from relation (11). Therefore, out of the six possible assignments between the elements of the sets a b c { , , } and . . {0 1, 0 2, 3} only three are distinct. These are what is depicted in Fig. 1. It is seen clearly that for these parameters this alternative does not afford a series expansion that can be truncated after the first few terms for a usable result. The terms making the most significant contribution to the sum are not even at the beginning, but in this example rather around term number 15. Furthermore, the two latter choices exhibit terms of alternating sign and magnitudes of about 10 5 times the sum itself. Figure 2 depicts all three distinct choices for the alternative expression (8d). We see a very desirable feature here. When the largest value is assigned to c, the terms are seen to converge quickly (note that the expression is invariant under the exchange ↔ a b  15 675 . We also see that only even powers − − = . a b d f ( )( ) 0 55 enter, resulting in all terms of the series being positive; no cancellation occurs. This choice leads to a sum that converges so quickly that S is approximated within 2% by only the first term, while the first two terms attain an error less than 0.01%. In addition, it should also be noted that for moderate values of n,  are indeed equal. The terms in the series (8d) are also unaffected by the change ↔ a b, but this is immediate from the expression.

Applications
The results in this paper are the exact expressions given in (8)(9)(10), which extends the earlier formulas (5-7), together with the asymptotic behaviour given by Theorem 1. These series can give exact answers in more general models where axisymmetry is not insisted upon. However, to further motivate applicability of our findings, we discuss and suggest several possible situations where the new results provided in this work are used. www.nature.com/scientificreports www.nature.com/scientificreports/ We have presented explicit formulas for the orientationally-averaged signal S in Eq. (2) for general (symmetric) matrices D and B. These formulas complement the well-known cases 1, 2, and 3, i.e., the formulas in Eqs (5-7), where D and B have various symmetries. However, even when one matrix, say B, is rank-1, the signal expression for a general D (Eq. 8) is believed to be new. Eq. (10) raises the question of the usefulness of case 5, in which both measurement and diffusion tensors are general. We argue that this solution is indeed useful, for example, when a powdered specimen whose local structure is characterized by a general diffusion tensor is examined. In MRI, the imaging gradients lead to a rank-3 measurement tensor even when a standard Stejskal-Tanner sequence is employed. Moreover, since for a general D, even using a rank-1 measurement tensor ( ) for sufficiently many (at least 3) d determines D. Below, we also discuss power-laws and asymptotic behaviour of the signal in a more general setting.
From a practical standpoint, many "independent" measurements are necessary in order to mitigate noise-related effects. With an axisymmetric measurement tensor B, the space of measurements is two dimensional (two degrees of freedom in B), while in the general case the space of measurements is three dimensional. Loosely speaking, this means that in the latter case, there are more "independent" measurement tensors available close to the origin of this space (or any other point common to both spaces, for that matter). Since measurements close to the origin (i.e., B tensors with small eigenvalues) produce higher signal value, this is favorable from a signal-to-noise ratio perspective. On the theoretical side, however, there are situations where a general B tensor actually is crucial in determining the diffusivity properties of the specimen; see subsection below on the estimation of D from the series expansion of S.

Remarks on the white-matter signal at large diffusion-weighting.
Recently, the orientationallyaveraged signal in white-matter regions of the brain was observed to follow the power-law ∝ − S d 1/2 at large d when = = e f 0, e.g., in traditional Stejskal-Tanner measurements 27,28 . Because such decay is predicted for vanishing transverse diffusivity, these results have been interpreted to justify the "stick" model of axons 21,29,30 ( > = = a b c 0) while also suggesting that the signal from the extra-axonal space disappears at large diffusion-weighting. In a recent article 24 , we showed that such a decay is indeed expected for one-dimensional curvilinear diffusion observed via narrow pulses. On the other hand, in acquisitions involving wide pulses, the curvature of the fibers has to be limited in order for ∝ − S d 1/2 behaviour to emerge. We also noted that the ∝ − S d 1/2 dependence is valid for an intermediate range of diffusion weightings as the true asymptotics of the signal decay is governed by a steeper decay 24 .
Here, based on these findings, we consider a rank-1 diffusion tensor for representing intra-axonal diffusion, which is capable of reproducing the intermediate   www.nature.com/scientificreports www.nature.com/scientificreports/ the stick model could be performed by acquiring data using planar encoding, in which case the expected decay of the orientationally averaged signal would be characterized by a decay ∝ − d 1 regardless of whether or not the B matrix is axisymmetric.

Observation of the (non)axisymmetry of local diffusion tensors.
Here, we focus our attention to The answer to this question is yes; in fact, it is not so hard to see that if S d ( ) are equal for all ⩾ d 0, then D 1 and D 2 must be equal-see for instance the remark near Eq. (17) in the next section. (Also recall that we identify the matrices according to their eigenvalues; two matrices are "equal" for the purposes of this article if one can be rotated into the other.) As an illustration, starting with a rank-1 measurement tensor d B Fig. 3 shows the signal S as a function of d for six different diffusion tensors … D D D , , 1 2 6 . In this example, all tensors D i , = … i 1, 2, 6 have the same trace yielding the same behaviour near the origin, and their common smallest eigenvalue leads to the same large b behaviour. By varying the remaining two eigenvalues, (whose values are found in Fig. 3) while keeping their sum constant, different curves S d ( ) D i are produced. The six curves shown in Fig. 3 are samples from a family (indexed by the diffusion tensor D) of curves characterized by their initial and asymptotic behaviour. It is interesting to note that no axisymmetric diffusion tensor D other than D 1 and D 6 can produce a curve in this family. In this context, the 'same asymptotic behaviour' refers to the same exponential decay as → ∞ d , and this is given by the smallest eigenvalue of the diffusion tensor. The initial behaviour, i.e., ′ S (0) is given by the trace of the diffusivity, and it is immediate to see that an axisymmetric tensor where the smallest eigenvalue is 0.2 μm 2 /ms and where the trace is 2.6 μm 2 /ms has to have the eigenvalues of either D 1 or D 6 .
In the remaining part of this section, we consider the intermediate d regime alluded to in the previous section.

For a general rank-3 diffusion tensor
, the falloff is exponential, which can be inferred from the expres- since both the left-and the right-hand-sides of the above expression decay exponentially fast; see (6). Consequently, a reliable inference cannot be made in the large d regime in typical acquisitions due to limited SNR. The only exception is when = c 0, i.e., when D is of rank 2. In this case, the arguments from the preceding section can be employed by interchanging the matrices D and B, i.e., This is indeed the case, as the following theorem shows. , it then holds that This limiting behaviour is illustrated in Fig. 4 where, starting with a measurement tensor =          d B can be calculated. These limits are shown with dots in Fig. 4. The discrepancy between the curves and the corresponding dots are due to the finite values of d.
Hence, for large d, , ,0 , which evaluates to D tr( ) only for = a b. On the other hand, the small d behaviour of the average signal is always . Thus, D tr( ) can, in principle, be determined from the initial decay of the average signal. Any mismatch between this estimate and the estimate of κ a b ( , ) from the power-law decay of the signal can thus be attributed to transverse anisotropy of D. The above inference is a clear example to how the derived expressions involving general tensors can be utilized to gain new insight into the characterization of the local structure.
Estimation of D from the power series expansion of S. One of the main results of this paper consists of the series in Eq. (10), which make it possible to calculate the powder average S for general matrices D and B without numerical integration or sampling in SO(3). As the corresponding formula (8) for the case when one of the matrices, B, is axisymmetric is simpler, a relevant question is whether the use of a general B is of any help in determining D from a set of measurements. This question immediately splits into two; namely if there is an advantage with a general B in principle, and also in practice. For instance, for any fixed  B, and regarding S as a function of λ =  B B, the asymptotics of S for large λ may be of primary importance in principle, but in practice the actual values of S in that regime may be so small that noise and measurement errors make them impossible to use for determining D.
To address the question above, we start by considering a general B and then the special case when B is axisymmetric. Since any axisymmetric matrix B is the sum of an isotropic matrix and a rank-one matrix, and since the effect on S of an isotropic matrix is trivial, c.f. Eq. (5), it is sufficient to consider rank-one matrices B.
So, with a general D, we consider a family of B-matrices given by and λ is a scalar strength parameter. For fixed D and  B, the orientationally-averaged signal is a function of λ, and we seek the dependence for small λ. Consider therefore the The coefficients c c c , , 1 2 3 can  1   2  2  2  2  2  2  2  2  2   3  3  3  2  3   2  3  2  3   3  3  2  3 For a generic matrix  B (with the isotropic choice being the singular exception), knowledge of these three coefficients suffice to determine the matrix D: For ease of illustration, consider the special case wherein  B is rank-1 (i.e., = = e f 0) yielding the simplified coefficients We see that this is a non-degenerate system that gives D D D tr( ), tr( ), tr( ) 2 3 in terms of c c c , , 1 2 3 , and since (the eigenvalues of) D is determined from these three traces, c 1 , c 2 and c 3 determine D.
Returning to a general  B, a convenient point of view is to consider  B as decomposed into two parts (c.f., . Then the average signal (2) factors into two, each factor calculable from  1   2  2  2  2  2  2  2   3  3  3  3  2  3   2  2  3  2  3 while an isotropic =  f B I on the other hand yields (renaming the coefficients) The resulting signal expansion is the product 3 3 4  . Note how the isotropic part is not all that helpful: It attenuates the signal without being sensitive to anything other than D tr( ). When using a general  B as described above, Eq. (18) tells us that D cannot be determined from c 1 and c 2 alone, even if we are using the extra degree of freedom in B by varying δ and ε. This is so since c 1 and c 2 contain information only on two invariants ( D tr( ) and D tr( ) 2 ), which is not sufficient to determine D, which has three eigenvalues. That being said, in practice, varying δ and ε may be a way of getting "independent" measurements to reduce the influence of noise, i.e., increase SNR and therefore provide more reliable estimates of the coefficients c k .
Let us also mention that in the situation described by Eq. (18) where the combined freedom of δ and ε offers no extra information, one can still vary these to test the assumption that the specimen is indeed described by a single diffusivity matrix D. Namely, for different values of δ and ε, one should always get the same estimated matrix D.
Standard model of white-matter with a general diffusion tensor for the extracellular compartment. The extra degree of freedom provided by the parameters δ and ε may be crucial in certain relevant situations. In principle, such situations can be addressed by involving higher order coefficients … c c , , 4 5 but reliably estimating them would get increasingly difficult, and using c c c , , 1 2 3 for various δ and ε values is more robust. We give the following example, which could be relevant for simplified models of white-matter microstructure.
Suppose that our specimen is a mixture of two different substances, in unknown proportions, where one of the substances is (from a diffusivity perspective) a "stick". Thus, in proportions p and − p 1 , with ⩽ ⩽ p 0 1 , we have  1  2  2  2  2  2   2  2  2  2  3  3  3  2  3  3   2  2  3  2  3  3  3 By varying δ and/or ε, these equations determine p q D , , "almost uniquely", in the sense that there are (rare) situations where there are two sets of acceptable solutions to Eq. (20). However, if one also adds the measurements with an isotropic measurement tensor, this possible ambiguity goes away. (See Supplementary Section IV for details.) Thus, it is possible to obtain all 5 unknowns of a multi-compartment white-matter model from the first three terms of the power-series representation of the orientationally-averaged signal.

Discussion and Conclusions
Our elaborations on the numerical behaviour of the main results (8) and (10) were restricted to the case where one matrix is axisymmetric, corresponding to Eq. (8). In the more general case corresponding to Eq. (10), the same guidelines apply regarding how the ordering of the eigenvalues affects the numerical behaviour, with some additional caveats. First of all, since no eigenvalues necessarily coincide, the number of possible distinguishable orderings increases, by a factor of 6 in particular. Also, not only one has to choose between the three alternatives (10c-e) by the same guidelines that apply to the alternatives (8a-d), but also the number of coefficients q mk has to be specified.
A more brute-force approach to calculate the orientational average (2) is to set up the necessary integrations in the space of rotations and perform them by some discretization scheme. However, such an approach would require the integrand  − e DRBR tr( ) to be relatively well-behaved in the integration domain in order to be accurate, which is not always the case. For instance, when the matrices D and B are very similar to each other, and with one eigenvalue dominating the others, the integrand is (virtually) zero for most rotations R, with most of the contribution stemming from a small subset. In such cases, a discretization of the average is not reliable.
In this work, we represented local diffusion by employing a diffusion tensor along with the generalization 32 of the Stejskal-Tanner formula 33 for the signal contribution of each microdomain. The underlying assumption is that diffusion in each and every microdomain is unrestricted. This assumption 16 has been the building block of not only the microstructure models mentioned above, but also the techniques 5,34-36 developed within the multidimensional diffusion MRI framework. Introduction of the confinement tensor concept 37,38 provides a viable direction that could achieve the same by accounting for the possible restricted character of the microdomains. This is the subject of future work.
Another limitation of the present work is the assumption that there is no variation in the size of the microdomains making up the complex environment-the same assumption employed in many of the microstructure models. Previous studies 39,40 suggest the complexity of accounting for such variations in different contexts. We intend to address this issue in the future.
In conclusion, we studied the orientationally-averaged magnetic resonance signal by extending the existing expressions to cases involving general tensors with no axisymmetry. This was accomplished by evaluating a challenging average (Eq. (2)), or equivalently the integral in (3) for the special class of p(D) distributions considered in this article. Although the results are given as sums of infinitely many terms, we showed that with certain arrangements of the parameters, obtaining very accurate estimates is possible by retaining a few terms in the series. These developments led to a number of interesting inferences on the properties of the signal decay curve as well as estimation of relevant parameters from the signal.
The findings presented in this work could be useful in many contexts in which the the expression (2) (or (3)) emerges. For example, though we employed the nomenclature of diffusion MR in this paper, our findings are applicable to solid-state NMR spectroscopy as well due to the mathematical similarities of the two fields.