Power-law relationship in the long-tailed sections of proton dose distributions

The halo portion of a proton therapy dose creates a long tail in proton dose distributions, but so far study of this phenomenon has been limited. We used statistical methods and mathematical models to confirm that the long-tailed portion of proton dose distributions exhibits a power-law relationship. By analyzing 299 measured dose profiles, we found that all proton lateral dose distributions had a significant power-law scaling correlation with a high correlation coefficient in the tail. We set up a dual-mechanism model, containing both direct and indirect impact mechanisms. In the direct impact mechanism, the proximal dose deposition is mainly due to the direct impact of a proton on a particle. In the indirect mechanism, the impact of a proton on a given particle is considered in terms of the proton’s impact on a neighboring particle that then impacts the given particle. We found that the indirect impact mechanism led to a tail in the distribution exhibiting a power-law relationship because the probability of the indirect impacts was proportional to the distance; i.e., the longer the distance, the larger the indirect impact probability. Upon analyzing the experimental data, we observed that the power-law exponent increased with proton energy.

falls outside of the Gaussian pattern of the Fermi-Eyges theory 13 . The halo creates a long tail in proton lateral dose distributions. Because most dose distributions have been described on the basis of the Molière theory 16,17 , which leads to Gaussian distribution, the long-tailed portion of the distribution created by the halo has been fitted with empirical models such as two Gaussians with Cauchy-Lorentz function 9 or three Gaussians 18 . The empirical models have shown partial success in fitting the data, but the mechanism leading to the long-tailed dose distribution remains unclear.
Many long-tailed distributions in a variety of natural and man-made phenomena follow a "power-law" distribution. A power-law distribution in empirical data is a striking feature that has attracted considerable attention 19 . The power-law can relate microcosmic theory to observed macroscopic phenomena. Although power-law distributions have been reported in areas ranging from ecology and molecular biology to finance and the Internet [20][21][22][23] , to the best of our knowledge, the power law has not been used to describe the long-tailed proton lateral dose distributions. In the current study, we sought to determine whether the long-tailed proton lateral dose distribution follows the power-law, and if so, whether this phenomenon can be explained by similar mechanisms leading to long-tailed distributions in other research areas. Determining whether the long-tailed proton dose distribution follows the power law will help elucidate the mechanisms of halo formation and thus improve dose calculation. Figure 1 plots the distance in log scale and lateral dose for 221.8-MeV protons at 5-cm depth in water. The dots show all of the measurement points. We defined the distance from the point with maximum R 2 when fitting to the power law to the origin as DMO. The straight line in Fig. 1 starting from DMO indicates a strong power-law relationship with R 2 = 0.994. Figure 2 shows the results of the power-law correlation between lateral distance and lateral dose for 72.5 MeV, 102.4 MeV, 143.2 MeV, and 221.8 MeV at multiple depths, including the measurement points starting from the maximum R 2 point. The graph shows that all profiles have significant power-law scaling correlations with a high correlation coefficient.

Results
We calculated the correlation parameters for all profiles (Table 1). All profiles had significant power-law correlations with P < 0.01. The exponent parameter (slope) was −4.41 ± 0.58 with a 95% confidence interval of −4.64 to −4.19. The square of the correlation coefficient was 0.98 ± 0.021, indicating a strong linear correlation between log-transformed dose and distance. The P value and the square of the correlation coefficient indicated that the tail of the profile was strongly consistent with the power-law relationship. These results suggest that the lateral dose decreased 2 4.41 times each time the lateral distance doubled, staring from the profile data point with the maximum correlation coefficient.
As can be seen from Table 1, different energy ranges had different parameters. Figure 3 shows the power-law exponent plotted against 94 energies and Fig. 4 shows the DMO plotted against the energies. The power-law exponent increased with energy and DMO decreased with increasing energy.
The model was built as described in the Methods. Equation 12 shows our model's probable prediction for the impact of a proton, including the direct and indirect mechanisms. Equation 15 indicates that once a particle's number of impacts (i) is large enough, further impacts of the particle accelerate growth because there are so many links that can impact particle A, showing the characteristics of power law. The power-law exponent is −(1 + 1/λ). The tipping point where R indirect overtakes R direct occurs at ( ) We fit the parameters of λ on the basis of the model and exponent obtained from the statistics. Table 2 shows the fitting parameter values of λ for multiple energy ranges. The data shown in Table 2 indicate that λ increases with energy. Owing to the physical properties of reaction between the proton and matter, the greater the proton energy, the greater the number of indirect impacts and the larger the value of λ, which is consistent with the trend of value obtained from the statistical data and model. In addition, the higher the energy, the larger the power-law exponent (Fig. 3).
As the energy becomes larger, the λ value becomes larger and the tipping point becomes smaller according to equation 8. The tipping point position obtained from the model was consistent with the trend of DMO obtained from the statistical data (Fig. 4). For the energies between 72.5 and 79.9 MeV, the DMO was about 2.5 cm, whereas for energies between 210 and 221.8 MeV, the DMO was about 0.96 cm. For lower energies (70.9-72.5 MeV), the core and halo distribution overlapped at most distances of interest, and the long-tailed section of the distribution (halo) was not considered important. However, the long tail effect could reach 10 cm even though the DMO occurs at 0.96 cm for high energies (210-221.8 MeV), and it is important to correctly model long-tail effects.

Methods and Materials
A power-law relationship usually means that some physical properties or probability distributions satisfy the following formula: The constant α is called the exponent or scaling parameter (slope of the log-log regression line) of the power law.  In the current study, we measured the proton lateral dose distribution (the core and halo) using an ionization chamber in a water tank. In the clinical application of proton therapy, the lateral dose distribution is defined as the "profile. " We used the lateral distance (r) in the profile as the independent variable (x) and the lateral dose (d) as the dependent variable (y). Because profile data on the positive and negative sides of the x-axis are symmetric, the positive axis is used for testing. The following sections describe the data collection and statistical methods used to establish our model.

Profile measurement. All profiles were measured at The University of Texas MD Anderson Cancer Center
Proton Therapy Center using the spot scanning delivery system, which has been used to treat patients since May 2008 24 . The proton therapy system uses a synchrotron and the Hitachi ProBeat delivery System (Hitachi, Ltd., Tokyo, Japan) with three gantries and two fixed beamlines. The spot scanning beam delivery system offers 94 selectable energies ranging from 72.5 to 221.8 MeV, corresponding to penetrations ranging from 4.0 to 30.6 cm in depth.
We used an ionization chamber to measure the profiles, as described by Sawakuchi et al. 25 and Anand et al. 26 . Profiles were measured using a small cylindrical ionization chamber (type 31014, PTW-Freiburg, Freiburg, Germany) in an automated water tank (MP3 Phantom Tank, type L981010, PTW-Freiburg) for all scanning proton energies. The ionization chamber has a sensitive volume with a 0.1-cm radius and 0.5-cm height. The measuring range included extended lateral distances from the central axis at multiple depths, covering low-dose halo regions, which are a factor of 10 4 lower than the central axis dose.   In the current study, all profile measurements were determined in terms of relative dose. A total of 299 profiles were measured. Two other methods (optically stimulate luminescent detector and Gafchromic EBT film) of measurement were used to verify the reliability of the ionization chamber measurements.

Regression line fitting.
To simplify the process of estimating the relationship, we used log transformation.
The x and y variables are log-transformed, so equation (1) can be expressed as: where Y = log(y), X = log(x), and C = log(c). We can determine whether variables fit to equation (1) by checking whether the log-transformed variables are linearly related. There are two different types of reported power laws: power-law probability distributions 23 and bivariate power law 27 . For the power-law probability distributions, the regression line fitting method should not be used because regression between x and p(x) (probability density) does not produce a well-defined probability model of x 28 . For the bivariate power law, the least-squares linear regression method is suitable to fit the power-law distribution. Ordinary least squares (OLS), major axis, and standardized major axis 27 are the most common methods used to define a line of best fit for a bivariate relationship. The fitting method used for the line differs depending on whether the measurement error is considered. OLS frequently requires the assumption that X is measured without error, whereas the major axis and standardized major axis methods assume that X contains measurement error 29 . The major axis method is also symmetric, meaning that a single line defines the bivariate relationship regardless of which variable is X and which is Y, whereas the OLS method is asymmetric. Because the dose profile data are noise-free in the x-axis and given that the relationship between distance and dose is asymmetric, OLS was used.
In the OLS regression technique, the line is fitted to minimize the sum of squares of residuals from the line.
where Y is the fitted value of Y. This allows us to test for an association between X and Y. The key statistical parameter is the p value. The strength of the relationship between X and Y also needs to be determined. A suitable statistic to answer this question is the square of the correlation coefficient, R 2 . We used an R package (SMATR) and in-house script to fit the profile data. Details about the use of these statistics and about the underlying formulas and their validation can be found in Warton et al. 27 .
In our previous observation, we test all measurement points of the profiles. We found that the power-law relationship applied only to values in the long tail of the profiles. In statistical analysis, we focused on the data in the long-tailed section of the profiles (representing the halo). We applied an algorithm to select the profile data. First, we tested each x as the first value of the dataset to find the maximum R 2 value and recorded the distance from the point with the maximum R 2 to the origin. Then, the dataset was made up from the x value with the maximum R 2 value. Thus, we found the dataset that best matched the power-law relationship in the profiles.

Model building.
We developed a model (focused on the power-law relationship observed in the long-tailed portion of the profiles) aiming to address two points of interest to us. First, we examined proton dose deposition to identify the mechanism behind the power-law phenomenon that occurred in the profiles. Second, we wanted to explain why the power-law parameters for profiles non-universal constants were, varying from one energy to another.
We established a two-mechanism model, comprising the direct mechanism of impact and indirect mechanism of impact. This two-mechanism model is similar to the scientific citation model described in Peterson et al. 30 . Figure 5 shows a schematic drawing of the two mechanisms model for Proton depositing dose in the materials. Considering a particle A in the material, if the particle is in the range Θ as shown in the circle, the main interaction of the particle A to the proton is the direct impact. This mechanism is shown at the left panel of the Fig. 5. However, if the particle is outside of the range Θ, the indirect interaction mechanism will be dominant.
As shown in the right panel of the Fig. 5, the proton first impact a particle in the range, the particle B than impact particle A located at position r. As shown by the wiggle line of Fig. 5, the indirect impact from B to A is through a chain reaction. This chain reaction is important for us to understand the power law behavior. If we compare the probability of receiving indirect impacts for the particle in position at r + 1 and r, the particle at position r + 1 (longer distance) will have larger probability of receiving indirect impact than the particle at position r (shorter distance). Once the particle at position r receiving an indirect impact, it will trigger an indirect impact to particle at position r + 1. This means that particle at longer distance will receive more indirect impacts simply because there are more indirect reaction channels. This mechanism is essentially "Richer (longer distance) get richer (more indirect impacts)".
Assuming that the proton enters the medium (water) from the zero point, the dose is generated by the impact of the proton on particles. At a given time point, a certain number of particles is already at each position on the x-axis. The number of particles is m, which is large. Let n be the average number of impacts per particle with other particles. Hence the total number of impacts is mn. Let i be the number of impacts that a particle has received. For example, a particle that has received no impacts from other particles has i = 0. Some particles have received many impacts, until the particle energy is converted into dose.
At each time step, an incident proton enters. Now we focus on an particle A and discuss the effect of the incident proton on A. The probability that this new proton will randomly impact A is is called the direct impact mechanism. Particle A can also be subject to an indirect impact mechanism, in which the incident proton impacts other particles, which then impact A. Given the physical properties of the reaction between the proton and the particle, any particle having a direct impact on A must be within a certain range Θ around A. If a particle A is outside the range Θ, the particles B in Θ are first impacted, then B impact A. Of course, a particle can pass through multiple impacts before impacting A. This process is called the indirect impact mechanism.
Let particle B ∈ Θ; in the indirect impact mechanism, the incident proton impacts B (either directly or indirectly) and B then impacts A. If the probability that the new proton impacts B is 1 /m and the probability that B impacts A is (1 /m) × (1/n), then the probability that the incident proton impacts A with the indirect impact mechanism is Given that this proton has impacted A, let λ be the indirect impact probability; the direct impact probability is then 1 − λ. Assuming that A has now received i impacts, then the number of impacts on particle A from an incident proton is I(i). I(i) can be computed as follows through either the direct or indirect impact mechanism: Next, we calculate the impact distribution p(i), the fraction of the particles that receive i incoming impacts. The number of particles having i impacts is m × p(i) and the number of particles with i − 1 impacts is m × p(i − 1). The number of particles with i impacts increases every time a particle with i − 1 impacts receives another impact and decreases every time a particle with i impacts receives another impact.
When a proton inters, we get where we define There is no inflow from a lesser bin for p(0). We define 1 as the inflow coming from a incident proton at each time step. So, Next, we explain the relationship between i and distance (r), p(i), and dose [d(r)]. Considering that the dose at position r is determined by the number of impacts which particle at position r receives and energy transfer per impacts. In this work, we approximate the same energy will be deposited at position r whenever there is an indirect impact at position r. This is an approximation. We only approximate same energy transfer for the indirect impacts. By using this approximation, we can derive the probability distribution of indirect impacts following the similar method used by Peterson et al. Therefore, i and r are proportional and the dose [d(r)] is proportional to p(i).
Thus, d can be expressed as the following, with a coefficient (a) according to equation 15:

Discussion
The power law has special mathematical properties and can be the result of interesting endogenous processes such as network effects, multiple reactions effects, or feedback loops. A power-law distribution is often considered evidence of the underlying processes constructing the dynamics for these systems 31,32 . Many studies across disciplines have examined the power law, but so far none have used the power law to explain the halo phenomenon in proton dose deposition. In the current study, we used statistical methods to test the measurement data and establish a mechanistic model to describe the power-law relationship between proton lateral dose distribution and the lateral distance, especially at the long tail of the profile. The high correlation coefficient uncovered in our analysis indicates that the relationship between the lateral distance and dose in the halo area follows the power law.
The derivation of our model has some similarities with that of Peterson et al., who examined power-law relationships in scientific citations 30 . One key difference is that we applied power law to a natural physical phenomenon. Our model shows that most particles impact at short distances and are converted into dose, but some particles always have more impact and move a longer distance. After the particles move to a certain tipping point, the indirect impact mechanism rapidly increases, causing the particles to move a longer distance to deposit energy, forming the halo.
Our observation that the long-tailed section of a proton dose distribution explained through a simple indirect reaction mechanism, rather than through the first principle, is a novel finding. As we know, the halo consists of charged secondary particles, many but not all of them protons, from elastic interactions with H, elastic and inelastic interactions with O, and non-elastic interactions with O 15 . Our findings suggest that the net effects of these complex physical reactions lead to a very simple power-law distribution. The power-law interpretation of physics data can also bring some insight of the social phenomena such as scientific paper citation. One interesting result in this work is that high-energy proton will have a longer tail compared to low-energy proton. The "high-energy proton" in the current study is analogous to Peterson et al. 's 30 "highly cited group" of scientific citations, which also exhibited a longer tail than the group with a lower H-index.
In all proton dose calculation models, the functions describing the off-axis lateral distribution are Gaussian. Some studies have used one Gaussian and some studies have used two or more to handle large-angle scatters 9 . For example, the pencil beam distribution was modeled using three Gaussian functions in our treatment planning system (Eclipse 8.9; Varian Medical Systems, Alta Palo, CA) 18 . During the clinical commissioning of the scanning beam, we previously observed that the measured dose of the treatment planning system at multiple depths in the square field cannot be accurately predicted as a function of field size. The difference was found at the medium depth of the high-energy beam 33  of various correction factors, the halo phenomenon was not described very well. The power-law relationship may provide a novel and accurate alternative model of proton dose distribution. Figure 6 show that low-dose halos in single-spot profiles of 221.8 MeV could be adequately modeled with the addition of the power-law function to a double-Gaussian function. This figure shows the lateral profiles of individual scanning spots modeled at various depths using the double-Gaussian functions and power-law function (solid lines) and double-Gaussian functions only (dashed lines). The measured lateral profiles at different depths are indicated by the brown circle (depths: 2 cm), blue square (depths: 10 cm) and red triangle (depths: 20 cm) markers, respectively. The lateral profiles modeled with the two Gaussian components and power-law component showed better agreement with the measured results compared with the model using the two Gaussian components only. However, this model with power-law function needs further study.