High-precision speckle-tracking X-ray imaging with adaptive subset size choices

Speckle-tracking imaging has the advantages of simple setup and high-sensitivity to slowly varying phase gradients. Subset size choice is regarded as a trade-off problem for speckle-tracking X-ray imaging where one needs to balance the spatial resolution and accuracy, where the subset was defined as the region of interest of windowing choice for digital image correlation algorithm. An adaptive subset size choice method based on a Fourier transform for effectively detecting sample phase information without foreknowledge of the sample structure is presented in this study. The speckle-tracking phase-contrast and the form of dark-field imaging based on this method have the advantages of (i) high resolution and time saving compared to large subset choice and (ii) partially improvement the influence from experimental noises, background fluctuations, and false signals compared to small subset choice at the same time. This method has proven to be particularly robust in the experimental condition of poor signal-to-noise ratio. The proposed method may be expanded to all speckle-based imaging methods and other imaging techniques based on the subset or window matching.

www.nature.com/scientificreports/ One idea for solving this trade-off problem of subset size choice is to design an adaptive subset to suit different positions in a speckle pattern. This idea was first proposed in signal or image processing. An adaptive Fourier transform was used to optimize the bias-to-variance trade-off for time-varying signals with a minimal mean squared error 17 . In three-dimensional surface shape measurement, the windowed Fourier transform and wavelet transform methods were used to extract phase information from a single-fringe pattern 18 . An iterative adaptive cross-correlation was used to improve the accuracy of the DIC for wavefront sensing with high subpixel accuracy 19,20 . This technique focused on the improvement of the shift estimate accuracy by fitting the peaks of the cross-correlation or the phase slopes with the variation of the subset size, but it did not build a relationship between the subset size and the sample and it had a large associated computation time.
In this study, we present a novel method based on Fourier analysis for speckle-tracking X-ray imaging. The use of an adaptive subset size choice (ASC) proves to be effective in improving the resolution, decreasing background fluctuations and false signals, and reducing the required computation time.

Methods and simulations
In the X-ray speckle-tracking imaging mode, the superposition of the interference of the weak scattered beams from a diffuser and stronger transmitted beams produced a speckle intensity distribution on the detection plane. The sandpaper can be described as consisting of many grains. Based on the intensity extractions f o and f s obtained from two speckle patterns, with and without a sample, and using the two-dimensional DIC method 10 , the shifts of the maximum cross-correlation coefficient (MCCC) ε = argmax(f 0 ⊗ f s ) in both directions can be obtained. These shifts are proportional to the respective phase gradients. The symbol ⊗ marks all correlation operations.
The f o and f s have different expressions for different cross-correlation calculations. The MCCC was shown to have a relationship to the dark-field signal with different expression forms 7,21 , which is reduced by small-angle scattering from the sample structure. The cross-correlation criterion has many representations, of which the zero-normalized cross-correlation (ZNCC) or the zero-normalized sum of the squared differences (ZNSSD) are considered the most robust and reliable approaches as a two-dimensional digital image correlation method 14 , and they are widely used at present. For this criterion, f o and f s can be expressed as follows: where I 0 and I s are the intensity patterns without and with a sample, the overline and the symbol Δ indicate the mean value and the sum of squares of the deviation from the mean of the intensity patterns, respectively, and w is the subset window function. Here, the subset window function is the rectangle function w(x,z) = rect(x/M, z/N), which is unity for |x| ≤ 1/2 and |z| ≤ 1/2 but zero otherwise, where M × N defines the subset size. In previous studies, equal values of M and N were chosen for the side lengths of the subsets.
In order to study the criterion for the subset choice, we designed a simulator to producing speckle patterns. The radii of the grain sizes and the positions of the grains satisfied two normal distributions. In this simulation, the mean radius of the grain size of the sandpaper was 0.25 μm and the standard deviations of the radii and the positions in two dimensions were 30% and 50% of the mean grain size of the sandpaper, respectively, based on the estimation of real sandpaper. The horizontal phase gradient of the sample was designed to be triangular, as is shown in Fig. 1a. The propagation process was simulated by a Fresnel approximation of the diffraction integral 22 at an energy of 17 keV. The speckle patterns with 220 × 220 pixels with and without a sample were recorded 1.6 m downstream of the sample with a pixel size of 0.1 μm.
In our simulation, the side length values of the subset, M and N pixels, were kept equal. Figure 1 demonstrates the phase gradients along both directions which were calculated with different subset size choices. When the subset size is smaller than five pixels, that is, the average size of a grain, all results have serious oscillations. Our previous study 23 presents a more detailed explanation of the relationship between grain size of sandpaper and subsets. As the subset size increases, the oscillation of the horizontal phase gradient becomes smaller and smaller, approaching the real phase gradient. The root-mean-square (RMS) error reduces and approaches a constant of 11% when the subset size is larger than 23 pixels. The vertical phase gradient includes high-frequency step signals. When the subset size is around 7-9 pixels, the edges of the sample can be distinguished. With an increase of the subset size, the edge signals broaden and shift into the sample zone. In this case, the resolution decreases and it is hard to distinguish the correct sample edges. Compared to the horizontal phase gradient with its relatively low spatial frequency, the effective subset size choice for the high-frequency vertical phase gradient is only in a very narrow range.
By considering the process of the DIC algorithm, we clearly understand the reason why the phenomenon in the simulation occurs. For any window centre (x′, z′), in order to accurately search out the maximum value c max in a cross-correlation map c(x, z, x′, z′) = f 0 ⊗ f s , the ideal case of the function c has a sharp main peak (Dirac function in theory), which is similar to the Fig. 2d. The locations and the number of satellite peaks in the cross-correlation map depend on the distribution of the sandpaper grains since the grains exist as markers in the speckle tracking imaging technique. If there are two or more peaks with similar intensities in the cross-correlation map due to the problem of parameter selection and the sample itself, the algorithm may fail and incorrect shift information may be found, as is shown in Fig. 2a,e. The result is often characterized as a false or discontinuous signal in the imaging. This phenomenon is very common in high-frequency signal detection while using an unsuitable subset size, because high-frequency regions and low-frequency regions relate to very different phase shifts in a subset region and they display significant competition for searching out a unique maximum ε using the DIC algorithm. An excessive subset dilutes the weight of the high-frequency components in the algorithm and produces an over-wide weak single peak (Fig. 2g) or even finds an incorrect main peak (Fig. 2h). In Fig. 2h, this incorrect main peak is at (0, 0) which means that algorithm judges that this place is out of the sample. This is the reason www.nature.com/scientificreports/  www.nature.com/scientificreports/ that the edge shifts into the sample using an excessive subset as is shown in Fig. 1b. A subset that is too small causes itself to not have enough features as a marker so that it is difficult to distinguish and it is easily influenced by optical blurs and noises, as is shown in Fig. 2a, e. The demonstration of searching process for MCCC in the cross-correlation map with different subset choices inside and at the edge of the sample is shown in the Fig. 2.
In summary, for the interior of sample, large subset choice produces single sharp peak so that searches out the right shift, but at the edge of sample, too small (Fig. 2e) and too large (Fig. 2h) subset choices both cannot search out the right peak position (Fig. 2f). It is clear that the subset choice is related to the local spatial frequency of sample. Thus, we need to build a relationship between the cross-correlation coefficient and the spatial frequency of the sample. Based on the Fourier transform properties for correlation theorems, a two-dimensional Fourier transform of the cross-correlation coefficient of two patterns can be expressed as follows: where * is the conjugate operator, and C and F are the Fourier transforms of c and f. If we assume a shift of the MCCC ε = (x 0 ,z 0 ), according to the Fourier shift property: Thus, the shift of the MCCC ε can be directly related to the phase angle: For a set of unique shifts (x 0 ,z 0 ), the phase angle is a plane function. However, this ideal situation only happens in the translation of the same pattern. When introducing a sample, the effects on the speckle pattern vary dramatically at different frequencies. Normally the DIC algorithm only judges the macroscopic level, namely, the lowest-frequency shifts in the subset region. Based on the abovementioned discussion, a successful algorithm corresponds to a set of significantly optimal shifts. It is reflected in the phase angle is that there is remarkable monotonicity and even linearity in the low-frequency region. Figure 3 demonstrates the phase angles along the vertical direction at the different positions relative to the sample. The slopes of fitting curves the lowest-frequency region from − 0.4 to 0.4 μm −1 , correspond to the shift of the MCCC. If there are two peaks or several peaks having similar intensities in the correlation map, the lowest spatial frequency region of the phase angle shows a jagged and disorderly shape. The shifts x 0 and z 0 are directly proportional to the slopes in the corresponding directions.
For a region without a sample or without a phase change, since the slope corresponding to the shift is small enough and random, the values of the phase angle in the entire medium and the low-frequency region are very small, as is shown the curve of "no sample" in Fig. 3. Assuming that the sampling of the spatial frequency in any direction is L, criterion 1 φ 1 < 2πα/L can be used to judge this kind of region. In this formula, α is a phase gradient sensitivity factor, typically 0.5-3, depending on the signal-to-noise ratio (SNR) and accuracy of the sub-pixel algorithm 14 like 0.01-0.1 pixel. High algorithm accuracy and high SNR ensure a small α. In general, a relatively small value can avoid the loss of sample signal. When φ 1 is smaller than the valueα, the side length of subset can be set to the given largest value which includes at least 4-5 grains to pursuit high accuracy or the shifts in this area can even be set to zero directly and the DIC algorithm can be skipped to save calculation time. As the shift increases, the phase angle becomes steeper. In a given low-frequency region, by judging whether the phase angle has stable monotony and satisfies linearity along one direction, we can judge whether the selected subset size is www.nature.com/scientificreports/ reasonable. A reasonable subset ensures that the central signal within the subset can dominate the optimization of the cross-correlation coefficient, whether it is a slowly varying signal or a high-frequency signal. Normally, this reasonable subset size has a range of values dependent on the spatial frequency. Based on the simulation, for a high-frequency signal, the reasonable subset is smaller, whereas the relatively large subsets have to be selected to reduce the RMS error to the theoretical phase gradient for a slowly varying signal. A given value of the fitted slope or the local extreme φ 2 < 2πβ/L (criterion 2) in the low-frequency region can be used to distinguish between a high-frequency signal and a slowly varying signal, where β is a threshold factor corresponding to the shift of 0.5-1 pixel, empirically 5-15, depending on the SNR. Lower SNR corresponds to a greater β value. Although the factors α and β need to be tuned, the setting of these two factors is not very sensitive, but slightly affects the accuracy of the results. As can be seen in Fig. 3, the threshold can distinguish between two different slopes of − 0.32 and − 0.20, corresponding to the edge and interior of the sample. For the high-frequency signal, the initial subset is the given smallest subset, and then the subset size increases until the phase angles satisfies the monotony and linearity conditions. Based on sampling theorem, the smallest size should have been larger than five pixels. However, the optimal subset size for slowly varying signals is derived from a given largest subset when the criterion 2 is triggered. Phase unwrapping is used if the wrapped phase angle is found in the low-frequency region. In order to satisfy the continuity of the two spatial frequency bands, the median of the subset size range is set as the finishing point of the optimizations, where the median means the average of the given smallest and largest subset sizes. Since the finishing points of the criterions of these two spatial frequency bands are same, the deviation of the value β has little effect on the reconstruction result based on a large number of simulations. Figure 4 presents the basic procedure for searching out an optimal subset size. The entire ASC procedure can automatically implement before the DIC algorithm based on the given threshold conditions by estimating the SNR in an experiment. Figure 5 shows the comparison of the criteria of the fixed subset sizes and the variable subset sizes. Figure 5j presents the simulated speckle pattern with a sample. Figure 5k,l show the subset sizes based on ASC in both directions, which clearly outline the phase changes of the simulated sample. The subset size is obviously related to the spatial frequency of the phase gradients. The phase-contrast images based on our subset choice shown in Fig. 5g,h improved the accuracy and smoothness of the slowly varying signals compared to the results with the smallest subsets (Fig. 5a, b). They also completely overcame the situation of the shift of the high-frequency edge signals while using a large subset, as shown in the vertical direction in Fig. 5d,e. Figure 6 indicate the phase gradients calculated using ASC method and their theoretical curves along both directions. It is clear that the ASC method makes that the slow varying and high-frequency signals get good reconstructions at the same time. Table 1 compares the RMS errors for horizontal slowly varying signals and the edge deviations for vertical highfrequency signals based on different subset size choices. Compared to the SSSIG criterion, our criterion based on a Fourier transform has better sensitivity for distinguishing between a sample and sandpaper grain, and it saves significant computation time. The ASC based entirely on actual sample information causes the subset choice and the algorithm to no longer have periodicity, which also brings other corresponding advantages and disadvantages. www.nature.com/scientificreports/ One advantage of this is that the false fringes from the algorithm disappear in Fig. 5g,h. The disadvantage of this is that the continuity of the dark-field image, as is shown in Fig. 5i, becomes partly weakened because the variable subset size choice produces additional calculation fluctuations for the high-sensitivity correlation coefficients. We also simulated the influences of the Gaussian noises with the signal-to-noise ratio (SNR) from 5 to 100. As is shown in Fig. 7a, the algorithm using small subset suffers serious disturbance from noises and reconstruction RMS error keeps large even if the SNR reaches 40. As a contrast, the algorithm using the large subset almost is immune to the noises as long as the SNR is greater than 10. The high level of robustness with respect to noise for the algorithm using ASC is in between and can be accepted when the SNR is greater than 25, and decreases to the level of large subset when the SNR is greater than 40. For high-frequency edge signal, the influences of noises are similar using different subset choices. As is shown in Fig. 7b, as long as the SNR is greater than 20, the edge deviation is in agreement with the results without noises in Table 1. As we all know, the SNR is an important factor that affects the imaging quality. The robustness with respect to noise using ASC is a significant advantage over using a fixed small subset in the condition of relatively poor SNR.

experiments
We also measured some actual samples to prove our techniques. The experiments using polymethyl methacrylate (PMMA) spheres with a diameter of 1.05 mm and a JIMA tungsten resolution target (RT) (line grating structures in both orthogonal directions) with the thickness of 1.0 μm were performed at the BL13W imaging beamline and the BL09B measurement beamline, respectively. The sample, SiC sandpaper, and detector were arranged in an unfocused X-ray beam path. The detection system was a microscope objective lens system (Optique Peter) with a given magnification coupled to a CMOS camera (Hamamatsu). The main parameter details can be found in Table 2. Based on the SNRs in the experiments, the sensitive factor α and the threshold factor β were 0.5 and 5 for the PMMA and 2.5 and 8 for the resolution target.   www.nature.com/scientificreports/ The measurement of RT performing at the bending magnet beamline is different. The weak flux with a long time exposure introduced more background noises with the poor SNR of 21. Based on the simulations in Fig. 7, this level of SNR may seriously affect the result of small subsets. Figure 8 show speckle-tracking images of the resolution target. The experiment verifies the simulated results. Compared to PMMA spheres, the resolution target has a much finer line grating structure of ~ 15 μm. Using a small subset size, such as 7 × 7 or 9 × 9, the line structures cannot be clearly distinguished from background fluctuations, as are shown in Fig. 9a-c. As can be seen in Fig. 9k, at a range of pixels of 400-800, the fluctuations are random but the sharp phase gradient peaks obviously confuse the real line grating structures of the resolution target at the range of pixels of 100-400. Some false peaks at pixels 160 and 220 also can be found. These fluctuations and false peaks can be eliminated by using large subset size. We also tried to use a denoising method reported in reference 24 to correct the detector noise and beam fluctuations. Although the result (hollow square curve) is slightly improved, the fluctuations cannot be eliminated significantly. However, a large subset with the side length of 16.25 μm spans two adjacent line structures, which causes the phase gradient signal to broaden (Fig. 9d-f) and reduces the resolution and contrast. The advantage of the ASC method is better reflected in the imaging of this sample. The side length of subset clearly sketches the line grating structures (Fig. 9j). As are shown in Fig. 9g-i, a clear line grating structure  www.nature.com/scientificreports/ www.nature.com/scientificreports/ with 12 periods can be shown and it significantly improves the CNR. In this measurement with the condition of poor SNR, a fixed subset choice performs very poorly even if using a denoising technique. An adaptive subset choice can reconstruct the sample information without additional denoising technique.