Fourier ptychographic reconstruction using Poisson maximum likelihood and truncated Wirtinger gradient

Fourier ptychographic microscopy (FPM) is a novel computational coherent imaging technique for high space-bandwidth product imaging. Mathematically, Fourier ptychographic (FP) reconstruction can be implemented as a phase retrieval optimization process, in which we only obtain low resolution intensity images corresponding to the sub-bands of the sample’s high resolution (HR) spatial spectrum, and aim to retrieve the complex HR spectrum. In real setups, the measurements always suffer from various degenerations such as Gaussian noise, Poisson noise, speckle noise and pupil location error, which would largely degrade the reconstruction. To efficiently address these degenerations, we propose a novel FP reconstruction method under a gradient descent optimization framework in this paper. The technique utilizes Poisson maximum likelihood for better signal modeling, and truncated Wirtinger gradient for effective error removal. Results on both simulated data and real data captured using our laser-illuminated FPM setup show that the proposed method outperforms other state-of-the-art algorithms. Also, we have released our source code for non-commercial use.


Introduction
Fourier ptychographic microscopy (FPM) is a novel computational coherent imaging technique for high space-bandwidth product (SBP) imaging. 1,2 his technique sequentially illuminates the sample with different incident angles, and correspondingly captures a set of low-resolution (LR) images of the sample.Assuming that the incident light is a plane wave and the imaging system is a low-pass filter, the LR images captured under different incident angles correspond to different spatial spectrum bands of the sample, as shown in Fig. 1.By stitching these spectrum bands together in Fourier space, a large field-of-view (FOV) and high resolution (HR) image of the sample can be obtained.As a reference, the synthetic NA of the FPM setup in Ref. 1 is ∼0.5, and the FOV can reach ∼120 mm 2 , which greatly improves the throughput of the existing microscope.FPM has been widely applied in 3D imaging, 3,4 fluorescence imaging, 5,6 mobile microscope, 7,8 and high-speed in vitro imaging. 9athematically, Fourier ptychographic (FP) reconstruction can be implemented as a typical phase retrieval optimization process, which needs to recover a complex function given the intensity measurements of its linear transforms.Specifically, we only obtain the LR intensity images corresponding to the sub-bands of the sample's HR spatial spectrum, and the reconstruction is to retrieve the complex HR spatial spectrum.Conventional FPM 1,2 utilizes the alternating projection (AP) algorithm, 10,11 which adds constraints alternatively in spatial space (captured intensity images) and Fourier space (pupil function), to stitch the LR sub-spectra together.AP is easy to implement and fast to converge, but is sensitive to measurement noise and system errors.To tackle measurement noise, Bian et al. 12 proposed a novel method termed Wirtinger flow optimization for Fourier Ptychography (WFP), which uses gradient descent scheme and Wirtinger calculus 13 to minimize the intensity errors between estimated LR images and measurements.WFP is robust to Gaussian noise, and can produce better reconstruction results in low-exposure imaging scenarios and thus largely decrease image acquisition time.However, it needs careful initialization since the optimization is non-convex.Based on the semidefinite programming (SDP) convex optimization for phase retrieval, 14,15 Horstmeyer et al. model FP reconstruction as a convex optimization problem. 16The method guarantees a global optimum, but converges slow which makes it impractical in real applications.Recently, Yeh et al. 17 tested different objective functions (intensity based, amplitude based and Poisson maximum likelihood) under the gradient-descent optimization scheme for FP reconstruction.The results show that the amplitude-based and Poisson maximum-likelihood objective functions produce better results than the intensity-based objective function.To address the LED misalignment, the authors also add a simulated annealing algorithm into each iteration to search for optimal pupil locations.Although the above methods offer various options for FP reconstruction, they have their own limitations.AP and WFP are limited to adress Gaussian noise, and cannot handle speckle noise well (as shown in following experiments) which is common when the light source is highly spatially and temporally coherent (such as laser). 18The Poisson Wirtinger Fourier ptychographic reconstruction (PWFP) technique mentioned in Ref. 17 performs better than other methods, but it still obtains aberrant reconstruction results with measurements corrupted with Gaussian noise, and needs much more running time for the incorporated simulated annealing algorithm to deal with LED misalignment.
In this paper, we propose a novel FP reconstruction method termed as truncated Poisson Wirtinger Fourier ptychographic reconstruction (TPWFP), to efficiently handle the above mentioned measurement noise and pupil location error that are common in FPM setups.The technique incorporates Poisson maximum likelihood objective function and truncated Wirtinger gradient 19 together into a gradient-descent optimization framework.The advantages of TPWFP lie in three aspects: • The utilized Poisson maximum-likelihood objective function is more appropriate to describe the Poisson characteristic of the photon detection by an optical sensor in real imaging systems, and thus can obtain better results in real applications.
• Truncated gradient is used to prevent outliers from degrading the reconstruction, which provides better descent directions and enhanced robustness to various sources of error such as Gaussian noise and pupil location error.
• There is no matrix lifting and global searching for optimization, resulting in faster convergence and less computational requirement.
To demonstrate the effectiveness of TPWFP, we test it against the aforementioned algorithms on both simulated data and real data captured using a laser FPM setup. 18Both simulations and real experiments show that TPWFP outperforms other state-of-the-art algorithms in imaging scenarios involving Poisson noise, Gaussian noise, speckle noise and pupil location error.

Methods
As stated before, TPWFP incorporates Poisson maximum likelihood objective function and truncated Wirtinger gradient together into a gradient descent optimization framework for FP reconstruction.Next, we begin to introduce this technique in detail.

Image formation of FPM
FPM is a coherent imaging system.Under the assumption that the light incident on a sample is a plane wave, we can describe the light field transmitted from the sample as φ (x, y)e ( jx 2π λ sin θ x , jy 2π λ sin θ y ) , where φ is the sample's complex spatial map, (x, y) are the 2D spatial coordinates, j is the imaginary unit, λ is the wavelength of illumination, and θ x and θ y are the incident angles as shown in Fig. 1.Then the light field is Fourier transformed to the pupil plane when it travels through the objective lens, and subsequently low-pass filtered by the aperture.This process can be represented as P(k x , k y )F (φ (x, y)e ( jx 2π λ sin θ x , jy 2π λ sin θ y ) ), where P(k x , k y ) is the pupil function for low-pass filtering, (k x , k y ) are the 2D spatial frequency coordinates in the pupil plane, and F is the Fourier transform operator.Afterwards, the light field is Fourier transformed again when it passes through the tube lens to the imaging sensor.Since real imaging sensors can only capture light's intensity, the image formation of FPM follows: (1) where I is the captured image, F −1 is the inverse Fourier transform operator, and Φ is the spatial spectrum of the sample.
Visual explanation of the image formation process is diagrammed in Fig. 1 Because F −1 is linear and ) is a linear operation of low-pass filtering to the HR spatial spectrum with the pupil function, we rewrite the above image formation of FPM as where b ∈ R m is the ideal captured images (all the captured I under different angular illumination in vector form), A ∈ C m×n is the corresponding linear transform matrix incorporating Fourier transforms and the low-pass filter, and z ∈ C n is the sample's HR spectrum (Φ in vector form).This is a standard phase retrieval problem, 26 where b and A are known, and z is what we aim to recover.

Poisson maximum-ikelihood objective function
Here we assume that the detected photons at each detector unit follow Poisson distribution in real setups, which is consistent with the independent nature of random individual photon arrivals at the imaging sensor. 21Note that although the Poisson distribution approaches a Gaussian distribution for large photon counts according to the central limit theorem, most of the captured images in FPM are dark-field images captured under oblique illuminations and thus are more consistent with Poisson distribution.In a nutshell, the signal's probability model can be represented as where b i = |a i z| 2 is the i th latent signal in b, and a i is the i th row of A. Thus, for c i , its probability mass function is where e is the Euler's number, and c i ! is the factorial of c i .
Based on the maximum-likelihood estimation theory, assuming that the measurement are independent from each other, the reconstruction turns into maximizing the global probability of all the measurements ( Taking a logarithm of the objective function yields Since c 1,••• ,m are the experimental measurements and are thus constant in the optimization process, we can omit the last item in L (namely ∑ m i=1 log(c i !)) for optimization.Then by replacing b i with |a i z| 2 , we obtain the objective function of TPWFP as

Truncated Wirtinger gradient
As stated before, we use the gradient-descent optimization scheme.Based on the Wirtinger calculus, 13 we obtain the gradient of L(z) as where a H i is the transposed-conjugate matrix of a i .To prevent optimization degeneration from measurement noise, we add a thresholding operation to ∇L(z) before using it to update z in each iteration.Similar to Ref., 19 the thresholding constraint is defined as Here a h is a predetermined parameter specified by users, |c i − |a i z| 2 | is the difference between the i th measurement and its reconstruction, is the mean of all the differences, and |a i z| 2 ||z|| is the relative value of the latent signal.Intuitively, the thresholding indicates that if one measurement is far from the reconstruction, it is labeled as an outlier and omitted in subsequent optimization.Note that the thresholding is signal dependent, which is beneficial for accurate detection of outliers.
Thus, for each index i, its corresponding measurement can be used in Eqs.(8) only when it meets the thresholding criterion in Eq. ( 9).In the following, we use ξ to denote the index set that meets the thresholding constraint, and represent the corresponding truncated Wirtinger gradient as Note that the index set ξ is iteration-variant, meaning that in each iteration we update ξ according to the measurements c and the updated z.This adaptively provides us with better descent direction.
In the gradient descent scheme, we update z in the kth iteration as where µ is the gradient descent step that is predetermined manually.Here we utilize a setting similar to Ref. 12 as where we set k 0 = 330 and µ max = 0.1.This type of gradient-descent step is widely used, since it allows the gradient descent step to grow gradually, and offers an adaptive way for better convergence. 13,19 sed on the above derivations, we summarize the proposed TPWFP algorithm in Alg. 1.For the initialization z (0) , similar to Ref., 12 we set z (0) as the spatial spectrum of the up-sampled version of the LR image captured under the normal incident light.According to Ref., 19 the computation complexity of such an optimization algorithm is O(mn log 1 ε ), where m is the number of measurements, n is the number of signal entries, and ε is the relative reconstruction error defined in Eq. ( 13).This is much lower than WFP's computation complexity which is O(mn 2 log 1 ε ).Note that the source code of TPWFP is available at http://www.sites.google.com/site/lihengbianfor non-commercial use.

Results
In this section, we test the proposed TPWFP and other state-of-the-art algorithms including AP, WFP and PWFP on both simulated and real captured data, to show pros and cons of each method.

Quantitative metric
To quantitatively evaluate the reconstruction quality, we utilize the relative error (RE) 19 metric defined as This metric describes the difference between two complex functions z and ẑ.We use it here to compare reconstructed HR spatial spectrum with groundtruth in the simulation experiments.

Parameters
In the simulation experiments, we simulate the FPM setup with its hardware parameters as follows: the numerical aperture (NA) of the objective lens is 0.08, and the corresponding pupil function is an ideal binary function (all ones inside the NA circle and all zeros outside); the height from the LED plane to the sample plane is 84.8mm; the distance between adjacent LEDs is 4mm, and 15 × 15 LEDs are used to provide a synthetic NA of ∼0.5; the wavelength of incident light is 625nm; and the pixel size of captured images is 0.2um.Besides, we use the 'Lena' and the 'Aerial' image (512 × 512 pixels) from the USC-SIPI image database 20 as the latent HR amplitude and phase map, respectively.The captured LR images' pixel numbers are set to be one tenth of the HR image along both dimensions, and are synthesized based on the image formation in Eq. 2. We repeat 20 times for each of the following simulation experiments and average their evaluations to produce the final results.

5/10
As for the algorithms' parameters, an important parameter of TPWFP is the thresholding a h in Eq. 9. To choose appropriate a h , we test different a h on the simulated data corrupted with white Gaussian noise, and study its influence on the final reconstruction quality.The results are shown in Fig. 2, from which we can see that both too small or too big of a h result in worse reconstruction.This is determined by the nature of the utilized truncated gradient.When a h is too small, more informative measurements are incorrectly labeled as outliers and thus contribute nothing to final reconstruction, resulting in blurred reconstructed images as shown in Fig. 2. When a h is too big, measurement noise and system errors are not effectively removed from reconstructed images.To sum up, we choose an appropriate assignment for a h as a h = 25 in the following experiments for TPWFP.
Another parameter for all the algorithms is the iteration number.For AP, we set its iteration number to be 100 which is enough for its convergence. 1For WFP, its iteration number is set to be 1000. 12From Fig. 2 we can see that 200 iterations are enough for TPWFP to converge.Since PWFP is a particular form of TPWFP when a h = ∞ (no thresholding to the gradient), we set the same iteration number for PWFP.

Simulation experiments
First, we test the four algorithms (AP, WFP, PWFP and TPWFP) on simulated captured images corrupted with Poisson noise, Gaussian noise and speckle noise, respectively, which are the most common noise in real imaging setups.The first two kinds are caused by photoelectric effect and dark current, 21 while speckle noise are caused by the light's spatial-temporal coherence of the illumination source (such as laser).
The results are shown in Fig. 3.Note that the standard deviation (std) is the ratio between actual std and the maximum of the ideal measurements b.Besides, we use the model c = b(1 + n) for speckle noise simulation, where n is uniformly distributed random zero mean noise.From the results, we can see that under small Gaussian noise, WFP outperforms the other three methods.This is because WFP assumes global uniform noise for both low and high spatial frequencies, which is consistent with the corrupted noise model.Instead, PWFP and TPWFP assume that noise would be smaller for lower intensities (especially for LR images correspond to high spatial frequencies).Thus, they cannot remove enough noise for these spatial frequencies and produce worse results.However, when noise grows to around std 2 × 10 −3 , TPWFP get better results than WFP.This is because when noise is large, WFP cannot extract useful information from the noisy data, while TPWFP directly omits these measurements to avoid their negative influence on final reconstruction.For Poisson noise and speckle noise, while both PWFP and TPWFP obtains better results than the other methods, TPWFP is little advantageous than PWFP.This is because for these kinds of signal dependent noise, it is hard for the truncated gradient to correctly distinguish noise from latent signals.
Then we apply the four algorithms on the simulated data corrupted with pupil location error, which is common in real setups due to LED misalignment or unexpected system errors.We simulate pupil location error by adding Gaussian noise to the incident wave vectors of each LED.The reconstruction results are shown in Fig. 4. From the results we can see that TPWFP outperforms state-of-the-arts a lot.This benefits from the nature of the utilized truncated gradient.In the thresholding operation (Eq.( 9)), if one measurement (spatial space) is far from reconstruction due to pupil location error, we omit this measurement which represents misaligned information.Thus, we prevent the pupil location error from degenerating final reconstruction.We display the running time of each method under current algorithm settings in Table 1.All the four methods are implemented using Matlab on an Intel Xeon 2.4 GHz CPU computer, with 8G RAM and 64 bit Windows 7 system.From the table we can see that both PWFP and TPWFP save more running time than WFP due to faster convergence, 19 but they are still more time consuming than AP.Note that TPWFP consumes more time than PWFP, which is caused by the additional thresholding and truncation operation to the gradient.

Real experiment
To further validate the robustness of TPWFP to the above measurement noise and system errors, we run the four algorithms on two real captured datasets including USAF target and red blood cell sample using a laser FPM setup. 18The red blood cell sample is prepared on a microscope slide stained with Hema 3 stain set (Wright-Giemsa).The setup consists of a 4 f microscope system with a 4× 0.1 NA objective lens (Olympus), a 200 mm focal-length tube lens (Thorlabs), and a 16-bit sCMOS camera (PCO.edge5.5).The system is fitted with a circular array of 95 mirror elements providing illumination NA of 0.325, resulting in the total synthetic NA of 0.425.A 1W laser of 457 nm wavelength is used for the illumination source, which is pinhole-filtered, collimated and guided to a pair of Galvo mirrors (Thorlabs GVS212) to be directed to individual mirror

Discussion
In this paper, we propose a novel reconstruction method for FPM termed as TPWFP, which utilizes Poisson maximum likelihood objective function and truncated Wirtinger gradient for optimization under a gradient descent framework.Results on both simulated data and real data captured using our laser FPM setup show that the proposed method outperforms other TPWFP can be widely extended.First, the pupil function updating procedure of the EPRY-FPM algorithm 22 can be incorporated into TPWFP to obtain corrected pupil function and better reconstruction.Besides, since the linear transform matrix A can be composed of any kinds of linear operations (Fourier transform and low-pass filtering in FPM), TPWFP can be applied in various linear optical imaging systems for phase retrieval, such as conventional ptychography, 23 multiplexed FP 24,25 and fluorescence FP. 5 Also, since TPWFP is much more robust to pupil location error than other methods, it may find wide applications in other imaging schemes where precise calibrations are unavailable.
In spite of advantageous performance and wide applications, the limitations of TPWFP lie in two aspects.First, it is still time consuming compared to conventional AP, though it is much faster than WFP.Second, it is non-convex.Although choosing

Figure 1 .
Figure 1.The FPM system and its image formation.

Figure 2 .
Figure 2. Reconstruction quality (relative error between reconstructed HR spatial spectrum and groundtruth) under different settings of a h in TPWFP.The experiment is conducted on the simulated dataset corrupted with white Gaussian noise (std = 2 × 10 −3 ).

Figure 3 .
Figure 3.Comparison of the reconstruction results by the three state-of-the-arts and the proposed TPWFP under Poisson noise, Gaussian noise and speckle noise.

Figure 4 .Figure 5 .
Figure 4. Reconstruction results by the three state-of-the-arts and the proposed TPWFP under pupil location error.

Table 1 .
Comparison of running time between state-of-the-arts and the proposed TPWFP.