Abstract
Phase recovery (PR) refers to calculating the phase of the light field from its intensity measurements. As exemplified from quantitative phase imaging and coherent diffraction imaging to adaptive optics, PR is essential for reconstructing the refractive index distribution or topography of an object and correcting the aberration of an imaging system. In recent years, deep learning (DL), often implemented through deep neural networks, has provided unprecedented support for computational imaging, leading to more efficient solutions for various PR problems. In this review, we first briefly introduce conventional methods for PR. Then, we review how DL provides support for PR from the following three stages, namely, preprocessing, inprocessing, and postprocessing. We also review how DL is used in phase image processing. Finally, we summarize the work in DL for PR and provide an outlook on how to better use DL to improve the reliability and efficiency of PR. Furthermore, we present a liveupdating resource (https://github.com/kqwang/phaserecovery) for readers to learn more about PR.
Similar content being viewed by others
Introduction
Light, as an electromagnetic wave, has two essential components: amplitude and phase^{1}. Optical detectors, usually relying on photontoelectron conversion (such as chargecoupled device sensors and the human eye), measure the intensity that is proportional to the square of the amplitude of the light field, which in turn relates to the transmittance or reflectance distribution of the sample (Fig. 1a, b). However, they cannot capture the phase of the light field because of their limited sampling frequency^{2}.
Actually, in many application scenarios, the phase rather than the amplitude of the light field carries the primary information of the samples^{3,4,5,6}. For quantitative structural determination of transparent and weakly scattering samples^{3} (Fig. 1c), the phase delay is proportional to the sample’s thickness or refractive index (RI) distribution, which is critically important for bioimaging because most living cells are transparent. For quantitative characterization of the aberrated wavefront^{5} (Fig. 1d, e), the phase aberration is caused by atmospheric turbulence with an inhomogeneous RI distribution in the light path, which is mainly used in adaptive aberration correction. Also, for quantitative measurement of the surface profile^{6} (Fig. 1f), the phase delay is proportional to the surface height of the sample, which is very useful in material inspection.
Since the phase delay across the wavefront is necessary for the above applications, but the optical detection devices can only perceive and record the amplitude of the light field, how can we recover the desired phase? Fortunately, as the light field propagates, the phase delay also causes changes in the amplitude distribution; therefore, we can record the amplitude of the propagated light field and then calculate the corresponding phase. This operation generally comes under different names according to the application domain; for example, it is quantitative phase imaging (QPI) in biomedicine^{3}, phase retrieval in coherent diffraction imaging (CDI)^{4} which is the most commonly used term in Xray optics and nonoptical analogs such as electrons and other particles, and wavefront sensing in adaptive optics (AO)^{5} for astronomy and optical communications. Here, we collectively refer to the way of calculating the phase of a light field from its intensity measurements as phase recovery (PR).
As is common in inverse problems, calculating the phase directly from an intensity measurement after propagation is usually illposed^{7}. Suppose the complex field at the sensor plane is known. We can directly calculate the complex field at the sample plane using numerical propagation^{8} (Fig. 2a). However, in reality, the sensor only records the intensity but loses the phase, and, moreover, it is necessarily sampled by pixels of finite area size. Because of these complications, the complex field distribution at the sample plane generally cannot be calculated in a straightforward manner (Fig. 2b).
We can transform phase recovery into a wellposed/deterministic problem by introducing extra information, such as holography or interferometry at the expense of having to introduce a reference wave^{8,9}, ShackHartmann wavefront sensing which introduces a microlens array at the conjugate plane^{10,11}, and transport of intensity equation requiring multiple throughfocus amplitudes^{12,13}. Alternatively, we can solve this illposed phase recovery problem in an iterative manner by optimization, i.e., the socalled phase retrieval such as GerchbergSaxtonFienup algorithm^{14,15,16}, multiheight algorithm^{17,18,19}, realspace ptychography^{20,21,22}, and Fourier ptychography^{23,24}. Next, we introduce these classical phase recovery methods in more detail.
Holography/interferometry
By interfering the unknown wavefront with a known reference wave, the phase difference between the object wave and the reference wave is converted into the intensity of the resulting hologram/interferogram due to alternating constructive and destructive interference of the two waves across their fronts. This enables direct calculation of the phase from the hologram^{8}.
In inline holography, where the object beam and the reference beam are along the same optical axis, fourstep phaseshifting algorithm is commonly used for phase recovery (Fig. 3)^{25}. At first, the complex field of the object wave at the sensor plane is calculated from the four phaseshifting holograms. Next, the complex field at the sample plane is obtained through numerical propagation. Then, by applying the arctangent function over the final complex field, a phase map in the range of (−π, π] is obtained, i.e., the socalled wrapped phase. The final sample phase is obtained after phase unwrapping. Other multiplestep phaseshifting algorithms are also possible for phase recovery^{26}. Spatial light interference microscopy (SLIM), as a wellknown QPI method, combines the phaseshifting algorithm with a phase contrast microscopy for phase recovery over transparent samples^{27}.
In offaxis holography, where the reference beam is slightly tilted from the optical axis, the phase is modulated into a carrier frequency that can be recovered through spatial spectral filtering with only one holographic measurement (Fig. 4)^{28}. By appropriately designing the carrier frequency, the baseband that contains the reference beam can be well separated from the object beam. After transforming the measured hologram into the spatial frequency domain through a Fourier transform (FT), one can select the +1st or −1st order beam and move it to the baseband. By applying an inverse FT, the object beam can be recovered. One has to be careful, however, not to exceed the Nyquist limit on the camera as the angle between reference and object increases. Moreover, as only a small part of the spatial spectrum is taken for phase recovery, offaxis holography typically wastes a lot of spatial bandwidth product of the system. To enhance the utilization of the spatial bandwidth product, the KramersKronig relationship and other iterative algorithms have been recently applied in offaxis holography^{29,30,31}.
Both the inline and offaxis holography discussed above are lensless, where the sensor and sample planes are not mutually conjugated. Therefore, a backward numerical propagation from the former to the latter is necessary. The process of numerical propagation can be omitted if additional imaging components are added to conjugate the sensor and sample planes, such as digital holographic microscopy^{32}.
ShackHartmann wavefront sensing
If we can obtain the horizontal and vertical phase gradients of a wavefront in some ways, then the phase can be recovered by integrating the phase gradients in these orthogonal directions. ShackHartmann wavefront sensor^{10,11} is a classic way to do so from the perspective of geometric optics. It usually consists of a microlens array and an image sensor located at its focal plane (Fig. 5). The phase gradient of the wavefront at the surface of each microlens is calculated linearly from the displacement of the focal point on the focal plane, in both horizontal and vertical (xaxis and yaxis) directions. The phase can then be computed by integrating the gradient at each point, whose resolution depends on the density of the microlens array. In addition, quantitative differential interference contrast microscopy^{33}, quantitative differential phase contrast microscopy^{34}, and quadriwave lateral shearing interferometry^{35} also recover the phase from its gradients. They may achieve higher resolution than the ShackHartmann wavefront sensor.
Transport of intensity equation
For a light field, the wavefront determines the axial variation of the intensity in the direction of propagation. Specifically, there is a quantitative relationship between the gradient and curvature of the phase and the axial differentiation of intensity, the socalled transport of intensity equation (TIE)^{12}. This relationship has an elegant analogy to fluid mechanics, approximating the light intensity as the density of a compressible fluid and the phase gradient as the lateral pressure field^{36}. TIE can be derived from three different perspectives: the Helmholtz equations in the paraxial approximation, and the Fresnel diffraction and Poynting theorem in the paraxial and weakdefocusing approximation^{13}. The gradient and curvature of the phase together determine the wavefront shape, whose normal vector is then parallel to the wavevector at each point of the wavefront, and consequently to the direction of energy propagation. In turn, variations in the lateral energy flux also result in axial variations of the intensity. Convergence of light by a convex lens is an intuitive example (Fig. 6): the wavefront in front of the convex lens is a plane, whose wavevector is parallel to the direction of propagation. As such, the intensity distribution on different planes is constant; that is, the axial variation of the intensity is equal to zero. Then, the convex lens changes the wavefront so that all wavevectors are directed to the focal point, and therefore, as the light propagates, the intensity distribution becomes denser and denser, meaning that the intensity varies in the axial direction (equivalent, its axial derivative is not zero).
As there is a quantitative relationship between the gradient and curvature of the phase and the axial differentiation of intensity, we can exploit it for phase recovery (Fig. 7). By shifting the sensor axially, intensity maps at different defocus distances are recorded, which can be used to approximate the axial differential by numerical difference, and thus calculate the phase through TIE. Due to the addition of the imager, the sensor and sample planes are conjugated. Besides, TIE can also be used in lensless systems to recover the phase at the defocus plane, which thus requires an additional numerical propagation^{13}.
It is worth noting that TIE is suitable for a complete and partially coherent light source, and the resulting phase is continuous and does not require phase unwrapping, while it is only effective in the case of paraxial and weakdefocusing approximation^{13}.
Phase retrieval
If extra information is not desired to be introduced, then calculating the phase directly from a propagated intensity measurement is an illposed problem. We can overcome such difficulty through incorporating prior knowledge. This is also known as regularization. In the GerchbergSaxton (GS) algorithm^{14}, the intensity at the sample plane and the farfield sensor plane recorded by the sensor are used as constraints. A complex field is projected forward and backward between these two planes using the Fourier transform and constrained by the intensity iteratively; the resulting complex field will gradually approach a solution (Fig. 8a). Fienup changed the intensity constraint at the sample plane to the aperture (support region) constraint, so that the sensor only needs to record one intensity map, resulting in the error reduction (ER) algorithm and the hybrid inputoutput (HIO) algorithm (Fig. 8b)^{15,16}. In addition to the aperture constraint, one can introduce other physical constraints such as histogram^{37}, atomicity^{38}, and absorption^{39} to reduce the illposedness of phase retrieval. Furthermore, many types of sparsity priors such as spatial domain^{40}, gradient domain^{41,42}, and wavelet domain^{43} are effective regularizers for phase retrieval.
Naturally, if more intensity maps are recorded by the sensor, there will be more prior knowledge for regularization, further reducing the illposedness of the problem. By moving the sensor axially, the intensity maps of different defocus distances are recorded as an intensity constraint, and then the complex field is computed iteratively like the GS algorithm (Fig. 9a), the socalled multiheight phase retrieval^{17,18,19}. In this axial multiintensity alternating projection method, the distance between the sample plane and the sensor plane is usually kept as close as possible, so that numerical propagation is used for projection instead of Fourier transform. Meanwhile, with a fixed position of the sensor, multiple intensity maps can also be recorded by radially moving the aperture near the sample, and then the complex field is recovered iteratively like the ER and HIO algorithms (Fig. 9b), the socalled realspace ptychography^{20,21,22}. In this radial multiintensity alternating projection method, each adjoining aperture constraint overlaps one another and expands the field of view in real space. Furthermore, angular multiintensity alternating projection is also possible. By switching the aperture constraint from the spatial domain to the frequency domain with a lens system, multiple intensity maps with different frequency information are recorded by changing the angle of the incident light (Fig. 9c), the socalled Fourier ptychography^{23,24}. Due to the change of illumination angle, highfrequency information that originally exceeds the numerical aperture is recorded, expanding the Fourier bandwidth in reciprocal space. Recently, synthetic aperture ptychography^{44} was proposed to simultaneously expand the bandwidth in real space and reciprocal space, in which an extended plane wave is used to illuminate a stationary object and subsequently a coded image sensor is translated within the far field to record data.
In addition to alternating projections, there are two most representative nonconvex optimization methods, namely the Wirtinger flow^{45} and truncated amplitude flow algorithms^{46}. They can be transformed into convex optimization problems through semidefinite programming, such as the PhaseLift algorithm^{47}.
Recovery of lowfrequency phase component
As mentioned at the beginning, because the phase information of the light field is converted into amplitude variations during propagation, one can recover the phase from the recorded amplitude distribution. However, lowfrequency phase component causes less amplitude variations, which is difficult for detection. A more quantitative analysis can be performed through the phase transfer function^{13}, which characterizes the transfer response of phase content at different spatial frequencies for an imaging system. For holography and ShackHartmann wavefront sensing, due to the interference phenomenon or the microlens array, the lowresolution phase component is converted into a fringe pattern or focus translation, which can be easily detected. For other lensless methods of recovering phase from propagation intensity maps, such as lensless TIE, GerchbergSaxtonFienup algorithm, multiheight algorithm, and realspace ptychography with an unknown probe beam, their phase transfer function of the lowfrequency component is close to zero. That is to say, the slowvarying phase gradient cannot induce sufficient intensity contrast to be detected and thus cannot be recovered through subsequent algorithms. Coded ptychography^{48} is an effective solution, in which the coded layer (such as disorderengineered surface^{49} or fixed bloodcell layer^{50,51}) effectively converts the phase information of different spatial frequencies into detectable distortions in the diffraction patterns. Similarly, the coded layer can also be used in the multiheight algorithm to recover the slowvarying phase profiles^{52}. As for the lensbased case, such as lensbased TIE^{53,54}, Fourier ptychography^{55}, and quantitative differential phase contrast microscopy^{56}, the phase transfer function of the imaging system can be modulated by changing the illumination angle, thereby collecting more lowfrequency phase information.
Deep learning (DL) for phase recovery
In recent years, as an important step towards true artificial intelligence (AI), deep learning^{57} has achieved unprecedented performance in many tasks of computer vision with the support of graphics processing units (GPUs) and large datasets. Similarly, since it was first used to solve the inverse problem in imaging in 2016^{58}, deep learning has demonstrated promising potential in the field of computational imaging^{59}. In the meantime, there is a rapidly growing interest in using deep learning for phase recovery (Fig. 10).
For the vast majority of “DL for PR”, the implementation of deep learning is based on the training and inference of artificial neural networks (ANNs)^{60} through inputlabel paired dataset, known as supervised learning (Fig. 11). In view of its natural advantages in image processing, the convolutional neural network (CNN)^{61} is the most widely used ANN for phase recovery. Specifically, in order for the neural network to learn the mapping from physical quantity A to B, a large number of paired examples need to be collected to form a training dataset that implicitly contains this mapping relationship (Fig. 11a). Then, the gradient of the loss function is propagated backward through the neural network, and the network parameters are updated iteratively, thus internalizing this mapping relationship (Fig. 11b). After training, the neural network is used to infer B_{x} from an unseen A_{x} (Fig. 11c). In this way, deep learning has been used in all stages of phase recovery and phase processing.
In fact, the rapid pace of deeplearningbased phase recovery has been documented in several excellent review papers. For example, Barbastathis et al.^{59} and Rivenson et al.^{62} reviewed how supervised deep learning powers the process of phase retrieval and holographic reconstruction. Zeng et al.^{63} and Situ et al.^{64} mainly focused on the use of deep learning in digital holography and its applications. Zhou et al.^{65} and Wang et al.^{66} reviewed and compared different usage strategies of AI in phase unwrapping. Dong et al.^{67} introduced a unifying framework for various algorithms and applications from the perspective of phase retrieval and presented its advances in machine learning. Park et al.^{68} discussed AIQPIbased analysis methodologies in the context of life sciences. Differently, depending on where the neural network is used, we review various methods from the following four perspectives:

In the section “DLpreprocessing for phase recovery”, the neural network performs some preprocessing on the intensity measurement before phase recovery, such as pixel superresolution (Fig. 12a), noise reduction, hologram generation, and autofocusing.

In the section “DLinprocessing for phase recovery”, the neural network directly performs phase recovery (Fig. 12b) or participates in the process of phase recovery together with the physical model or physicsbased algorithm by supervised or unsupervised learning modes.

In the section “DLpostprocessing for phase recovery”, the neural network performs postprocessing after phase recovery, such as noise reduction (Fig. 12c), resolution enhancement, aberration correction, and phase unwrapping.

In the section “Deep learning for phase processing”, the neural network uses the recovered phase for specific applications, such as segmentation (Fig. 12d), classification, and imaging modal transformation.
Finally, we summarize how to effectively use deep learning in phase recovery and look forward to potential development directions (see the section “Conclusion and outlook”). To let readers learn more about phase recovery, we present a liveupdating resource (https://github.com/kqwang/phaserecovery).
DLpreprocessing for phase recovery
A summary of “DLpreprocessing for phase recovery” is presented in Table 1 and is described below, including the “Pixel superresolution”, “Noise reduction”, “Hologram generation”, and “Autofocusing” sections.
Pixel superresolution
A highresolution image generally reveals more detailed information about the object of interest. Therefore, it is desirable to recover a highresolution image from one or multiple lowresolution measurements of the same field of view, a process known as pixel superresolution. Similarly, from multiple subpixelshifted lowresolution holograms, a highresolution hologram can be recovered by pixel superresolution algorithms^{69}. Luo et al.^{70} proposed to use the UNet for this purpose. Compared with iterative pixel superresolution algorithms, this deep learning method has an advantage in inference time while ensuring the same level of resolution improvement. It maintains high performance even with a reduced number of input lowresolution holograms.
After the pixel superresolution CNN (SRCNN) was proposed for singleimage superresolution in the field of image processing^{71}, this type of deep learning method was also used in other optical superresolution problems, such as brightfield microscopy^{72} and fluorescence microscopy^{73}. Similarly, this method of inferring corresponding highresolution images from lowresolution versions via deep neural networks can also be used for holograms pixel superresolution before doing phase recovery by conventional recovery methods (Fig. 13).
Byeon et al.^{74} first applied the SRCNN to hologram pixel superresolution, and named it HGSRCNN. Compared with conventional focusedimagetrained SRCNN and bicubic interpolation, this method, trained with defocus inline holograms, can infer higherquality highresolution holograms. Xin et al.^{75} used an improved fast SRCNN (FSRCNN) to do pixel superresolution for whitelight holograms, significantly improving the identification and accuracy of threedimensional (3D) measurement results. Under the premise of improved accuracy, the inference speed of FSRCNN is nearly ten times faster than that of SRCNN.
Ren et al.^{76} proposed to use a CNN, incorporating the residual network (ResNet) and subpixel network (SubPixelNet), for pixel superresolution of a single offaxis hologram. They found that compared to l_{1}norm and structural similarity index (SSIM)^{77}, the neural network trained using l_{2}norm as the loss function performed best. Moreover, this deep learning method reconstructs highresolution offaxis holograms with better quality than conventional image superresolution methods, such as bicubic, bilinear, and nearestneighbor interpolations.
Noise reduction
Most phase recovery methods, especially holography, are performed with a coherent light source; therefore, coherent noise is unavoidable. In addition, noise can be caused by environmental disturbances and the recording process of the image sensor. Therefore, reducing the noise from the hologram before phase recovery is essential. Filterbased methods, such as windowed Fourier transform (WFT)^{78}, have been widely used in hologram noise reduction, but most of these methods face a tradeoff between good filtering performance and time cost.
In 2017, Zhang et al.^{79} opened the door to image denoising using the deep CNN, called DnCNN. Subsequently, the DCNN was introduced to the field of fringe analysis for fringe pattern denoising (Fig. 14).
Yan et al.^{80} first applied the DnCNN to fringe pattern denoising, which has higher precision around image boundaries and needs less inference time than WFT. Similar conclusions can also be seen in the work of Lin et al.^{81}. Then, inspired by the FFDNet^{82}, Hao et al.^{83} downsampled the input fringe pattern into four subimages before using the DnCNN for denoising, leading to a faster inference speed. Furthermore, Zhou et al.^{84,85} converted this batchdenoising DnCNN into the frequency domain. Specifically, they first computed the Fourier transform of the downsampled subimages, then used the DnCNN to achieve noise reduction in the frequency domain, and finally applied upsampling and inverse Fourier transform to obtain the denoised fringe pattern. From the comparison results, their method outperforms that of Yan et al. and Hao et al. at different noise levels. ReyesFigueroa et al.^{86} further showed that the UNet and its improved version (VNet) are better than DnCNN for fringe pattern denoising, because their proposed VNet has more channels on the outer side than on the inner side, retaining more details. Given the UNet’s outstanding mapping capabilities, GurrolaRamos et al.^{87} also improved it for fringe pattern denoising, where dense blocks are leveraged for reusing feature layers, local residual learning is used to address the vanishing gradient problem, and global residual learning is used to estimate the noise of the image instead of the denoised image directly. Compared with other neural networks mentioned above, it has a minor model complexity while maintaining the highest accuracy.
Hologram generation
As mentioned in the Introduction, in order to recover the phase, multiple intensity maps are needed in many cases, such as phaseshifting holography and axial multiintensity alternating projection. Given its excellent mapping capability, the neural network can be used to generate other relevant holograms from known ones, thus enabling phase recovery that requires multiple holograms (Fig. 15). In this approach, the input and output usually belong to the same imaging modality with high feature similarity, so it is easier for the neural network to learn. Moreover, the dataset is collected only by experimental record or simulation generation, without the need for phase recovery as ground truth in advance by conventional methods.
Zhang et al.^{88,89} first proposed the idea of generating holograms with holograms before phase recovery with the conventional method (Fig. 15a). From a single hologram, the other three holograms with π/2, π, and 3π/2 phase shifts were simultaneously generated by the YNet^{90}, and then phase recovery was implemented by the fourstep phaseshifting method. The motivation to infer holograms instead of phase via a network is that for different types of samples, the spatial differences between their holograms were significantly lower than that of their phase. Accordingly, this phase recovery based on the hologram generation has better generalization ability than recovering phase from holograms directly with the neural network, especially when the spatial characteristics differences of the phase between the training and testing datasets are relatively large^{89}. Since the phaseshift between the generated holograms is equal, Yan et al.^{91} proposed to generate noisefree phaseshifting holograms using a simple endtoend generative adversarial network (GAN) in a manner of sequential concatenation. Subsequently, for better performance in balancing spatial details and highlevel semantic information, Zhao et al.^{92} applied the multistage progressive image restoration network (MPRNet)^{93} for phaseshifting hologram generation. Huang et al.^{94} and Wu et al.^{95} then expanded this approach from fourstep to threestep and twostep phaseshifting methods, respectively.
Luo et al.^{96} proposed to generate holograms with different defocus distances from one hologram via a neural network, and then achieve phase recovery with alternating projection (Fig. 15b). Similar to the work of Zhang et al.^{89}, they proved that the use of neural networks with less difference between the source domain and the target domain could enhance the generalization ability. As for multiwavelength holography, Li et al.^{97,98} harnessed a neural network to generate a hologram of another wavelength from one or two holograms of known wavelength, thereby realizing twowavelength and threewavelength holography. At the same time, Xu et al.^{99} realized a oneshot twowavelength and threewavelength holography by generating the corresponding singlewavelength holograms from a twowavelength or threewavelength hologram with information crosstalk.
Autofocusing
In lensless holography, the phase of the sample plane can only be recovered if the distance between the sensor plane and the sample plane is known. Defocus distance estimation thus becomes a fundamental problem in holography, which is also known as autofocusing.
Deep learning methods for autofocus essentially use the neural network to estimate the defocus distance from the hologram (Fig. 16), which can be regarded as either a classification problem^{100,101,102,103} or a regression problem^{104,105,106,107,108,109,110}.
From the perspective of classification, Pitkäaho et al.^{100} first proposed to estimate the defocus distance from the hologram by a CNN. In their scheme, the zeroorder and twinimage terms need to be removed before the trained neural network classifies the holograms into different discrete defocus distances. Meanwhile, Ren et al.^{101} advocate directly using raw holograms collected at different defocus distances as the input of the neural networks. Furthermore, they revealed the advantages of neural networks over other machine learning algorithms in the task of autofocusing. Immediately afterward, Son et al.^{102} also verified the feasibility of autofocus by classification through numerical simulations. Subsequently, Couturier et al.^{103} improved the accuracy of defocus distance estimation by using a deeper CNN for categorizing defocus distance into a greater number of classes.
Nevertheless, no matter how many classes there are, the defocus distance estimated by these classificationbased methods is also discrete, which is still not precise enough in practice. Thus, Ren et al.^{104} further developed an approach to treat the defocus distance estimation as a regression problem, where the output of the neural network is continuous. They verified the superiority of this deeplearningbased regression method with amplitude samples and phase samples, respectively, and tested the adaptability under different exposure times and incident angles. Later, Pitkäaho et al.^{105} also extended their previous classificationbased work^{100} to this regressionbased approach. While these methods estimate the defocus distance of the entire hologram, Jaferzadeh et al.^{106} and Moon et al.^{107} proposed to take out the region of interest from the whole hologram as the input to estimate the defocus distance. In order to get rid of the constraint of known defocus distance as the label of the training dataset, Tang et al.^{111} proposed to iteratively infer the defocus distance by an untrained network with a defocus hologram and its infocus phase. Later on, Cuenat et al.^{108} demonstrated the superiority of the Vision Transformer^{112} over typical CNNs in defocus distance estimation. Because the spatial spectrum information is also helpful for the defocus distance estimation^{113}, Lee et al.^{109} and Shimobaba et al.^{110} proposed to use the spatial spectrum or power spectrum of holograms as the network input to estimate the defocus distance.
DLinprocessing for phase recovery
In “DLinprocessing for phase recovery”, the neural network directly performs the inference process from the measured intensity image to the phase (see the “Networkonly strategy” section), or together with the physical model or physicsbased algorithm to achieve the inference (see the “Networkwithphysics strategy” section).
Networkonly strategy
The networkonly strategy uses a neural network to perform phase recovery, where the network input is the measured intensity image and the output is the phase. A summary of various methods is presented in Table 2 and described below, where we classify them into datasetdriven (DD) and physicsdriven (PD) approaches.
Datasetdriven approach
As a supervised learning mode, datadriven deep learning phase recovery methods presuppose a large number of paired inputlabel datasets. Usually, it is necessary to experimentally collect a significant number of intensity images (such as diffraction images or holograms) as input, and use conventional methods to calculate the corresponding phase as ground truth (Fig. 17a). The key lies in that this paired dataset implicitly contains the mapping relationship from intensity to phase. Then, an untrained/initialized neural network is iteratively trained with the paired dataset as an implicit prior, where the gradient of the loss function propagates into the neural network to update the parameters (Fig. 17b). After training, the network is used as an endtoend mapping to infer the phase from intensity (Fig. 17c). Therefore, the DD approach is to guide/drive the training of the neural network with this implicit mapping, which is internalized into the neural network as the parameters are iteratively updated.
Sinha et al.^{114} were among the first to demonstrate this endtoend deep learning strategy for phase recovery, in which the phase of objects is inferred from corresponding diffraction images via a trained deep neural network. In dataset collection, they used a phaseonly spatial light modulator (SLM) to load different public image datasets to generate the phase as ground truth, and after a certain distance, place the image sensor to record the diffraction image as input. The advantage is that both the diffraction image and the phase are known and easily collected in large quantities. Through comparative tests, they verified the adaptability of the deep neural network to unseen types of datasets and different defocus distances. Although this scheme cannot be used in practical application due to the use of the phasetype spatial light modulator, their pioneering work opens the door to deeplearninginference phase recovery. For instance, Li et al.^{115} introduced the negative Pearson correlation coefficient (NPCC)^{116} as a loss function to train the neural network, and enhanced the spatial resolution by a factor of two by flattening the power spectral density of the training dataset. Deng et al.^{117} found that the higher the Shannon entropy of the training dataset, the stronger the generalization ability of the trained neural network. Goy et al.^{118} extended the work to phase recovery under weaklight illumination.
Meanwhile, Wang et al.^{119} extended the diffraction device of Sinha et al.^{114} to an inline holographic device by adding a coaxial reference beam, and used the inline hologram instead of the diffraction image as the input to a neural network for phase recovery. Nguyen et al.^{120} applied this endtoend strategy for Fourier ptychography, inferring the highresolution phase from a series of lowresolution intensity images via a UNet, and Cheng et al.^{121} further used a single lowresolution intensity image under optimized illumination as the neural network input. Cherukara et al.^{122} extended this endtoend deep learning strategy to CDI, in which they trained two neural networks with simulation datasets to infer the amplitude or phase of objects from farfield diffraction intensity maps, respectively. Ren et al.^{123} demonstrated the time and accuracy superiority of this endtoend deep learning strategy over conventional numerical algorithms in the case of offaxis holography. Yin et al.^{124} introduced the cycleGAN to extend this endtoend deep learning strategy to the application scenario of unpaired datasets. Lee et al.^{125} replaced the forward generator of the cycleGAN by numerical propagation, improving the phase recovery robustness of neural networks in highly perturbative configurations. Hu et al.^{126} applied this endtoend deep learning strategy to the ShackHartmann wavefront sensor, inferring the phase directly from a spot intensity image after the microlens array. Wang et al.^{127} extended this endtoend deep learning strategy to TIE, using a trained neural network to infer the phase of the cell object from a defocus intensity image illuminated by partially coherent light. Further, Zhou et al.^{128} used neural networks to infer highresolution phase from a lowresolution defocus intensity image. Pirone et al.^{129} applied this hologramtophase deep learning strategy to improve the reconstruction speed of 3D optical diffraction tomography (ODT) from tens of minutes to a few seconds. Chang et al.^{130} expanded the illumination source from photons to electrons, recovering the phase images from electron diffraction patterns of twisted hexagonal boron nitride, monolayer graphene, and Au nanoparticles. Tayal et al.^{131} demonstrated the use of data augmentation and a symmetric invariant loss function to break the symmetry in the endtoend deep learning phase recovery.
In addition to expanding the application scenarios of this endtoend deep learning strategy, some researchers focused on the performance and advantages of different neural networks in phase recovery. Xue et al.^{132} applied Bayesian neural network (BNN) into Fourier ptychography for inferring model uncertainty while doing phase recovery. Li et al.^{133} applied GAN for phase recovery, inferring the phase from two symmetricillumination intensity images. Wang et al.^{90,134} proposed a onetomulti CNN, YNet^{90}, from which the amplitude and phase of an object can be inferred from the input intensity simultaneously. Zeng et al.^{135} introduce the capsule network to overcome information loss in the pooling operation and internal data representation of CNNs. Compared with conventional CNNs, their proposed capsulebased CNN (RedCap) saves 75% of network parameters while ensuring higher holographic reconstruction accuracy. Wu et al.^{136} applied the YNet^{90} to CDI for simultaneous inference of phase and amplitude. Huang et al.^{137} introduced a recurrent convolution module into UNet, trained using GAN, for holographic reconstruction with autofocus. Uelwer et al.^{138} used a cascaded neural network for endtoend phase recovery. Castaneda et al.^{139} and Jaferzadeh et al.^{140} introduced GAN into offaxis holographic reconstruction. Luo et al.^{141} added dilated convolutions into a CNN, termed mixedcontext network (MCN)^{141}, for phase recovery. By comparing in a onesamplelearning scheme, they found that MCN is more accurate and compact than the conventional UNet. Ding et al.^{142} added Swin Transformer^{143} into UNet and trained it with lowresolution intensity as input and highresolution phase as ground truth using cycleGAN. The trained neural network can do phase recovery while enhancing the resolution and has higher accuracy than the conventional UNet. In CDI, Ye et al.^{144} used a multilayer perceptron for feature extraction before a CNN, considering the property of the farfield (Fourier) intensity images where the data are globally correlated. Chen et al.^{145,146} combined the spatial Fourier transform module with ResNet, termed Fourier imager network (FIN), to achieve holographic reconstruction with superior generalization to new types of samples and faster inference speed (9fold faster than their previous recurrent neural network, 27fold faster than conventional iterative algorithms). Shu et al.^{147} applied neural architecture search (NAS) to automatically optimize the network architecture for phase recovery. Compared with the conventional UNet, the peak signaltonoise ratio (PSNR) of their NASbased network is increased from 34.7 dB to 36.1 dB, and the inference speed is increased by 27fold.
As a similar deep learning phase recovery strategy in adaptive optics, researchers demonstrated that neural networks could be used to infer the phase of the turbulenceinduced aberration wavefront or its Zernike coefficient from the distortion intensity of target objects^{148}. In these applications, only the wavefront subsequently used for aberration correction is of interest, not the RI distribution of turbulence that produces this aberration wavefront.
Physicsdriven approach
Different from the datasetdriven approach that uses inputlabel paired dataset as an implicit prior for neural network training, physical models, such as numerical propagation, can be used as an explicit prior to guide/drive the inference or training of neural networks, termed physicsdriven (PD) approach. It only requires measurements of samples as an inputonly dataset and is therefore an unsupervised learning mode. On the one hand, this explicit prior can be used to iteratively optimize an untrained neural network to infer the corresponding phase and amplitude from the measured intensity image as input, referred to as the untrained PD (uPD) scheme (Fig. 18a). On the other hand, this explicit prior can be used to train an untrained neural network with a large number of intensity images as input, which then can infer the corresponding phase from unseen intensity images, an approach called the trained PD (tPD) scheme (Fig. 18b).
In order to more intuitively understand the difference and connection between the DD and PD approaches, let us compare the loss functions in Fig. 17 and Fig. 18:
where \({\Vert \cdot \Vert }_{2}^{2}\) denotes the square of the l_{2}norm (or other distance functions), \({f}_{\omega }(\cdot )\) is a neural network with trainable parameters \(\omega\), \(H(\cdot )\) is a physical model (such as numerical propagation, Fourier transform, or Fourier ptychography measurement model), \({I}_{i}\) is the measured intensity image in the training dataset, \({\theta }_{i}\) is the phase in the training dataset, \({I}_{x}\) is the measured intensity image of a test sample, and \(n\) is the number of samples in the training dataset. In Eq. (1) for the DD approach, the priors used for network training are the measured intensity image and corresponding groundtruth phase. Meanwhile, in Eqs. (2) and (3) for the PD approaches, the priors used for network inference or training are the measured intensity image and physical model, instead of the phase. It should be noted that the uPD scheme is free from numerous intensity images as a prerequisite, but requires numerous iterations for each inference; while the tPD scheme completes the inference only passing through the trained neural network once, but requires a large number of intensity images for pretraining.
This PD approach was first implemented in the work on Fourier ptychography by Boominathan et al.^{149}. They proposed it in the higher overlap case, including the scheme of directly using an untrained neural network for inference (uPD) and the scheme of training first and then inferring (tPD), and demonstrated the former by simulation.
For the uPD scheme, Wang et al.^{150} used a UNetbased scheme to iteratively infer the phase of a phaseonly object from a measured diffraction image whose defocus distance is known. Their method demonstrates higher accuracy than conventional algorithms (such as GS and TIE) and the DD scheme, at the expense of a longer inference time (about 10 minutes for an input with 256 × 256 pixels). Zhang et al.^{151} extended this work to the case where the defocus distance is unknown by including it as another unknown parameter together with the phase to the loss function. Yang et al.^{152,153} found that after expanding the tested sample from phaseonly to complexamplitude, obvious artifacts and noise appeared in the recovered results. Therefore, they proposed to add an aperture constraint into the loss function to reduce the illposedness of the problem. Regarding the timeliness, they pointed out that it would cost as much as 600 hours to infer 3,600 diffraction images with this uPD scheme. Meanwhile, Bai et al.^{154} extended this from a singlewavelength case to a dualwavelength case. Galande et al.^{155} found that this way of neural network optimization with a singlemeasurement intensity input lacks information diversity and can easily lead to overfitting of the noise, which can be mitigated by introducing an explicit denoiser. It is worth pointing out that this way of using the objectrelated intensity image as the neural network input makes it possible to internalize the mapping relationship between intensity and phase into the neural network through pretraining. In addition, some researchers proposed to make adjustments to the uPD scheme, using the initial phase and amplitude recovered by backward numerical propagation as the neural network input^{156,157,158}, which reduces the burden on the neural network to obtain higher inference accuracy.
Although the phase can be inferred from the measured intensity image through an untrained neural network without any ground truth, the uPD scheme inevitably requires a large number of iterations, which excludes its use in many dynamic applications. Therefore, to adapt the PD scheme to dynamic inference, Yang et al.^{152,153} adjusted their previously proposed uPD scheme to the tPD scheme by pretraining the neural network using a small part of the measured diffraction images, and then using the pretrained neural network to infer the remaining ones. Yao et al.^{159} trained a 3D version of the YNet^{90} with simulated diffraction images as input, and then used the pretrained neural network for direct inference or iterative refinement, which is 100 and 10 times faster than conventional iterative algorithms, respectively. Li et al.^{160} proposed a twotoone neural network to reconstruct the complex field from two axially displaced diffraction images. They used 500 simulated diffraction images to pretrain the neural network, and then inferred an unseen diffraction image by refining the pretrained neural network for 100 iterations. Bouchama et al.^{161} further extended the tPD scheme to Fourier ptychography of low overlap cases by simulated datasets. Different from the above ways of generating training datasets from natural images or real experiments, Huang et al.^{162} proposed to generate holograms as training datasets from randomly synthesized artificial images with no connection or resemblance to realworld samples. They further trained a neural network with the generated holograms and the tPD scheme, which showed superior external generalization to holograms of real tissues with arbitrarily defocus distances. It is worth mentioning that the PD strategy can also be used in computergenerated holography, generating the corresponding hologram from the target phase or amplitude via a physicsdriven neural network^{163,164}.
Networkwithphysics strategy
Different from the networkonly strategy, in the networkwithphysics strategy, either the physical model and neural network are connected in series for phase recovery (physicsconnectnetwork, PcN), or the neural network is integrated into a physicsbased algorithm for phase recovery (networkinphysics, NiP), or the physical model or physicsbased algorithm is integrated into a neural network for phase recovery (physicsinnetwork, PiN). A summary of the networkwithphysics strategy is presented in Table 3 and is described below.
Physicsconnectnetwork (PcN)
In this scheme, the role of the neural network is to extract and separate the pure phase from the initial estimate that may suffer from spatial artifacts or low resolution, which allows the neural network to perform a simpler task than the networkonly strategy; typically, the initial phase is calculated using a physical model (Fig. 19). This scheme requires paired inputlabel datasets to teach the neural network and therefore belongs to supervised learning.
Rivenson et al.^{165} first applied this PcN scheme in holographic reconstruction in 2018. They used numerical propagation to calculate the initial complex field (including real and imaginary parts) from a single intensityonly hologram, which contained twinimage and selfinterferencerelated spatial artifacts, and then used a datadriven trained neural network to extract the pure complex field from the initial estimate. Compared with the axial multiintensity alternating projection algorithm^{17,18,19}, their PcN scheme reduces the number of required holograms by 2–3 times while improving the computation time by more than three times. Wu et al.^{166} then extended the depth of field (DOF) based on this work by training a neural network with pairs of randomly defocused complex fields and the corresponding infocus complex field. Meanwhile, Huang et al.^{137} proposed the use of a recurrent CNN^{167} for the PcN scheme and the networkonly strategy. They compared the performance of neural networks using either a hologram or an initial complex field as input within the same background and discovered that the networkonly strategy is more robust for sparse samples, while the PcN scheme demonstrates better inference capabilities on dense samples. Goy et al.^{118} applied the PcN scheme to phase recovery under weaklight illumination, which is more illposed than conventional phase recovery. They showed that the inference performance of the PcN scheme is stronger than that of the networkonly strategy under weaklight illumination, especially for dense samples in the extreme photon level case (1 photon). Further, Deng et al.^{168} introduced a default feature perceptual loss of the VGG layer into the loss function for neural network training, which inferred more fine details than that of the NPCC loss function. They also improved the spatial resolution and noise robustness by learning the lowfrequency and highfrequency bands, respectively, through two neural networks and synthesizing these two bands into fullband reconstructions with a third neural network^{169}. By introducing random phase modulation, Kang et al.^{170} further improved the phase recovery ability of the PcN scheme under weaklight illumination. Zhang et al.^{171} extended the PcN scheme to Fourier ptychography, inferring highresolution phase and amplitude using the initial phase and amplitude synthesized from the intensity images as input to a neural network. Moon et al.^{172} extended the PcN scheme to offaxis holography, using numerical propagation to obtain the initial phase from the Gaber hologram as the input to the neural network.
Networkinphysics (NiP)
In this scheme, trained or untrained neural networks are used in physicsbased iterative algorithms as denoisers, structural priors, or generative priors. Regarding phase recovery as one of the most general optimization problems, this approach can be expressed as
where \(H(\cdot )\) is the physical model, \(\theta\) is the phase, \({I}_{x}\) is the measured intensity image of a test sample, and \(R(\theta )\) is a regularized constraint. According to the RegularizationbyDenoising (RED)^{173} framework, a pretrained neural network for denoising can be used as the regularized constraint:
where \(D(\theta )\) is a pretrained neural network for denoising, and \(\lambda\) is a weight factor to control the strength of regularization. Metzler et al.^{174} used the above algorithm for phase recovery and called it PrDeep. They used a DnCNN trained on 300,000 pairs of data as a denoiser and FASTA^{175} as a solver. In comparison with other conventional iterative methods, PrDeep demonstrates excellent robustness to noise. Wu et al.^{176} proposed an online extension of PrDeep, which adopts the online processing of data by using only a random subset of measurements at a time. Bai et al.^{177} extended PrDeep to incorporate a contrasttransferfunctionbased forward operator in \(H(\cdot )\) for phase recovery. Wang et al.^{178} improved PrDeep by changing the solver from FASTA to ADMM, which further improved the noise robustness. Chang et al.^{179} used a generalizedalternatingprojection solver to further expand the performance of PrDeep and made it suitable for the recovery of complex fields. Işıl et al.^{180} embedded a trained neural network denoiser into HIO, removing artifacts from the results after each iteration. On this basis, Kumar et al.^{181} added totalvariation prior together with the denoiser for regularization.
In addition, according to the deep image prior (DIP)^{182,183}, even an untrained neural network itself can be used as a structural prior for regularization (Fig. 20):
where \({g}_{\omega }(\cdot )\) is an untrained neural network with trainable parameters \(\omega\) that usually takes a generative decoder architecture, \({I}_{x}\) is the measured intensity image of a test sample, and \({z}_{f}\) is a fixed vector, which means that the input of the neural network is independent of the sample, and therefore the neural network cannot be pretrained like the PD approach.
This DIPbased approach was first introduced to phase recovery by Jagatap et al.^{184}. They solved Eq. (6) using the gradient descent and projected gradient descent algorithms by optimizing over trainable parameters \(\omega\), both of which outperform sparse truncated amplitude flow (SPARTA) algorithm. In followup work, they provided rigorous theoretical guarantees for the convergence of their algorithm^{185}. Zhou et al.^{186} applied this DIPbased algorithm to ODT, alleviating the effects of the missing cone problem. Shamshad et al.^{187} extended this DIPbased algorithm to subsampled Fourier ptychography, achieving better reconstructions at low subsampling ratios and high noise perturbations. In order to make the algorithm adaptive to different aberrations, Bostan et al.^{188} added a fully connected neural network with Zernike polynomials as the fixed input, and used it as the second structural prior. In the holographic setting with a reference beam, Lawrence et al.^{189} demonstrated the powerful information reconstruction ability of the DIPbased algorithm in extreme cases such as low photon counts, beamstopobscured frequencies, and small oversampling. Niknam et al.^{190} used the DIPbased algorithm to recover complex fields from an inline hologram. They further improved the twinimage artifacts suppression capability through some additional regularization, such as bounded activation function, weight decay, and parameter perturbation. Ma et al.^{191} embed an untrained generation network into the ADMM algorithm to solve the phase recovery at low subsampling ratios, and achieved better results than the gradient descent and projected gradient descent algorithms of Jagatap et al.^{184}. Chen et al.^{192} extended the DIPbased algorithm to Fourier ptychography, in which four parallel untrained neural networks were used for generating phase, amplitude, pupil aberration, and illumination fluctuation factor correction, respectively.
Similarly, a pretrained generative neural network can also be used as a generative prior, assuming that the target phase is in the range of the output of this trained neural network (Fig. 21):
where \(G(\cdot )\) is a pretrained fixed neural network that usually takes a generative decoder architecture, \({I}_{x}\) is the measured intensity image of a test sample, and \(z\) is a latent vector to be searched. Due to the use of the generative neural network, the multidimensional phase that originally needed to be iteratively searched is converted into a lowdimensional vector, and the solution space is also limited within the range of the trained generative neural network.
Hand et al.^{193} used the generative prior for phase recovery with rigorous theoretical guarantees for random Gaussian measurement matrix, showing better performance than SPARTA at low subsampling ratios. Later on, Shamshad et al.^{194} experimentally verified the robustness of the generativepriorbased algorithm to low subsampling ratios and strong noise in the coded diffraction setup. Then, Shamshad et al.^{195} extended this generativepriorbased algorithm to subsampled Fourier ptychography. Hyder et al.^{196} improved this by combining the gradient descent and projected gradient descent methods with AltMinbased nonconvex optimization methods. As a general defect, the trained generative neural network will limit the solution space to a specific range related to the training dataset, so that the iterative algorithm cannot search beyond this range. Therefore, Shamshad et al.^{197} set both the input and previously fixed parameters of the trained generative neural network to be trainable. As another solution, Uelwer et al.^{198} extended the range of the trained generative neural network by intermediate layer optimization.
Physicsinnetwork (PiN)
According to the algorithm unrolling/unfolding technique proposed by Gregor and LeCun^{199}, physicsbased iterative algorithms can be unrolled as an interpretable neural network architecture (Fig. 22). Although this scheme integrates physics prior knowledge into neural networks, it still requires inputlabel paired datasets for neural network training and thus falls under the category of supervised learning. Wang et al.^{200} unrolled an algorithm called decentralized generalized expectation consistent signal recovery (deGECSR) into a neural network with trainable parameters, which exhibits stronger robustness using fewer iterations than the original deGECSR. Naimipour et al.^{201,202} used the algorithm unrolling technique in reshaped Wirtinger flow and SPARTA. Zhang et al.^{203} unrolled the iterative process of the alternative projection algorithm into complex UNets. Shi et al.^{204} used a deep shrinkage network and dual frames to unroll the proximal gradient algorithm in coded diffraction imaging. Wu et al.^{205} integrated the Fresnel forward operator and TIE inverse model into a neural network, which can be efficiently trained with a small number of datasets and is suitable for transfer learning. Yang et al.^{206} unrolled the classic HIO algorithm into a neural network that combines information both in the spatial domain and frequency domain. Since PiNbased networks are embedded with physical knowledge, good performance can usually be achieved with a small training dataset. It is worth mentioning that, as another type of PiN scheme, physicsinformed neural networks mainly solves partial differential equations by embedding initial conditions, boundary conditions, and equation constraints into the loss function of neural networks^{207}.
Summary of “DLinprocessing for phase recovery”
At the end of this section, we provide a summary of “DLinprocessing for phase recovery” in Table 4, where “supervised learning mode” requires paired datasets, “weaksupervised learning mode” requires unpaired datasets, and “unsupervised learning mode” requires inputonly, phaseonly, or no datasets.
DLpostprocessing for phase recovery
A summary of “DLpostprocessing for phase recovery” is presented in Table 5 and is described below, including the “Noise reduction”, “Resolution enhancement”, “Aberration correction”, and “Phase unwrapping” sections.
Noise reduction
In addition to being part of the preprocessing in “Noise reduction” under the section “DLpreprocessing for phase recovery”, noise reduction can also be performed after phase recovery (Fig. 23). Jeon et al.^{208} applied the UNet to perform speckle noise reduction on digital holographic images in an endtoend manner. Their deep learning method takes only 0.92 s for a reconstructed hologram of 2048 × 2048, while other conventional methods take tens of seconds because of the requirement of multiple holograms. Choi et al.^{209} introduced the cycleGAN to train neural networks for noise reduction by unpaired datasets. They demonstrated the advantages of this unpaireddatadriven method with tomograms of different cell samples in optical diffraction chromatography: the nondatadriven ways either remove coherent noise by blurring the entire images or perform no effective denoising, whereas their method can simultaneously remove the noise and preserve the features of the sample.
Zhang et al.^{210} first proposed to suppress noise directly on the wrapped phase via a neural network. However, this direct way may lead to many wrong jumps in the wrapped phase, which results in larger errors in the unwrapped phase. Thus, Yan et al.^{211,212} proposed to do noise reduction on the sine and cosine (numerator and denominator) images of the phase via a neural network, and then calculated the wrapped phase from denoised sine and cosine images by the arctangent function. Almost simultaneously, Montresor et al.^{213} introduced the DnCNN into speckle noise reduction for phase data by their sine and cosine images. As it is difficult to simultaneously collect the phase data with and without speckle noise in an experimental manner, they used a simulator based on a doublediffraction system to numerically generate the dataset. Furthermore, their method yields comparable standard deviation to the WFT and better peaktovalley, while costing less time. Building on this work, Tahon et al.^{214} designed a dataset (HOLODEEP) for speckle noise reduction in soft conditions and used a shallower network for faster inference. To go further, they released a more comprehensive dataset for conditions of severe speckle noise^{215}. Fang et al.^{216} applied GAN to do speckle noise reduction for phase. Murdaca et al.^{217} applied this deeplearningbased phase noise reduction to interferometric synthetic aperture radar (InSAR)^{218}. The difference is that in addition to the sine and cosine images of the phase, the neural network also reduces noise for the amplitude images at the same time. Tang et al.^{219} proposed to iteratively reduce the coherent noise in phase with an untrained UNet. In the above works, various loss functions were employed alongside the conventional l_{2}norm and l_{1}norm to enhance performance. These additional functions include the edge function^{208}, which sharpens the edges of the denoised image, as well as gradient and variance functions^{219} that further suppress noise while preventing excessive smoothing.
Resolution enhancement
Similar to the section “Pixel superresolution”, resolution enhancement can also be performed after phase recovery as postprocessing (Fig. 24). Liu et al.^{220} first used a neural network to infer the corresponding highresolution phase from the lowresolution phase. They trained two GANs with both a pixel superresolution system and a diffractionlimited superresolution system, which was demonstrated on biological thin tissue slices with the analysis of spatial frequency spectrum. Moreover, they pointed out that this idea can be extended to other resolutionlimited imaging systems, such as using a neural network to build a passageway from offaxis holography to inline holography. Later, Jiao et al.^{221} proposed to infer the highresolution noisefree phase from an offaxissystemacquired lowresolution version with a trained UNet. To collect the paired dataset, they developed a combined system with diffraction phase microscopy (DPM)^{222} and spatial light interference microscopy (SLIM)^{27} to generate both holograms from the same field of view. After training, the UNet retains the advantages of both the high acquisition speed of DPM and the high transverse resolution of SLIM.
Subsequently, Butola et al.^{223} extended this idea to partially spatially coherent offaxis holography, where the phase recovered at lownumericalapertures objectives was used as input, and the phase recovered at highnumericalapertures objectives was used as ground truth. Since lownumericalapertures objectives have a larger field of view, they aim to obtain a higher resolution at a larger field of view, i.e., a higher spatial bandwidth product. Meng et al.^{224} used structuredillumination digital holographic microscopy (SIDHM)^{225} to collect the highresolution phase as ground truth. To supplement more highfrequency information by two cascaded neural networks, they used the lowresolution phase along with the highresolution amplitude inferred from the first neural network both as inputs of the second neural network. Subsequently, Li et al.^{226} extended this resolutionenhanced postprocessing method to quantitative differential phase contrast microscopy for highresolution phase recovery from the least number of experimental measurements. To solve the problem of outofmemory for the large size of the input, they disassembled the fullsize input into some subpatches. Moreover, they found that the UNet trained on the paired dataset has a smaller error than the paired GAN and the unpaired GAN. For GAN, there is more unreasonable information in the inferred phase, which is absent in ground truth. Gupta et al.^{227} took advantage of the high spatial bandwidth product of this method to achieve a classification throughput rate of 78,000 cells per second with an accuracy of 76.2%. All these works use UNet as the basic structure, where most neural networks input and output phase maps of the same size and thus have the same number of downsampling times and upsampling times, whereas for the application where the input size is smaller than the output^{227}, the neural network has more upsampling times.
For ODT, due to the limited projection angle imposed by the numerical aperture of the objective lens, there are certain spatial frequency components that cannot be measured, which is called the missing cone problem. To address this problem via a neural network, Lim et al.^{228} and Ryu et al.^{229} built a 3D RI tomogram dataset for 3D UNet training, in which the raw RI tomograms with poor axial resolution were used as input, and the resolutionenhanced RI tomograms from the iterative total variation algorithm were used as ground truth. The trained 3D UNet can infer the highresolution version directly from the raw RI tomograms. They demonstrated the feasibility and generalizability using bacterial cells and a human leukemic cell line. Their deeplearningbased resolutionenhanced method outperforms conventional iterative methods by more than an order of magnitude in regularization performance.
Aberration correction
For holography, especially in the offaxis case, the lens and the unstable environment of the sample introduce phase aberrations superimposing on the phase of the sample. To recover the pure phase of the sample, the unwanted phase aberrations should be eliminated physically or numerically. Physical approaches compensate for the phase aberrations by recovering the background phase without the sample from anther hologram, which requires more setups and adjustments^{230,231}.
As for numerical approaches, the compensation of the phase aberrations can be directly achieved by Zernike polynomial fitting (ZPF)^{232} or principalcomponent analysis (PCA)^{233}. Yet, in these numerical methods, the aberration is predicted from the whole phase, where the object area should not be considered as an aberration. Thus, before using the Zernike polynomial fitting, the neural network can be used to find out the object area and the background area to avoid the influence of the background area and improve the compensation effect (Fig. 25). This segmentationbased idea, namely CNN + ZPF, was first proposed by Nguyen et al.^{234} in 2017. They manually made binary masks as ground truth for each phase to distinguish the area of the background and sample. After comparison on different real samples, they found that the compensated result of the CNN + ZPF contains a flatter background than that of PCA. However, the aberration in the initial phase makes it more difficult to do segmentation from the already weak phase distribution of the boundary features, especially for the large tilted phase aberrations. To address this problem, Ma et al.^{235} proposed to do segmentation with hologram instead of phase as neural network input. Lin et al.^{236} applied the CNN + ZPF to realtime phase compensation with a phaseonly SLM.
In addition to the way of CNN + ZPS, Xiao et al.^{237} directly inferred the Zernike coefficient of aberration from the initial phase via a neural network, which costs less computation. They trained a neural network specifically for bone cells, and used this efficient method to achieve longterm morphological observation of living cells. Zhang et al.^{238} used a trained neural network to infer the infocus phase from the defocus aberrated intensity and phase. Tang et al.^{239} introduced the sparse constraint into the loss function and iteratively inferred the corresponding phase aberrations from the initial phase or fixed vector with an untrained neural network and Zernike model.
Phase unwrapping
In the interferometric and optimizationbased phase recovery methods, the recovered light field is in the form of complex exponential. Hence, the calculated phase is limited in the range of (π, π] on account of the arctangent function. Therefore, the information of the sample cannot be obtained unless the absolute phase is first estimated from the wrapped phase, the socalled phase unwrapping. In addition to phase recovery, the phase unwrapping problem also arises in magnetic resonance imaging^{240}, fringe projection profilometry^{241}, and InSAR. Most conventional methods are based on the phase continuity assumption, and some cases, such as noise, breakpoints, and aliasing, all violate the Itoh condition and affect the effect of the conventional methods^{242}. The advent of deep learning has made it possible to perform phase unwrapping in the above cases. According to the different uses of the neural network, these deeplearningbased phase unwrapping methods can be divided into the following three categories (Fig. 26)^{66}. Deeplearningperformed regression method (dRG) estimates the absolute phase directly from the wrapped phase by a neural network (Fig. 26a)^{243,244,245,246,247,248,249,250,251,252,253,254,255,256}. Deeplearningperformed wrap count method (dWC) first estimates the wrap count from the wrapped phase by a neural network, and then calculates the absolute phase from the wrapped phase and the estimate wrap count (Fig. 26b)^{210,257,258,259,260,261,262,263,264,265,266,267}. Deeplearningassisted method (dAS) first estimates the wrap count gradient or discontinuity from the wrapped phase by a neural network; next, either reconstruct the wrap count from the wrap count gradient and then calculate the absolute phase like dWC^{268,269}, or directly use optimizationbased or branchcut algorithms to obtain the absolute phase from the warp count gradient or the discontinuity (Fig. 26c)^{270,271,272,273,274}.
Deeplearningperformed regression method (dRG)
Dardikman et al.^{243} presented the dRG method, which utilizes a residualblockbased CNN with a dataset of simulated steep cells. They also validated the dRG method postprocessed by congruence in actual cells and compared it with the performance of the dWC method^{244}. Then, Wang et al.^{245} introduced the UNet and a phase simulation generation method into the dRG method, wherein they evaluated the trained network on real samples, examined the network’s generalization ability through middlelayer visualization, and demonstrated the superiority of the dRG method over conventional methods in noisy and aliasing cases. In the same year, He et al.^{246} and Ryu et al.^{247} evaluated the ability of the 3DResNet and recurrent neural network (ReNet) to perform phase unwrapping using magnetic resonance imaging data. Dardikman et al.^{248} released their real sample dataset as opensource. They demonstrated that the congruence could enhance the accuracy and robustness of the dRG method, particularly when dealing with a limited number of wrap count. Qin et al.^{249} utilized a ResUNet with a larger capacity to achieve higher accuracy and introduced two new evaluation indices. Perera et al.^{250} and Park et al.^{251} introduced the long shortterm memory (LSTM) network and GAN into phase unwrapping. Zhou et al.^{252,275} enhanced the robustness and efficiency of the dRG method by doing preprocessing and postprocessing steps for the UNet with EfficientNet^{275} backbone. Xu et al.^{253} improved the accuracy and robustness of the UNet by adding more middlelayers and skip connections and using a composite loss function. Zhou et al.^{254} used the GAN in the InSAR phase unwrapping and avoided the blur in the unwrapped phase by combining the l_{1} loss and adversarial loss. Xie et al.^{255} trained four networks for different noise levels, which made each network more focused on a specific noise level. Zhao et al.^{256} added a weighted map as the prior to the neural network to make it more focused on the area near the jump edge, similar to an additional attention mechanism. Different from the above methods, Vithin et al.^{276,277} proposed to use the YNet^{90} to infer the phase gradients from a wrapped phase and then calculate the absolute phase.
Deeplearningperformed wrap count method (dWC)
Liang et al.^{257} and Spoorthi et al.^{258} first proposed this idea in 2018. Spoorthi et al.^{258} proposed a phase dataset generation method by adding and subtracting Gaussian functions with randomly varying mean and variance values, and used the clusteringbased smoothness to alleviate the classification imbalance of the SegNet. Further, the prediction accuracy of their methods was improved by introducing the prior of absolute phase values and gradients into the loss function, which they called PhaseNet2.0^{259}. Zhang and Liang et al.^{210,260} sequentially used three networks to perform phase unwrapping by wrapped phase denoising, wrap count predicting, and postprocessing. In addition, they proposed to generate a phase dataset by weighted adding Zernike polynomials of different orders. Immediately after, Zhang and Yan et al.^{261} verified the performance of the network DeepLabV3+, but the resulting wrap count still contained a small number of wrong pixels, which will propagate error through the whole phase maps in the conventional phase unwrapping process. They thus proposed to use refinement to correct the wrong pixels. To further improve the unwrapped phase, Zhu et al.^{262} proposed to use the median filter for the second postprocessing to correct wrong pixels in the wrap count predictions. Wu et al.^{263} enhanced the simulated phase dataset by adding the noise from real data. They also used the fullresolution residual network (FRRNet) with UNet to further optimize the performance of the UNet in the Doppler optical coherence tomography. By comparison with real data, their proposed network holds a higher accuracy than that of the PhaseNet and DeepLabV3+. As for applying the dWC to point diffraction interferometer, Zhao et al.^{264} proposed an imageanalysisbased postprocessed method to alleviate the classification imbalance of the task and adopted the iterativeclosestpoint stitching method to realize dynamic resolution. Vengala et al.^{90,265,266} used the YNet^{90} to reconstruct the wrap count and pure wrapped phase at the same time. Zhang et al.^{267} added atrous spatial pyramid pooling (ASPP), positional selfattention (PSA), and edgeenhanced block (EEB) to the UNet to get higher accuracy and stronger robustness than the networks used in the above methods. Huang et al.^{278} applied the HRNet to the dWC methods. Their method still needs the median filter for postprocessing, although the performance is better than that of the PhaseNet and DeepLabV3+. Wang et al.^{279} proposed another EEB based on Laplacian and Prewitt edge enhancement operators for the network, which further enhances classification accuracy and avoids the use of postprocessing.
Deeplearningassisted method (dAS)
The conventional methods estimate the wrap count gradient under the phase continuity assumption, which hence is disturbed by unfavorable factors such as noise. To get rid of it, Zhou et al.^{270} proposed to estimate the wrap count gradient via a neural network instead of conventional methods. Since the noisy wrapped phase and the corresponding correct wrap count gradient are used as training datasets, the trained neural network is able to estimate the correct wrap count gradient from the noisy wrapped phase without being limited by the phase continuity assumption. The correct result can be obtained by minimizing the difference between the unwrapped phase gradients and the networkoutput wrap count gradient. Further, Wang et al.^{271} proposed to input a quality map, as the prior, together with the wrapped phase into the neural network to improve the accuracy of the estimated wrap count gradient. Almost simultaneously, Sica et al.^{268} directly reconstructed the wrap count from the networkoutput wrap count gradient and then calculated the absolute phase, like dWC. On this basis, Li et al.^{269} improved neural network estimation efficiency by using a single fusion gradient instead of the vertical and horizontal gradients. In addition to estimating the wrap count gradient via a neural network, Wu et al.^{272,273} chose to estimate the horizontal and vertical discontinuities with a neural network, and recover the absolute phase by the optimizationbased algorithms. Instead of using the wrapped phase as the network input, Zhou et al.^{274} embedded the neural network into the branchcut algorithm to predict the branchcut map from the residual image, which reduced the computational cost of the branchcut algorithm.
Deep learning for phase processing
A summary of “Deep learning for phase processing” is presented in Table 6 and is described below, including the “Segmentation”, “Classification”, and “Imaging modal transformation” sections.
Segmentation
Image segmentation, aiming to divide all pixels into different regions of interest, is widely used in biomedical analysis and diagnosis. For unlabeled cells or tissues, the contrast of the brightfield intensity is low and thus inefficient to be used for image segmentation. Therefore, segmentation according to the phase distribution of cells or tissues becomes a potentially more efficient way. Given the great success of CNNs in semantic segmentation^{280}, it seems that we can easily transplant it for phase segmentation, that is, doing segmentation with the phase as input of the neural network (Fig. 27).
To the best of our knowledge, early in 2013, Yi et al.^{281} first proposed to do segmentation from the phase distribution for the red blood cells, although using a nonlearning imageprocessingbased algorithm. To improve the segmentation accuracy in the case of heavily overlapped and multiple touched cells, they first introduced the fully convolutional network (FCN)^{280} into phase segmentation^{282}. Earlier in the same year, Nguyen et al.^{283} used the random forest algorithm to segment prostate cancer tissue from the phase distribution. Ahmadzadeh et al.^{284} used the FCNbased phase segmentation to do nucleus extraction for cardiomyocyte characterization. Subsequently, the UNet was used for phase segmentation in multiple biomedical applications, such as segmentation of the sperm cells’ ultrastructure for assisted reproductive technologies^{285}, SARSCoV2 detection^{286}, cells livedead assay^{287}, and cells cyclestage detection^{288}. In addition, other types of neural networks were used for phase segmentation, including the mask RCNN for cancer screening^{289} and the DeepLabV3+ for cytometric analysis^{290}.
Further than the phase, the RI from ODT can be used to segment a sample in three dimensions. Lee et al.^{291} obtained the 3D shape and position of the organelles by 2D segmentation of the RI tomograms at different depths, which are respectively used for the analysis of the morphological and biochemical parameters of breast cancer cells’ nuclei. As a more direct and efficient way, Choi et al.^{292} used a 3D UNet to segment subcellular compartments directly from a single 3D RI tomogram.
Classification
Similar but different from the segmentation, the classification task is only responsible for giving the overall category of the input sample image, regardless of the specific pixels in the image. For the classification task, the phase provides more information related to the RI and threedimensional topography of the sample, making it ideal for transparent samples such as cells, tissues, and microplastics^{293,294}. Conventional machine learning algorithms first manually extract tens of features from the phase and then do classification with different models. Support vector machine^{295}, as one of the most popular conventional machine learning strategies, is the most used strategy in phase classification^{296,297,298,299,300,301,302,303}. In addition, some researchers used other conventional machine learning strategies, such as knearest neighbor^{304,305}, fullyconnected neural networks^{306,307}, random forest^{308,309}, and random subspace^{310}. More generally, some researchers compared the accuracy of different conventional machine learning strategies in the same application context^{306,311,312,313}.
Different from conventional machine learning strategies that require manual feature extraction, deep learning usually takes the phase or its further version directly as input, in which the deep CNNs will automatically perform feature extraction (Fig. 28). This automatic feature extraction strategy tends to achieve higher accuracy, but usually requires a larger number of paired inputlabel datasets as support. The use of phase as input to deep CNNs for classification was first reported in the work of Jo et al.^{293}. They revealed that, for cells like anthrax spores, the accuracy of the neural network using phase as input is higher than that of the neural network using binary morphology image obtained by conventional microscopy as input. Subsequently, this deeplearningbased phase classification method has been used in multiple applications, including assessment of T cell activation state^{314}, cancer screening^{315}, classification of sperm cells under different stress conditions^{316}, prediction of living cells mitosis^{317}, and classification of different white blood cells^{318}. Accuracy in these applications is generally higher than 95% for the binary classification, but cannot achieve comparable accuracy in multitype classification.
On the one hand, combining the automatically extracted features of the neural network and the manually extracted features for classification can effectively improve the accuracy, which is because the manually extracted features add the prior of human experts to the classifier^{319,320,321}. For instance, after adding the manual morphological features, the accuracy and area under the curve of healthy and sickle red blood cells classification are improved from 95.08% and 0.9665 to 98.36% and 1.0000, respectively^{320}. On the other hand, the classification accuracy can also be enhanced by using higher dimensional data of the phase or other data together with the phase as the input of the neural network, such as 3D RI tomogram from the phase^{322,323}, more phase in temporal dimension^{324,325,326}, more phase in wavelength dimension^{327,328}, and amplitude together with the phase^{329,330,331,332,333,334}.
3D RI tomogram from the phase (Fig. 29a)
Ryu et al.^{322} used the 3D RI tomogram as the input of a neural network to classify different types of cells, and achieved an accuracy of 99.6% in the binary classification of lymphoid and myeloid cells, and of 96.7% even in fivetype classification of white blood cells. For the multitype classification, they also used the amplitude or phase of the same sample as input to train and test the same neural network, but only achieved an accuracy of 80.1% and 76.6%, respectively. Afterward, Kim et al.^{323} from the same group applied this technology to microbial identification and reached 82.5% accuracy from an individual bacterial cell or cluster for the identification of 19 bacterial species.
More phase in temporal dimension (Fig. 29b)
Wang et al.^{324} used the amplitude and phase from timelapse holograms as inputs to a pseudo3D CNN to classify the type of growing bacteria, shortening the detection time by >12 h compared with the environmentalprotectionagencyapproved methods. Likewise, Liu et al.^{325} used the phase from timelapse holograms as neural network inputs to infer the plaqueforming units probability for each pixel, achieving >90% plaqueforming units detection rate in <20 h. By contrast, Batuch et al.^{326} proposed to use the phase at a specific moment and the corresponding spatiotemporal fluctuation map as the inputs of a neural network to improve the accuracy of cancer cell classification.
More phase in wavelength dimension (Fig. 29c)
Singla et al.^{327} used the amplitude and phase of the redgreenblue color wavelengths as inputs of a neural network, thereby achieving a classification accuracy of 97.7% for healthy and malariainfected red blood cells, and classification accuracy of 91.2% even for different stages of malariainfection. Similarly, With the blessing of information from the redgreenblue color holograms, Isil et al.^{328} achieved the highaccuracy fourtype classification of algae, including accuracy of 94.5%, 96.7%, and 97.6% for D. tertiolecta, Nitzschia, and Thalassiosira algae, respectively.
Amplitude together with the phase (Fig. 29d)
Lam et al.^{330,331} used the amplitude and phase as the inputs of a neural network to do the classification of occluded and/or deformable objects, and achieved accuracy over 95%. With the same strategy, they performed a tentype classification for biological tissues with an accuracy of 99.6%^{332}. Further, Terbe et al.^{333} proposed to use a type of volumetric network input by supplementing more amplitude and phase in different defocus distances. They built a more challenging dataset with seven classes by alga in different counts, small particles, and debris. The network with volumetric input outperforms the network with a single amplitude and phase inputs in all cases by ~4% accuracy. Similarly, Wu et al.^{334} used real and imaginary parts of the complex field as network input to do a sixtype classification for bioaerosols, and achieved an accuracy of over 94%.
In pursuit of extreme speed for realtime classification, some researchers also choose to directly use the raw hologram recorded by the sensor as the input of the neural network to perform the classification tasks^{335,336,337,338,339}. Since the information of amplitude and phase are encoded within a hologram, the hologramtrained neural network should achieve satisfactory accuracy with the support of sufficient feature extraction capabilities, which has been proven in practices including molecular diagnostics^{335}, microplastic pollution assessment^{336,337,338}, and neuroblastoma cells classification^{339}.
Imaging modal transformation
Let us start this subsection with image style transfer^{340,341}, which aims to transfer a given image to another specified style under the premise of retaining the content of this image as much as possible. For a type of biological sample or its tissue slice, different parts have different RI properties, different absorption properties, and different chemical or fluorescent staining properties. These four corresponding properties point to phase recovery/imaging, brightfield imaging, and chemical or fluorescentstaining imaging, respectively, which makes it possible to achieve image style transfer from phase recovery to other imaging modals (Fig. 30).
From phase recovery to brightfield imaging
The brightfield images of some color biological samples have sufficient contrast due to their strong absorption of visible light, so for such samples, brightfield imaging can be used as the target imaging modality, in which a neural network is used to transfer the complexvalue image of the sample into its virtual brightfield image. In 2019, Wu et al.^{342} presented the first implementation of this idea, called brightfield holography, in which a neural network was trained to transfer the backpropagated complexvalue images from a single hologram to their corresponding speckle and artifactfree brightfield images (Fig. 31a). This type of “brightfield holography” is able to infer a whole 3D volumetric image of a color sample like pollen from its singlesnapshot hologram. Further, Terbe et al.^{343} implemented “brightfield holography” with a cycleGAN in the case of unpaired datasets.
From phase recovery to chemicalstaining imaging
For most transparent/colorless biological samples, chemical staining enables them to be clearly observed or imaged under brightfield microscopy. This allows the above “brightfield holography” to be used for transparent biological samples as well, which is called virtual staining. It directly infers the corresponding digital stained image from the phase recovered by labelfree methods, which avoids the complicated, timeconsuming, and contaminating staining processes. Rivenson et al.^{344} applied this virtual staining technique to the inspection of histologically stained tissue slices and named it PhaseStain, in which a welltrained neural network was used to directly transfer the phase of tissue slices to their brightfield image of virtual staining (Fig. 31b). Using labelfree slices of human skin, kidney, and liver tissue, they conducted an experimental demonstration of the efficacy of “PhaseStain” by imaging them with a holographic microscope. The resulting images were compared to those obtained through brightfield microscopy of the same tissue slices that were stained with HandE, Jones’ stain, and Masson’s trichrome stain, respectively. The reported “PhaseStain” greatly saves time and costs associated with the staining process. Similarly, Wang et al.^{345} applied the “PhaseStain” in Fourier ptychographic microscopy and adapted it to an unpaired dataset with a cycleGAN. Further, by introducing the phase attention guidance, Jiang et al.^{49} addressed the ambiguity problem of intensity or phaseonly networks for virtual staining. Liu et al.^{346} used six images of amplitude and phase at three wavelengths as network input to infer the corresponding virtual staining version. In addition to tissue slices, Nygate et al.^{347} demonstrated the advantages and potential of this deep learning virtual staining approach on a single biological cell like sperm (Fig. 31c). To improve the effectiveness of virtual staining, they used the phase gradients as an additional handengineered feature along with the phase as the input of the neural network. In order to assess the effectiveness of virtual staining, they used virtual staining images, phase, phase gradients, and stainfree brightfield images as input data for the fivetype classification of sperm, and found that the recall values and F1 scores of virtual staining images were higher than those of other data twice or even four times. This type of singlecell staining approach provides ideal conditions for realtime analysis, such as rapid stainfree imaging flow cytometry.
From phase recovery to fluorescentstaining imaging
Apart from imaging color or chemicalstained biological samples with brightfield microscopy, fluorescence microscopy can provide molecularspecific information by imaging fluorescencelabeled biological samples. As a labeled imaging method, fluorescence microscopy has insurmountable disadvantages, including phototoxicity and photobleaching. Guo et al.^{348} proposed the concept of “transferring the physicalspecific information to the molecularspecific information via a trained neural network” (Fig. 32a). Specifically, they used the phase and polarization of cell samples as multichannel inputs to infer the corresponding fluorescence image, and further demonstrated its performance by imaging the architecture of brain tissue and prediction myelination in slices of a developing human brain. Almost simultaneously, Kandel et al.^{349} used a neural network to infer the fluorescencerelated subcellular specificity from a single phase, which they called phase imaging with computational specificity (Fig. 32b). With these labelfree methods, they monitored the growth of both nuclei and cytoplasm for live cells and the arborization process in neural cultures over many days without loss of viability^{350}. Guo et al.^{351} further inferred the fluorescence images from the phase at different depths and performed 3D prediction for mitochondria. The above methods are performed on widefield fluorescence microscopes, which cannot provide highresolution 3D fluorescence data for neural networks as ground truth. Hence, Chen et al.^{352} presented an artificial confocal microscopy consisting of a commercial confocal microscope augmented by a laser scanning gradient light interference microscopy system. It can provide the phase of the samples in the same field of view as the fluorescence channel to obtain paired datasets. With the support of deep learning, their proposed artificial confocal microscopy combines the benefits of nondestructive phase imaging with the depth sectioning and chemical specificity of confocal fluorescence microscopy.
The aforementioned imaging modal transformation methods use phase as the input of neural networks, but the phase, in addition to being related to RI, also depends on the thickness of the biological sample or its tissue slice. Therefore, a neural network trained on the dataset of a biological type is difficult to generalize to another different one. Unlike inferring the fluorescence image from the phase, RI is an absolute and unbiased quantity of biological samples, so a neural network trained with RI as input is naturally applicable to new species. Jo et al.^{353} thus built a bridge from ODT to fluorescence imaging via deep learning (Fig. 32c). They trained a neural network with the 3D RI tomogram as input and the corresponding fluorescence image as ground truth. With the trained neural network, they performed various applications within the endogenous subcellular structures and dynamics profiling of intact living cells at unprecedented scales.
Conclusion and outlook
The introduction of deep learning provides a datadriven approach to various stages of phase recovery. Based on where they are used, we provided a comprehensive review of how neural networks work in phase recovery. Deep learning can provide preprocessing for phase recovery before it is performed, can be directly used to perform phase recovery, can postprocess the initial phase obtained after phase recovery, or can use the recovered phase as input to implement specific applications. Despite the fact that deep learning provides unprecedented efficiency and convenience for phase recovery, there are some common general points to keep in mind when using this learnbased tool.
Datasets
For the supervised learning mode, a paired dataset provides enough rich and highquality prior knowledge as a guide for neural network training. As one of the most common ways, some researchers choose to collect the intensity image of the real sample through the experimental setup as the input, and calculate the corresponding phase through conventional modelbased methods as ground truth (label). Numerical simulations can be a convenient and efficient way to generate datasets for some cases, such as phase unwrapping^{66}, hologram resolution enhancement^{74} and diffractive imaging^{130}. The paired dataset thus implicitly contains the inputtolabel mapping relationship in a large number of specific samples, which determines the upper limit of the ability of the trained neural network. For instance, if the dataset is collected under fixed settings, the trained neural network can only target a fixed device parameter (such as defocus distance, offaxis angle, and wavelength) or a certain class of samples, but cannot adapt to other situations that are not implied in the dataset. Of course, one can ameliorate this by using different settings and different types of samples when collecting datasets, thereby including various cases in the paired training samples, such as adapting to a certain range of defocus distance^{114,166}, adapting to different aberrations^{119,129}, adapting to different offaxis angles^{123} and adapting to more types of samples^{127}. One can use Shannon entropy to quantitatively represent the richness of the amount of information contained in the dataset, which directly affects the generalization ability of the trained neural network^{117}. In addition, the spatial frequency content of the training samples in datasets also limits the ability of the trained neural network to resolve fine spatial features, which can be improved to some extent by preprocessing the power spectral density of the training samples^{115}. For the weaksupervised learning mode, the cycleGANbased method trains neural networks with an unpaired dataset for learning the mapping relationship between the input domain and the target domain, including phase recovery^{124,125,142}, noise reduction^{209}, resolution enhancement^{227}, and imaging modal transformation^{343,345}. As for the unsupervised learning mode, under the guidance of forward physical models and inputonly datasets, neural networks learn the inverse process^{152,153,159,160,161,162}.
Networks and loss functions
Guided/Driven by the dataset, the neural network is trained to learn the mapping relationship from the input domain to the target domain by minimizing the difference between the actual output and ground truth (loss functions). Therefore, the fitting ability of the neural network itself and the perception ability of the loss function determines whether the implicit mapping relationship in the dataset can be well internalized into the neural network. Conventional encoderdecoderbased neural networks have sufficient receptive fields and strong fitting capabilities, but downsampling operations such as maxpooling lose some highfrequency information. Dilated convolutions can improve the receptive field while retaining more highfrequency information^{141}. Convolution in the Fourier frequency domain guarantees a global receptive field, since each pixel in the frequency domain contains contributions from all pixels in the spatial domain^{145,146}. In order to make the neural network more focused on different spatial frequency information, one can also use two neural networks to learn the high and lowfrequency bands, respectively, and then use the third neural network to merge them into a full spatial frequency version^{169}. Neural architecture search is another potential technology that automatically searches out the optimal network structure from a large structure space^{147}. In addition to the aforementioned CNNs, due to the excellent global feature perception, Vision Transformer^{112} and Swin Transformer^{143} achieved better inference performance than classic CNNs in autofocusing^{108} and phase recovery^{142}. However, it should be noted that Transformer does not have inherent translational equivariance and invariance like CNNs, and thus requires corresponding data enhancement. The recently proposed local conditional neural fields framework is expected to achieve highly generalized multiscale phase recovery, in which generalization ability comes from measurementspecific information in latent space while multiscale ability comes from local representation^{354}. As the most commonly used loss functions, l_{2}norm and l_{1}norm are more responsive to lowfrequency information and less sensitive to highfrequency information. That is to say, the lowfrequency information in the output of the neural network contributes more to the l_{2}norm and l_{1}norm loss functions than the highfrequency information. Therefore, some researchers have been trying to find more efficient loss functions, such as NPCC^{115}, GAN loss^{132,139,140}, and default feature perceptual loss of VGG layer^{168}. So far, what kind of neural network and loss function is the best choice for phase recovery is still inconclusive.
Networkonly or physicsconnectnetwork (PcN)
Networkonly strategy aims to infer the final phase from the raw measured intensity image in an endtoend fashion using a neural network. It’s a oneshot approach, letting the neural network do it all in one go. Neural networks not only need to perform regularization to remove twinimage and selfinterferencerelated spatial artifacts but also undertake the task of freespace light propagation. Therefore, the inference results of the networkonly strategy are not satisfactory in some severely illposed cases, including weaklight illumination^{118} and dense samples^{137}. Since freespace light propagation is a wellcharacterized physical model that can be reproduced and enforced numerically, using numerical propagation in front can relieve the burden on the neural network and allow it to focus on learning regularization. In fact, PcN can indeed infer better results than networkonly in the above illposed cases^{118,137}. In another similar scheme, the neural network only performs the task of hologram generation before the phaseshifting algorithm, thus achieving better generalization ability than networkonly^{89}. In addition, using specklecorrelation processing before the neural network makes the trained neural network suitable for unknown scattering media and target objects^{355}.
Interpretability
In phase recovery, learningbased deep learning techniques usually attempt to automatically learn a specific mapping relationship by optimizing/training neural network parameters with the realworld paired dataset. Deep neural networks usually adopt a multilayer architecture and contain a large number of trainable parameters (even greater than millions), and are thus capable of learning complicated mapping relationships from datasets. Unlike physicsbased algorithms, such network architectures that are general to various tasks often lack interpretability, meaning that it is difficult to discover what the neural network has learned internally and what the role of a particular parameter is by examining the trained parameters. This makes one helpless in practical applications when encountering a failure of neural network inference, in which they can neither analyze why the neural network failed for that sample nor make targeted improvements for the neural network to avoid this failure in subsequent uses. The algorithm unrolling/unfolding technique proposed by Gregor and LeCun gives hope for the interpretability of neural networks^{199}, in which each iteration of physicsbased iterative algorithms is represented as one layer of the neural network. One inference through such a neural network is equivalent to performing a fixed number of iterations of the physicsbased iterative algorithm. Usually, physicsbased parameters and regularization coefficients are transferred into the unrolled network as trainable parameters. In this way, the trained unrolled network can be interpreted as a physicsbased iterative algorithm with a fixed number of iterations. In addition, the unrolled network naturally inherits prior structures and domain knowledge from a physicsbased iterative algorithm, and thus its parameters can be efficiently trained with a small dataset.
Uncertainty
When actually using a trained neural network to do inference for a tested sample, its ground truth is usually unknown, which makes it impossible to determine the reliability of the inferred results. To address this, Bayesian CNNs perform phase inference while giving uncertainty maps to describe the confidence measure of each pixel of the inferred result^{132,356,357,358}. This uncertainty comes from both the model itself and the data, called epistemic uncertainty and aleatoric uncertainty, respectively. The networkoutput uncertainty maps are experimentally verified to be highly consistent with the real error map, which makes it possible to assess the reliability of inferred results in practical applications without any ground truth^{132,358}. In addition to Bayesian neural networks, there are three other uncertainty estimation techniques, including single deterministic methods, ensemble methods, and test time augmentation methods^{359}.
From electronic neural networks to optical neural networks
So far, the artificial neural networks involved in this review mostly run in the hardware architecture with electronics as the physical carrier, such as the graphic processing unit, which is approaching its physical limit. Replacing electrons with photons is a potential route to highspeed, parallel, and lowpower artificial intelligence computing, especially optical neural networks^{360,361}. Among them, spatialstructurebased optical neural networks, represented by the diffractive deep neural network^{362}, are particularly suitable for image processing and computational imaging^{363,364,365}. Some examples have initially demonstrated the potential of using optical neural networks for phase recovery^{366,367,368}.
Inherent limitations of the hardware imaging system
In addition to considering how to use neural networks to better recover phases from measured intensity maps, the capabilities of the hardware imaging system itself to detect and capture information are also essential. This is because a clear understanding exists that even the most advanced deep learning techniques cannot recover information that the hardware imaging systems have not recorded. In the case of lensless systems, incorporating additional light field modulation devices, such as coded layers, can transform otherwise imperceptible low and highfrequency information into detectable levels^{49,50,51,52}. A potential research direction involves using deep learning to design coded layer distributions that optimally consider information across all frequencies. For lensbased systems, the illumination strategy dictates the frequency content entering the effective numerical aperture. Hardware parameters, such as illumination patterns, can be integrated as trainable parameters within the PiNbased phaserecovery neural network, allowing for joint optimization through training datasets^{369,370}.
Learningbased deep neural networks have enormous potential and efficiency, while conventional physicsbased methods are more reliable. We thus encourage the incorporation of physical models with deep neural networks, especially for those well modeling from the real world, rather than letting the deep neural network perform all tasks as a black box. A possible way is to thoroughly consider the network structure, loss function, and priors from both the dataset and physical model during the training stage to obtain an effective pretrained neural network; in actual use, the pretrained neural network can be employed for onetime inference to address situations requiring high realtime requirements, or alternatively, the physical model can be used to iteratively finetune the pretrained neural network to achieve higher accuracy.
References
Born, M. & Wolf, E. Principles of Optics: Electromagnetic Theory Of Propagation, Interference And Diffraction Of Light. 6th edn (Pergamon Press, 1980).
Shechtman, Y. et al. Phase retrieval with application to optical imaging: a contemporary overview. IEEE Signal Process. Mag. 32, 87–109 (2015).
Park, Y., Depeursinge, C. & Popescu, G. Quantitative phase imaging in biomedicine. Nat. Photonics 12, 578–589 (2018).
Miao, J. W. et al. Extending the methodology of Xray crystallography to allow imaging of micrometresized noncrystalline specimens. Nature 400, 342–344 (1999).
Tyson, R. K. & Frazier, B. W. Principles of Adaptive Optics. 5th edn (CRC Press, 2022).
Colomb, T. & Kühn, J. Digital holographic microscopy. in Optical Measurement of Surface Topography (ed. Leach, R.) 209–235 (Springer, 2011).
Klibanov, M. V., Sacks, P. E. & Tikhonravov, A. V. The phase retrieval problem. Inverse Probl. 11, 1–28 (1995).
Goodman, J. W. Introduction to Fourier Optics. 4th edn (W.H. Freeman, 2017).
Gabor, D. A new microscopic principle. Nature 161, 777–778 (1948).
Hartmann, J. Bermerkungen über den bau und die justierung von spektrographen. Z. Instrumentenkd 20, 47–58 (1900).
Shack, R. V. & Platt, B. C. Production and use of a lenticular Hartmann. screen. J. Opt. Soc. Am. 61, 656–661 (1971).
Teague, M. R. Deterministic phase retrieval: a Green’s function solution. J. Opt. Soc. Am. 73, 1434–1441 (1983).
Zuo, C. et al. Transport of intensity equation: a tutorial. Opt. Lasers Eng. 135, 106187 (2020).
Gerchberg, R. W. & Saxton, W. O. A practical algorithm for the determination of phase from image and diffraction plane picture. Optik 35, 237–246 (1972).
Fienup, J. R. Phase retrieval algorithms: a comparison. Appl. Opt. 21, 2758–2769 (1982).
Fienup, J. R. Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3, 27–29 (1978).
Allen, L. J. & Oxley, M. P. Phase retrieval from series of images obtained by defocus variation. Opt. Commun. 199, 65–75 (2001).
Pedrini, G., Osten, W. & Zhang, Y. Wavefront reconstruction from a sequence of interferograms recorded at different planes. Opt. Lett. 30, 833–835 (2005).
Greenbaum, A. & Ozcan, A. Maskless imaging of dense samples using pixel superresolution based multiheight lensfree onchip microscopy. Opt. Express 20, 3129–3143 (2012).
Hoppe, W. & Strube, G. Beugung in inhomogenen Primärstrahlenwellenfeld. II. Lichtoptische Analogieversuche zur Phasenmessung von Gitterinterferenzen. Acta Crystallogr. Sect. A 25, 502–507 (1969).
Faulkner, H. M. L. & Rodenburg, J. M. Movable aperture lensless transmission microscopy: a novel phase retrieval algorithm. Phys. Rev. Lett. 93, 023903 (2004).
Rodenburg, J. M. & Faulkner, H. M. L. A phase retrieval algorithm for shifting illumination. Appl. Phys. Lett. 85, 4795–4797 (2004).
Zheng, G. A., Horstmeyer, R. & Yang, C. H. Widefield, highresolution Fourier ptychographic microscopy. Nat. Photonics 7, 739–745 (2013).
Zheng, G. A. et al. Concept, implementations and applications of Fourier ptychography. Nat. Rev. Phys. 3, 207–223 (2021).
Yamaguchi, I. & Zhang, T. Phaseshifting digital holography. Opt. Lett. 22, 1268–1270 (1997).
Huang, P. S. & Zhang, S. Fast threestep phaseshifting algorithm. Appl. Opt. 45, 5086–5091 (2006).
Wang, Z. et al. Spatial light interference microscopy (SLIM). Opt. Express 19, 1016–1026 (2011).
Leith, E. N. & Upatnieks, J. Reconstructed wavefronts and communication theory. J. Opt. Soc. Am. 52, 1123–1130 (1962).
Baek, Y. et al. Kramers–Kronig holographic imaging for highspacebandwidth product. Optica 6, 45–51 (2019).
Huang, Z. Z. & Cao, L. C. High bandwidth‐utilization digital holographic multiplexing: an approach using Kramers–Kronig relations. Adv. Photonics Res. 3, 2100273 (2022).
Luo, G. et al. Complex wave and phase retrieval from a single offaxis interferogram. J. Opt. Soc. Am. A 40, 85–95 (2023).
Kim, M. K. Principles and techniques of digital holographic microscopy. SPIE Rev. 1, 018005 (2010).
Arnison, M. R. et al. Linear phase imaging using differential interference contrast microscopy. J. Microsc. 214, 7–12 (2004).
Tian, L. & Waller, L. Quantitative differential phase contrast imaging in an LED array microscope. Opt. Express 23, 11394–11403 (2015).
Bon, P. et al. Quadriwave lateral shearing interferometry for quantitative phase microscopy of living cells. Opt. Express 17, 13080–13094 (2009).
Barbastathis, G. Quantitative phase retrieval. in Proceedings of 2010 International Symposium on Optomechatronic Technologies (IEEE, 2010).
Zhang, K. Y. J. & Main, P. Histogram matching as a new density modification technique for phase refinement and extension of protein molecules. Acta Crystallogr. Sect. A: Found. Crystallogr. 46, 41–46 (1990).
Elser, V. Solution of the crystallographic phase problem by iterated projections. Acta Crystallogr. Sect. A Found. Crystallogr. 59, 201–209 (2003).
Latychevskaia, T. & Fink, H.W. Solution to the twin image problem in holography. Phys. Rev. Lett. 98, 233901 (2007).
Moravec, M. L., Romberg, J. K. & Baraniuk, R. G. Compressive phase retrieval. in Proceedings of SPIE 6701, Wavelets XII. 670120 (SPIE, 2007).
Kostenko, A. et al. Phase retrieval in inline xray phase contrast imaging based on total variation minimization. Opt. Express 21, 710–723 (2013).
Gao, Y. H. & Cao, L. C. Iterative projection meets sparsity regularization: towards practical singleshot quantitative phase imaging with inline holography. Light Adv. Manuf. 4, 37–53 (2023).
Rivenson, Y. et al. Sparsitybased multiheight phase recovery in holographic microscopy. Sci. Rep. 6, 37862 (2016).
Song, P. M. et al. Synthetic aperture ptychography: coded sensor translation for joint spatialFourier bandwidth expansion. Photonics Res. 10, 1624–1632 (2022).
Candès, E. J., Li, X. D. & Soltanolkotabi, M. Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61, 1985–2007 (2015).
Wang, G., Giannakis, G. B. & Eldar, Y. C. Solving systems of random quadratic equations via truncated amplitude flow. IEEE Trans. Inf. Theory 64, 773–794 (2018).
Candès, E. J., Strohmer, T. & Voroninski, V. PhaseLift: exact and stable signal recovery from magnitude measurements via convex programming. Commun. Pure Appl. Math. 66, 1241–1274 (2013).
Wang, T. B. et al. Optical ptychography for biomedical imaging: recent progress and future directions. Biomed. Opt. Express 14, 489–532 (2023).
Jiang, S. W. et al. Resolutionenhanced parallel coded ptychography for highthroughput optical imaging. ACS Photonics 8, 3261–3271 (2021).
Jiang, S. W. et al. Bloodcoated sensor for highthroughput ptychographic cytometry on a Bluray disc. ACS Sens. 7, 1058–1067 (2022).
Jiang, S. W. et al. Spatial and Fourierdomain ptychography for highthroughput bioimaging. Nat. Protoc. 18, 2051–2083 (2023).
Guo, C. F. et al. Quantitative multiheight phase retrieval via a coded image sensor. Biomed. Opt. Express 12, 7173–7184 (2021).
Zuo, C. et al. Highresolution transportofintensity quantitative phase microscopy with annular illumination. Sci. Rep. 7, 7654 (2017).
Li, J. et al. Optimal illumination pattern for transportofintensity quantitative phase microscopy. Opt. Express 26, 27599 (2018).
Sun, J. S. et al. Highspeed Fourier ptychographic microscopy based on programmable annular illuminations. Sci. Rep. 8, 7669 (2018).
Fan, Y. et al. Optimal illumination scheme for isotropic quantitative differential phase contrast microscopy. Photonics Res. 7, 890–904 (2019).
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).
Jin, K. H. et al. Deep convolutional neural network for inverse problems in imaging. IEEE Trans. Image Process. 26, 4509–4522 (2017).
Barbastathis, G., Ozcan, A. & Situ, G. H. On the use of deep learning for computational imaging. Optica 6, 921–943 (2019).
Leijnen, S. & van Veen, F. The neural network zoo. Proceedings 47, 9 (2020).
LeCun, Y. et al. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989).
Rivenson, Y., Wu, Y. C. & Ozcan, A. Deep learning in holography and coherent imaging. Light Sci. Appl. 8, 85 (2019).
Zeng, T. J., Zhu, Y. M. & Lam, E. Y. Deep learning for digital holography: a review. Opt. Express 29, 40572–40593 (2021).
Situ, G. H. Deep holography. Light Adv. Manuf. 3, 278–300 (2022).
Zhou, L. F. et al. Artificial intelligence in interferometric synthetic aperture radar phase unwrapping: a review. IEEE Geosci. Remote Sens. Mag. 9, 10–28 (2021).
Wang, K. Q. et al. Deep learning spatial phase unwrapping: a comparative review. Adv. Photonics Nexus 1, 014001 (2022).
Dong, J. et al. Phase retrieval: from computational imaging to machine learning: a tutorial. IEEE Signal Process. Mag. 40, 45–57 (2023).
Park, J. et al. Artificial intelligenceenabled quantitative phase imaging methods for life sciences. Nat. Methods 20, 1645–1660 (2023).
Bishara, W. et al. Lensfree onchip microscopy over a wide fieldofview using pixel superresolution. Opt. Express 18, 11181–1191 (2010).
Luo, Z. X. et al. Pixel superresolution for lensfree holographic microscopy using deep learning neural networks. Opt. Express 27, 13581–13595 (2019).
Dong, C. et al. Image superresolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38, 295–307 (2016).
Rivenson, Y. et al. Deep learning microscopy. Optica 4, 1437–1443 (2017).
Wang, H. D. et al. Deep learning enables crossmodality superresolution in fluorescence microscopy. Nat. Methods 16, 103–110 (2019).
Byeon, H., Go, T. & Lee, S. J. Deep learningbased digital inline holographic microscopy for high resolution with extended field of view. Opt. Laser Technol. 113, 77–86 (2019).
Xin, L. et al. Threedimensional reconstruction of superresolved whitelight interferograms based on deep learning. Opt. Lasers Eng. 145, 106663 (2021).
Ren, Z. B., So, H. K. H. & Lam, E. Y. Fringe Pattern Improvement and SuperResolution Using Deep Learning in Digital Holography. IEEE Trans. Ind. Inform. 15, 6179–6186 (2019).
Wang, Z., Simoncelli, E. P. & Bovik, A. C. Multiscale structural similarity for image quality assessment. in Proceedings of the ThritySeventh Asilomar Conference on Signals, Systems & Computers. 1398–1402 (IEEE, 2003).
Kemao, Q. Windowed Fourier transform for fringe pattern analysis. Appl. Opt. 43, 2695–2702 (2004).
Zhang, K. et al. Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26, 3142–3155 (2017).
Yan, K. T. et al. Fringe pattern denoising based on deep learning. Opt. Commun. 437, 148–152 (2019).
Lin, B. W. et al. Optical fringe patterns filtering based on multistage convolution neural network. Opt. Lasers Eng. 126, 105853 (2020).
Zhang, K., Zuo, W. M. & Zhang, L. FFDNet: toward a fast and flexible solution for CNNbased image denoising. IEEE Trans. Image Process. 27, 4608–4622 (2018).
Hao, F. G. et al. Batch denoising of ESPI fringe patterns based on convolutional neural network. Appl. Opt. 58, 3338–3346 (2019).
Zhou, W. J. et al. Speckle noise reduction in digital holograms based on Spectral Convolutional Neural Networks (SCNN). in Proceedings of SPIE 11188, Holography, Diffractive Optics, and Applications IX (SPIE, 2019).
Zhou, W. J. et al. A deep learning approach for digital hologram speckle noise reduction. in Proceedings of the Imaging and Applied Optics Congress (Optica Publishing Group, 2020).
ReyesFigueroa, A., Flores, V. H. & Rivera, M. Deep neural network for fringe pattern filtering and normalization. Appl. Opt. 60, 2022–2036 (2021).
GurrolaRamos, J., Dalmau, O. & Alarcón, T. UNet based neural network for fringe pattern denoising. Opt. Lasers Eng. 149, 106829 (2022).
Zhang, Q. N. et al. Deep phase shifter for quantitative phase imaging. Preprint at https://doi.org/10.48550/arXiv.2003.03027 (2020).
Zhang, Q. N. et al. Phaseshifting interferometry from single frame inline interferogram using deep learning phaseshifting technology. Opt. Commun. 498, 127226 (2021).
Wang, K. Q. et al. YNet: a onetotwo deep learning framework for digital holographic reconstruction. Opt. Lett. 44, 4765–4768 (2019).
Yan, K. T. et al. Virtual temporal phaseshifting phase extraction using generative adversarial networks. Appl. Opt. 61, 2525–2535 (2022).
Zhao, Y., Hu, K. & Liu, F. W. Oneshot phase retrieval method for interferometry using a multistage phaseshifting network. IEEE Photonics Technol. Lett. 35, 577–580 (2023).
Zamir, S. W. et al. MultiStage Progressive Image Restoration. in Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition 14821–14831 (IEEE, 2021).
Huang, T. et al. Singleshot Fresnel incoherent correlation holography via deep learning based phaseshifting technology. Opt. Express 31, 12349–12356 (2023).
Wu, B. et al. RSAGAN: Rapid selfattention generative adversarial nets for singleshot phaseshifting interferometry. Opt. Lasers Eng. 168, 107672 (2023).
Luo, H. et al. DiffractionNet: a robust singleshot holography for multidistance lensless imaging. Opt. Express 30, 41724–41740 (2022).
Li, J. S. et al. Quantitative phase imaging in dualwavelength interferometry using a single wavelength illumination and deep learning. Opt. Express 28, 28140–28153 (2020).
Li, J. S. et al. Hybridnet: a twotoone deep learning framework for threewavelength phaseshifting interferometry. Opt. Express 29, 34656–34670 (2021).
Xu, X. Q. et al. Dualwavelength interferogram decoupling method for threeframe generalized dualwavelength phaseshifting interferometry based on deep learning. J. Opt. Soc. Am. A 38, 321–327 (2021).
Pitkäaho, T., Manninen, A. & Naughton, T. J. Performance of autofocus capability of deep convolutional neural networks in digital holographic microscopy. in Proceedings of the Digital Holography and ThreeDimensional Imaging (Optica Publishing Group, 2017).
Ren, Z. B., Xu, Z. M. & Lam, E. Y. Autofocusing in digital holography using deep learning. in Proceedings of SPIE 10499, ThreeDimensional and Multidimensional Microscopy: Image Acquisition and Processing XXV (SPIE, 2018).
Son, K. C. et al. Autofocusing algorithm for a digital holographic imaging system using convolutional neural networks. Jpn. J. Appl. Phys. 57, 09SB02 (2018).
Couturier, R. et al. Using deep learning for object distance prediction in digital holography. in Proceedings of 2021 International Conference on Computer, Control and Robotics 231–235 (IEEE, 2021).
Ren, Z. B., Xu, Z. M. & Lam, E. Y. Learningbased nonparametric autofocusing for digital holography. Optica 5, 337–344 (2018).
Pitkäaho, T., Manninen, A. & Naughton, T. J. Focus prediction in digital holographic microscopy using deep convolutional neural networks. Appl. Opt. 58, A202–A208 (2019).
Jaferzadeh, K. et al. Nosearch focus prediction at the single cell level in digital holographic imaging with deep convolutional neural network. Biomed. Opt. Express 10, 4276–4289 (2019).
Moon, I. & Jaferzadeh, K. Automated digital holographic image reconstruction with deep convolutional neural networks. in Proceedings of SPIE 11402, ThreeDimensional Imaging, Visualization, and Display 2020 (SPIE, 2020).
Cuenat, S. et al. Fast autofocusing using tiny transformer networks for digital holographic microscopy. Opt. Express 30, 24730–24746 (2022).
Lee, J. Autofocusing using deep learning in offaxis digital holography. in Proceedings of the Imaging and Applied Optics 2018 (Optica Publishing Group, 2018).
Shimobaba, T., Kakue, T. & Ito, T. Convolutional neural networkbased regression for depth prediction in digital holography. in Proceedings of the 27th International Symposium on Industrial Electronics 1323–1326 (IEEE, 2018).
Tang, J. et al. Singleshot diffraction autofocusing: distance prediction via an untrained physicsenhanced network. IEEE Photonics J. 14, 5207106 (2022).
Dosovitskiy, A. et al. An image is worth 16x16 words: transformers for image recognition at scale. in Proceedings of the 9th International Conference on Learning Representations (OpenReview.net, 2021).
Oh, S. et al. Fast focus estimation using frequency analysis in digital holography. Opt. Express 22, 28926–28933 (2014).
Sinha, A. et al. Lensless computational imaging through deep learning. Optica 4, 1117–1125 (2017).
Li, S. & Barbastathis, G. Spectral premodulation of training examples enhances the spatial resolution of the phase extraction neural network (PhENN). Opt. Express 26, 29340–29352 (2018).
Neto, A. M. et al. Image processing using Pearson’s correlation coefficient: Applications on autonomous robotics. in Proceedings of 2013 13th International Conference on Autonomous Robot Systems I16 (EEE, 2013).
Deng, M. et al. On the interplay between physical and content priors in deep learning for computational imaging. Opt. Express 28, 24152–24170 (2020).
Goy, A. et al. Low photon count phase retrieval using deep learning. Phys. Rev. Lett. 121, 243902 (2018).
Wang, H., Lyu, M. & Situ, G. H. eHoloNet: a learningbased endtoend approach for inline digital holographic reconstruction. Opt. Express 26, 22603–22614 (2018).
Nguyen, T. et al. Deep learning approach for Fourier ptychography microscopy. Opt. Express 26, 26470–26484 (2018).
Cheng, Y. F. et al. Illumination pattern design with deep learning for singleshot Fourier ptychographic microscopy. Opt. Express 27, 644–656 (2019).
Cherukara, M. J., Nashed, Y. S. G. & Harder, R. J. Realtime coherent diffraction inversion using deep generative networks. Sci. Rep. 8, 16520 (2018).
Ren, Z. B., Xu, Z. M. & Lam, E. Y. Endtoend deep learning framework for digital holographic reconstruction. Adv. Photonics 1, 016004 (2019).
Yin, D. et al. Digital holographic reconstruction based on deep learning framework with unpaired data. IEEE Photonics J. 12, 3900312 (2020).
Lee, C. et al. Deep learning based on parameterized physical forward model for adaptive holographic imaging with unpaired data. Nat. Mach. Intell. 5, 35–45 (2023).
Hu, L. J. et al. Deep learning assisted Shack–Hartmann wavefront sensor for direct wavefront detection. Opt. Lett. 45, 3741–3744 (2020).
Wang, K. Q. et al. Transport of intensity equation from a single intensity image via deep learning. Opt. Lasers Eng. 134, 106233 (2020).
Zhou, J. et al. Deep learning‐enabled pixel‐super‐resolved quantitative phase microscopy from single‐shot aliased intensity measurement. Laser Photon. Rev. 2300488 (2023)
Pirone, D. et al. Speeding up reconstruction of 3D tomograms in holographic flow cytometry via deep learning. Lab Chip 22, 793–804 (2022).
Chang, D. J. et al. Deeplearning electron diffractive imaging. Phys. Rev. Lett. 130, 016101 (2023).
Tayal, K. et al. Inverse problems, deep learning, and symmetry breaking. Preprint at https://doi.org/10.48550/arXiv.2003.09077 (2020).
Xue, Y. et al. Reliable deeplearningbased phase imaging with uncertainty quantification. Optica 6, 618–629 (2019).
Li, X. et al. Quantitative phase imaging via a cGAN network with dual intensity images captured under centrosymmetric illumination. Opt. Lett. 44, 2879–2882 (2019).
Wang, K. Q. et al. Y4Net: a deep learning solution to oneshot dualwavelength digital holographic reconstruction. Opt. Lett. 45, 4220–4223 (2020).
Zeng, T. J., So, H. K. H. & Lam, E. Y. RedCap: residual encoderdecoder capsule network for holographic image reconstruction. Opt. Express 28, 4876–4887 (2020).
Wu, L. L. et al. Complex imaging of phase domains by deep neural networks. IUCrJ 8, 12–21 (2021).
Huang, L. Z. et al. Holographic image reconstruction with phase recovery and autofocusing using recurrent neural networks. ACS Photonics 8, 1763–1774 (2021).
Uelwer, T., Hoffmann, T. & Harmeling, S. Noniterative phase retrieval with cascaded neural networks. in Proceedings of the 30th International Conference on Artificial Neural Networks (Springer, 2021).
Castaneda, R., Trujillo, C. & Doblas, A. Videorate quantitative phase imaging using a digital holographic microscope and a generative adversarial network. Sensors 21, 8021 (2021).
Jaferzadeh, K. & Fevens, T. HoloPhaseNet: fully automated deeplearningbased hologram reconstruction using a conditional generative adversarial model. Biomed. Opt. Express 13, 4032–4046 (2022).
Luo, W. et al. Learning endtoend phase retrieval using only one interferogram with mixedcontext network. in Proceedings of SPIE 11970, Quantitative Phase Imaging VIII (SPIE, 2022).
Ding, H. et al. ContransGAN: convolutional neural network coupling global swintransformer network for highresolution quantitative phase imaging with unpaired data. Cells 11, 2394 (2022).
Liu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows. in Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (IEEE, 2021).
Ye, Q. L., Wang, L. W. & Lun, D. P. K. SiSPRNet: endtoend learning for singleshot phase retrieval. Opt. Express 30, 31937–31958 (2022).
Chen, H. L. et al. Fourier Imager Network (FIN): a deep neural network for hologram reconstruction with superior external generalization. Light Sci. Appl. 11, 254 (2022).
Chen, H. L. et al. eFIN: enhanced Fourier imager network for generalizable autofocusing and pixel superresolution in holographic imaging. IEEE J. Sel. Top. Quantum Electron. 29, 6800810 (2023).
Shu, X. et al. NASPRNet: neural architecture search generated phase retrieval net for offaxis quantitative phase imaging. Preprint at https://doi.org/10.48550/arXiv.2210.14231 (2022).
Wang, K. Q. et al. Deep learning wavefront sensing and aberration correction in atmospheric turbulence. PhotoniX 2, 8 (2021).
Boominathan, L. et al. Phase retrieval for Fourier Ptychography under varying amount of measurements. in Proceedings of the British Machine Vision Conference 2018 (BMVA Press, 2018).
Wang, F. et al. Phase imaging with an untrained neural network. Light Sci. Appl. 9, 77 (2020).
Zhang, X. Y., Wang, F. & Situ, G. H. BlindNet: an untrained learning approach toward computational imaging with model uncertainty. J. Phys. D: Appl. Phys. 55, 034001 (2022).
Yang, D. Y. et al. Coherent modulation imaging using a physicsdriven neural network. Opt. Express 30, 35647–35662 (2022).
Yang, D. Y. et al. Dynamic coherent diffractive imaging with a physicsdriven untrained learning method. Opt. Express 29, 31426–31442 (2021).
Bai, C. et al. Dualwavelength inline digital holography with untrained deep neural networks. Photonics Res. 9, 2501 (2021).
Galande, A. S. et al. Untrained deep network powered with explicit denoiser for phase recovery in inline holography. Appl. Phys. Lett. 122, 133701 (2023).
Li, H. Y. et al. Deep DIH: singleshot digital inline holography reconstruction by deep learning. IEEE Access 8, 202648–202659 (2020).
Zhang, J. L. et al. The integration of neural network and physical reconstruction model for Fourier ptychographic microscopy. Opt. Commun. 504, 127470 (2022).
Chen, X. W. et al. DHGAN: a physicsdriven untrained generative adversarial network for holographic imaging. Opt. Express 31, 10114–10135 (2023).
Yao, Y. D. et al. AutoPhaseNN: unsupervised physicsaware deep learning of 3D nanoscale Bragg coherent diffraction imaging. npj Comput. Mater. 8, 124 (2022).
Li, R. J. et al. Physicsenhanced neural network for phase retrieval from two diffraction patterns. Opt. Express 30, 32680–32692 (2022).
Bouchama, L. et al. A physicsinspired deep learning framework for an efficient Fourier ptychographic microscopy reconstruction under low overlap conditions. Sensors 23, 6829 (2023).
Huang, L. Z. et al. Selfsupervised learning of hologram reconstruction using physics consistency. Nat. Mach. Intell. 5, 895–907 (2023).
Wu, J. C. et al. Highspeed computergenerated holography using an autoencoderbased deep neural network. Opt. Lett. 46, 2908–2911 (2021).
Liu, K. X. et al. 4KDMDNet: diffraction modeldriven network for 4K computergenerated holography. Opto Electron. Adv. 6, 220135 (2023).
Rivenson, Y. et al. Phase recovery and holographic image reconstruction using deep learning in neural networks. Light Sci. Appl. 7, 17141 (2018).
Wu, Y. C. et al. Extended depthoffield in holographic imaging using deeplearningbased autofocusing and phase recovery. Optica 5, 704–710 (2018).
Wang, W. et al. Shape inpainting using 3D generative adversarial network and recurrent convolutional networks. in Proceedings of 2017 IEEE International Conference on Computer Vision 2317–2325 (IEEE, 2017).
Deng, M. et al. Probing shallower: perceptual loss trained Phase Extraction Neural Network (PLTPhENN) for artifactfree reconstruction at low photon budget. Opt. Express 28, 2511–2535 (2020).
Deng, M. et al. Learning to synthesize: robust phase retrieval at low photon counts. Light Sci. Appl. 9, 36 (2020).
Kang, I., Zhang, F. C. & Barbastathis, G. Phase extraction neural network (PhENN) with coherent modulation imaging (CMI) for phase retrieval at low photon counts. Opt. Express 28, 21578–21600 (2020).
Zhang, J. Z. et al. Fourier ptychographic microscopy reconstruction with multiscale deep residual network. Opt. Express 27, 8612–8625 (2019).
Moon, I. et al. Noisefree quantitative phase imaging in Gabor holography with conditional generative adversarial network. Opt. Express 28, 26284–26301 (2020).
Romano, Y., Elad, M. & Milanfar, P. The little engine that could: regularization by denoising (RED). SIAM J. Imaging Sci. 10, 1804–1844 (2017).
Metzler, C. A. et al. prDeep: robust phase retrieval with a flexible deep network. in Proceedings of the 35th International Conference on Machine Learning 3498–3507 (PMLR, 2018).
Goldstein, T., Studer, C. & Baraniuk, R. A field guide to forwardbackward splitting with a FASTA implementation. Preprint at https://doi.org/10.48550/arXiv.1411.3406 (2014).
Wu, Z. H. et al. Online regularization by denoising with applications to phase retrieval. in Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop 3887–3895 (IEEE, 2019).
Bai, C. et al. Robust contrasttransferfunction phase retrieval via flexible deep learning networks. Opt. Lett. 44, 5141–5144 (2019).
Wang, Y. T., Sun, X. H. & Fleischer, J. W. When deep denoising meets iterative phase retrieval. in Proceedings of the 37th International Conference on Machine Learning 10007–10017 (PMLR, 2020).
Chang, X. Y., Bian, L. H. & Zhang, J. Largescale phase retrieval. eLight 1, 4 (2021).
Işıl, Ç., Oktem, F. S. & Koç, A. Deep iterative reconstruction for phase retrieval. Appl. Opt. 58, 5422–5431 (2019).
Kumar, S. Phase retrieval with physics informed zeroshot network. Opt. Lett. 46, 5942–5945 (2021).
Ulyanov, D., Vedaldi, A. & Lempitsky, V. Deep image prior. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 9446–9454 (IEEE, 2018).
Heckel, R. & Hand, P. Deep decoder: concise image representations from untrained nonconvolutional networks. Preprint at https://doi.org/10.48550/arXiv.1810.03982 (2018).
Jagatap, G. & Hegde, C. Phase retrieval using untrained neural network priors. Workshop on solving inverse problems with deep networks. in Proceedings of the 33rd Conference on Neural Information Processing Systems (OpenReview.net, 2019).
Jagatap, G. & Hegde, C. Algorithmic guarantees for inverse imaging with untrained network priors. in Proceedings of the 33rd Conference on Neural Information Processing Systems (Curran Associates Inc., 2019).
Zhou, K. C. & Horstmeyer, R. Diffraction tomography with a deep image prior. Opt. Express 28, 12872–12896 (2020).
Shamshad, F., Hanif, A. & Ahmed, A. Subsampled Fourier ptychography using pretrained invertible and untrained network priors. Preprint at https://doi.org/10.48550/arXiv.2005.07026 (2020).
Bostan, E. et al. Deep phase decoder: selfcalibrating phase microscopy with an untrained deep neural network. Optica 7, 559–562 (2020).
Lawrence, H. et al. Phase retrieval with holography and untrained priors: tackling the challenges of lowphoton nanoscale imaging. in Proceedings of the Mathematical and Scientific Machine Learning 516–567 (PMLR, 2021).
Niknam, F., Qazvini, H. & Latifi, H. Holographic optical field recovery using a regularized untrained deep decoder network. Sci. Rep. 11, 10903 (2021).
Ma, L. Y. et al. ADMM based Fourier phase retrieval with untrained generative prior. Preprint at https://doi.org/10.48550/arXiv.2210.12646 (2022).
Chen, Q., Huang, D. L. & Chen, R. Fourier ptychographic microscopy with untrained deep neural network priors. Opt. Express 30, 39597–39612 (2022).
Hand, P., Leong, O. & Voroninski, V. Phase retrieval under a generative prior. in Proceedings of the 32nd International Conference on Neural Information Processing Systems 9154–9164 (Curran Associates Inc., 2018).
Shamshad, F. & Ahmed, A. Robust compressive phase retrieval via deep generative priors. Preprint at https://doi.org/10.48550/arXiv.1808.05854 (2018).
Shamshad, F., Abbas, F. & Ahmed, A. Deep Ptych: subsampled fourier ptychography using generative priors. in Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing 7720–7724 (IEEE, 2019).
Hyder, R. et al. Alternating phase projected gradient descent with generative priors for solving compressive phase retrieval. in Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing 7705–7709 (IEEE, 2019).
Shamshad, F. & Ahmed, A. Compressed sensingbased robust phase retrieval via deep generative priors. IEEE Sens. J. 21, 2286–2298 (2021).
Uelwer, T., Konietzny, S. & Harmeling, S. Optimizing intermediate representations of generative models for phase retrieval. Preprint at https://doi.org/10.48550/arXiv.2205.15617 (2022).
Gregor, K. & LeCun, Y. Learning fast approximations of sparse coding. in Proceedings of the 27th International Conference on International Conference on Machine Learning 399–406 (Omnipress, 2010).
Wang, C. J. et al. Phase retrieval with learning unfolded expectation consistent signal recovery algorithm. IEEE Signal Process. Lett. 27, 780–784 (2020).
Naimipour, N., Khobahi, S. & Soltanalian, M. UPR: a modeldriven architecture for deep phase retrieval. in Proceedings of the 54th Asilomar Conference on Signals, Systems, and Computers 205–209 (IEEE, 2020).
Naimipour, N., Khobahi, S. & Soltanalian, M. Unfolded algorithms for deep phase retrieval. Preprint at https://doi.org/10.48550/arXiv.2012.11102 (2020).
Zhang, F. L. et al. Physicsbased iterative projection complex neural network for phase retrieval in lensless microscopy imaging. in Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition 10518–10526 (IEEE, 2021).
Shi, B. S. & Lian, Q. S. DualPRNet: deep shrinkage dual frame network for deep unrolled phase retrieval. IEEE Signal Process. Lett. 29, 1177–1181 (2022).
Wu, X. F. et al. Physicsinformed neural network for phase imaging based on transport of intensity equation. Opt. Express 30, 43398–43416 (2022).
Yang, Y. C. et al. HIONet: deep priors based deep unfolded network for phase retrieval. Digit. Signal Process. 132, 103797 (2023).
Karniadakis, G. E. et al. Physicsinformed machine learning. Nat. Rev. Phys. 3, 422–440 (2021).
Jeon, W. et al. Speckle noise reduction for digital holographic images using multiscale convolutional neural networks. Opt. Lett. 43, 4240–4243 (2018).
Choi, G. et al. Cycleconsistent deep learning approach to coherent noise reduction in optical diffraction tomography. Opt. Express 27, 4927–4943 (2019).
Zhang, J. C. et al. Phase unwrapping in optical metrology via denoised and convolutional segmentation networks. Opt. Express 27, 14903–14912 (2019).
Yan, K. T. et al. Wrapped phase denoising using convolutional neural networks. Opt. Lasers Eng. 128, 105999 (2020).
Yan, K. T. et al. Deep learningbased wrapped phase denoising method for application in digital holographic speckle pattern interferometry. Appl. Sci. 10, 4044 (2020).
Montresor, S. et al. Computational denoising based on deep learning for phase data in digital holographic interferometry. APL Photonics 5, 030802 (2020).
Tahon, M., Montresor, S. & Picart, P. Towards reduced CNNs for denoising phase images corrupted with speckle noise. Photonics 8, 255 (2021).
Tahon, M., Montrésor, S. & Picart, P. Deep learning network for speckle denoising in severe conditions. J. Imaging 8, 165 (2022).
Fang, Q. et al. Speckle denoising based on deep learning via a conditional generative adversarial network in digital holographic interferometry. Opt. Express 30, 20666–20683 (2022).
Murdaca, G., Rucci, A. & Prati, C. Deep learning for InSAR phase filtering: an optimized framework for phase unwrapping. Remote Sens. 14, 4956 (2022).
Yu, H. W. et al. Phase unwrapping in InSAR: a review. IEEE Geosci. Remote Sens. Mag. 7, 40–58 (2019).
Tang, J. et al. Coherent noise suppression of singleshot digital holographic phase via an untrained selfsupervised network. Front. Photonics 3, 907847 (2022).
Liu, T. et al. Deep learningbased superresolution in coherent imaging systems. Sci. Rep. 9, 3926 (2019).
Jiao, Y. H. et al. Computational interference microscopy enabled by deep learning. APL Photonics 6, 046103 (2021).
Popescu, G. et al. Diffraction phase microscopy for quantifying cell structure and dynamics. Opt. Lett. 31, 775–777 (2006).
Butola, A. et al. High spacebandwidth in quantitative phase imaging using partially spatially coherent digital holographic microscopy and a deep neural network. Opt. Express 28, 36229–36244 (2020).
Meng, Z. et al. DLSIDHM: a deep network generating the highresolution phase and amplitude images from widefield images. Opt. Express 29, 19247–19261 (2021).
Gao, P., Pedrini, G. & Osten, W. Structured illumination for resolution enhancement and autofocusing in digital holographic microscopy. Opt. Lett. 38, 1328–1330 (2013).
Li, A. C. et al. Patchbased Unet model for isotropic quantitative differential phase contrast imaging. IEEE Trans. Med. Imaging 40, 3229–3237 (2021).
Gupta, R. K. et al. High throughput hemogram of T cells using digital holographic microscopy and deep learning. Opt. Contin. 2, 670–682 (2023).
Lim, J., Ayoub, A. B. & Psaltis, D. Threedimensional tomography of red blood cells using deep learning. Adv. Photonics 2, 026001 (2020).
Ryu, D. et al. DeepRegularizer: rapid resolution enhancement of tomographic imaging using deep learning. IEEE Trans. Med. Imaging 40, 1508–1518 (2021).
Ferraro, P. et al. Compensation of the inherent wave front curvature in digital holographic coherent microscopy for quantitative phasecontrast imaging. Appl. Opt. 42, 1938–1946 (2003).
Colomb, T. et al. Total aberrations compensation in digital holographic microscopy with a reference conjugated hologram. Opt. Express 14, 4300–4306 (2006).
Miccio, L. et al. Direct full compensation of the aberrations in quantitative phase microscopy of thin objects by a single digital hologram. Appl. Phys. Lett. 90, 041104 (2007).
Zuo, C. et al. Phase aberration compensation in digital holographic microscopy based on principal component analysis. Opt. Lett. 38, 1724–1726 (2013).
Nguyen, T. et al. Automatic phase aberration compensation for digital holographic microscopy based on deep learning background detection. Opt. Express 25, 15043–15057 (2017).
Ma, S. J. et al. Phaseaberration compensation via deep learning in digital holographic microscopy. Meas. Sci. Technol. 32, 105203 (2021).
Lin, L. C. et al. Deep learningassisted wavefront correction with sparse data for holographic tomography. Opt. Lasers Eng. 154, 107010 (2022).
Xiao, W. et al. Sensing morphogenesis of bone cells under microfluidic shear stress by holographic microscopy and automatic aberration compensation with deep learning. Lab Chip 21, 1385–1394 (2021).
Zhang, G. et al. Fast phase retrieval in offaxis digital holographic microscopy through deep learning. Opt. Express 26, 19388–19405 (2018).
Tang, J. et al. Phase aberration compensation via a selfsupervised sparse constraint network in digital holographic microscopy. Opt. Lasers Eng. 168, 107671 (2023).
Jenkinson, M. Fast, automated, Ndimensional phaseunwrapping algorithm. Magn. Reson. Med. 49, 193–197 (2003).
Su, X. Y. & Chen, W. J. Fourier transform profilometry: a review. Opt. Lasers Eng. 35, 263–284 (2001).
Ghiglia, D. C. & Pritt, M. D. Twodimensional Phase Unwrapping: Theory, Algorithms, and Software (Wiley, 1998).
Dardikman, G. & Shaked, N. T. Phase unwrapping using residual neural networks. in Proceedings of the Imaging and Applied Optics 2018 (Optica Publishing Group, 2018).
Dardikman, G., Turko, N. A. & Shaked, N. T. Deep learning approaches for unwrapping phase images with steep spatial gradients: a simulation. in Proceedings of 2018 IEEE International Conference on the Science of Electrical Engineering in Israel 1–4 (IEEE, 2018).
Wang, K. Q. et al. Onestep robust deep learning phase unwrapping. Opt. Express 27, 15100–15115 (2019).
He, J. J. et al. Deep spatiotemporal phase unwrapping of phasecontrast MRI data. in Proceedings of the 27th ISMRM Annual Meeting & Exhibition. www.ismrm.org, (2019).
Ryu, K. et al. Development of a deep learning method for phase unwrapping MR images. in Proceedings of the 27th ISMRM Annual Meeting & Exhibition. www.ismrm.org, (2019).
Dardikman, G. et al. PhUnNet: readytouse neural network for unwrapping quantitative phase images of biological cells. Biomed. Opt. Express 11, 1107–1121 (2020).
Qin, Y. et al. Direct and accurate phase unwrapping with deep neural network. Appl. Opt. 59, 7258–7267 (2020).
Perera, M. V. & De Silva, A. A joint convolutional and spatial quaddirectional LSTM network for phase unwrapping. in Proceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing 4055–4059 (IEEE, 2021).
Park, S., Kim, Y. & Moon, I. Automated phase unwrapping in digital holography with deep learning. Biomed. Opt. Express 12, 7064–7081 (2021).
Zhou, H. Y. et al. The PHU‐NET: a robust phase unwrapping method for MRI based on deep learning. Magn. Reson. Med. 86, 3321–3333 (2021).
Xu, M. et al. PUMNet for phase unwrapping with speckle reduction and structure protection in ESPI. Opt. Lasers Eng. 151, 106824 (2022).
Zhou, L. F. et al. PUGAN: a onestep 2D InSAR phase unwrapping based on conditional generative adversarial network. IEEE Trans. Geosci. Remote Sens. 60, 5221510 (2022).
Xie, X. M. et al. Deep learning phaseunwrapping method based on adaptive noise evaluation. Appl. Opt. 61, 6861–6870 (2022).
Zhao, J. X. et al. VDENet: a twostage deep learning method for phase unwrapping. Opt. Express 30, 39794–39815 (2022).
Liang, R. G. et al. Phase unwrapping using segmentation. U.S. Provisional Patent Application. No. 62/768, 624 (2018).
Spoorthi, G. E., Gorthi, S. & Gorthi, R. K. S. S. PhaseNet: a deep convolutional neural network for twodimensional phase unwrapping. IEEE Signal Process. Lett. 26, 54–58 (2019).
Spoorthi, G. E., Sai Subrahmanyam Gorthi, R. K. & Gorthi, S. PhaseNet 2.0: phase unwrapping of noisy data based on deep learning approach. IEEE Trans. Image Process. 29, 4862–4872 (2020).
Liang, R. G. et al. Phase unwrapping using segmentation. PCT patent. WO2020102814A1 (2020).
Zhang, T. et al. Rapid and robust twodimensional phase unwrapping via deep learning. Opt. Express 27, 23173–23185 (2019).
Zhu, S. T. et al. Phase unwrapping in ICF target interferometric measurement via deep learning. Appl. Opt. 60, 10–19 (2021).
Wu, C. C. et al. Phase unwrapping based on a residual endecoder network for phase images in Fourier domain Doppler optical coherence tomography. Biomed. Opt. Express 11, 1760–1771 (2020).
Zhao, Z. et al. Phase unwrapping method for point diffraction interferometer based on residual auto encoder neural network. Opt. Lasers Eng. 138, 106405 (2021).
Vengala, K. S., Paluru, N. & Subrahmanyam Gorthi, R. K. S. 3D deformation measurement in digital holographic interferometry using a multitask deep learning architecture. J. Opt. Soc. Am. A 39, 167–176 (2022).
Krishna, S., Ravi, V. & Gorthi, R. K. A multitask learning for 2D phase unwrapping in fringe projection. IEEE Signal Process. Lett. 29, 797–801 (2022).
Zhang, J. K. & Li, Q. G. EESANet: edgeenhanced selfattention network for twodimensional phase unwrapping. Opt. Express 30, 10470–10490 (2022).
Sica, F. et al. A CNNbased coherencedriven approach for InSAR phase unwrapping. IEEE Geosci. Remote Sens. Lett. 19, 4003705 (2022).
Li, L. T. et al. InSAR phase unwrapping by deep learning based on gradient information fusion. IEEE Geosci. Remote Sens. Lett. 19, 4502305 (2022).
Zhou, L. F., Yu, H. & Lan, Y. Deep convolutional neural networkbased robust phase gradient estimation for twodimensional phase unwrapping using SAR interferograms. IEEE Trans. Geosci. Remote Sens. 58, 4653–4665 (2020).
Wang, H. et al. A novel qualityguided twodimensional InSAR phase unwrapping method via GAUNet. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 7840–7856 (2021).
Wu, Z. P. et al. A new phase unwrapping method combining minimum cost flow with deep learning. in Proceedings of 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS 3177–3180 (IEEE, 2021).
Wu, Z. P. et al. Deeplearningbased phase discontinuity prediction for 2D phase unwrapping of SAR interferograms. IEEE Trans. Geosci. Remote Sens. 60, 5216516 (2022).
Zhou, L. F. et al. Deep learningbased branchcut method for InSAR twodimensional phase unwrapping. IEEE Trans. Geosci. Remote Sens. 60, 5209615 (2022).
Tan, M. & Le, Q. EfficientNet: rethinking model scaling for convolutional neural networks. in Proceedings of the 36th International Conference on Machine Learning 6105–6114 (PMLR, 2019).
Vithin, A. V. S., Vishnoi, A. & Gannavarpu, R. Phase derivative estimation in digital holographic interferometry using a deep learning approach. Appl. Opt. 61, 3061–3069 (2022).
Satya Vithin, A. V., Ramaiah, J. & Gannavarpu, R. Deep learning based single shot multiple phase derivative retrieval method in multiwave digital holographic interferometry. Opt. Lasers Eng. 162, 107442 (2023).
Huang, W. et al. Twodimensional phase unwrapping by a highresolution deep learning network. Measurement 200, 111566 (2022).
Wang, Y. X., Zhou, C. L. & Qi, X. Y. PEENet for phase unwrapping in fringe projection profilometry. in Proceedings of SPIE 12478, Thirteenth International Conference on Information Optics and Photonics (SPIE, 2022).
Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. in Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2015).
Yi, F. L. et al. Automated segmentation of multiple red blood cells with digital holographic microscopy. J. Biomed. Opt. 18, 026006 (2013).
Yi, F. L., Moon, I. & Javidi, B. Automated red blood cells extraction from holographic images using fully convolutional neural networks. Biomed. Opt. Express 8, 4466–4479 (2017).
Nguyen, T. H. et al. Automatic Gleason grading of prostate cancer using quantitative phase imaging and machine learning. J. Biomed. Opt. 22, 036015 (2017).
Ahmadzadeh, E. et al. Automated single cardiomyocyte characterization by nucleus extraction from dynamic holographic images using a fully convolutional neural network. Biomed. Opt. Express 11, 1501–1516 (2020).
Kandel, M. E. et al. Reproductive outcomes predicted by phase imaging with computational specificity of spermatozoon ultrastructure. Proc. Natl Acad. Sci. USA 117, 18302–18309 (2020).
Goswami, N. et al. Labelfree SARSCoV2 detection and classification using phase imaging with computational specificity. Light Sci. Appl. 10, 176 (2021).
Hu, C. F. et al. Livedead assay on unlabeled cells using phase imaging with computational specificity. Nat. Commun. 13, 713 (2022).
He, Y. R. et al. Cell cycle stage classification using phase imaging with computational specificity. ACS Photonics 9, 1264–1273 (2022).
Zhang, J. K. et al. Automatic colorectal cancer screening using deep learning in spatial light interference microscopy data. Cells 11, 716 (2022).
Jiang, S. W. et al. Highthroughput digital pathology via a handheld, multiplexed, and AIpowered ptychographic whole slide scanner. Lab Chip 22, 2657–2670 (2022).
Lee, J. et al. Deeplearningbased labelfree segmentation of cell nuclei in timelapse refractive index tomograms. IEEE Access 7, 83449–83460 (2019).
Choi, J. et al. Labelfree threedimensional analyses of live cells with deeplearningbased segmentation exploiting refractive index distributions. Preprint at https://doi.org/10.1101/2021.05.23.445351 (2021).
Jo, Y. et al. Holographic deep learning for rapid optical screening of anthrax spores. Sci. Adv. 3, e1700606 (2017).
Valentino, M. et al. Digital holographic approaches to the detection and characterization of microplastics in water environments. Appl. Opt. 62, D104–D118 (2023).
Chang, C. C. & Lin, C. J. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27 (2011).
Roitshtain, D. et al. Quantitative phase microscopy spatial signatures of cancer cells. Cytom. Part A 91, 482–493 (2017).
Mirsky, S. K. et al. Automated analysis of individual sperm cells using stainfree interferometric phase microscopy and machine learning. Cytom. Part A 91, 893–900 (2017).
Ozaki, Y. et al. Labelfree classification of cells based on supervised machine learning of subcellular structures. PLoS ONE 14, e0211347 (2019).
Bianco, V. et al. Microplastic identification via holographic imaging and machine learning. Adv. Intell. Syst. 2, 1900153 (2020).
Belashov, A. V. et al. In vitro monitoring of photoinduced necrosis in HeLa cells using digital holographic microscopy and machine learning. J. Opt. Soc. Am. A 37, 346–352 (2020).
Lam, V. K. et al. Quantitative scoring of epithelial and mesenchymal qualities of cancer cells using machine learning and quantitative phase imaging. J. Biomed. Opt. 25, 026002 (2020).
Nissim, N. et al. Real‐time stain‐free classification of cancer cells and blood cells using interferometric phase microscopy and machine learning. Cytom. Part A 99, 511–523 (2021).
Bianco, V. et al. Identification of microplastics based on the fractal properties of their holographic fingerprint. ACS Photonics 8, 2148–2157 (2021).
Yoon, J. et al. Identification of nonactivated lymphocytes using threedimensional refractive index tomography and machine learning. Sci. Rep. 7, 6654 (2017).
Park, S. et al. Labelfree tomographic imaging of lipid droplets in foam cells for machinelearningassisted therapeutic evaluation of targeted nanodrugs. ACS Nano 14, 1856–1865 (2020).
Chen, C. L. et al. Deep learning in labelfree cell classification. Sci. Rep. 6, 21471 (2016).
Kim, G. et al. Learningbased screening of hematologic disorders using quantitative phase imaging of individual red blood cells. Biosens. Bioelectron. 123, 69–76 (2019).
Javidi, B. et al. Sickle cell disease diagnosis based on spatiotemporal cell dynamics analysis using 3D printed shearing digital holographic microscopy. Opt. Express 26, 13614–13627 (2018).
Paidi, S. K. et al. Raman and quantitative phase imaging allow morphomolecular recognition of malignancy and stages of Bcell acute lymphoblastic leukemia. Biosens. Bioelectron. 190, 113403 (2021).
Pirone, D. et al. Identification of drugresistant cancer cells in flow cytometry combining 3D holographic tomography with machine learning. Sens. Actuators B: Chem. 375, 132963 (2023).
Li, Y. Q. et al. Accurate labelfree 3part leukocyte recognition with single cell lensfree imaging flow cytometry. Comput. Biol. Med. 96, 147–156 (2018).
Memmolo, P. et al. Differential diagnosis of hereditary anemias from a fraction of blood drop by digital holography and hierarchical machine learning. Biosens. Bioelectron. 201, 113945 (2022).
Valentino, M. et al. Intelligent polarizationsensitive holographic flowcytometer: towards specificity in classifying natural and microplastic fibers. Sci. Total Environ. 815, 152708 (2022).
Karandikar, S. H. et al. Reagentfree and rapid assessment of T cell activation state using diffraction phase microscopy and deep learning. Anal. Chem. 91, 3405–3411 (2019).
Zhang, J. K., He, Y. R. & Sobh, N. Labelfree colorectal cancer screening using deep learning and spatial light interference microscopy (SLIM). APL Photonics 5, 040805 (2020).
Butola, A. et al. High spatially sensitive quantitative phase imaging assisted with deep neural network for classification of human spermatozoa under stressed condition. Sci. Rep. 10, 13118 (2020).