Real-time complex light field generation through a multi-core fiber with deep learning

The generation of tailored complex light fields with multi-core fiber (MCF) lensless microendoscopes is widely used in biomedicine. However, the computer-generated holograms (CGHs) used for such applications are typically generated by iterative algorithms, which demand high computation effort, limiting advanced applications like fiber-optic cell manipulation. The random and discrete distribution of the fiber cores in an MCF induces strong spatial aliasing to the CGHs, hence, an approach that can rapidly generate tailored CGHs for MCFs is highly demanded. We demonstrate a novel deep neural network—CoreNet, providing accurate tailored CGHs generation for MCFs at a near video rate. The CoreNet is trained by unsupervised learning and speeds up the computation time by two magnitudes with high fidelity light field generation compared to the previously reported CGH algorithms for MCFs. Real-time generated tailored CGHs are on-the-fly loaded to the phase-only spatial light modulator (SLM) for near video-rate complex light fields generation through the MCF microendoscope. This paves the avenue for real-time cell rotation and several further applications that require real-time high-fidelity light delivery in biomedicine.

www.nature.com/scientificreports/ Gerchberg-Saxton (GS) algorithm 26 is commonly utilized for calculating the CGH for structured light field generation. However, the random and discrete distribution of fiber cores leads to a strong spatial sampling of the CGH when coupling into the MCF, and the number of fiber cores is much less than the pixel number of the phase-only SLM, inducing strong aliasing. This results in unrecognizable reconstructed light fields 27 , proving the GS algorithm cannot be directly implemented in hologram generation for randomly distributed phased arrays. Therefore, we previously proposed a tailored phase retrieval algorithm for holographic control of complex light field through the MCF named Core-GS 27 , enabling complex wavefront shaping through the MCF with high fidelity. This is implemented in lab-on-a-chip optical manipulation of biological cells 28 and can facilitate applications in holographic optogenetic stimulation 29 , micro-materials processing in hard-to-reach areas 30 , structured light generation for MCF amplifier 31 , and high-dimensional optical and quantum communication 32 . Due to the long computation time, the CGHs have to be generated in advance and then loaded to the SLM for dynamic light field generation. This limits applications such as adaptive tomographic optical manipulation 10,33 and adaptive selective holographic photoactivation in optogenetics 34 , which needs rapid generation and refreshing of CGHs. Therefore, there is a great demand in biomedicine for real-time generation of tailored CGHs for MCF. Recently, deep neural networks have been used for reducing the computation time of CGHs [35][36][37][38][39] , however, all of these existing networks are designed for regularly distributed pixels and induce significant distortion when applied to discrete or randomly distributed phased-arrays 23,27 . Therefore, an approach that can generate tailored CGHs for MCFs or random phased arrays in real-time with high fidelity is not yet available.
In this paper, we propose a novel phase encoder deep neural network (CoreNet), which can generate CGHs tailored for MCFs in real-time. The discrete or random distribution map of the phased array can be loaded to the neural network as an input. Hence, CoreNet can generate tailored phase modulation holograms rapidly, enabling precise light-field control for a randomly distributed phased array. Specifically, for holographic display through a lensless microendoscope with 10,000 single-mode fiber cores, the experimentally measured fiber core distribution map is embedded into the neural network to generate tailored CGHs. Unlike supervised learning, which needs to label the holograms calculated by classical phase retrieval algorithms as the ground truth, CoreNet uses an autoencoder network architecture to encode the target intensity into the tailored phase modulation holograms with unsupervised training. The diffraction model is incorporated into the network for numerically propagating the light field between the phase modulation plane and the target intensity plane. This allows the network to search for the optimal phase modulation maps of the target image and learn the mapping from the target images to the phase modulation holograms. This end-to-end architecture of CoreNet gets rid of complicated iterative operations, generating the phased array rapidly. Near video-rate generation of the tailored CGHs for MCF-based complex wavefront shaping can thus be realized by the trained network with high fidelity, opening new perspectives for applications based on MCFs.

Results
When light field transmits through the MCF with an extremely large number of fiber cores, the intrinsic optical path differences (OPDs) between fiber cores induce strong phase distortion in the light field coupled out. To compensate for this, digital optical phase conjugation (DOPC) 40,41 is employed. The phase differences between fiber cores are measured by off-axis digital holography 28 , and the conjugated phase differences are mapped on the MCF facet by the phase-only SLM to pre-compensate the OPDs. Then the MCF can act as a phased array to generate arbitrary light field distribution. Typically, to control the wavefront through the MCF, a tailored CGH generated by the Core-GS algorithm 27 is loaded to the SLM additionally to the conjugated phase. Although the Core-GS algorithm can achieve good quality compared to the raw GS algorithm, the long iteration time still limits the applications.
We designed a phase encoder neural network, called CoreNet, to generate tailored holograms for high-speed complex wavefront shaping through MCF. The U-Net architecture has been proven to be effective in image processing tasks. Here, we modified the U-Net to a network with two inputs to collect more information on the target image (Fig. 1). Instead of feeding the raw image directly to the neural network, the target image firstly back propagates to the plane on the fiber facet at the distal application side to obtain the complex amplitude. The reason for that is learning the feature mapping at the same plane is easier than at different planes. Then the real and imaginary parts of the complex amplitude are extracted as the inputs of CoreNet. The downsampling blocks are duplicated for each input. The two downsampling paths join a bottleneck layer by a concatenation operator. Then the bottleneck layer is up-sampled to the resolution of the phase-only SLM by a series of upsampling blocks.
Each downsampling and upsampling block consists of two residual blocks as shown in Fig. 2. Each residual block is composed of two sets of batch normalization (BN), nonlinearity (ReLU), and a convolutional layer stacked one above the other. The strides of the first convolution and transposed convolution in the residual block are (2,2) to realize the function of downsampling and upsampling.
At the end of U-Net, the values at the core position are extracted to convolve with circular masks which indicate the shape of fiber cores to form the tailored phase modulation map for the phased array. Then the fiber core map as amplitude combined with phased array pattern to form a complex field. Finally, the complex field propagates to the target plane to form the target intensity distribution. Here we adopted band-limited angular spectrum method 42 to simulate the light propagation. Compared with the classical angular spectrum method, it has less numerical error for far-field propagation. The band-limited angular spectral transfer function is www.nature.com/scientificreports/ where u and v denote the sample interval in the frequency domain.
As an unsupervised learning method, the corresponding phased array of each input image does not need to be calculated in advance. By utilizing automatic differentiation, the loss can be propagated back to the encoder part, and the learnable parameters of the U-net can be updated during the training process. The training results are shown in Fig. 3. We use the EMNIST Balanced dataset which includes 131,600 characters for training 43 . Several custom letters and binary patterns make up the test set. Sixteen images and two images are randomly selected from the MNIST validation dataset and custom test dataset to show the network performance. The reconstructed results by the Core-GS algorithm are also presented for comparison. CoreNet can achieve better reconstruction quality and the computation time is only 0.2 s, which is much faster than the Core-GS algorithm that took 11 s on the same platform (see "Materials and methods"). The 2-D correlation coefficient is employed to characterize the fidelity of reconstructed images. Hence, for a normalized image X , the correlation coefficient between the reference image Y is expressed as Figure 1. Structure of the phase encoder deep neural network (CoreNet). The encoding part of CoreNet is a modified U-Net. The downsampling path of U-Net is split into two paths, and the inputs of the two paths are the real and imaginary parts of the field at the distal facet of the fiber bundle, which is obtained by back-propagation of the target field to the plane of the distal fiber facet. The core phase mapping and diffractive propagation are embedded in CoreNet to reconstruct the target field. Since the labeled data is the same as the input target field, CoreNet can achieve unsupervised learning. www.nature.com/scientificreports/ where X and Y is the mean value of the reconstructed image and the reference image, n is the total number of pixels.
Employing CoreNet to generate holograms in real-time facilitates the rapid complex holographic display through the lensless microendoscope with 10,000 fiber cores. The working principle of the lensless microendoscope is shown in Fig. 4a. The holograms generated by CoreNet are loaded to the phase-only SLM in real-time and projected on the proximal fiber facet to generate tailored light fields at the distal far-field. The experimental setup is demonstrated in Fig. 4b, and the detailed description of the setup and calibration process can be found in "Materials and methods". Previously, CGHs needed to be calculated and loaded to the SLM in advance to generate the dynamic light field. Due to the fast computational speed of CoreNet, it is possible to generate the CGHs in real-time for tailored dynamic light field generation through the MCF. As shown in Fig. 4c, a running man animation is reconstructed at the distal far-field of the MCF to demonstrate the rapid hologram generation capability of CoreNet (see Visualization 1). The corresponding modulation holograms are real-time generated by CoreNet and on-the-fly fed to the SLM, enabling real-time CGHs generation for the near-video-rate holographic display of tailored light field through the miniature lensless microendoscope.

Discussion
Comparisons of the light field generation in simulation employing CoreNet and the Core-GS are shown in Fig. 3a. Different from the Core-GS 27 , CoreNet provides optimal recovery of the target image without any blemishes. It can be noticed in Fig. 3b that the gradient of the phase value in the hologram generated by CoreNet is much smaller than the Core-GS, the smooth transition of the phase leads to homogeneous backgrounds in the reconstructed images, increasing the signal-to-noise ratio.
We use four different splits from EMNIST to test the performance of the CoreNet, which is shown in Fig. 5. The EMNIST MNIST and EMNIST Digits dataset provide balanced handwritten digit datasets directly compatible with the original MNIST dataset. The EMNIST Letters dataset merges a balanced set of uppercase and lowercase letters into a single 26-class task. The EMNIST Balanced dataset contains a set of characters with an equal number of samples per class. CoreNet generates accurate CGHs with averaged fidelity over 0.85 for all types of datasets. The performance of CoreNet can be further improved by using other non-pixel-wise losses, www.nature.com/scientificreports/ such as SSIM loss which could improve the structural similarity or perceptual loss which encourages natural and perceptually pleasing results.
Comparisons of the light field generation through a 10,000 core lensless microendoscope in experiments employing CoreNet and alternate techniques are shown in Fig. 6a. Normal GS algorithm 26 generated holograms lead to a strong distorted light field, even though the phase distortion in the MCF is calibrated in situ. This is mainly due to the random distribution and the limited number of fiber cores, which induces significant spatial aliasing when the modulated light transmits through the MCF. The previously reported Core-GS algorithm solved this problem, providing high-quality complex light field generation through the MCF. However, the iterative process of the Core-GS requires high computational effort. Our novel CoreNet sped up the process by a factor of 82, enabling real-time generation of holograms for rapid holographic display through the microendoscope. The  www.nature.com/scientificreports/ fidelity of the experimental light field generation through the MCF is characterized by employing the correlation coefficients between the experimental captured image and the original target images. The image correlation coefficients, which are calculated between the generated light field in experiments and the target images, for the letter "MST" in Fig. 6a is 0.80 for Core-GS and 0.84 for CoreNet, and for the smiling face in Fig. 6b is 0.67 for Core-GS and 0.69 for CoreNet. Although strong phase distortion is induced by the relatively long length (50 cm) of the lensless microendoscope, the near-perfect in-situ calibration keeps the high fidelity of the light-field generation. Compared to the Core-GS algorithm, CoreNet offers light field generation with higher fidelity with significantly less computation time.
As an unsupervised learning approach, CoreNet provides high-quality phase retrieval for a randomly distributed phased array without labeling. Compared to the Core-GS which is an iterative algorithm, CoreNet significantly reduces the computation time to less than 0.14 s for the generation of one phase hologram, enabling real-time light field generation (see Table 1). The computational speed can be further increased in a better hardware platform. Despite the much shorter calculation time, the generated light field from CoreNet has the highest fidelity in the three approaches (Fig. 6c).
One potential application of CoreNet is in optogenetics. Holographic controlled light enables selective stimulation of target neurons individually with program-controlled shapes of the light field 9,44 , enabling precise control of the neuronal networks. The MCF endoscope with a micro-objective (external diameter of 2.6 mm) has been employed for selective photoactivation in a mouse brain with minor invasiveness 34 . However, the low SNR of the generated light field can lead to photoactivation of unwanted neurons, degrading the quality of the selective photoactivation. Holographic stimulation using CoreNet through the MCF can avoid this by generating tailored light fields with high fidelity, enabling in-vivo single-neuron activation. As shown in Fig. 7a, the invasiveness can be minimized to a few hundred microns using the lensless microendoscope for in-vivo selective holographic stimulation. Furthermore, to compensate for the vibration in a behaving mouse, the tailored light field also needs to be controlled adaptively, and CoreNet is the first reported approach that can adaptively control the tailored light field by generating the CGHs in real-time. Closed-loop control of the photoactivation using the lensless microendoscope is thus possible with CoreNet. Hence, the strong capability of rapidly generating high-quality tailored CGHs of CoreNet turns the lensless microendoscope into a powerful optogenetic probe for adaptive and selective photoactivation for in-vivo applications.
Fiber-based optical traps are now an important tool for investigating biological cell mechanics [45][46][47][48] . The small size and the high flexibility of optical fibers make them easy to be integrated into miniature lab-on-a-chip devices (Fig. 7b), facilitating high throughput measurements when combined with micro-fluid techniques. We previously reported the first MCF-based dual-beam trap, offering a very high degree of freedom for optical manipulation of biological cells 28 . However, the temporal resolution of optical manipulation is limited by the generation speed of the modulation holograms. Employing CoreNet can boost the generation speed of the CGHs for dynamic light  www.nature.com/scientificreports/ fields. Besides, the refractive index distribution in biological cells is inhomogeneous, leading to instability in optical traps. Adaptive tomographic optical trap 10,33 solved the problem by utilizing tailored trapping beams which fit the refractive index distribution and the shape of cells, but it requires high-speed tailored hologram generation. Hence, it is now possible to implement the adaptive tomographic trap in MCF-based optical traps for real-time closed-loop control. This can significantly increase the stability of the optical trapping and manipulation. Besides biomedical applications, the MCF is also one of the candidates for the next generation of fiber-optic communication cables. Employing MCFs broaden the bandwidth significantly with high-dimensional communication channels. CoreNet offers the possibility to generate tailored light fields through the MCF in realtime, paving the way for MCF-based high-dimensional fiber-optic communication 4 . Furthermore, CoreNet can have much wider applications beyond optical engineering. It can be employed for phase retrieval of any kinds of discrete or randomly distributed phased arrays, like phased array radar, ultrasonic phased arrays, opening new perspectives in astronomy, radar technologies, communication technologies, and ultrasonic technologies.

Materials and methods
Network training. We use the negative Pearson correlation coefficient (NPCC) as the loss function, which is defined in Eq. (5). The NPCC measures the linear correlation between two images instead of calculating pixel-wise error, which relaxes the constraint on the output image so that the network can converge faster to the optimal solution.
Network training and testing were performed on a workstation with AMD Ryzen 9 3950X CPU and 128 GB of RAM, using NVIDIA RTX A6000 GPU. The network is trained for 5 epochs using the Adam optimizer. The training images are preprocessed to sizes of 512 × 512 pixels and then padded with zeros to 1920 × 1080 pixels. Fig. 4b. The diameter of the laser beam emitted from a diode-pumped solid-state laser (Verdi 532 nm, Coherent Inc.) is expanded by a factor of 10 (L1, L2) to fully illuminate the phase-only SLM (PLUTO LCOS SLM, Holoeye Photonics). The CGH displayed on the SLM is combined with a blazed grating, a phase conjugation layer, and a phased array modulation layer. To get rid of the direct reflection from the surface of the SLM, the phase modulation hologram is diffracted to higher orders by the blazed grating and only the first diffraction order can pass through the iris diaphragm in the spatial filter system (L3, ID, L4). The filtered phase modulation hologram is projected on the proximal facet of a 50 cm long MCF (10,000 single-mode cores; FIGH-350S, Fujikura) through a microscope objective (MO1; 20X Plan Achromat Objective, 0.4 NA, Olympus).

Calibration of the intrinsic phase distortion in MCFs.
Before transforming the MCF-based microendoscope to a phased array, the phase distortion due to the OPD between the fiber cores needs to be compensated employing DOPC 40,49 . In our work, we implement the previously proposed two-stage calibration method 25,50 . The intrinsic OPDs between the fiber cores are measured and compensated in transmission geometry, and the bending induced and temporal phase distortion is further calibrated by the back-reflected guide star in situ. To be more specific, a blazed grating is displayed on the SLM to generate a plane wave illumination at the proximal facet of the MCF, the distorted light field at the distal facet is imaged on the distal camera (CAM2; uEye   Fig. 4b) is split from the laser source and coupled into a single-mode fiber. The reference beam from the fiber collimator (L8; collimation package, Thorlabs) is slightly tilted for the digital off-axis holographic geometry 51 . The phase differences of fiber cores are reconstructed from the captured digital hologram on the distal camera 28 . The measured phase is then conjugated and affine transformed into the coordinate system of the SLM to pre-compensate the intrinsic phase distortion.
To calibrate the temporal and bending induced phase distortion in situ, a partial reflector can be mounted on the distal tip, and the further temporal and bending induced phase distortion is measured from the guide star hologram captured on the proximal camera (CAM1; uEye camera, IDS) 50 . The guide star is generated at the distal side by illuminating a single fiber core. The reflected light illuminates the distal facet and the distorted light field on the proximal facet is imaged on the proximal camera interfering with the second reference beam (reference beam 2 in Fig. 4b). Therefore, the phase distortion can be measured from the digital off-axis hologram captured on the proximal camera without distal access. The conjugated phase distortion is then added to the phase conjugation layer of the SLM to further in-situ compensate for the temporal and bending induced phase distortion.

Conclusion
We demonstrated a novel phase retrieval method based on a deep neural network (CoreNet), decreasing the computational time of the CGHs for MCFs by a factor of 82. This enables real-time dynamic control of the complex light field through the MCF lensless microendoscope. The phase distortion in the MCF is compensated in situ by DOPC, which transforms the MCF to a holographic controlled phased array. Employing CoreNet to generate the tailored holograms for complex wavefront shaping through the MCF provides high fidelity holographic reconstruction at the distal side of the MCF. Near video-rate holographic display of dynamic light field is realized through an MCF with on-the-fly generated CGHs. Our work paves the path for high-speed MCF-based applications such as microendoscopic imaging, in vivo adaptive optical manipulation, optogenetic stimulation, micro-materials processing, and optical communication.

Data availability
The training dataset EMNIST is publicly available 43  www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.