Introduction

The ability to shape light into arbitrary potentials has created many new opportunities in cold atom experiments. Applications include atomtronics1,2, tailored potentials for quantum simulation experiments with optical lattices3,4,5,6 and quantum information platforms using Rydberg arrays7,8,9. These applications require smooth light potentials that minimize inhomogeneities and the resulting dephasing effects, and for experiments involving larger atom numbers or where laser power is limited, a high efficiency is desirable. Arbitrary light potentials are commonly generated using a digital micromirror device (DMD) which is an amplitude-modulating spatial light modulator (SLM) or using a phase-modulating liquid crystal on silicon (LCOS) SLM. Tailored light potentials for cold atom experiments were realised using a DMD in direct imaging10,11, where the efficiency of the light potential is directly proportional to the number of mirrors in the ‘on’ position and is limited by the diffraction efficiency of the device (typically 30–88%)10,12,13. Alternatively, the DMD can be used in a holographic setup with efficiencies of 1–2%14. As opposed to direct imaging, any aberrations in the optical system can be corrected in situ which enables to generate diffraction-limited light potentials14. Using a phase-modulating LCOS SLM in a holographic setup, calculated efficiencies of 18–64% were achieved15,16,17,18, largely independent of the size of the light potential. After multiplying these calculated efficiencies by the diffraction efficiency of the LCOS SLM (20–90%, depending on the diffraction angle19) they are still an order of magnitude higher compared to the DMD efficiencies. As holographically generated light potentials are very sensitive to aberrations in the optical system, it is challenging to produce light potentials of low error. Potentials with a root-mean-squared (RMS) error of \(<5\%\) have been used to investigate Bose-Einstein condensates in ring traps20,21, while in recent experiments with Rydberg arrays, light potentials with an RMS error of \(0.7\%\) were used7. Despite the complexity associated with a holographic setup, the prospect of achieving higher efficiencies and lower error has driven the development of sophisticated hologram calculation techniques.

The task of finding the SLM phase to achieve a desired light potential is known as phase retrieval problem. Various algorithms such as the mixed-region amplitude-freedom (MRAF) algorithm17, the offset-MRAF algorithm15 and a conjugate gradient (CG) approach16 were developed to solve this purely computational problem and produce simulated light potentials of \(<1\%\) RMS error. However, creating light potentials with this degree of accuracy is difficult experimentally as imperfections in the optical setup cause a mismatch between the simulated and the measured light potentials. These effects include a distorted wavefront at the SLM, the curved surface of the SLM itself, crosstalk between neighbouring SLM pixels, aberrations caused by the Fourier lens and other alignment imperfections. To compensate for these errors, camera feedback algorithms were used to create more accurate light potentials7,17,21,22,23. Using stochastic gradient descent, the phase retrieval problem was solved by directly taking the camera image into account when calculating the cost function and its gradient24. Further, it was shown that the Fourier transform used to propagate the light field from the SLM to the camera can be replaced by a more sophisticated method to simulate the propagation of light and can result in more accurate experimental light potentials15.

In this work, we create light potentials by combining several computational and experimental techniques to achieve an RMS error of \(<2\%\) for various patterns while maintaining measured efficiencies between 15 and 40%. We solve the phase retrieval problem by using CG minimisation16,18 and investigate the difference between two methods to simulate the propagation of the light; the angular spectrum method (ASM)25 and the commonly used fast Fourier transform (FFT). We further improve the quality of our light potentials by modelling crosstalk of neighbouring SLM pixels. In previous work, spot patterns have been generated using a pixel crosstalk model19, however, to the best of our knowledge, this effect has not been taken into account to generate smooth arbitrary light potentials. The combination of all of these techniques allows us to produce potentials of \(<1.5\%\) RMS error and efficiencies of more than \(40\%\), opening the way to new applications that require this degree of accuracy and efficiency.

Our experimental setup consists of the SLM (Hamamatsu X13138-07, pixel pitch \(12.5 \upmu \hbox {m}\), \(1272 \times 1024\) pixels), an achromatic doublet lens and a camera in the Fourier plane. A full description of the setup is shown in the SI. The electric field in the SLM plane, \(E_{\text {SLM}}\!\left( x, y\right) \), is related to the electric field in the image plane, \(E_{\text {IMG}}\!\left( x, y\right) \), via the Fourier transform (details see Fig. 1 and “Methods”). To generate the desired light potential with an intensity pattern \(I_{\text {IMG}}\!\left( x, y\right) =\left| E_{\text {IMG}}\!\left( x, y\right) \right| ^2\), the phase pattern displayed by the SLM, \(\varphi \!\left( x, y\right) \), must be found, given the constant field at the SLM \(A_{\text {SLM}}\!\left( x, y\right) \exp {\left[ i\varphi _{\text {C}}\!\left( x, y\right) \right] }\). The Gerchberg-Saxton (GS) algorithm26 is an iterative Fourier transform algorithm (IFTA) and can find \(\varphi \!\left( x, y\right) \) to produce spot patterns of \(98\%\) uniformity27. However, for arbitrary and smooth light potentials, required e.g., for quantum simulation experiments with ultracold atoms, the GS algorithm does not converge well. Modified versions of the original GS algorithm such as the mixed-region amplitude-freedom (MRAF) algorithm17 and the offset-MRAF (OMRAF) algorithm15, have produced smooth simulated light potentials approaching 1% root-mean-square (RMS) error, and predicted efficiencies around 24%15, depending on the target pattern (see Eqs. 1 and 2). More recently, gradient-based optimisation algorithms such as the CG method were used to generate simulated light potentials \(<0.1\%\) RMS error and efficiencies \(>60\%\)16, outperforming the above-mentioned IFTAs15,17. Note that these are RMS errors and efficiencies of simulated light potentials which differ from the experimentally obtained values (see Table 1).

Figure 1
figure 1

Generation of light potentials using CG minimisation and camera feedback. (a) Holographic setup with the displayed SLM phase, \(\varphi _{ij}\), in the SLM plane, forming a light potential, \(I_{kl}\), in the Fourier plane of the lens. (b) Flow diagram visualising the process of generating a light potential. The pixel crosstalk on the SLM is modelled just before the SLM field, \(E_{\text {SLM}}\), is propagated to the image plane. (c) Convergence of the first 6 iterations of the feedback process. Due to imperfections, the experimental RMS error, \(\varepsilon _{\text {M}}\), (solid line) converges at a higher level than the predicted RMS error, \(\varepsilon _{\text {P}}\) (dashed line). After each feedback iteration, n, the experimental RMS error, \(\varepsilon _{\text {M}}\), decreases due to the adjusted target light potential.

Results

Characterisation of light potentials

To characterise the quality of our light potentials, we define the predicted and measured RMS error, \(\varepsilon _{\text {P}}\) and \(\varepsilon _{\text {M}}\), respectively,

$$\begin{aligned} \varepsilon _{\text {P}} = \sqrt{\frac{1}{N_M}\sum _{k,l\in M}\frac{\left( {\hat{T}}_{kl}-{\hat{I}}_{kl}\right) ^2}{{\hat{T}}_{kl}^2}} \quad {\text {and}} \quad \varepsilon _{\text {M}} = \sqrt{\frac{1}{N_{U}}\sum _{u,v\in M_{U}}\frac{\left( {\hat{T}}_{uv}-{\hat{I}}_{uv}\right) ^2}{{\hat{T}}_{uv}^2}}. \end{aligned}$$
(1)

The predicted RMS error, \(\varepsilon _{\text {P}}\), measures the difference between the simulated light potential, \({\hat{I}}_{kl}\), and the target potential, \({\hat{T}}_{kl}\), where k and l are the indices in the computational image plane. The error is evaluated in a measure region, M, which is defined as the region in the image plane where the target intensity is larger than 50% of the maximum target intensity22. The number of pixels in M is indicated by \(N_M\). \({\hat{T}}_{kl}=T_{kl} / \sum _{k,l \in M} T_{kl}\) and \({\hat{I}}_{kl}=I_{kl} / \sum _{k,l \in M} I_{kl}\) are the normalised target light potential and the normalised simulated light potential, respectively. The measured RMS error, \(\varepsilon _{\text {M}}\), characterises the light potential captured by the camera. The camera image, \(I_{uv}\), with row and column indices, u and v, is mapped to the computational image plane using an affine transform, U. We define \(\varepsilon _{\text {M}}\) in the transformed measure region, \(M_{U}\), containing \(N_{U}\) pixels, using the normalised, transformed target light potential, \({\hat{T}}_{uv}\).

The predicted efficiency, \(\eta _{\text {P}}\), of the light potential is given by the ratio of the power in the signal region, S, (indicated by the red rectangle in Fig. 1a), to the total power in the image plane17. We define the experimental efficiency of the light potential, \(\eta _{\text {M}}\), as the ratio of optical power, \(P_S\), in the transformed signal region, \(S_U\), to the measured power of the beam before the expansion telescope, \(P_{\text {in}}\), (see “Methods”)

$$\begin{aligned} \eta _{\text {P}} = \frac{\sum _{k,l\in S}I_{kl}}{\sum _{k,l}I_{kl}} \quad {\text {and}} \quad \eta _{\text {M}} = \frac{P_S}{P_{\text {in}}}. \end{aligned}$$
(2)

Conjugate gradient minimisation and camera feedback

We use CG minimisation16 due to its rapid convergence and due to its flexibility to define a cost function which can be chosen to meet the requirements for a specific application, e.g., the optimisation of intensity, phase and efficiency in a specific region of interest. The minimisation improves the simulated light potential, \(I_{kl}\), iteratively by modifying the SLM phase, \(\varphi _{ij}\), based on a cost function C and its gradient \(\partial C / \partial \varphi _{ij}\) (blue loop in Fig. 1b). We use the mean-squared error between the normalised simulated intensity pattern in the image plane, \({\tilde{I}}_{kl}=I_{kl} / \sum _{k,l \in S} I_{kl}\), and the normalised target intensity pattern, \({\tilde{T}}_{kl}=T_{kl} / \sum _{k,l \in S} T_{kl}\), in the signal region, S, as cost function for the optimisation16,

$$\begin{aligned} C\left( \varphi \right) = s \sum _{k,l\in S} \left( {\tilde{T}}_{kl}-{\tilde{I}}_{kl} \right) ^2. \end{aligned}$$
(3)

The sum is evaluated over k and l in the signal region, S, where s is the steepness of the cost function to aid convergence (see “Methods” for further details).

To generate light potentials of low RMS error experimentally, it is essential to measure the beam profile, \(A_{\text {SLM}}\!\left( x, y\right) \), and constant phase, \(\varphi _{\text {C}}\!\left( x, y\right) \), at the SLM plane. We use an interferometric method14 which displays a sequence of patterns on subsections on the SLM (see SI). Finding a suitable initial SLM phase guess is essential for the convergence of the CG minimisation. We choose an initial phase guess for a given light potential and remove optical vortices from a light potential if necessary (see “Methods”). To reduce the error in the experimental light potential further, we employ a camera feedback algorithm22 (red loop in Fig. 1b, details see “Methods”). The entire protocol is shown schematically in Fig. 1b. After \(m_{\text {max}}\) CG iterations, a camera image is taken to update the target image and restart the CG loop. The feedback algorithm typically converges within \(n=15\) iterations (see Fig. 1c).

Pixel crosstalk modelling

By modelling a single SLM pixel with a single computational pixel, we assume that the phase across the SLM pixel is uniform. However, due to the nature of the liquid crystal material inside the SLM, neighbouring pixels affect each other at their boundary region. This effect is known as pixel crosstalk or fringing field effect28,29,30,31,32,33. We study the effect of pixel crosstalk on our light potentials by up-scaling the SLM phase such that one SLM pixel is represented by \(3 \times 3\) computational pixels and convolving it with a kernel, K,33

$$\begin{aligned} K\left( x, y\right) = {\mathcal {F}}^{-1}\left\{ \exp \left[ -\left( \frac{\left| \kappa _x\right| ^q +\left| \kappa _y\right| ^q}{\sigma ^q}\right) \right] \right\} , \end{aligned}$$
(4)

of order q and width \(\sigma \). As an example, we calculated the SLM phase for a spot array target potential using the CG minimisation, and observed fringes in the camera image (Fig. 2b) which do not appear in the simulated light potential (Fig. 2a). After up-scaling and convolving the same SLM phase pattern, we propagate the field from the SLM plane to the image plane using the Fourier transform. Modelling the pixel crosstalk has no influence on the spatial resolution of the light potential in the image plane. The resulting simulated light potential (Fig. 2c) features fringes similar to those in the camera image, however, with reduced contrast.

Figure 2
figure 2

Simulated and experimental images illustrating the effect of pixel crosstalk. (a) Simulated light potential for a spot array target light potential. (b) Camera image of the experimental light potential showing fringes and an intensity gradient, with less intense spots in the top left of the image. (c) Simulated light potential after up-scaling and convolving the SLM phase pattern with kernel K. The fringes and the intensity gradient seen in the camera image (b) are reproduced in the simulation, however, with reduced contrast.

In the CG minimisation, we account for pixel crosstalk by upscaling the displayed phase, \(\varphi \left( x, y\right) \), and restricting its values to a range between 0 and \(2\pi \) to ensure that the cost \(C\left( \varphi \right) \) remains a continuous, differentiable function. We convolve the up-scaled phase with the kernel, K, before propagating the light field to the image plane. The parameters \(\sigma =1.24 \,\text {px}^{-1}\) and \(q=1.80\) were found by a 2D scan to minimise \(\varepsilon _{\text {M}}\) after 150 CG iterations without camera feedback for a disc-shaped target potential. Using the camera feedback algorithm with the pixel crosstalk model further reduces the RMS error. The final RMS error and the effect of modelling the pixel crosstalk depend on the size of a specific target light potential (see Fig. 3). Upscaling the SLM pixels by a factor of 3 is computationally expensive, however, we accelerate our calculations using a GPU. This reduces the runtime of our algortihm to \(\sim \) 10 minutes which is 6 times longer than without pixel crosstalk modelling (for 15 feedback iterations with 100 CG iterations each).

Figure 3
figure 3

Effect of pattern size and pixel crosstalk on the RMS error. (a)–(c) Disc-shaped potentials (diameters \(D=0.6\,\text {mm}\), \(D=1.5\,\text {mm}\) and \(D=2.8\,\text {mm}\)), generated using camera feedback without the pixel crosstalk model, and normalised by the average intensity in the flat part of the disc. (d) RMS error of disc-shaped light potentials of different diameters with and without the pixel crosstalk model. (e) Horizontal profiles of the light potentials, averaged over 10 rows within the white rectangles in (a)–(c).

To study how the pixel crosstalk model affects our light potentials, we produced disc-shaped light potentials of different diameters, D, between \(0.64 \,\hbox {mm}\) and \(3.3 \,\hbox {mm}\), with and without accounting for pixel crosstalk (see Fig. 3). The target light potential was convolved with a Gaussian kernel of 2 pixels width to ensure that the edge of the disc is not sharper than the diffraction limit. For the initial phase guess, the quadratic phase curvature was adjusted proportional to the disc diameter (see “Methods”). This ensures that the predicted efficiency of the differently sized light potentials remain similar (\(\eta _{\text {P}}\) = 74–87%). Without accounting for pixel crosstalk, we achieved best light potentials (\(\varepsilon _{\text {M}}=1.1\%\)) for small discs of \(D=0.85 \,\hbox {mm}\), and less smooth potentials (\(\varepsilon _{\text {M}}=3.6\%\)) for larger discs of \(D=3.2 \,\hbox {mm}\) with measured efficiencies \(\eta _{\text {M}}\) = 33–40%. We found that \(\varepsilon _{\text {M}}\) is inversely proportional to the measured intensity, \(I^{\prime}\), in the flat part of the disk (Fig. 3d). To obtain \(I^{\prime}\), we measure the average intensity in the flat part of the disc using the camera (see “Methods”). Smaller discs are of higher intensity since the same amount of optical power is focussed onto a smaller area.

We found that the pixel crosstalk causes a ghost image19,34 which can interfere with the light potential and cause fringes. By accounting for pixel crosstalk in our model, any interference with the light potential caused by the ghost image is attenuated which lowered the final experimental RMS error by a factor of \(\sim 0.4\) (\(D=2.8 \hbox {mm}\)). We found that accounting for pixel crosstalk has little effect on smaller light potentials, where the overlap between the ghost image and the light potential is smaller (see Fig. 3d). When taking the pixel crosstalk model into account, the RMS error, \(\varepsilon _{\text {M}}\), remains smaller as the ghost image caused by pixel crosstalk is attenuated (Fig. 3d), and the measured efficiency decreases from \(\eta _{\text {M}}=41\%\) (\(D=0.85 \,\hbox {mm}\)) to \(\eta _{\text {M}}=20\%\) (\(D=3.2 \,\hbox {mm}\)). We found that the predicted efficiency, \(\eta _\text {P}\), is proportional to the measured efficiency, \(\eta _{\text {M}}\). The efficiency predicted by the pixel crosstalk model is lower and closer to the measured efficiency as multiple diffraction orders are simulated. We did not see an improvement in \(\varepsilon _{\text {M}}\) when increasing the resolution of an SLM pixel even further to \(5\times 5\) or \(7\times 7\) computational pixels.

Angular spectrum method

The Fourier transform used to compute the propagation of light from the SLM plane to the image plane requires the far-field and the paraxial approximations (including a parabolic lens) as well as the assumption that lens and camera are perfectly in focus. In practice, we use a doublet lens and and there is an experimental position uncertainty of the lens and the camera along the optical axis. Inspired by the improvement in RMS error reported in a recent study15, we implement Helmholtz propagation using the angular spectrum method (ASM) to model the diffraction of light without assuming a far field or small angles25 (see SI for details on the ASM). In our method, this replaces the Fourier transform, \({\mathcal {F}}\), in the CG minimisation (shown in blue in Fig. 1b) with the ASM.

We investigate the effect of using the ASM together with the pixel crosstalk model in our feedback process (Fig. 4). The ASM is more accurate than the FFT before any camera feedback is used (\(n=0\) in Fig. 4), however, both methods converge to similar values after 15 iterations (see inset in Fig. 4). When including the pixel crosstalk model, the initial error before camera feedback (\(n=0\)) is higher, but the algorithm converges to lower \(\varepsilon _{\text {M}}\) after 15 iterations for both the ASM and the FFT method. In all methods, the experimental error, \(\varepsilon _{\text {M}}\), slowly rises towards the end of the CG minimisation (seen most clearly in Fig. 4 between \(n=1\) and \(n=2\)). This is due to a mismatch between the simulation and the experiment, leading to a discrepancy between \(\varepsilon _{\text {P}}\) and \(\varepsilon _{\text {M}}\) (see Fig. 1c). The lowest value of \(\varepsilon _{\text {M}}\) is found after less than 100 CG iterations (see hollow circles in the inset of Fig. 4). We did not find a significant improvement by using the ASM instead of the FFT after camera feedback. In optical setups involving a high-NA microscope objective where the paraxial approximation does not hold, we expect the ASM to perform better than in our test setup.

Figure 4
figure 4

Convergence of the feedback procedure using the FFT and the ASM, with and without pixel crosstalk modelling. The main figure shows \(\varepsilon _{\text {M}}\) as it converges for \(n=15\) camera iterations with \(m=100\) CG iterations in between. The values \(\varepsilon _{\text {M}}\) used in the camera feedback process are shown as filled circles. To investigate the behaviour of \(\varepsilon _{\text {M}}\) during the CG minimisation, we saved intermediate phase patterns and analysed the resulting light potentials (lines in main figure). For \(n>1\), the experimental error, \(\varepsilon _{\text {M}}\), is smallest for \(m < 100\). The inset shows the convergence during the final 8 camera feedback iterations. The lowest experimental error was found between \(n=11\) and \(n=14\) (hollow circles in the inset).

Comparing light potentials

To characterise our method, we produced various light potentials for cold atom experiments. We created a ring with a Gaussian profile relevant for atomtronic experiments20 (Fig. 5a), a Gaussian potential with an offset as used to cancel the harmonic confinement in optical lattices5 (Fig. 5b) and a Gaussian spot array with a non-zero background for tweezer arrays7 (Fig. 5c). We also generated a potential resembling an ‘atomtronic’ OR gate as used by previous studies15,17,34 (Fig. 5d). For the Gaussian potential and the spot array, we achieved the best experimental results by using an initial phase guess according to Eq. (8) (see “Methods”). For the ring-shaped potential (Fig. 5a) and the OR gate (Fig. 5d), an initial phase guess resulting in vortex-free potentials could not be found in the same way. For these patterns, remaining optical vortices were removed35 (see “Methods”).

Figure 5
figure 5

Camera images and their normalised profiles (along the white dashed lines) after 15 feedback iterations using the FFT with the crosstalk model. (a) Ring with a Gaussian profile on a non-zero background. (b) Gaussian potential with offset. (c) Gaussian spot array on a non-zero background. (d) An ‘atomtronic’ logical OR gate17.

We find that the vortex removal process introduces high-frequency components in the displayed SLM phase \(\varphi \!\left( x, y\right) \) which increases the effect of pixel crosstalk and deteriorates the experimental light potential. Using our technique, vortex-free simulated light potentials can be generated even from an entirely random initial guess. However, starting with such a random initial phase guess results in less accurate experimental light potentials. For all patterns, we used 15 feedback iterations with 100 CG iterations each, accounting for pixel crosstalk during the optimisation. The experimental RMS error of all four patterns varies between \(\varepsilon _{\text {M}}\) = 1.4–1.6%, with measured efficiencies between \(\eta _{\text {M}}\) = 15–31% (see Table 1). The remaining imperfections are most visible in the profiles of light potentials with flat regions. The peak signal-to-noise ratios (PSNR)36 measured in the transformed signal region, \(S_U\), of the light potentials in Fig. 5a–d are \(45.7 \,\hbox {dB}\), \(40.7 \,\hbox {dB}\), \(43.8 \,\hbox {dB}\), and \(39.9 \,\hbox {dB}\), respectively.

Discussion

Compared to previous studies (Table 1), we can generate experimental light potentials of low RMS error and higher efficiencies. Using the CG method, a small line-shaped potential (105 \(\upmu \hbox {m}\) length) of \(0.7\%\) RMS error has been generated7 by optimising the intensity and phase in the image plane as well as the efficiency (\(\varepsilon _{\text {P}}=38\%\)). Using such a phase constraint, it becomes increasingly difficult to generate light potentials which are accurate and efficient for larger patterns. A larger line-shaped potential (\(\sim 400\,\upmu \hbox {m}\) length) has been generated in a different study18,23 by constraining the phase, however, with a much lower efficiency of \(8.3\%\). If the phase of the target light potential is constrained, more accurate light potentials are typically less efficient and vice versa7,18. By removing the phase constraint, accurate and efficient light potentials have been generated computationally using the CG method16, however, the unrestrained phase makes it difficult to realise these experimentally23. In this work, we minimised experimental errors by characterising our optical system and by using camera feedback. This allows us to generate accurate and efficient light potentials experimentally, without constraining the phase in the image plane. Previous studies have characterised their optical system and used camera feedback without constraining the phase15,22, however, using an IFTA (MRAF or OMRAF) instead of the CG algorithm, resulting in less accurate and less efficient experimental light potentials than presented here. In our work, accounting for pixel crosstalk further reduced the RMS error, especially for large light potentials, while lowering the efficiency by \(\sim 20\%\) (see bottom of Table 1).

We did not see an improvement in the RMS error when using the ASM instead of the FFT, however, other experimental uncertainties such as a displacement of the Fourier lens in the xy-plane or a tilt of the Fourier lens could be modelled with the ASM to improve the accuracy of the light potentials before any camera feedback. Cold atom experiments require microscopic potentials to be projected using a high-NA objective, which will be the subject of further work. The FFT might not be sufficient to model this high-NA objective due to the large diffraction angles and the ASM could lead to more accurate potentials in this scenario, even without restricting the phase.

Table 1 Simulated and experimental errors and efficiencies of previous studies compared to this work.

Methods

Phase retrieval problem

The electric field in the SLM plane at \(z=0\), \(E\!\left( x, y, 0\right) \equiv E_{\text {SLM}}\!\left( x, y\right) \), is calculated using the amplitude of the incident laser beam, \(A_{\text {SLM}}\!\left( x, y\right) \), and the phase at the SLM (see Fig. 1)

$$\begin{aligned} E_{\text {SLM}}\!\left( x, y\right) = A_{\text {SLM}}\!\left( x, y\right) \exp {\bigl \{ i\left[ \varphi _{\text {C}}\!\left( x, y\right) + \varphi \!\left( x, y\right) \right] \bigr \}}. \end{aligned}$$
(5)

The phase at the SLM is the sum of the pattern displayed by the SLM, \(\varphi \!\left( x, y\right) \), and a constant phase, \(\varphi _{\text {C}}\!\left( x, y\right) \), which varies spatially across the SLM but does not change with the displayed phase pattern. This constant phase is caused by distortions of the incoming wavefront and imperfections of the SLM surface. In the image plane at \(z=2f\), the electric field, \(E\!\left( x, y, 2f\right) \equiv E_{\text {IMG}}\!\left( x, y\right) \), is characterised by the amplitude, \(A_{\text {IMG}}\!\left( x, y\right) \), and the phase, \(\phi \!\left( x, y\right) \), of the light potential

$$\begin{aligned} E_{\text {IMG}}\!\left( x, y\right) = A_{\text {IMG}}\!\left( x, y\right) \exp {\left[ i\phi \!\left( x, y\right) \right] }. \end{aligned}$$
(6)

Under the paraxial approximation and the far-field approximation, the electric field in the image plane is related to the electric field in the SLM plane via the Fourier transform25, \({\mathcal {F}}\),

$$\begin{aligned} E_{\text {IMG}}\!\left( \kappa _x, \kappa _y\right) = \frac{1}{i\lambda f}\iint ^{\infty }_{-\infty }E_{\text {SLM}}\!\left( x^{\prime}, y^{\prime}\right) \exp {\left[ -2\pi i\left( \kappa _xx^{\prime} + \kappa _yy^{\prime}\right) \right] }\,dx^{\prime}\,dy^{\prime} \equiv {\mathcal {F}}\left\{ E_{\text {SLM}}\left( x, y\right) \right\} , \end{aligned}$$
(7)

with spatial frequencies in the image plane, \(\kappa _x=x/\lambda f\) and \(\kappa _y=y/\lambda f\).

Implementation of conjugate gradient minimisation and camera feedback

We use a nonlinear CG solver37, implemented on a GPU using PyTorch which has automatic differentiation capabilities. This allows us to compute the gradient of the cost function, \(\partial C / \partial \varphi \), without the need for an analytic expression. Using \(s=10^{12}\) (see Eq. 3), the minimisation typically reaches \(\varepsilon _{\text {P}}=1\%\) within 100 iterations, depending on the shape of the desired potential and provided that an initial phase guess which does not lead to optical vortices was used. As the SLM phase pattern is optimised by simulating the diffraction of light, the target intensity pattern, \({\tilde{T}}_{kl}\), is convolved with the point spread function of our optical system to remove sub-diffraction limited features which hinder convergence. If desired, a term could be added to the cost function to optimise for a higher power inside the signal region7. Currently, we do not require control over the phase, \(\phi _{kl}\), in the image plane, however, it is possible to simultaneously control the intensity and the phase in the image plane at the expense of efficiency7,23.

The camera feedback algorithm22 further improves the CG hologram calculation. Initially, an SLM phase pattern, \(\varphi _{ij}^{\left( 0\right) }\), is calculated for a given target light potential, \({\hat{T}}_{kl}^{\left( 0\right) }\), by running the CG minimisation for \(m_{\text {max}}\) iterations. We then display this pattern on the SLM and take a camera image, \(I_{uv}\), of the light potential. We map the initial target light potential from the coordinate system of the computational image plane, \({\hat{T}}^{(0)}_{kl}\), to the coordinate system of the camera image, \({\hat{T}}^{(0)}_{uv}\), using an affine transformation. To calculate the affine transformation, we generate a checkerboard-shaped light potential using the CG algorithm and detect the corner points of the checkerboard in the camera image24,38. Then, the camera image, \(I_{uv}\), and the transformed initial target light potential, \(T^{(0)}_{uv}\), are normalised22 and subtracted from each other. This difference \(D_{uv}={\hat{T}}^{(0)}_{uv} - {\hat{I}}_{uv}\) is then transformed back to the coordinates of the computational image plane and added to the previous target light potential \({\hat{T}}^{(n-1)}_{kl}\), resulting in a new target light potential \({\hat{T}}^{(n)}_{kl}={\hat{T}}^{(n-1)}_{kl} + D_{kl}\) for the next feedback iteration. We then re-run the CG minimisation using the updated target light potential and the previous optimised phase pattern, \(\varphi _{ij}^{\left( n-1\right) }\), as an initial guess. Before the new target potential is calculated, the difference \(D_{uv}\) is blurred with a Gaussian kernel to ensure that there are no features in the new target that are smaller than the diffraction limit (e.g. camera noise) as the CG minimisation cannot produce light potentials containing sub-diffraction-limited features.

Initial phase guess

We use a combination of a linear phase and a quadratic phase as an initial phase guess, \(\varphi _{\text {G}}\), which is common practice in IFTAs and gradient-based phase retrieval algorithms16,17,

$$\begin{aligned} \varphi _{\text {G}}\left( x, y\right) = m_{x}x + m_{y}y + 4R\left[ \gamma x^2 + \left( 1-\gamma \right) y^2\right] , \end{aligned}$$
(8)

The linear terms \(m_{x}x\) and \(m_{y}y\) diffract the light away from the optical axis and are typically determined by the shape of the target light potential, \(T_{kl}\). The quadratic term with curvature, R, and aspect ratio, \(\gamma \), are used to control the size of the illuminated area. Smaller values of R produce more efficient light potentials as more light is focused into the signal region S. The initial phase guess must be chosen such that optical vortices cannot form in the signal region S of the image plane16,17. An optical vortex is a phase winding around a singularity at which the phase is not defined35. The field amplitude at this point is zero, causing ‘holes’ in the light potential (Fig. 6a). The CG minimisation cannot remove these vortices because a global phase shift is required to annihilate them39. By varying R, an initial guess that prevents the formation of optical vortices can be found for ‘simple’ target potentials. We choose a uniform disc on a dark background as a target potential and detect the number of vortices in the resulting light potential for each value of R (Fig. 6e). The vortices in the light potential cause a higher predicted RMS error, \(\varepsilon _{\text {P}}\) (black circles in Fig. 6e and blue circles in Fig. 6f). Certain values of R do not result in optical vortices, and the lowest RMS error was found for \(R=3.6 \,\hbox {mrad}/px^2\).

Figure 6
figure 6

Detection and removal of optical vortices in the disc-shaped light potential. (a) Intensity of the light potential, showing the central \(100 \times 100\) pixels; (b) phase, \(\phi \), of the same potential, (c) phase, \(\phi _{\text {v}}\) of the vortices only, (d) phase, \(\phi - \phi _{\text {v}}\), of the corrected field with vortices removed. (e) Number of vortices detected in the light potential after 100 CG iterations and 10 feedback iterations, using different values for the quadratic curvature, R, in the initial phase guess. (f) Predicted RMS error, \(\varepsilon _{\text {P}}\), before vortex removal (blue circles) and after (orange triangles).

This procedure works well for simple patterns such as a disc-shaped flattop, however, for more intricate light potentials, it becomes difficult to find a suitable initial guess by scanning the value of R. Further, we found that using the measured intensity profile of the incident laser beam, \(\left| A_{\text {SLM}}\left( x, y\right) \right| ^2\), instead of a perfect Gaussian can introduce vortices even for simple patterns. To improve our scheme, we detect optical vortices in the light potential and remove them35,39. Initially, the usual CG minimisation is performed until stagnation is reached. We then detect the position of the vortices by identifying the zero crossings of the real and imaginary part of the electric field in the image plane, \(E_{\text {IMG}}\!\left( x, y\right) \). To find the charge of the vortices, a line integral around the \(3 \times 3\) neighbours of these points is evaluated. The sign of the line integral indicates if the vortex is of positive or negative charge. The phase around these vortices, \(\phi _{\text {V}}\!\left( x, y\right) \), is calculated using the relation39

$$\begin{aligned} \phi _{\text {V}}\left( x, y\right) = \sum _{n=1}^{N}q_{n}{{\,\textrm{Arg}\,}}\left[ \left( x-x_{n}\right) +i\left( y-y_{n}\right) \right] , \end{aligned}$$
(9)

where N is the total number of vortices, \(q_{n}\) the charge of the vortex and \(x_{n}\) and \(y_{n}\) its position. The phase, \(\phi _{\text {V}}\!\left( x, y\right) \), (Fig. 6c) is then subtracted from the phase of the light potential, \(\phi \!\left( x, y\right) \), (Fig. 6b) which annihilates the vortices (Fig. 6d). The electric field consisting of the corrected phase, \(\phi \!\left( x, y\right) - \phi _{\text {V}}\!\left( x, y\right) \), and the amplitude of the light potential, \(A_{\text {IMG}}\!\left( x, y\right) \), is propagated back to the SLM plane using the inverse Fourier transform. The phase of the resulting electric field is used as a new initial phase guess, \(\varphi _{\text {G}}\!\left( x, y\right) \),

$$\begin{aligned} \varphi _{\text {G}}\!\left( x, y\right) = {{\,\textrm{Arg}\,}}\biggl [{\mathcal {F}}^{-1}\Bigl \{ A_{\text {IMG}}\!\left( x, y\right) \exp \bigl [ i\left( \phi \!\left( x, y\right) - \phi _{\text {V}}\!\left( x, y\right) \right) \bigr ]\Bigr \}\biggr ]. \end{aligned}$$
(10)

By re-running the CG minimisation using \(\varphi _{\text {G}}\!\left( x, y\right) \), a vortex-free light potential can be produced, provided that all vortices in the light potential were detected. In case there are remaining vortices in the light potential, this process can be repeated until all vortices are detected and annihilated.

Efficiency measurement

To obtain the power in the signal region, \(P_S\), we measure the optical power that corresponds to a certain pixel value and exposure time of the camera image. We display a circular mask on the SLM containing a linear phase gradient and place an iris in the image plane to block the zeroth-order light. Only the power of the first-order spot caused by the SLM phase pattern is measured using a power meter. We then take a camera image of this spot with a certain exposure time and relate the pixel sum of the camera image to the measured power. Using this calibration, the optical power, \(P_S\), is calculated from the pixel sum of the camera image inside the transformed signal region, \(\sum _{u,v\in S_U}I_{uv}\), and the exposure time. The predicted efficiency, \(\eta _{\text {P}}\), is always higher than the measured efficiency, \(\eta _{\text {M}}\), as it does not take the diffraction efficiency of the SLM into account. When displaying a flat phase on the SLM, the measured power of the zeroth-order spot is \(69\%\) of the incident power, \(P_{\text {in}}\).