On-demand Photonic Ising Machine with Simplified Hamiltonian Calculation by Phase encoding and Intensity Detection

The photonic Ising machine is a new paradigm of optical computing that takes advantage of the unique properties of light wave propagation, parallel processing, and low-loss transmission. Thus, the process of solving combinatorial optimization problems can be accelerated through photonic/optoelectronic devices, but implementing photonic Ising machines that can solve arbitrary large-scale Ising problems with fast speed remains challenging. In this work, we have proposed and demonstrated the Phase Encoding and Intensity Detection Ising Annealer (PEIDIA) capable of solving arbitrary Ising problems on demand. The PEIDIA employs the heuristic algorithm and requires only one step of optical linear transformation with simplified Hamiltonian calculation by encoding the Ising spins on the phase term of the optical field and performing intensity detection during the solving process. As a proof of principle, several 20 and 30-spin Ising problems have been solved with high ground state probability (>0.97/0.85 for the 20/30-spin Ising model).


Introduction
Optimization problems 1 are ubiquitous in nature and human society, such as ferromagnets 2,3 , phase transition 4 , artificial intelligence 5 , finance 6 , biology 7,8 , agriculture 9 , etc.. Usually, combinatorial optimization problems (COPs) are nondeterministic polynomial hard (NP-hard) problems 1 , in which the required resources to find the optimal solutions grow exponentially with the problem scales on conventional von-Neumann machines.To tackle such obstacles, Ising machines as specialized hardware-sovlers are introduced to accelerate the solving process 10 , since any problem in the complexity class NP can be mapped to an Ising problem within polynomial complexity [11][12][13] .The NP-hard Ising problems can be described as finding the energy ground state corresponding to a specific Ising spin vector , which is equivalent to searching the global minima of Hamiltonian in absence of the external field 14

𝐻(𝛔
where  is the adjacent interaction matrix,  = [ 1 ,  2 , … ,   ] T is the Ising spin vector including Ising spins   ∈ {1, −1}, and the superscript T denotes the transpose.To implement Ising machines, various classical and quantum physical systems have been employed.For the quantum annealers, various systems, such as D-wave systems 15 , trapped atoms 16,17 , magnetic tunnel junctions 18 and radio-frequency superconducting quantum interference devices 19 , have been employed to simulate the Ising model.However, the large-scale implementation and fault tolerance of the quantum system are still technically challenging, and the annealing process should be sufficiently slow to maximize the ground state probability 10 .In classical domain, the Ising machines mainly based on electronic devices can be divided into two categories according to the operation mechanisms.In the first category, the Ising spins are represented by the phase of coupled electronic oscillators and the system is driven to the ground state 20 .In the other category, electronic parallel computation schemes, such as the memristor crossbar 21 or the complementary metal-oxide-semiconductor chip 22,23 , are utilized to accelerate the Hamiltonian calculations.Recently, advanced photonics exhibit the feasibilities to encode, transmit and process information on various spatial degrees of freedom, e.g.phase, amplitude/intensity, frequency/wavelength, time slot and spatial profile/distribution. Thus, photonic Ising machines have emerged due to the nature of parallelism and high propagation speed of light.Likewise, there are also two types of photonic Ising machines that have been experimentally reported.The first type of photonic Ising machine is based on the nonlinear optical or optoelectronic parametric oscillators [24][25][26][27][28][29] , in which Ising spins are encoded on the phase terms of the time-multiplexed pulses and the ground state search relies on the spontaneous convergence of the parametric oscillators.For these Ising machines, the coupling among Ising spins has to be implemented via measurement feedback and performing vector-matrix multiplication in an electronic hardware, or aligning the oscillator pulses in the same time slot with the optical delay line.Moreover, whether Ising machines based on nonlinear parametric oscillators can solve random Ising models remains contested 30 .The second type of photonic Ising machine employs heuristic algorithms to solve Ising problems, in which the main computation task is to calculate the Ising Hamiltonians of different Ising spin states iteratively.Such task can be regarded as multiplying a static matrix by an ever-changing vector, which is very suitable for the optical vector-matrix transformation (OVMM) system.Actually, the OVMM has been applied on some specific computation tasks, such as the deep diffractive neural network 31,32 and the nanophotonic processor 33 .As shown in these works, the vector-matrix multiplications could be dramatically accelerated by photonic systems due to the parallel propagation of light.In the spatial photonic Ising machine 34 , numerous Ising spins are encoded on the phase terms of the light field through phase-modulation units of the spatial light modulator (SLM).However, the spins interact in the intensity distribution of the light field with a Fourier lens, hence only some specific Ising models can be solved.The on-chip array of tunable Mach-Zehnder interferometers (MZIs) network 35,36 , which is known as the Reck scheme 37 , is also employed to solve arbitrary Ising models.However, due to the complexity of the Reck scheme, only 4-spin Ising models are experimentally demonstrated 36 .Therefore, it is still highly desired to implement photonic Ising machines that can solve arbitrary large-scale Ising problems with fast speed.
In this work, a photonic Ising machine is proposed and demonstrated with a fully reconfigurable OVMM system and the employed heuristic algorithm is modified from simulated annealing 38,39 .Meanwhile, the calculation of the Hamiltonian is simplified in the optical domain where the Ising spins are encoded on the phase term of the light field and only intensity detection is required.Our proposal is named as the "Phase Encoding and Intensity Detection Ising Annealer" (PEIDIA), which will be briefly explained as follows.At the beginning, with proper treatment of the adjacent matrix , the calculation of Ising Hamiltonians can be modified from the quadratic form (Eq. ( 1)) and only one OVMM is required.Then with a simple summation of the intensities of the light field, the Hamiltonian can be readily calculated.At last, the heuristic algorithm is employed to search the ground state with the obtained Hamiltonian.Thus our proposed PEIDIA is quite helpful to simplify and speed up the calculations in the optical domain.Furthermore, the PEIDIA can serve as a kind of "on-demand" solver for arbitrary Ising models while a programmable OVMM setup is employed to perform arbitrary linear transformation of the input Ising spin vector.
In our experimental implementation, the employed OVMM scheme is based on the discrete coherent spatial (DCS) mode and SLMs, which is improved from our previous work [40][41][42][43] .To verify the feasibility of our proposal, we have experimentally solved an antiferromagnetic Möbius-Ladder model as well as two fully connected, randomly generated spin-glass models.For the 20-spin Möbius-Ladder model, the ground state probability reaches 0.99 (100 runs) within 400 iterations, while that of the fully connected and random model is around 0.97 (100 runs) within 600 iterations.Furthermore, the ground state probability of the 30-spin fully connected and random model is around 0.85 (100 runs) within 1200 iterations.It should be mentioned that, this proposed architecture does not rely on specific optical linear transformation schemes and heuristic algorithms.The main advantage of the PEIDIA is the simplified Hamiltonian calculation with only one OVMM so that the parallelism and fast-speed of the optical calculation may be fully exploited.Thus, the proposed PEIDIA would pave the way to achieve large-scale photonic Ising machines that can solve arbitrary Ising models on demand.

Results
Architecture and operation principles of the PEIDIA.In order to accelerate the solving process of the Ising problem, we have proposed an on-demand photonic Ising machine and Figure 1a shows the architecture design.There are three main stages in the operation of the PEIDIA: the electronic pretreatment, the optical matrix multiplication and the electronic feedback.In the first stage of the electronic pretreatment, the parameters for the setup of the optical system are calculated and configured according to the adjacent matrix  of the given Ising model.In the second stage, the optical matrix multiplication system accelerates the calculation of the Hamiltonian of a certain Ising spin vector.The optical intensities obtained from the optical matrix multiplication are detected and converted to electronic signals.In the third stage, the spin vector for the next iteration will be generated and fed to the optical matrix multiplication according to the adopted heuristic algorithm.The second and the third stage will be conducted iteratively until the algorithm terminates.The details are described as follows.
The main purpose of the electronic pretreatment is to simplify the calculation in the optical domain.According to Eq. ( 1), the Ising Hamiltonian has a quadratic form and two steps of vector-matrix multiplications are required.Actually, only one vector-matrix multiplication is needed in the optical domain with proper pretreatment.First, every real interaction matrix  of the given Ising model can be decomposed to a symmetric matrix  + = ( +  T )/2 and an anti-symmetric matrix  − = ( −  T )/2.Since  − has no contribution to the Ising Hamiltonian in Eq. (1), we only discuss the symmetric component  + and use  to denote  + for simplicity in this article.As  is a real symmetric matrix, the Hamiltonian has the form as follows with eigendecomposition 44 : where  =  T  , while  is the normalized orthogonal eigenvector matrix and  = diag( 1 ,  2 , … ,   ) is the diagonal eigenvalue matrix of .
In our proposal, the vector-matrix multiplication of  is performed by the OVMM.During the ground state search, the transformation matrix  is unchanged while the sampled spin state  updates iteratively.Thus, each Ising spin is considered as encoded on the phase term of the optical field   =  0   =  0 exp(i( 0 +   )), while   corresponds to the element of the spin vector with the value of   ∈ {0, }.By neglecting the constant phase term of exp(i 0 ), the complex amplitude of the output optical field can be written as where  0 is the constant amplitude term.By defining the output intensity vector  by   =   *   , the Hamiltonian becomes (see detailed deduction in Supplementary Note 1) ) .
Eq. ( 4) shows that the calculation of Hamiltonian turns into the simple summation of the optical intensities in a subtle way.Thus in our proposal, the optical computation would perform the task of encoding spin vectors on optical field, vector-matrix multiplications of  and intensity detections as shown in Figure 1a.It should be mentioned that although the Ising spin vector  is encoded on the phase term of optical field, only the measurement of the output intensity vector  is required to obtain the Hamiltonian.
In Eq. ( 4), the first term in the bracket is the summation of the intensities corresponding to the negative eigenvalues, while the second term is that corresponds to the positive eigenvalues, which is due to the difference between     and   *   when   < 0. Since there are subtracting operations, the Hamiltonian is finally calculated with Eq. ( 4) in the electronic domain after the intensity detection.In succession, the heuristic algorithm determines the spin vector for next iteration.Here, a modified simulated annealing algorithm is employed to search the ground state.In the iteration , a spin state  (𝑛) is accepted and its Hamiltonian  (𝑛) is calculated.Then in the next iteration ( + 1),  spins of the spin vector are randomly flipped, which means the spin vector  (𝑛) is updated to  (+1) on the optical domain via the electronic feedback.The variable  is a random integer obtained from a Cauchy random variable (0, ), where  is a scaling coefficient and  is the annealing temperature (see Supplementary Note 2).With such state-generation method, the spin state can experience long jumps occasionally and escape from local minima more easily.Then the difference between the current Hamiltonian  (+1) and the previous one of  (𝑛) is calculated as: ∆ =  (+1) −  (𝑛) .
(5) If ∆ ≤ 0, the generated state  (+1) is accepted.If ∆ > 0,  (+1) is accepted with the probability of exp(−∆/) due to the Metropolis criterion 38,39 .In a single run of the algorithm,  is slowly decreased from the initial temperature of  0 to zero with the increase of the iteration number.According to the annealing schedule, it can be seen that in the early stage of the annealing, the PEIDIA can perform the global search, while at the end of the annealing, it is more likely to perform the local search.Finally, a "frozen" state will be obtained, which may be the optimal ground state with high probability.Actually, other heuristic algorithms can also be adopted, such as genetic algorithm 45 , in which multiple-spin-flips are also desired.Optical vector-matrix multiplication.Generally, the transformation matrix  in Eq. ( 3) is complex and non-unitary, so that the OVMM employed in our architecture should be capable of achieving such non-unitary transformation.In our previous work [40][41][42][43] , a matrix transformation scheme has been demonstrated with discrete coherent spatial (DCS) mode and SLMs.Such scheme can perform arbitrary complex vector-matrix multiplications for both unitary and non-unitary matrices.Based on it, the architecture of the optical computation in the PEIDIA is schematically depicted in Figure 1b.The spin vector is encoded on the input DCS mode, which consists of a group of Gaussian beams.More specifically, the elements of the spin vector are defined as the complex amplitudes at the centers of the Gaussian beams respectively.During the annealing process, each Ising spin   (𝑛) is encoded on the input vector through the appending spin-encoding phase pattern corresponding to the phase delay of 0/, as depicted by the blue/red circular regions in Figure 1b, respectively.Then the input vector passes through the meticulously designed beam-splitting and recombining phase patterns which are determined by the transformation matrix  (see details in Supplementary Note 3).Once generated for a given Ising problem, such two patterns are fixed during the following calculation and annealing process.The output amplitude vector  () consists of the complex amplitudes at the centers of the beams in the output plane, where the output intensity vector  () is detected.In succession, the Hamiltonian  (𝑛) is calculated in the electronic domain and the next sampling state  (+1) is generated and updated to the spin-encoding phase pattern.The spin flip can be simply achieved by adding a constant phase delay  to the corresponding circular region of the spin-encoding phase pattern, as depicted in the last pattern of Figure 1b.
It should be mentioned that with our scheme, the mapping relations between the Ising model and the experimental parameters (the phase patterns corresponding to the matrix ) are simple and explicit.The beam-splitting and recombining scheme can directly conduct the non-unitary matrix transformation without utilizing cascaded structures 40,41 .1) is the photograph of the e.perimental demonstration and the detailed description is provided in Supplementary Note 3. Inset (2) shows the intensity distribution in the transverse plane surrounded by the black dashed bo..Here SLM0 splits the beam into N=20 beams, which is equal to the number of the Ising spins.The color bar denotes the incident power on each pi.el.Inset (3) shows a circular beam-recombining phase pattern on SLM2 which is not superposed with the blazed grating, and the color bar shows the phase distribution.SLM1 encodes the Ising spins to the beams, and implements the optical vector-matri.multiplication together with SLM2.
Experiment demonstration.The experimental setup of the PEIDIA with 20 spins is illustrated in Figure 2, and the photo of the experimental setup is shown in inset (1) of Figure 2. A Gaussian beam at 1550 nm (ORION 1550 nm Laser Module) with the fiber collimator is injected to a half-wave plate and a linear polarizer, which align the polarization according to the requirement of three phase-only reflective SLMs (Holoeye PLUTO-2.1-TELCO-013).Each SLM has 1920×1080 pixels with the pixel pitch of 8μm, serving as a reconfigurable wavefront modulator.SLM0 is employed to split the single incident beam with the radius of 1.63mm into 20 Gaussian beams without overlap as the input DCS mode, and the position distribution in the transverse plane is shown in inset (2) of Figure 2. The beams are arranged in a triangular lattice in order to encode more spins on a single SLM, and the radius of each beam is ~610μm.The selection of the beam radius would be discussed in the section of "Scalability".Both phase patterns for spin-encoding and beam-splitting are applied on SLM1, while the beam-recombining phase pattern is applied on SLM2.SLM1 and SLM2 perform the OVMM of  together.The splitting ratio of each region on SLM1, which is defined as the ratio of the complex amplitudes of the beams after the splitting, is consistent with the corresponding column of .The recombining ratio of each region on SLM2 is an all-one vector, which sums up the incident optical complex amplitudes.For example, one of the twenty (=20) circular beam-recombining phase pattern is shown in inset (3) of Figure 2. It should be mentioned that the phase pattern on SLM2 only depends on the dimensionality and the position distribution of the DCS mode, e.g.20 beams in a triangular lattice in this work, and keeps constant during the solving process.Moreover, the radius of each beam-splitting and recombining region is set to 1175μm that is almost twice as long as the beam radius to conveniently align each Gaussian beam with =20.Actually, the radius of each circular region could be reduced to about 1.5 times of the Gaussian beam radius according to our estimation while there is no significant overlap between adjacent beam spots.Thus for N=30, each beam-splitting/recombining region on SLM is set to 950μm (~1.55×610μm).The detailed discussion about the beam radius is provided in the section of "Scalability".After SLM2, there is a pinhole to filter out the unwanted diffraction components.Before the camera, a lens aligns the beams along the direction of the optical axis.Finally, the DCS mode is detected by the Hamamatsu InGaAs Camera C12741-03.The methods for calibrating the optical system and generating the phase patterns are provided in Supplementary Note 3. Additionally, a CPU is employed to perform the required process in the electronic domain, including the pretreatment of the adjacent matrix, generating the phase patterns on SLMs, flipping the spins, calculating the Hamiltonians and executing other steps required in the adopted heuristic algorithm.

Ground state search for different Ising models.
To verify our proposed PEIDIA, several models with different complexities and numbers of spins have been experimentally solved.The first model is an antiferromagnetic Möbius-Ladder model with =20 (denoted as model 1), in which the nonzero entries are   = −1 as illustrated in the inset of Figure 3a.First, a single run of the PEIDIA is conducted, where a final accepted state is obtained after 600 iterations, and the measured experimental Hamiltonian evolution is illustrated in Figure 3a.Figures 3b and 3c show the detected images and the beam intensities of the output fields of the randomly generated initial state and the final accepted state, respectively, and the positions of both states are marked in Figure 3a.In Figures 3b and 3c, the intensity of each beam is represented by the average power of central 9 pixels inferred from the grayscale.In the eigenvalue matrix  of the 20-dimensional Möbius-Ladder model, the first 11 eigenvalues are negative while the last 9 eigenvalues are positive.Thus, the beams with number 1-11 are marked as the "negative" beams corresponding to the negative eigenvalues, while the rest are denoted as the "positive" beams in Figures 3b  and 3c.In Figure 3b, the intensity mainly concentrates on the "negative" beams, indicating that the initial state is an excited state with a high Hamiltonian ( =11.5).Actually, Eq. ( 4) indicates that the optical intensities are expected to be more concentrated on the "positive" beams to achieve lower Hamiltonians.Figure 3c shows that the intensity finally concentrates on the "positive" beams 19 and 20, while there are almost no signals on the "negative" beams, which corresponds to a low value of the Hamiltonian (=-26.0).The results in Figure 3 indicate that the accepted spin state evolves from an initial state with a high Hamiltonian to a final state with a low Hamiltonian, thus the PEIDIA indeed minimizes the Hamiltonian.In the experiment, the PEIDIA has been run for 100 times, and the corresponding Hamiltonian evolutions are depicted in Figure 4a.In each run, the initial state of the spins is randomly generated.Most of the curves converge to the low Hamiltonians within 400 iterations, and the finally obtained Hamiltonians are very close to the ground state Hamiltonian =-26 which is denoted as the black dashed line in Figure 4a.Such distribution may be mainly due to the systematic error and the detection noise.Actually, the target of the PEIDIA is to obtain the spin vector of the ground state, rather than the actual value of the Hamiltonian.Thus, the accepted spin vectors in each iteration corresponding to all curves in Figure 4a are extracted to calculate the theoretical Hamiltonians with Eq. ( 1), and then the ground state probability is obtained by counting the proportion of the ground state Hamiltonian for each iteration within all 100 runs.The ground state probability versus the iteration number is plotted as the red curve in Figure 4b.It can be seen that as the initial states are randomly generated, the ground state probability is almost 0 in the range of the iteration number less than 50.Then the probability would experience a rapid growth from the iteration number 50 to 300, and gradually converge in the end.The final ground state probability is around 0.99 after 400 iterations, indicating that almost all of the 100 runs can successfully obtain the ground states.For comparison, the algorithm simulation, which does not include the simulation of the entire optical system, has also been carried out for 10000 times on a computer with the same parameters as the experimental settings, and the ground state probability versus the iteration number is plotted as the black curve in Figure 4b.It can be seen that the experimental curve matches very well with the simulation.
As shown in the Figure 4a, the experimental Hamiltonians are distributed around the ground state Hamiltonian in the final stage of searching, indicating that the systematic error and the detection noise cannot be neglected due to the limited performance of the experimental devices.Such error and noise would cause some deviations of the actual transformation matrices, the input vectors and the detected signals from the theoretical expectations.To quantify the influence of these two factors, the parameter of fidelity  is introduced with In Eq. ( 6),  is the intensity vector measured by the camera and  theo = () * ⊙ () ( ⊙ denotes the element-wise multiplication) is the theoretical output intensity vector, which is calculated by the target transformation matrix  and the sampled spin vector .Thus,  could evaluate the accuracy of the output intensity vectors, since both the accuracy of the OVMM and the detection noise are involved.According to Eq. ( 6),  is normalized within [0, 1] and  = 1 represents that the two vectors are perfectly parallel to each other.The fidelities of all 6×10 4 experimental samples are calculated, and the probability distribution of  is illustrated in Figure 4c, and the average value is 0.99978±0.00039,indicating that our transformation scheme is quite accurate.
To present the "on-demand" ability of our proposal, we have considered two other randomly generated and fully connected spin-glass models with spin numbers of 20 and 30 (denoted as model 2 and 3 respectively), in which nonzero entries are uniformly distributed in   ∈ {−1, 1}.The results of model 2 are shown in Figures 4d-4f, while those of model 3 are shown in Figures 4g-4i.The spin couplings of model 2 and 3 are shown in the insets of Figures 4d and 4g respectively.The measured accepted experimental Hamiltonians for 100 runs are presented in Figures 4d and 4g, where the theoretical ground state Hamiltonians are shown as the black dashed lines (=-62 for model 2 and =-117 for model 3).Here, the number of iterations for each run is increased to 1200 and 2000 respectively since model 2 and 3 have higher graph densities than model 1.It can be seen that most of the curves converge within 600 iterations for model 2 and 1200 iterations for model 3.The ground state probabilities versus the iteration number are also calculated and plotted in Figures 4e and 4h with the final ground state probabilities of 0.97 and 0.85 for model 2 and 3, respectively.Both results indicate that our PEIDIA is capable of solving such complex and fully connected models.Furthermore, the fidelity distributions of the output vectors are also calculated and shown in Figures 4f and 4i respectively.The average fidelities of the sampled output intensity vectors for model 2 and 3 are 0.99976±0.00044and 0.99942±0.00131respectively, which are very close to that in model 1.Meanwhile, the fidelities of the corresponding transformation matrices are 0.9989, 0.9994 and 0.9969 for model 1-3, respectively (see Supplementary Note 4).However, for the 30-spin demonstration, the experimental ground state probability becomes a bit lower than that in the simulation.Such result is mainly due to the decreased signal-to-noise ratio of the experimental Hamiltonian with increasing dimensionality  (see detailed discussion of the searching accuracy in Supplementary Note 6).

Discussion
According to our previous work 40,41 , each pattern on SLM is the superposition of a series of phase gratings, hence abundant pixels have to be employed to perform such complex pattern with enough accuracy.In this section, we would estimate the upper limit of the number of spins that can be arranged.Firstly, we refer to the Gaussian beam radius function given by where  0 is the radius of the beam waist between SLM1 and SLM2,  is the wavelength and  is the axial distance relative to the beam waist.To arrange as many spins as possible for a given  and , it is necessary to ensure that the beam radii on both SLM1 and SLM2 are the same, which can be achieved by locating the beam waist at the midpoint between SLM1 and SLM2.For our experiment, =1550nm and  SLM =0.377m (half of the distance between SLM1 and SLM2).By taking the derivative of the Gaussian beam radius function and setting / 0 = 0, the radius of beam waist should be  0 =431μm, which corresponds to a minimum beam radius  SLM =610μm on SLM1 and SLM2.Thus, in our experimental setup, the beam radii on SLM1 and SLM2 are fixed and aligned to ~610μm.Additionally, the radius of each circular pattern on SLM1 and SLM2 should be set to larger than 1.5 SLM so that the beam overlap is about ~1% (estimated by the overlap integral of two Gaussian modes).Thus, for the 20-spin and 30-spin models, the radius of each beam-splitting/recombining pattern on SLM1 and SLM2 is set to 1175μm and 950μm, respectively.It also should be noted that, due to the paraxial approximation and the Nyquist-Shannon sampling theorem 46 , increasing the number of spins would not require larger beam-splitting/recombining regions.Therefore, the minimum radius of the circular pattern is ~950μm in our current experimental setup, and the maximum number of spins is ~36.Another factor that would limit the scalability of the PEIDIA is the noise level of the experimental Hamiltonian (see detailed analysis in Supplementary Note 6).In the experiment of model 1-3, all the noise levels are less than the corresponding minimum Hamiltonian variations, thus high ground state probabilities are achieved.According to our analysis, the signal-to-noise ratio of the experimental Hamiltonian is approximately proportional to 1/√.Such result indicates that the searching accuracy of the PEIDIA would deteriorate in high-dimensional conditions, which would lead to the failure of searching the ground state.
In the future work, the number of spins could be increased by reducing the beam radius, adjusting the distances between SLMs properly (the distances should have lower boundaries as paraxial approximation is applied), or utilizing shorter operation wavelength according to Eq. (7).For instance, if the distance between SLM1 and SLM2 is set to 0.4m and =800μm, the minimum beam radii on SLM1 and 2 are ~320μm and the minimum radius of the circular region is ~500μm so that that ~132 beams could be processed on SLM1 and SLM2.Furthermore, if 4K SLMs (3840×2160 pixels) are adopted, the spin number can be increased to ~520 with the same arrangement as our present setup.Besides, the detector with lower noise and higher dynamic range would be helpful to achieve higher searching accuracy of the PEIDIA in high-dimensional conditions.
Although this work is also based on SLMs, the difference between our proposal and Pierangeli et al.'s work 34 relies on the employed OVMM.In the number of the reconfigurable parameters is 2, which is contributed by the amplitude modulation and the target intensity pattern.It should be noticed that an Ising interaction matrix without external field has the independent entries of ( − 1)/2.Therefore, the OVMM based on a Fourier lens cannot handle arbitrary Ising models.Compared with Pierangeli et al.'s work 34 , in which each spin is encoded by a single SLM pixel, our employed OVMM utilized more pixels to form a spin for arbitrary matrix transformations, hence it can solve arbitrary Ising models -that is to say, our demonstration trades the number of implementable spins for the on-demand characteristic.
As mentioned above, our PEIDIA only requires one non-unitary OVMM with proper pretreatment.Besides, the Ising spins are encoded on the phase term of the optical field and only intensity measurement is needed to calculate the Hamiltonian.However, in the on-chip proposal of tunable MZI network, two cascaded Reck schemes are utilized to perform arbitrary OVMM since only unitary matrix transformations can be performed by the Reck scheme.Each Reck scheme requires ( − 1)/2 MZIs 37 .For example, a 20-dimensional Reck scheme totally needs 190 MZIs, which consist of 380 beam splitters and 380 phase shifters.Such cascaded structure would impede its high-dimensional implementations.Nevertheless, the primary advantage of the proposal with tunable MZI network is the achievement of the Ising machine on a photonic chip.It should be mentioned that our architecture maybe implemented with integrated photonic devices.As shown in Figure 2, the experimental demonstration mainly consists of lens, SLMs and camera.The functions of SLMs for OVMM and the lens could be realized with tunable metasurfaces 47,48 The spin vector could be updated via high-speed phase modulators 49 instead of refreshing SLMs.Besides, the high-speed photodetectors 50 could be employed to measure the output intensities.
The time cost of our demonstration of the PEIDIA consists of the pretreatment cost in the electronic domain and the iteration cost during the annealing process.In the pretreatment stage, the time complexity of the eigen-decomposition is ( 3 ) 44 and the generation of the phase patterns on the SLMs is ( 2 ).In fact, the pattern on SLM0 is a beam-splitting pattern and that on SLM2 is a beam-recombining pattern, which could be pre-generated before the annealing process.Different beamsplitting patterns on SLM1 would correspond to different Ising problems, and the generation of each pattern takes about 11min for the 20-spin experiment and 17min for the 30-spin experiment.Such pre-generation could be done while solving the previous problems.During the annealing process, the beam-splitting and recombining patterns on SLM0-2 are unchanged, and only a pre-generated constant phase delay mask encoding the sampled spin state is appended to SLM1 in each iteration, as shown in Figure 1.Therefore, the time cost is primarily determined by the optoelectronic iterations.The time cost per iteration in optical domain  o depends on the propagation time of light  p , the updating time of SLM1  u and the detection time of the camera  d , leading to a total  o =  p +  u +  d ≈0.32s.The rest operations in the electronic domain are the same as those in the algorithm simulation and the time cost  e is negligible.Thus, the total time cost per iteration is  iter =  o +  e ≈0.32s.In each iteration, the OVMM can perform  = 2 2 + 2 floating-point operations (FLOPs) 33 , including 2 2 multiplications of  and 2 multiplications in the intensity detection process.For instance, for model 3 (=30), the operation speed of the OVMM is  = / iter =5.81kFLOP/s.The total energy consumption of the optical system is =16W, including the power of laser and camera (see detailed analysis in Supplementary Note 7).Thus, the energy consumption is   = / =2.75mJ/FLOP.In the future, the PEIDIA based on integrated devices would achieve more spins, smaller size and less optical time cost, which could lead to higher computation speed and lower energy consumption.

Conclusions
In summary, our proposed PEIDIA provides an architecture that can map arbitrary Ising problems to a photonic system.The PEIDIA provides a fully reconfigurable and high-fidelity optical computation, which can accelerate the vector-matrix multiplication with the parallel propagation of light.The PEIDIA only requires one step of OVMM and intensity detection, which makes the architecture more compact and stable.As a proof of principle, two 20-dimensional and one 30-dimensional Ising problems have been successfully solved with high ground state probability of 0.99, 0.97 and 0.85 respectively.In the spatial-photonic implementation of the PEIDIA, the number of spins can be increased by optimizing the experimental conditions, such as employing shorter wavelength, or higher-resolution SLMs.Meanwhile, the utilization of the detector with lower noise level is significant to ensure the high searching accuracy of the PEIDIA in high-dimensional conditions.In our current experimental demonstration of the PEIDIA, the performances are severely limited by the conversion time between optical and electronic signals.We are still undergoing the corresponding work about the integrated scheme of PEIDIA, and we believe that our architecture could be further improved to achieve large-scale on-demand photonic Ising machines.

Methods
Generating the phase-only patterns on SLMs.The phase-only patterns on SLM0-2 to conduct the beam-splitting and recombining operations are iteratively optimized based on a gradient-descent method, rather than simply taking the argument pattern of the superposed weighted blazed gratings in our previous work [40][41][42][43] .The loss function depending on complex-valued parameters is defined as the difference between the target field and the field modulated by the pattern to be optimized.The three patterns are generated according to the actual experimental setup.More details of the method are provided in Supplementary Note 3.

Calibration of SLMs.
Due to various unavoidable systematic errors such as misalignment in the OVMM system, the phase term and the beam-splitting and recombining ratios of all the patterns need to be calibrated.The calibration method is improved from our previous work 42 based on homodyne detection.Such calibration significantly enhanced the fidelity of OVMM and more details are provided in Supplementary Note 3.
Annealing parameters.For model 1, the experimental parameters are: initial temperature ( 0 ) exp =3000, steps per temperature stage  step =30, stages of temperature  temp =20 and annealing factor  =0.9.For 100 average values of  1 in Supplementary Figure 6e, which are the Hamiltonian normalization coefficients (see Supplementary Note 5), 100 corresponding initial temperature ( 0 ) simu are obtained with ( 0 ) simu = ( 0 ) exp / 1 .Other parameters are the same as those in experiment.For each ( 0 ) simu , 100 simulations are conducted, hence totally 10000 simulations are conducted to obtained the simulation curve of ground state probability in Figure 4b.For model 2/3, the experimental parameter is ( 0 ) exp =2150/1200,  step =40/50,  temp =30/40 and  =0.90/0.92,respectively.The process of obtaining the simulation curve of ground state probability in Figures 4e and 4h of the ground state probability is the same as that in model 1.The full operation process is shown in Supplementary Note 2.

Supplementary Note 1: Detailed deduction in the pretreatment stage and the measurement of the experimental Hamiltonian
At first, we would like to introduce the principle of solving an Ising problem with our proposed Phase Encoding and Intensity Detection Ising Annealer (PEIDIA).in detail.For a given -spin Ising model, we first decompose the adjacent interaction matrix  to the symmetric matrix  + = ( +  T )/2 (the superscript T denotes the transpose) and the anti-symmetric matrix  − = ( −  T )/2.As  − does not contribute to the Hamiltonian, we use  to denote the symmetric component  + for simplicity.Thus, the real symmetric matrix  can be decomposed with eigen-decomposition 1 , where the  = diag( 1 ,  2 , … ,   ) is the diagonal eigenvalue matrix with  eigenvalues  1 ,  2 , … ,   , and  is the corresponding normalized orthogonal eigenvector matrix.By introducing  = √: With such decomposition, the Ising Hamiltonian  can be written as where  = [ 1 ,  2 , … ,   ] T (  ∈ {−1,1}) denotes the Ising spin vector.In Eq. ( S3), the Hamiltonian is the product of the transformed spin vector  and its transpose.It should be mentioned that the eigenvalues  1 ,  2 , … ,   may be negative.For convenience, suppose that  has  negative eigenvalues that satisfies  1 ≤  2 ≤ ⋯ ≤   < 0 ≤  +1 ≤ ⋯ ≤   .Thus, the square root of the diagonal eigenvalue matrix  can be expressed as √ = diag(i√− where the first  rows are imaginary, while the rest ( − ) rows are real.Similarly, since the spin vector  is also real, the first  elements of the transformed spin vector  =  are imaginary while the rest elements are real.Thus it can be written as In the PEIDIA, the optical vector-matrix multiplication (OVMM) is set to  and the input vector is set to .Since only the intensity of the light field can be detected, the output intensity vector is where ⊙ denotes element-wise multiplication.Thus Eq. ( S3) can be further rewritten as Then the relation between the measured intensities and the corresponding Hamiltonian is pixels outside the circular regions are filled with alternating 0 and  phase modulation according to the checkerboard method.Besides, as the employed SLMs are reflective, an additional global blazed grating with the period of 4 pixels is applied to each SLM pattern in order to concentrate the optical intensity in the first diffraction order.The beam from the collimator is centered at coordinate [0, 0] in the transverse  plane.Then the beam is split and projected to  different circular regions on SLM1, as shown in Supplementary Figures 3b and 4. The ideal modulation function of SLM0 is the superposition of a series of blazed gratings 5,6 : where  is the position vector in the  plane,  is the wavenumber,   is the position vector of the center of the th region of SLM1,   is the element of the input vector.
The ideal modulation function of th circular region on SLM1 comprises two parts: a spin-encoding pattern and a beamsplitting pattern.The spin-encoding pattern just applies a phase delay of 0 or  to the beam according to its corresponding spin   = ±1 and the beam-splitting pattern is a superposition of a lens mask and  blazed gratings 5,6 : where   is the element of the beam-splitting matrix,   is the position vector of the center of the th region of SLM2,   is the phase compensation term due to different optical path, and  lens is the phase modulation function of a lens.The lens with the focus  1 adjusts the location of the beam waist to the middle of SLM1 and SLM2.
Each region on SLM2 recombines the  beams deflected by each region on SLM1 and a pinhole is set to filter out undesired light.The ideal modulation function of th region on SLM2 is 5,6 where   is the element of the beam-recombining matrix.The lens with the focus  2 adjusts the location of the beam waist to the pinhole.
There is an evident caveat, however, that the ideal modulation functions above are not pure-phase, thus unattainable with the SLMs employed in our experiments.To address this issue, our previous work maps the value to a point on unit circle with the same argument 5,6 : for it is the closest pure-phase solution to the original function in the sense of least squares.It is worth pointing out that this practice is empirical, and the errors for both the light field and the matrix elements would increase drastically with the dimensionality.
In this work, an improved iterative algorithm is employed to obtain the optimal phase-only patterns, which produce the light field distribution on the target plane approximating to an ideal distribution.The fundamental theory here is gradient descent.The main issue is how to optimize a real criterion function depending on complex-valued parameters.According to ref. 7 , the optimizing theorems are handy to us.
(i) Preparations The first step of optimization is to discretize the problem so that the computer processing can be performed.Consider a continuous model of light field propagation: light field given two parallel planes with aligned transverse coordinates, the propagation of the light field (, , 0) on the first plane to the field ( ′ ,  ′ ,  0 ) on the second plane can be described as 8 where  = √( −  ′ ) 2 + ( −  ′ ) 2 +  2 , ⋆ denotes the convolution, and Fourier transform (FT) is utilized to compute the convolution.
Then the fields are sampled, and the Fourier transform and convolution are replaced with their discrete counterparts.For the sake of simplicity, the light fields are uniformly sampled in a rectangular area, aligned with both axes, and centered on the origin.
Let   0 ∈ ℂ 3  ×3  denote the 2D-DFT of the respond , and ,  ′ ∈ ℂ   ×  denote the sampling of light fields at  = 0 and  =  0 , respectively.Noting the non-overlapping condition and the alignment of discrete convolution, the response has to be sampled from an area of double length in each dimension, then applied with 3  × 3  2D-DFT and finally clips the result to match the shape of  ′ .To formulate it: where ⊙ denotes element-wise multiplication, Δ and Δ denote the sampling interval along each axis, and clip(•) clips extra zeros from the result matrix.Now consider a surface at  = 0, such as an SLM, and assume that each pixel independently produces a wavefront (rectangular, for example) proportional to their complex modulation coefficient.Let  ∈ ℂ 3  ×3  denotes the 2D-DFT of the wavefront and  ∈ ℂ   ×  denotes the coefficient matrix of the hologram.Now Note that the composited operations are defined as   0 (•), which is the discrete form of light field propagation function we desire.
(ii) Generate target light field After discretization, a target light field is needed for each SLM to be optimized.A convenient approach is to treat the light field produced by ideal patterns as target.As well the initial values of optimization are such set with ideal patterns.To formulate it: Optimization steps Given a target light field  T at  =  0 , an incident light field  I at  = 0, and a hyperparameter , optimize phase pattern  in the sense (Huber Loss): Following the procedures described in ref. 7 , we finally derive the gradient of ℒ with respect to , which is: where Finally, the gradient-descent-based algorithm is conducted to optimize the target phase-only hologram.Here is a sketch of the algorithm: Input: target light field:  T , incident light field:  I , hyperparameter: , initial phase pattern: , iteration limit iter.Calibration method of the OVMM.Before the experiments, the whole system should be calibrated to achieve high fidelity of the implemented matrix transformation to reduce the systematic error.In the calibration stage, both the phase and amplitude terms are considered.The phase calibration method is improved from that used in our previous work 10 .
Phase calibration.First, as mentioned above, the phase patterns of SLM1 and SLM2 are generated and set according to the all-one matrix.To measure the phase difference among the elements of the beam-splitting matrix, four output intensity matrices  1 ,  2 ,  3 and  4 have to be measured.In this section, each column of the input matrix would serve as the input vector one by one and the output matrix consists of the corresponding output vectors.
At first, the input matrix  in of SLM0 is defined as an  × ( − 1) matrix with each column including two non-zero elements 1: The process of phase calibration can be described as follows: Initialization: input matrix:  in , matrix on SLM1: all-one matrix, matrix on SLM2: all-one matrix.

End for
Here, the operation "deactivate" refers to replacing a specific region with a "mirror" pattern where the phase modulation is uniform.Such pattern can reflect the incident beam away from the next SLM.Then how to use these matrices to conduct the phase calibration is introduced.
According to the process above, the interference intensities between the first and one of the rest of the beams have been measured as  1 .Since none of the SLM patterns are calibrated, the actual input matrix can only be written as The actual transformation matrix is denoted as   ′ =   exp(i  ) so that the output amplitude matrix  1 =  ′  in ′ is It should be mentioned that Eq. (S27) can only calculate the phase differences among the elements in the same row.But this does not matter since the phase difference among different columns will not affect the output optical intensities.Finally, the phase-calibration matrix Δ is obtained, and the input beam-splitting matrix of SLM1 is modified to   exp(iΔ  ).It should be mentioned that such Δ is independent to the transformation matrix and only determined by the optical setup so that it could be directly applied on other transformations.
Amplitude calibration.The process of the calibration of SLM1 can be described as follows.

End for
In the 20/30-dimensional transformation,  amp,1 = 3~4 is enough to obtain high fidelity.It should be mentioned that due to the phase-only operation, the SLM1 could only split the incident beams.Thus, the input vector of SLM0 has to be modified to compensate the difference between the beam-splitting matrix of SLM1 and the target matrix.Here, the actual contribution of each split beam from SLM0 to the output mode has to be measured.This can be achieved by activating each region of the calibrated SLM1 pattern successively (other regions are "deactivated" as mentioned above).The process is described as follows.
Initialization: target transformation matrix: , input vector of SLM0

End for
In the 20/30-dimensional condition,  amp,2 = 3~4 is enough to obtain high fidelity.
In the end of this section, the 20-dimensional Discrete Fourier Transform (DFT) matrix is set as the target transformation to evaluate the calibration accuracy: Since the elements of both the DFT matrix and its inverse matrix have the same norms and only differ in arguments, it is very sensitive to the phase term.After the amplitude calibration of SLM1, the output intensity matrix corresponding to the input identity matrix is shown in Supplementary Figure 5a.It can be seen that each beam is equally split by SLM1, but the intensities in different columns are different.Then after the amplitude calibration of SLM0, the matrix  is shown as Supplementary Figure 5b, where the matrix elements are much closer to each other than that in Supplementary Figure 5a.
In order to evaluate the accuracy of the calibration, the input matrix is set as the inverse matrix of the DFT matrix, which is also the conjugate transpose.The output matrix should be the 20-dimensional identity matrix.Obviously, the input vector can be simply achieved by adding phase masks to each SLM1 region according to the arguments of the elements of the target input vector.The measured images are shown in Supplementary Figure 5d, where only one bright spot at different locations is observed in each image with similar intensities and high contrast.The output intensity matrix is shown as     1.The parameters necessary for the noise evaluation.The dark current marked by star is referred from Hamamatsu InGaAs camera G14671-0808W as it is not provided in the datasheet of C12741-03.
In the datasheet, the number of excited electrons per pixel per frame is employed to evaluate the readout noise.Thus, we use the same unit to analysis other noises instead of the common unit of power.Using the parameters in Supplementary Table 1, the noises can be evaluated and listed in Supplementary Table 2  element-wise product of the output amplitude vector in the complex field.Take model 3 as an example, for =30, the operation speed of the OVMM is  = / iter =5.81kFLOP/s.
In the experimental demonstration, the power of the input laser is about 2.59μW, 2.07μW and 6.01μW for model 1-3 respectively.The energy consumption mainly depends on the sensitivity of the camera in our current demonstration.With the employed camera, the laser power of  l =6μW is able to achieve high enough signal-to-noise ratio in the experiment of model 3, and the maximum power of the camera is  c =16W.Thus, the energy consumption is   = ( l +  c )/ =2.75mJ/FLOP.

Figure 1 .
Figure 1.The architecture diagram of the Phase Encoding and Intensity Detection Ising Annealer (PEIDIA).a) The full operation process of the PEIDIA consisting of three main stages.  the Ising spins encoded on the phase term of the light field. the transformation matri..   the output comple.amplitude.  the output intensity. the interaction matri. of the Ising model.b) Detailed demonstration of the PEIDIA with arbitrary non-unitary matri.transformation of the discrete coherent spatial mode for a 9-spin Ising problem.The input field consists of 9 Gaussian beams with equal intensities.Different colors in wavefront modulation patterns and optical fields indicate different phases according to the color bar, while the black region denotes 0 phase delay for readability.In the insets of the optical fields, the brightness denotes the optical amplitude.The orange arrows denote the propagation directions of the Gaussian beams.In the shown spin-update process, one red circular region turns blue, representing a spin flip.

Figure 2 .
Figure 2. The experimental setup.BPF bandpass filter.HWP half-wave plate.As the employed spatial light modulators (SLMs) are reflective, additional blazed gratings are applied to all SLMs to e.tinct the zero-order diffractions.Inset (1) is the photograph of the e.perimental demonstration and the detailed description is provided in Supplementary Note 3. Inset (2) shows the intensity distribution in the transverse plane surrounded by the black dashed bo..Here SLM0 splits the beam into N=20 beams, which is equal to the number of the Ising spins.The color bar denotes the incident power on each pi.el.Inset (3) shows a circular beam-recombining phase pattern on SLM2 which is not superposed with the blazed grating, and the color bar shows the phase distribution.SLM1 encodes the Ising spins to the beams, and implements the optical vector-matri.multiplication together with SLM2.

Figure 3 .
Figure 3. Results of the ground state search for the 20-dimensional Möbius-Ladder model.a) An e.perimentally measured Hamiltonian evolution curve in a single run.The inset shows the Ising model to be solved.Two arrows denote the position of the initial state in b and the final accepted state in c, respectively.b) The beam power of the detected image corresponding to a randomly generated initial state with a high Hamiltonian.The left inset denotes the positions of 20 beams.The color bar denotes the incident power on each pi.el.c) The beam power of the detected image corresponding to a final accepted state with a low Hamiltonian.The left inset denotes the positions of 20 beams.The color bar denotes the incident power on each pi.el.

Figure 4 .
Figure 4. Results of the three demonstrated models.a)-c) The 100 e.perimental Hamiltonian evolution curves for model 1-3 respectively.The normalization method is described in Supplementary Note 5.The inset shows the spin interactions of the models where the red and blue lines denote   = 1 and −1 respectively.The black dashed lines indicate the theoretical ground state Hamiltonians.d)-f) The ground state probabilities versus the iteration number over 100 runs for model 1-3 respectively.g)-i) The fidelity distributions of the sampled output intensity vectors for model 1-3 respectively.
argmin  ℒ(;  T ,  I ,  0 , ) where   (•) is previously defined, { • • selects from two values and ∑• sums them up.Other operations here are all elementwise.Note that (•) contracts the loss function and () =  || gives a phase-only hologram, which is what we eventually need.

For i = 1
to N − 1 do Update SLM0 according to the column i of  in Measure the column i output intensity matrix  1 Deactivate the region i + 1 of SLM1 Measure the column i output intensity matrix  3 Reset the SLM1 pattern Deactivate the region 1 of SLM1 Measure the column i output intensity matrix  4 Reset the SLM1 pattern Add the phase delay of π/2 to the region i + 1 of SLM1 Measure the column i output intensity matrix  2

Supplementary Figure 9 .
It is necessary to investigate the noise level of the ground state, which is our main concern.From the experimentally measured ground state signals of model 1-3 shown in Supplementary Figure9, it can be found that the signals most concentrates on the last beams respectively.Experimentally measured ground state signals of model 1-3.(a)-(c) correspond to model 1-3 respectively.The left insets denote the beam positions while the right insets show the ground state images, respectively.The color bar denotes the signal intensity of each pixel in the images.Thus, we can take a rough approximation of the output signal corresponding to the Ising ground state: the detected signals of beam 1 to -1 are 0 while the signal of beam  is the maximum value  well .The corresponding experimental ground state Hamiltonian is  0 =-3×10 5 electrons according to Eq. (S31).Thus, the photon shot noises of beam 1 to -1  p 1 ,  p 2 , … ,  p −1 are 0 while that of beam  is  p  = √ well =774electrons.Therefore, the total noises of beam 1 to -1 are   1 =   2 = ⋯ =   −1 = √ r 2 +  d 2 ≈1003electrons, and that of beam  is    = √ r 2 +  d 2 +  p  2 ≈1267electrons.It can be found that the major noise of the whole system is the readout noise of the detector.Then we obtain the uncertainty of the experimental ground state Hamiltonian as 1 , i√− 2 , … i√−  , √ +1 , … , √  ).As  = {  } is a real matrix, the transformation matrix  has the following form rep = 1 to  amp,1 do Generate and update the new phase pattern on SLM1 pattern according to  SLM1

Table 2 .
. The expressions and the corresponding values of different types of noise.=1.6×10 -1 C is the elementary charge.