Classical simulation of boson sampling with sparse output

Boson sampling can simulate physical problems for which classical simulations are inefficient. However, not all problems simulated by boson sampling are classically intractable. We show explicit classical methods of finding boson sampling distributions when they are known to be highly sparse. In the methods, we first determine a few distributions from restricted number of detectors and then recover the full one using compressive sensing techniques. In general, the latter step could be of high complexity. However, we show that this problem can be reduced to solving an Ising model which under certain conditions can be done in polynomial time. Various extensions are discussed including a version involving quantum annealing. Hence, our results impact the understanding of the class of classically calculable problems. We indicate that boson samplers may be advantageous in dealing with problems which are not highly sparse. Finally, we suggest a hybrid method for problems of intermediate sparsity.

Background and motivation.-Simulatingcomplicated quantum chemistry systems on classical or quantum simulators is an interesting problem with the industrial impact.It is simply cheaper to test many, e.g., molecular configurations in the simulators than synthetizing the molecules and testing their crucial properties experimentally.However, some problems are believed to be of complexity for which classical computers are inefficient.For instance, Huh et al. [1,2] showed that the statistics of Franck-Condon (FC) factors [3,4] for vibronic transitions in large molecules [5,6] is equivalent to the statistics of samples in a version of boson sampling [7][8][9][10][11][12][13][14][15].This observation has two serious consequences.First, finding FC factors may be computationally hard and the complexity may scale exponentially with the number of atoms in a molecule.Second, FC factors can be simulated by boson sampling based quantum simulators.It is widely accepted that boson sampling from interferometers described by the average-case Haar-or Gaussian-random unitary transformations is classically computationally inefficient [7].However, it is not clear if the problems related to quantum chemistry belong to this class, as the related matrices are not typically Haar-or Gaussianrandom [16].In particular, if these matrices were of low rank there exists an efficient algorithm for computing their permanents [17] and so finding the boson sampling statistics.In this paper we discuss the case when we a priori know that the statistics of outputs from a boson sampling setup is sparse.This knowledge can be based on some symmetries or other physical properties of the problem.Then boson sampling can be at least approximately classically simulated.Let us summarize our argumentation.We observe that although full coincidence statistics in ideal boson sampling experiments cannot be efficiently simulated classically the marginal distributions can be.It is implied by the result of Gurvits cited together with the proof in the Aronson and Arkhipov paper [7] as follows: Theorem (Gurvitss k-Photon Marginal Algorithm)There exists a deterministic classical algorithm that, given a unitary matrix U ∈ C M ×M , indices i 1 , ..., i k ∈ [M ], and occupation numbers j 1 , ..., j k ∈ {0, ..., N }, computes the joint probability Here, S = (s 1 , ..., s M ) ∼ D U means that the occupation numbers S are sampled according to the probability distribution over possible outputs of U .As commented by the authors this result in general does not change the complexity of the full boson sampling problem.To have the complete joint statistics we would need marginal distributions with k ≥ N .This implies the complexity scaling exponentially with the number of photons.However, if we invest the knowledge about the sparsity the complexity may be indeed reduced.We discuss the scheme that allows us to recover the joint sparse distribution from the marginal ones.We will use the ideas from compressive sensing, i.e., a sparse signal recovery from undersampled measurements [18][19][20][21][22][23][24].The incoherent measurement in our compressive sampling schemes consists of marginal distributions.The observation about computability of marginal distributions and the existence of compressive sensing algorithms do not guarantee yet that we have an efficient classical algorithm for boson sampling with sparse outputs.We need to show additionally that there exists an efficient algorithm which recovers the sparse joint distribution from compressively sensed data.In this paper we show that such algorithm exists.The arguments are inspired by works from the field of quantum compressive sensing [25][26][27][28][29][30].In particular Cramer et al. [26] proposed an algorithm to reconstruct a low rank quantum state from its reduced density matrices known from quantum tomography of subsystems.They used a modified singular value thresholding algorithm (MSVT).The essential part of it is finding the ground states of matrices that can be thought as Hamiltonians.This operation itself is inefficient, but if the Hamiltonian is constructed from local nearest-neighbor interactions the best so-called matrix product state approximations of the ground states can be found efficiently.Then MSVT converges to the right solution in time polynomial in the size of the system.The application of the matrix product states representation allows as well for significant reduction of the required memory.The procedure we provide is a simplified version of this idea.We show that the complexity bottleneck of a compressive sensing reconstruction from marginal distributions is in localizing the largest element from a long list (counterpart of finding the ground state).This procedure typically would require the number of steps and bits of memory which is O(d), where d is the length of the list.However, we show that in our case the list may be written as the local nearest-neighbor interaction Hamiltionian of a classical spin chain.Localization of the maximum energy configuration is efficient in this problem both in terms of the number of operations and memory [31,32].
Marginal coincidence distributions.-Letus start considering the following scenario.
For some unitary transformation U which reflects features of a physical system we want to simulate the transition probabilities x = | 1, 1, ..., 1, 0...0|U |n 1 , n 2 , ..., n M | 2 from a given (N − 1)-photon occupation numbers state in Mmodes |1, 1, ..., 1, 0...0 to all occupation numbers states |n 1 , n 2 , ..., n M (the left hand side of fig.1).The searched vector x is of length N M .This corresponds to 0, 1, ..., N − 1 possible photons in each mode.We claim that it is possible to efficiently find x calculating the statistics of marginal distributions from smaller number of detectors measuring only chosen states and ignoring the remaining modes (the right hand part of fig. 1) if x is sparse, i.e., it contains at most s non-zero entries, where s N M .To analyze the problem in detail let us introduce the following notation.Vector x can be decomposed in the basis of measured sets of occupation numbers as follows Here, the numbers α n1,...,n N are non-negative, sum up to one and only s of them are non-zero.We will use the following convention where the lower line indicates the positions of 1 in each mode and there is appropriate number of 0s in place of dots.In this explanation we will use a notation for twodetector coincidences, however the formalism can be extended to coincidences of different numbers of detectors if necessary.The measurement of modes i and j leads to the marginal probability distribution where y ni,nj is the sum of all entries α n1,...,n N with fixed n i and n j .We get this distribution observing frequencies of outcomes from given modes independently of what happens in the remaining modes.In the chosen convention the rows of the so-called measurement matrix A are binary patterns as follows where means the entry-wise multiplication and where 1 = (1, 1, 1, ...).Here 1 and (• • 1 • ••) are N dimensional vectors.The entry-wise multiplication preserves the tensor product structure and can be executed as the entrywise multiplication in each part of the tensor product.
The results of k measurement outputs form a k-dim vector where A is a k × N M binary matrix.If x is sparse in the basis incoherent with the measurement then it can be determined based on the number of measurements k N M as the most sparse solution of y = Ax .This problem can be seen as the constraint l1 norm minimization which is a convex optimization problem.It is solvable by many known algorithms [23].In the next part we analyze the complexity of particular algorithms adapted to minimize the computational costs and memory requirements.Complexity analysis.-Assumethat we use only 2 detectors measuring neighboring modes.For boson sampling the marginal distributions can be calculated in polynomial time in the number of modes.Gurvitss Algorithm from [7] does it in time N O(2) N 2 M .The number of measurements is k = N 2 M , the dimensionality of the sparse vector is d = N M .We allow for sub-linear scaling of the sparsity s, i.e., the number of non-zero elements in x.Therefore, we keep O(s) and O(d) well separated.Moreover, we assume that problems of complexity O(s) = O(k) are efficiently tractable.From the measurements, we want to recover the most sparse joint distribution knowing the measurement matrix.So, our goal is to solve the underdetermined problem y = Ax, where y is a k-dimensional measurement vector of marginal distributions, x is a d-dimensional sparse vector which is searched.In our case, rows of A are well structured orthogonal patterns.This implies that it is easy to multiply A by any sparse vector.For instance, assume that we know that an entry κ of a vector is non-zero.We decompose κ as a N-inary number which gives us immediately its tensor product representation as in (2) consistent with the structure of A. For example, assuming that N = 2, position 14 is represented as 01101 (corresponding to binary 13 as 0 occupies the first position) which corresponds to (10) ⊗ (01) ⊗ (01) ⊗ (10) ⊗ (01).(7) This vector has just one non-zero element in the 14th entry.It is immediate to show what is its overlap with a particular row of A which, for instance, can be of the form We need to check only relevant modes.
To solve the underdetermined problem we discuss in detail two first-order greedy algorithms [33].We consider the matching pursuit method which is simple but less accurate and slower, and the gradient pursuit which is more accurate and faster but slightly more computationally demanding.These two algorithms are enough to discuss the bottleneck for the memory and computational costs, and to show how to overcome these problems.
The matching pursuit protocol finds the s sparse solution.It is summarized as follows: I. (Initialization) At step 0: the residual r 0 = y and the approximate solution is x 0 = 0. II.(Support detection) In step i recognize the column of A denoted by A t which is the most similar to the current residue r i−1 solving t = argmax t |(A T r i−1 ) t |.III.(Updating) Update the solution only in index t i.e., x i t = x i−1 t + (A T r i−1 ) t and update the residue using t-th column of A as follows r i = r i−1 − (A T r i−1 ) t A t .IV. Continue iterating until a stopping criterion is matched.
The support detection step dominates the complexity.Without exploiting the structure of A we require O(kd) bits of memory to store the matrix and the same for the operational costs for multiplication A T r i−1 .Moreover, without smart tricks, typically, we would need at least O(d) steps to find the maximum value from the list A T r i−1 .The same amount of memory is needed to store the list.
Considering specific features of the problem we can overcome the bottleneck.Let us notice that the first and the third part of the algorithm can strongly benefit from the sparsity of vectors involved.Moreover, for any t, vector A t can be found operationally (multiplication of A and a sparse vector).So, A t does not need to be stored beforehand and the memory and computational costs of these parts are O(k) -size of r i .The entire procedure can be iterated until a given sparsity s of the solution is achieved.Finally, let us consider the operational costs of the inefficient support detection part.As for finding the index of the largest element of a list, we notice that it is equivalent to finding the leading eigenvector of the diagonal matrix with the list on the diagonal.We know that there are computationally efficient methods for finding the eigenvectors under some conditions.Indeed, we may apply the matrix product states (MPS) formalism which is a powerful tool, for instance, for finding the ground state of a Hamiltonian of a one-dimensional quantum spin chain with the nearest neighbors interactions (Ising model).This Hamiltonian is of the form (9) where H i,i+1 acts only on modes i and i+ 1 in an ordered set of modes with the open boundary condition and we explicitly write identities on the other modes.Let us notice that in our problem A T r in the support detection part can be written exactly as a diagonal local Hamiltonian.To simplify the explanation let us consider an example with the readouts from a single detector only.A row of measurement matrix A corresponding to n m photons in mode m is given as γ nm in (5).Observing all n m from only mode m we have where A [m] is a submatrix of A corresponding to different photon numbers recorded in mode m.Here, r [m] is part of the residual vector that corresponds to rows of Measurements of other modes have also form of the local Hamiltonians if understood as diagonal matrices.So, the same holds for A T r.For coincidence measurements of more than 1 modes given that the measurements are taken for the neighboring modes we have which written in the diagonal form is a typical nearest neighbor Hamiltonian for a one dimensional spin chain.
It is clear from this notation that the matrix vector multiplication requires negligible operational costs and O(k) bits of memory to store d long vector in its compressed representation as a sum of local matrices.Using these observations, the support finding from the matching pursuit algorithm can be determined much faster than in O(d) time.If we consider just two neighboring modes coincidences our problem can be described in terms of the classical spin chain formalism, we can use an explicit strategy from, e.g., [32] to find the optimal configuration of the classical spin chain which is equivalent to finding the position of the maximal value in our A T r list.Indeed, h m,m+1 (i m , i m+1 ) from the algorithm [32] is equivalent to r im,im+1 .This procedure reduces the computational costs of the support detection to 2M N 2 (factor 2 is from the necessity to repeat the procedure for −A T r as we are looking for the largest element in the absolute value).
The matching pursuit (MP) method although computationally simple is not the fastest one as the same support index may need to be used many times.Also it does not guarantee the accurate solution, as the solution is expressed by means of relatively small number of columns of A. Some modifications of the method lead more efficiently to more accurate results.Among them there is the orthogonal matching pursuit algorithm (OMP) in which in each step the support is updated and kept in the memory.The solution is approximated by the vector of coefficients in the best 2-norm approximation of the measurement vector y in terms of the selected columns of A. The gradient pursuit (GP) updates the solution by correcting it in the direction of the largest gradient of the 2-norm mentioned above by a given step.As discussed in [33] the only additional cost over what we have in the matching pursuit algorithm is the cost of calculating the step size.This can be however done efficiently in our case due to an easy way of finding the product of a sparse vector and matrix A. For MP and GP the convergence rates discussed in [33] depend on contracting properties of A.
Accuracy of the reconstruction.-Forsuccessful compressive sensing reconstruction the so-called coherence between the rows of a measurement matrix and the sparsity basis must by low.Roughly speaking it means that a particular measurement is sensitive on changes in a large part of the measured vector.For this reason random unstructured matrices are of particular interest as they are incoherent with almost any basis.However, random matrices are problematic for large scale problems.They are expensive in terms of storage and operational costs as the inefficient matrix vector multiplication is usually needed [34].Therefore, structured matrices which can be defined operationally and do not need to be stored are also desired.In our case the measurement matrix is structured and of low coherence with respect to the measured space basis.However, the usefulness of a particular matrix depends as well on the reconstruction algorithm.Therefore, we tested the performance of the chosen GP algorithm with measurement matrix A numerically.We have measured 1000 randomly chosen distributions of sparsity s = 4, 5, 6 for the problem with M = 6 output modes and N = 4 different events measured in each mode.Matrix A was associated with all 2-neighboring modes coincidence measurements.So, A is a 80 × 4096 matrix.We have tested how often X = x T test x solved /|x test | 2 is larger than a given threshold, where x test and x solved are the randomly chosen measured signal used in the simulation and the reconstructed signal respectively.We iterated the algorithm not more than 50 times.For s = 4 we observe that in about 80% cases X > 0.9 and in 74% cases X > 0.99.For s = 5, we have X > 0.9 and X > 0.99 in 64% and 56% of situations respectively.Finally, for s = 6 we observe X > 0.9 and X > 0.99 in 47% and 37% of cases respectively.As there are many possible more complicated and more efficient algorithms which can still share the feature of the reduced complexity we observe in this paper these results can be improved.However, testing them further is out of the scope of this paper.
Conclusions.-In this paper we show that if we a priori know that an output from a boson sampling experiment is sparse enough we can calculate the first-order approximation efficiently on a classical computer.The crucial steps are to calculate the marginal distributions using the Gurvitss k-Photon Marginal Algorithm and then to use an algorithm for compressive sensing sparse signal recovery.Many of these algorithms quickly converge assuming that the support localization is efficient.We show that due to the specific form of the marginal measurements the problem can be reduced to finding the optimal configuration in a 1 dimensional classical spin chain.The latter is known to be efficient.
In this paper we have investigated the situation with only nearest neighbor modes measurements.This restricts the tolerable sparsity for the method.When the sparsity increases a larger amount of data is needed.Implementing three or more nearest neighbors modes measurements is one of the solutions.We could think as well about relaxing the nearest modes requirement and investigating the situation with non-local coincidence distributions.Then more advance techniques from the theory of spin glass could be also applied [35,36].Finally, our function (11) to be minimized when restricted to nearest binary modes is equivalent to the Hamiltonian H = i>j H i,j , where H i,j = h i,j 1 i,j +h i,j σ z i σ z j +h i,j σ z i ⊗1 j +h i,j 1 i ⊗σ z j .
Here σ z i is the Pauli z matrix in mode i.The coefficients can be chosen to correspond to r ni,nj from (11).The ground state of Hamiltonian H can be found efficiently by simulated or quantum annealing under some conditions about spectrum of H and its correspondence to the connections in the annealer [37][38][39].In our approach we have some freedom in choosing pairs of modes to guarantee that these conditions are satisfied.So, the simulated or quantum annealing could be used to extend our approach.However, detailed study on this matter is a subject of separate study.
An important impact of this paper is that we implicitly suggest a new method for finding Franck-Condon factors under the condition that their distribution is sparse.This problem is in general related to calculating permanens of a unitary transformation [1].In this paper we show that it is enough to calculate permanents of submatrices due to the Gurvits's algorithm and apply the compressive sensing reconstruction.
Our technique can be applied to marginal distributions measured in realistic experiments or calculated assuming losses [40,41].The joint distribution after losses may be interpreted as a lossless distribution multiplied by a stochastic matrix L describing the process of losses.L is often in the form of a tensor product.The measurement matrix from our technique is now the product of A as previously and L. The new measurement matrix inherits features allowing for keeping the complexity tractable.The impact of losses may be a direction of further studies.

FIG. 1 .
FIG. 1. Red or blue bars -the statistics of coincidences of different numbers of photons at the output of an interferometer.Left hand side: The distributions measured directly by M photon-number-resolving detectors.All combinations of possible numbers of photons in M modes form an N M dimensional vector.Right hand side: Marginal distributions of the occupation numbers in two chosen modes.