Introduction

Universal quantum computation (QC) can be realized by combining arbitrary single-qubit rotations with two-qubit entangling gates, such as the controlled-NOT (CNOT) quantum gate. So far, there has been a lot of effort to build useful theoretical models for practical universal photonic quantum computers harnessing non-classical interference and only employing linear optical elements1,2,3,4, but this only allows to obtain a non-deterministic version of the desired entangling gate5,6,7,8, i.e., this quantum gate can be implemented with arbitrarily high fidelity but only with probabilistic outcome. Even though a deterministic CNOT could be implemented, in principle, by exploiting strongly nonlinear optical elements at the level of single photons9,10,11,12, there are still considerable difficulties in realizing sufficiently large photon–photon interactions allowing to achieve the so-called photon blockade regime13,14,15,16,17. Even if promising proof-of-concept demonstrations have been shown in solid-state cavity QED18,19,20,21,22,23, and interesting quantum photonic devices might be realized based on these outcomes24,25,26, the actual possibility of exploiting single-photon nonlinearities for developing universal quantum computation has been debated in the past27,28,29. Alternative proposals to implement deterministic two-qubit quantum gates have been recently put forward in various nonlinear photonic platforms30,31,32,33,34, but no significant experimental proof-of-principle demonstration has been shown so far. Hence, the question naturally arises if deterministic QC in photonic platforms can still be considered a viable route, e.g., by exploiting basic interferometric elements, as in linear optics quantum computation, combined with Kerr-type nonlinearities.

Here we go beyond the previous proposal to implement two-qubit gates in quantum nonlinear photonic interferometers using single-rail encoding34. In fact, unavoidable photon bunching has been shown to severely hinder the application of single-rail encoding for optical quantum computing35. So, here we introduce a QC paradigm based on generalized quantum photonic interferometers, which are shown to allow for an efficient implementation of single and two-qubit gates based on dual-rail qubit encoding. The platform requirements can be reduced to a standard planar technology in which propagating single photons interact with a given degree of self-Kerr nonlinearity when simultaneously present in the same waveguide channel and freely propagate otherwise. By non-trivially combining a few elementary layers, namely free propagation and next-neighbors hopping regions, we show by numerical optimization that deterministic two-qubit gate fidelities arbitrarily close to 100% can be achieved, in theory36. It is worth acknowledging that conceptual analogies between our proposed architecture and what has been defined in the literature as a quantum optical neural network37,38 exist, but we hereby focus on a model of interacting photons over the whole circuit instead of considering localized single-site nonlinearities. In fact, it is hereby shown that the optimization algorithms achieve the most successful implementation of two-qubit gates for values of the photon-photon nonlinearity that are small as compared to hopping between different waveguides, which is explicitly shown for the paradigmatic cases of CNOT and Mølmer–Sørensen (M–S)39,40 entangling gates. Notice that we assume a distributed two-photon nonlinearity over the whole circuit, at difference with most of the previous works. We also notice that we go beyond the existing literature on quantum logic gates optimized for one-dimensional quantum walks30, since we hereby consider actual implementations in imperfect photonic integrated circuits by also assessing the role of different dissipation mechanisms. In particular, we include an analysis of the effects of population decay, decoherence, and fabrication tolerance of the expected gate fidelity, for which we report additional extensive results in Supplementary Note 5. In perspective, the two-qubit operations optimized here can be combined with arbitrary single-qubit rotations on the Bloch sphere (in particular, x- and z- rotations) to implement a universal set of quantum gates and, thus, a full QC architecture. Finally, we discuss the practical feasibility of the proposed scheme in state-of-art technological platforms. Our results may ultimately open the route to the realization of deterministic quantum photonic computing.

Results and discussion

A general formalism to analyze quantum photonic interferometers in the presence of photon–photon nonlinearities has been previously introduced34. There, the formalism was meant to theoretically describe integrated platforms in which exciton–polaritons are the propagating elementary excitations, owing to their superior Kerr-type nonlinearities as compared to standard optoelectronics materials. However, the formalism can be generally transferred to any material platform possessing an intrinsic third-order nonlinearity, which may be suitably enhanced by transverse dielectric confinement and pulse shaping, which is hereby described by a Kerr-type nonlinear Hamiltonian in second quantization. Hence, in the present work, we prefer to keep the theoretical discussion as general as possible and speak about interacting photons; we will specifically refer to the potentially targeted platforms in the “Discussion” section.

The Hamiltonian model

The general case of n one-dimensional channels (with n corresponding input/output ports) in which quantized photon states propagate at fixed wave vector k, interfere through space-dependent evanescent coupling, and nonlinearly interact when simultaneously present within the same channel can be described by the following Hamiltonian ( = 1, see Supplementary Note 1 for a detailed derivation of this model):

$${{{{{{{\mathcal{H}}}}}}}}=\mathop{\sum }\limits_{i=1}^{n}\left({\omega }_{i}{a}_{i}^{{{{\dagger}}} }{a}_{i}+{U}_{i}{a}_{i}^{{{{\dagger}}} 2}{a}_{i}^{2}\right)+\frac{1}{2}\mathop{\sum }\limits_{\begin{array}{c}i,j = 1\\ j\ne i\end{array}}^{n}{J}_{ij}(x)\left({a}_{i}^{{{{\dagger}}} }{a}_{j}+{a}_{j}^{{{{\dagger}}} }{a}_{i}\right),$$
(1)

with \({{{{{{{\mathcal{H}}}}}}}}\equiv {{{{{{{\mathcal{H}}}}}}}}[\{{\omega }_{i}\};\,\{{U}_{i}\};\,\{{J}_{ij}(x)\}]\). In Eq. (1), the parameters ωi ≡ ωi(k) and Ui ≡ Ui(k) denote the energy–momentum dispersion and the Kerr-type nonlinearity in the ith channel, respectively. From Eq. (1) also notice that only self-Kerr nonlinearity is assumed in our model while fully neglecting cross-Kerr contributions and that such nonlinearity activates throughout the circuit whenever two-photon states are simultaneously present within a single propagating channel (i.e., a distributed nonlinearity), as opposed to models in which such Kerr-terms occur in localized regions defined, e.g., from single-mode resonators. The space-dependent parameters {Jij(x)} denote the hopping terms between the next neighboring channels, where x identifies the propagation direction in each one-dimensional (1D) waveguide. In particular, photons in the ith channel are annihilated (created) by bosonic operators ai ≡ ai,k (\({a}_{i}^{{{{\dagger}}} }\equiv {a}_{i,k}^{{{{\dagger}}} }\)), respectively.

In general, the n propagation channels might differ in terms of energy–momentum dispersion as well as photon–photon nonlinearities. For our purposes, we will assume the channels to be identical hereafter. This allows us to consider a unique dispersion relation, ω = ω(k), and the same nonlinearity in any propagation channel, U = Ui. In addition, we will be implicitly assuming that single-photon states are injected into the circuit as narrow-band, spatially localized wave packets for which phase distortion is essentially negligible and the Kerr nonlinearity does not depend on the wave vector. A more quantitative justification of such assumptions is given later in the “Discussion” section. Also, generalizations of this scheme are always possible, of course, which is left for future works.

n-channel circuits and qubits encoding

When many wave vector components are involved, the characterization of the action of a generic quantum circuit on a given initial many-photon state, ψI, requires to evolve both in time and space such initial configuration to obtain the final state, ψF, which contains the QC result. In the present case, where only monochromatic single-photon states are considered, the description gets simplified. As discussed in Supplementary Note 1 and following ref. 34, one obtains that the action of the generic circuit is encoded into a global unitary operator \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {tot}}}\) such that

$${\psi }_{{\rm {F}}}={{{{{{{{\mathcal{U}}}}}}}}}_{M}\,{{{{{{{{\mathcal{U}}}}}}}}}_{M-1}\,\cdots {{{{{{{{\mathcal{U}}}}}}}}}_{1}{\psi }_{{\rm {I}}}\equiv {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {tot}}}{\psi }_{{\rm {I}}}\,,$$
(2)

with \({{{{{{{{\mathcal{U}}}}}}}}}_{m}\equiv \exp (-i{{{{{{{\mathcal{H}}}}}}}}[\omega ;\,U;\{{J}_{ij}^{(m)}\}]{t}_{m})\) denoting the unitary propagator in the mth spatial region of the circuit, i.e., for x [xm, xm+1], where the inter-channel hoppings can be treated as piecewise constant functions, i.e., \({J}_{ij}(x)={J}_{ij}^{(m)}\). In particular, the parameter tm, with tm = (xm+1xm)/vg, corresponds to the time spent by photons traveling at group-velocity vg within the mth sub-region of the circuit.

As a final comment, let us stress that we hereby assume dual-rail encoding to implement single photonic qubits, at a difference with ref. 34. By doing so, a 2N-channel circuit fed with N single-photon states can be exploited to represent an N-qubits quantum state. As sketched in Fig. 1a, the two logical qubit states in a 2-channel device are defined by the two single-photon Fock states propagating in two adjacent channels, \(\left\vert 1,0\right\rangle\) and \(\left\vert 0,1\right\rangle\), respectively. The former describes a single propagating photon in the upper channel, while the latter describes a single photon propagating in the lower channel. Throughout the paper, we will employ the notation \({\left\vert 0\right\rangle }_{2}=\left\vert 1,0\right\rangle\) and \({\left\vert 1\right\rangle }_{2}=\left\vert 0,1\right\rangle\), in which the subscript denotes the number of channels. Whenever there is no possible confusion, we may drop the subscript. Many-qubits states are straightforwardly obtained by taking the tensor products of the states above in a 2N-channel device.

Fig. 1: Photonic QC elements.
figure 1

a Qubit logical state obtained with a dual-rail encoding of a pair of photonic channels, in which single-photon Fock states can be injected into each waveguide and measured at the output through single-photon detectors. b The hopping process of a single photon between two channels can be viewed as an RX gate (see paragraph Single-qubit gates). c Single block of a two-qubit information processing unit: Multiple repetitions of such an elementary block can be concatenated to obtain the desired two-qubit gate. Hopping regions (HR) (green boxes) depend on two parameters (Jm, tm), while free propagation (FP) (blue regions) depends only on a single parameter (tm) for a given value of the photon–photon nonlinearity, U. The FP layers within the \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {inter}}}\), \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {down}}}\), and \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {up}}}\) regions are assumed to have fixed propagation time set by the corresponding HR terms, which is implicitly represented by a simple line in the sketch.

Single-qubit gates

Elementary rotations of single qubits can be tailored by suitably combining, e.g., rotations around the x- and z-axes. As shown in Supplementary Note 2, for n = 2 and in the single photon subspace, by means of the dual rail encoding, one can directly map the dynamics of two coupled channels into the action of the RX(θ) rotation gate as

$$RX(\theta )={{\rm {e}}}^{-i\frac{\theta }{2}{\sigma }_{X}}=\cos \left(\frac{\theta }{2}\right){\mathbb{1}}-i\sin \left(\frac{\theta }{2}\right){\sigma }_{X},$$
(3)

multiplied by a global phase factor eiωt, where σX represents the Pauli X-matrix. In particular, as reported in Fig. 1b, the rotation angle, θ = 2Jt, depends on the system geometry as well as on the group velocity of the propagating photons. As it will be addressed in the “Discussion” section, it is reasonable to assume that the simple component depicted in Fig. 1b implements an RX(θ) gate with a tunable rotation angle.

The universal gate set for single-qubit operations can be straightforwardly implemented by adding the RZ gate obtained, e.g., from the addition of a simple phase shifter in one of the two coupled channels3,41.

Two-qubit gates

It is nowadays widely accepted that deterministic two-qubit gates cannot be devised by means of only linear optical components1,2. Hence, it is reasonable to argue whether the usage of nonlinear components in photonic networks, such as those considered in the present work, could eventually help achieving this goal. Here we report a theoretical analysis that is an attempt to positively answer this question. In fact, we are going to show that the suitable combination of nonlinear and linear components represents a key ingredient for the development of a universal QC paradigm based on photonic platforms. Nevertheless, to the best of our knowledge, there is no intuitive way to determine a prior, which is the optimal arrangement of these elements to deterministically perform a given quantum gate. Therefore, we tackle such a problem by exploiting an inverse design strategy based on a multi-parameter optimization algorithm. In particular, we applied this approach to a class of 4-channel circuits constituted by the concatenation of a finite number of blocks, where each block is defined by combining multiple 2-channels hopping regions with free propagation regions of fixed length, as sketched in Fig. 1c (see the “Methods” section and Supplementary Note 2 for further details on the definition of the single unitaries).

Our numerical results show that this approach can successfully provide an approximate but highly faithful representation of at least two different entangling gates, namely the controlled-NOT (CNOT) and the Mølmer–Sørensen (M–S). Their matrix representation on the computational basis, \(S=\{\left\vert 00\right\rangle ,\left\vert 01\right\rangle ,\left\vert 10\right\rangle ,\left\vert 11\right\rangle \}\), is explicitly reported in the “Methods” section for completeness in Eqs. (4) and (5), respectively. In particular, we show that in both cases, it is possible to improve the precision of the representation, which is explicitly quantified by means of the average gate fidelity \(\bar{F}\), Eq. (13), by simply increasing the number of elementary blocks included in the variational ansatz. This is in agreement with general results in the context of quantum optimal control landscape theory. Indeed, it is guaranteed that searching for a solution in high enough dimensional spaces should yield optimal configurations36 (see Supplementary Note 4, Section “Methods” for details). The optimization procedure is implemented as follows: we first parametrize a set of possible hopping parameters between the different channels, given in a matrix representation of the unitary evolution operator, i.e., the generalized time propagator in Eq. (2) (with fixed propagation time). Then we analyse the performances when increasing the number of sequentially concatenated blocks of the type represented in Fig. 1c by letting the algorithmic optimizer minimize the cost function. Details on the choice of the cost function are reported in the “Methods” section, see in particular Eq. (12).

Optimization of the CNOT gate

In order to quantify the performances of the optimization scheme, we consider the values of the cost function and the average gate fidelity reached by the multi-parameter optimizer once at convergence, see Eqs. (12) and (13) in the “Methods” section. Numerical results for such quantities in the case of the CNOT gate are shown in Fig. 2a, b, respectively. There, we report results for an increasing number of elementary blocks and different values of the photon nonlinearity, ranging from zero or very weak to ultra-strong as compared to \({J}_{\max }\).

Fig. 2: CNOT optimization.
figure 2

Comparison of a the best cost function (see Eq. (12)) and b average gate fidelity (see Eq. (13)) for different values of the Kerr nonlinearity U (in hopping parameter Jmax units) after the optimization of the block structure. The free propagation and interaction times were both fixed to t = 1 in dimensionless units.

In order to keep the analysis on a general level, we assume the tunneling rate in each hopping region (the HR units in Fig. 1c) as an independent optimization variable such that \(0\le J\le {J}_{\max }\), with \({J}_{\max }=1\) in dimensionless units. The propagation time in each sector is also fixed at t = 1 as a dimensionless parameter, but it may be included in the optimization parameters as well, in principle. In terms of realistic parameter values, a discussion of the actual dimensions and physical implementations will be given in the following section. As it can be noticed at first glance from Fig. 2a, the cost function cannot be minimized for negligible values of the nonlinearity, always displaying values that are significantly larger than zero independently of the number of blocks. This behavior is compatible with the well-established knowledge that two-qubit entangling gates cannot be realized only by means of linear (or approximately linear) components1,2. On the other hand, we notice that slower numerical convergence towards the highest fidelities is observed for UJmax case, which may be attributed to the onset of the photon blockade between neighboring channels34. However, weak nonlinearities relative to the hopping (in particular, ranging between U/Jmax = 0.05 and 0.5, in normalized units) give rise to reasonably fast convergence of the optimizer towards minimal cost function values. These numerical results suggest that the entanglement between qubit states in such platforms emerges as the result of the competition between hopping (i.e., the J parameters) and the entity of the nonlinear shifts affecting two-photon states propagating within the same channel (i.e., U). Indeed, even if the latter are never fed into the device as input states, and they do not belong to the computational basis, their excitation does occur during the time evolution within the circuit and ultimately affects the computation outcome. Thus, the final output of the four-channel system is the nontrivial result of the extra phase-shifts accumulated in the presence of such nonlinearities.

In fact, this is further evidenced by plotting the corresponding average gate fidelity as a function of the number of blocks in Fig. 2b. For nonlinearities in the range \(0.05\, < \,U/{J}_{\max } < 0.5\) this figure of merit reaches values close to 100% for a number of elementary blocks in the order of 10. In addition, it emerges that the average gate fidelity does not depend monotonically on the photon nonlinearity for a given number of blocks and that the condition \(U > {J}_{\max }\) does not help in reaching a high fidelity for a reasonably small number of blocks. From these numerical results, the optimal nonlinearity seems \(U=0.5{J}_{\max }\), which allow obtaining an almost ideal CNOT gate with a reasonably compact circuit: 7 elementary blocks are already sufficient to reach an average gate fidelity very close to 100%.

As an illustration of the actual performances to be expected, the real and imaginary parts of the optimized quantum gate matrix are explicitly shown for the best possible CNOT gate obtained with our numerical optimization procedure (see upper panels in Fig. 3 for a close-up). The real part seems to perfectly match the CNOT matrix, while the imaginary part is almost irrelevant, thus faithfully reproducing Eq. (4). In Fig. 3 we also show the full transfer matrix when taking into account all the possible configurations within the two-photon subspace, which spans, of course, also outside of the logic space of interest (e.g., when two photons simultaneously propagate within the same waveguide channel at the input or output port, respectively). This strongly supports the conclusion that such a gate is deterministic since no logic state is basically mapped outside the logic space, i.e., with negligibly small amplitude. It should be noted that inside the entangling gate, the quantum state can (and must) populate states lying outside the logic space in order to have the two photons nonlinearly interacting with each other; on the other hand, the output must always be restricted to the logic space for the operation to be defined as deterministic. The optimal parameters found for this structure are fully reported in a dedicated table in Supplementary Note 8 for a straightforward reproduction of our results for the interested reader.

Fig. 3: Best CNOT matrix.
figure 3

a Real and b imaginary parts (rounded at the third decimal digit) of the approximate CNOT matrix on the basis of all the possible two-photon input/output states obtained by using a 12-block structure with \(U=0.5{J}_{\max }\). The theoretical average fidelity for this gate is \(\bar{F}({\theta }_{{\rm {opt}}}) \, \approx \, 99.96 \%\). Vertical and horizontal black lines divide the logic space from the one based on states that are not on the computational basis. The projection of logic states of the computational space is negligible, as clearly seen from the plot. The zoom highlights the gate matrix in the two-qubit logic space.

Optimization of the M–S gate

While the optimization of the CNOT gate by concatenating a limited number of elementary operations in a nonlinear quantum photonic circuit is a relevant result per se, here we also show that a similarly efficient algorithmic optimization is possible for other entangling operations on the two-qubits space. Namely, we hereby focus on the ideal quantum gate defined in Eq. (5), the so-called Mølmer–Sørensen gate. The related numerical results are summarized in Fig. 4, where the behavior of the optimized average gate fidelity is shown as a function of the number of blocks for different values of the nonlinearity (Fig. 4a). The best approximate matrix is also shown in Fig. 4b. Similarly to what observed for the CNOT gate, it appears as a sort of general trend that convergence towards the ideal operation is reached quite fast for values of the nonlinear parameter in the range \(0.05{J}_{\max } < U < 0.5{J}_{\max }\). This is not the case for negligible or too large nonlinearity values, although the photon blockade (i.e., for \(U/{J}_{\max }=10\)) seems to have less of a detrimental effect here. Fig. 4b also shows the optimal M-S gate obtained for \(U/{J}_{\max }=0.5\), restricted to, the computational basis subspace, which evidently implements an almost ideal deterministic RXX operation. A plot of such an optimized M–S gate in the whole two-photon excitation space is explicitly reported in Supplementary Note 7, in analogy to Fig. 3 for the CNOT. Also, in this case, a table reporting the optimal parameters for this structure is given in Supplementary Note 8 for completeness.

Fig. 4: Mølmer–Sørensen gate optimization.
figure 4

a Best average gate fidelity values comparison after the optimization of the block structure non-decreasing. Free propagation and interaction times were fixed to 1 ps. b Plot showing the optimal M–S matrix (rounded at the third decimal digit) for \(U/{J}_{\max }=0.5\) and 20 blocks, for which we calculate \(\bar{F}({\theta }_{{\rm {opt}}}) \, \approx \, 99.96 \%\).

Realization and material platforms

It is now relevant to discuss the scalability and the possible physical implementations. Related to that, we will discuss the following any possible source of noise, loss, decoherence and fabrication tolerance, which we hereby address at the level of qualitative discussion while reporting all the relevant quantitative results in Supplementary Notes 5 and 6.

Several quantum photonic platforms have been put forward in the past few years, mostly based on conventional nonlinear materials in passive semiconductors, such as Si, SiN, glass, etc.41. While the scheme we have presented in this work is general and can also be applied outside the photonic realm, in principle, here we focus on discussing its realization in nonlinear photonic circuits. First, we notice that any realistic source of single-photon wave packets should be considered over a finite bandwidth. This implies that a proper theoretical treatment should consider a full quantum mechanical description of the multimode fields and their nonlinear interaction to best capture the effects of cross- and self-phase modulation on the propagating wavepackets27,28,29. The theoretical approaches developed, e.g., in refs. 32,42 might be applied, which would require a significant extension going beyond the present work. In particular, numerical limitations in the algorithmic optimization procedure might arise, which is then left for future developments.

Concerning these sources, single-photon Fock states can be produced at high repetition rates, high purity levels, and low bandwidth from single quantum emitters43, as well as detected at the output of the device through highly efficient single-photon detectors44. In addition, the injection of single-photon Fock states generated from quantum emitters into photonic integrated circuits does not constitute a technological bottleneck nowadays45. In fact, most single-photon sources based on pulsed single emitters, such as single QDs46 or molecules47, are quite narrow bands, and they can be coupled to an approximately linear waveguide mode dispersion. The latter can, in turn, be fully engineered by inverse design techniques48, with either real or imaginary parts of their eigenmodes possessing the desired properties, such as linearity, reduced bandwidth (e.g., flat band modes), low losses (e.g., bound states in the continuum)48. In addition, we also assume that the single-photon wave packets are spatially localized, typically on the optical wavelength scale, to enhance the two-photon nonlinear interaction. This may be practically achieved in this way: A single photon source emits a narrow-band wave packet (i.e., about 0.1 meV in a cavity, with temporal spread in the order of few tens of ps), which in turn excites a linear guided mode with low-group velocity, thus corresponding to a spread in reciprocal space that is on the order of something 0.5 μm−1; ultimately, we get a single-photon state with few tens of ps temporal duration propagating as a few microns spatial wave packet. As an additional pre-requisite, the Kerr-nonlinearity should be assumed constant over this bandwidth (and hence wave vector span).

Then, we notice that in view of devising a full QC platform, the optimized circuits implementing two-qubit operations must be complemented with isolated two-channel circuits in which single-qubit rotations are performed. In this respect, the hopping rate J can be tuned by suitably choosing the spatial separation between two adjacent channels. On the other hand, the parameter vg can be tuned and externally controlled by properly shifting the working point along the photon dispersion. This task can be practically achieved, for instance, by local temperature tuning, which results in local refractive index changes. This would ensure, for instance, the flexibility to perform RX rotations with externally controlled rotation angles. In principle, the application of the optimization procedure described in this work might lead to the targeted design of a quantum photonic integrated circuit fulfilling the requirements to achieve deterministic quantum gate operations for each specific platform under consideration. As an alternative long-term view for scalability, it might also be worth envisioning the optimized two-qubit gate as a stand-alone element of an external computational architecture, which is only deputed to perform the operation for which it is designed, e.g., the CNOT or the RXX gate, any time it is needed in the quantum computing algorithm.

In terms of possible material platforms, besides the well-established silicon-on-insulator technology, where remarkable advances in quantum photonic experiments have been recently shown49, a viable example might be the SiN platform, for which the high nonlinearity, low-loss, and mature fabrication of complex circuits50 might turn out to be an optimal combination to realize a proof of principle demonstration, at least. More recently, exciton-polaritons in semiconductor nanostructures have been proposed as an interesting platform to realize quantum photonic applications33,34,51, relying on the first experimental evidence for the quantum nature of the propagating polariton field excited from a single-photon Fock state52,53 (a key requirement of our theoretical scheme). First, let us notice that by assuming, e.g., typical excitation energies in the ω ~ 1 eV range, hopping parameters can be assumed such that \(\hslash {J}_{\max } \sim 1\) meV, as an order of magnitude value derived from recent experiments54. Then, we notice that the polariton–polariton scattering rate is of the Kerr-type, as it is well established14,20,21. This allows us to realize the model Hamiltonian in Eq. (1) with values of nonlinearity that are compatible with the ones required for the optimal operations identified in the previous section. In fact, the single-photon nonlinearity depends on the actual field confinement14,15. For 1D propagating polariton wave-packets with the spatial extension of the order of their wavelength (i.e., 1–2 μm, corresponding to a wave vector spread of, say, 0.5–1 μm−1 over a bandwidth of few hundred μeV, after the proper design of the polariton waveguide mode dispersion) this may range from 10 μeV (realistic, see, e.g., ref. 20) up to 50 or even 100 μeV depending on the material platform (optimistic, although recent works have reported enhanced nonlinearities of dipolar polaritons55,56,57). These values would then correspond to \(U/{J}_{\max }=0.01-0.1\), i.e., in the range where we have shown that two-qubit gates could actually be realized with large fidelity (see Figs. 2 and 4). In addition, \({J}_{\max }\) can also be reduced in realistic samples (depending on the distance between neighboring waveguides), thus making the optimal parameter (\(U/{J}_{\max }=0.5\)) within reach. This would also allow us to reduce the number of blocks required to achieve an optimal fidelity close to 100%, which is particularly relevant when losses have to be included in the discussion, as it will be addressed in the next section.

Losses, decoherence and fabrication imperfections

We have introduced a scheme based on the implementation of a unitary Hamiltonian evolution, but the realization of this paradigm in state-of-the-art photonic devices has to inevitably cope with the effects of losses and decoherence. Moreover, tolerance of the simulated figures of merit, such as gate fidelities, to fabrication imperfections or static fluctuations of structure parameters has to be assessed. Here we provide a short discussion summarizing the main conclusions about these effects. The role of losses and decoherence has been characterized by means of numerical simulations within an open quantum system approach. In particular, we stress that we paid attention to the deviations induced by non-unitary effects on the optimized circuits characterized in the “Results” section. A priori, such a procedure is conceptually different from performing a parameter optimization of the quantum circuit in the presence of incoherent effects. Therefore, it is reasonable to believe that further performance improvements in terms of average gate fidelity could be achieved by accounting for such incoherent effects during the optimization procedure. The details are reported in Supplementary Note 5.

First, we address the issue of population losses. As already analyzed34, propagation losses in such interferometers may affect the actual signal intensity detected at the output of the device, but correlations between single photons properly normalized to the detected intensity are not affected. In fact, provided the whole interferometer length remains within the propagation lifetime, the structure of the quantum gate operation is preserved within the computational basis, albeit with reduced efficiency due to the reduced probability of detecting coincident photons at the output (signal loss). To check this conjecture numerically, we have solved the quantum master equation for our model, including population losses in a Lindblad term (see Supplementary Note 5). The outcome is straightforward: the average gate fidelity has an exactly exponential decay with a lifetime given by the inverse of the two-photon population decay rate, i.e., τ = 1/(2γ). Formally, such an exponential decay implies that in the presence of losses, the circuit does not implement a deterministic gate. On the other hand, we stress that such exponential damping is expected to play a crucial role in the performance of any real-world quantum circuit58, where the computational basis is represented by a collection of states with a given fixed number of fundamental excitations (in our case the two photons propagating within the given channels). In such a case, any input state belonging to the computational subset will drift out of such subspace, leading to a less-than-unity average gate fidelity at the output. In this respect, as shown and discussed in Supplementary Note 5, our circuit displays some interesting features. Indeed, even if spurious contributions necessarily appear in the output state, the gate structure in the two photons sector is (almost) preserved, i.e., the lossy circuit reproduces—up to the exponential scaling factor mentioned above—the quantum gate targeted during the optimization procedure.

Other sources of noise, such as thermal noise or pure dephasing, can also be quantified through dissipative terms in the master equation (see Supplementary Note 5). First, photon number fluctuations can be considered absolutely negligible, given that low working temperatures in the Kelvin range produce negligible thermal photons in the visible/near-infrared range, and assuming pure Fock-state injection guarantees no presence of higher photon number states (as it would be the case, e.g., in an attenuated laser source). On the other hand, the effects of number-dependent dephasing might have an impact on the relative coherence between the qubit basis states. In order to properly quantify these effects, we have considered a number-dependent Lindblad term with a pure dephasing rate γdeph. Our numerical results (reported in Supplementary Note 5) suggest that a pure dephasing rate γdeph < 10−2γ0 (with γ0 determined as the inverse of the total propagation time within the whole interferometer) has no practical effects on the qubits decoherence, independently from the population decay rate γ, thus preserving the structure of the two-qubit entangling operation within the computational basis sector. On the other hand, dual rail encoding naturally allows the minimization of the effects of the loss of relative coherence between the two logical states of a single qubit, which are defined from single photons propagating in different channels. The results are interesting per se, and they might serve as a guideline for experimentalists working on specific material platforms once a proper characterization of dephasing rates is performed.

Finally, we have also considered the effects of static fluctuations of the model parameters on the overall gate performances, as it might occur when fabrication imperfections or structural disorders come into play after the circuit realization. Also, in this case, by considering realistic parameter fluctuations in state-of-the-art semiconductor technology, it is concluded that the optimal gate fidelities predicted in this work can be safely preserved against the main sources of static disorder (see Supplementary Note 6).

To conclude this section, we try to be specifically more quantitative on the estimation of model parameters for prospective polariton integrated circuits. First, let us notice that intrinsic exciton polaritons lifetime in inorganic semiconductor nanostructures has been measured in the order of 100 ps59. Moreover, exploiting the concept of bound-state in the continuum in a patterned waveguide geometry to suppress out-of-plane radiation losses, lifetimes in the order of 300 ps have been measured in polariton condensates (see, e.g., additional material of ref. 60). Recently, such propagation lifetimes have been confirmed in a polariton waveguide geometry57. In addition, different material platforms and designs have been explored for waveguide polaritons61,62,63,64. Based on these results, it is very likely that combining suitably engineered photonic lattices with single-quantum well samples, propagating single-polariton states with ultra-long propagation lengths might be realized65. Hence, the single polariton decay can be estimated in the γ ~ 1−10 μeV energy range. For the results shown, e.g., in Figs. 2 and 4, a sequence of 12 blocks with an average duration of about 8 ps per block (i.e., assuming a 1 ps average propagation time in each of the eight sectors schematically represented in Fig. 1c) amounts to an estimated total propagation time of 96 ps when considering \(U/{J}_{\max }=0.05\). This would allow to achieve a two-qubit entangling gate with fidelity in excess of 99.5% with a total propagation time well within the polariton lifetime of, e.g., 300 ps when restricting to the computational basis subspace, although with reduced efficiency due to the exponential decay of population (see, e.g., Fig. S-2 in Supplementary Note 5).

Comparisons with existing literature

We found conceptual connections with a few works in the recent literature30,37,38, as already mentioned in the Introduction. On the one hand, the analogy with quantum optical neural networks37,38 may be further explored in our case to find a possibly reduced depth of the circuits to implement a targeted two-qubit gate. In particular, layers of linear interferometers combined with localized nonlinearities are considered in ref. 37, which allows us to optimize a CNOT gate in dual rail encoding in a noiseless quantum optical neural network with a 10−4 error, which is basically comparable to our result. In a follow-up work38, lossy quantum optical networks are considered and their functionalities are optimized in the context of Bell-state analysis. In the latter work, only population losses are taken into account, which brings a reduced efficiency of the Bell-state analyzer (similar to our discussion above), but everything goes as expected within the computational basis. Losses are taken into account already at the training stage, which we might consider for future developments of the present work. While the work in ref. 38 does not consider optimization of CNOT or M–S gates, we can infer similar performances of the offline-optimized neural networks (i.e. a perfect network is trained, then losses are added to the solution) when compared to our quantum interferometers, albeit with different circuit depth due to the different model for nonlinearities. In fact, localized Kerr scatterers are considered between purely linear interferometric layers, imparting up to π phase shift on the two-photon states, and operation efficiency drops significantly to about 30% when a π/4 shift is assumed. In our proposed implementation, the distributed nonlinearity model is far from imparting such a large phase shift, which is the reason why we obtain comparable performances at the expense of an increased number of layers (blocks).

Finally, a CNOT gate is optimized in ref. 30 in continuously coupled waveguides implementing correlated 1-D quantum walks. The theoretical model thereby considered shares many similarities with ours, with competition between nonlinearity and hopping occurring over the whole propagation, in which no losses are considered. A similarly optimal U/Jmax = 0.5 is found for the best CNOT performance, with fidelities comparable to ours in the lossless case. However, there does not seem to be a fidelity drop for U/Jmax 1, as our results suggest, rather, it asymptotically increases to 100% on increasing U/Jmax.

Conclusions

We have proposed a quantum computing model based on the realization of a set of universal qubit gates in nonlinear photonic interferometers, where a dual-rail type of qubit encoding is assumed. We have shown that the interplay of hopping between nearest-neighbor waveguides and single-photon nonlinearities within the same propagating channel allows to build robust deterministic entangling gates between two such photonic qubits with high fidelity, whose quest has been one of the major issues in this field. The optimal realization of this operation on-chip is achieved by a suitable concatenation of something between 10 and 20 elementary blocks, each containing all the possible combinations of propagation unitaries defined on a 4-port device, without the need for additional ancillary waveguides. On the quantitative side, we have shown that optimal CNOT and M–S quantum gates can be designed with 99.96% theoretical fidelities. It is worth noting, for comparison, that currently available QC devices have state-of-the-art CNOT fidelities in the order of 99.77% with superconducting circuit architectures66, and M–S fidelities of 99.4% with trapped ion few qubits devices (Latest data from the IonQ Aria QPU specifications, see, e.g., https://ionq.com/quantum-systems/compare (accessed 16 January 2024)). Finally, while the relevance of the results reported in this work is mainly theoretical, the optimal operations achieved have been tested against the main sources of population loss, thermal noise and pure dephasing, also showing good resilience to static parameters fluctuations derived from, e.g., fabrication imperfections in actual devices. In conclusion, we believe these results might foster further research toward the realization of quantum devices to be used as building blocks of a canonical model of quantum computation employing single propagating photons as information carriers.

Methods

Two qubit gates

In this work, we have targeted two-qubit operations defined as controlled-NOT (CNOT) and Mølmer–Sørensen (M–S)39,40, which are paradigmatic entangling quantum gates. In particular, the CNOT is described by the following ideal operation in matrix representation on the two-qubit basis67

$${\rm {CNOT}}=\left[\begin{array}{cccc}1&0&0&0\\ 0&1&0&0\\ 0&0&0&1\\ 0&0&1&0\\ \end{array}\right].$$
(4)

Its action is such that the state of the second (target) qubit is flipped when the first (control) qubit is in its logical state \(\left\vert 1\right\rangle\).

The M–S gate40 is an alternative entangling operation consisting of an RXX gate with a fixed angle of π/2. Differently from the CNOT, the M–S gate has a non-trivial imaginary part:

$${\rm {RXX}}(\pi /2)=\frac{1}{\sqrt{2}}\left[\begin{array}{cccc}1&0&0&-i\\ 0&1&-i&0\\ 0&-i&1&0\\ -i&0&0&1\\ \end{array}\right],$$
(5)

where \({\rm {RXX}}(2\theta )=\exp \left(-i\theta {\sigma }_{X}\otimes {\sigma }_{X}\right)\).

In the present implementation, these two gates are realized by exploring the time propagation of a pair of single-photon states into a 4-port quantum photonic interferometer. The general Hamiltonian describing such a system is Eq. (1) with n = 4, which accounts for two main phenomena, i.e., two-photon nonlinear phase shifts and hopping of photons between adjacent channels, respectively. Hence, all possible time-evolution unitary operators can be written as a tensor product of a single-channel operator, \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\), accounting for nonlinear propagation, and a two-channel one describing hopping events, \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {HR}}}\). The matrix form of these two operators is reported in Supplementary Note 2, where the explicit dependence on the parameters of the Hamiltonian model, as well as on the propagation time in each sector, can be appreciated.

In light of these considerations, these two matrices can be used to define the set of elementary 4-channel operations, that is {\({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {free}}}\), \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {paral}}}\), \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {inter}}}\), \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {down}}}\), \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {up}}}\)}, needed for the parametrization of the fundamental block depicted in Fig. 1c. In particular, their explicit expressions read

$${{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {free}}}={{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}$$
(6)
$${{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {paral}}}={{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {HR}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {HR}}}$$
(7)
$${{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {inter}}}={{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {HR}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}$$
(8)
$${{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {down}}}={{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {HR}}}$$
(9)
$${{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {up}}}={{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {HR}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}\otimes {{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {FP}}}$$
(10)

where  denotes the tensor product. For our purposes, it is worth noticing that the two \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {HR}}}\) operators in Eq. (7) are defined, in general, with different values of Jij. This degree of freedom is exploited in the optimization procedure. Once the unitary operator \({{{{{{{{\mathcal{U}}}}}}}}}_{{b}}\) describing the single block depicted in Fig. 1c is parametrized, the total time-propagator \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {tot}}}\) for a structure with M blocks is obtained by considering the ordered product of such block operators, that is

$${{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {tot}}}(\{{\theta }_{{\rm {s}}}\})=\mathop{ \otimes }\limits_{b = 1}^{M}{{{{{{{{\mathcal{U}}}}}}}}}_{{b}}={{{{{{{{\mathcal{U}}}}}}}}}_{M}\,{{{{{{{{\mathcal{U}}}}}}}}}_{M-1}\cdots {{{{{{{{\mathcal{U}}}}}}}}}_{2}\,{{{{{{{{\mathcal{U}}}}}}}}}_{1},$$
(11)

where {θs} denotes the set of physical parameters used for representing the M-block system.

Cost function, minimization scheme, and gate fidelity

In this section, we briefly describe the main ingredients used in the optimization scheme, namely the cost function, the numerical optimizer, and the average gate fidelity used to assess the gate performances described in the previous sections.

The cost function considered in the present work is defined as

$$C(\{{\theta }_{{\rm {s}}}\})=\parallel {\tilde{{{{{{{{\mathcal{U}}}}}}}}}}_{{\rm {tot}}}(\{{\theta }_{{\rm {s}}}\})-{{{{{{{\mathcal{T}}}}}}}}{\parallel }_{{\rm {F}}}^{2}\,$$
(12)

in which F denotes the Frobenius norm, \({\tilde{{{{{{{{\mathcal{U}}}}}}}}}}_{{\rm {tot}}}(\{{\theta }_{{\rm {s}}}\})={{\rm {e}}}^{-i\Delta \phi }{{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {tot}}}(\{{\theta }_{\rm {{s}}}\})\) and \({{{{{{{\mathcal{T}}}}}}}}\) denote the total unitary operator describing the 4-channel system (up to a global phase) and the targeted ideal operation (either the CNOT or the M–S in Eqs. (4) and (5), respectively, are both restricted to the computational basis space. The Δϕ is a real parameter used to compensate for a possible global phase difference between \({{{{{{{{\mathcal{U}}}}}}}}}_{{\rm {tot}}}(\{{\theta }_{{\rm {s}}}\})\) and the target operation \({{{{{{{\mathcal{T}}}}}}}}\). In practice, such a global phase can always be tuned by means of single-qubit phase-shift operations performed on each channel at the end of the circuit.

The optimization procedure aims at finding the best approximation of the target operator \({{{{{{{\mathcal{T}}}}}}}}\) by iteratively looking for the set of parameters, {θs}, that minimizes the cost function C({θs}). Specifically, this task is performed by means of numerical routines. In particular, we make use of the Scipy68 implementation of L-BFGS-B69,70 optimizer (whose execution has been accelerated with JAX71,72), which is a limited-memory algorithm for solving large nonlinear optimization problems subject to simple bounds on the variables.

Once at convergence, the routine returns the optimal set of physical parameters \(\{{\theta }_{{\rm {s}}}^{{\rm {opt}}}\}\) that minimizes C({θs}) for a given value of the nonlinearity and number of blocks. The actual performances of the optimization procedure are subsequently quantified by computing the average gate fidelity, \(\bar{F}(\{{\theta }_{{\rm {s}}}^{{\rm {opt}}}\})\), also known as ensemble average fidelity67. In practice, this is equivalent to computing the average fidelity between the targeted quantum state and the one obtained after optimization over a certain ensemble. The explicit expression for this figure of merit reads

$$\bar{F}(\{{\theta }_{{\rm {s}}}^{{\rm {opt}}}\})=\frac{1}{| S| }\mathop{\sum}\limits_{i\in S}| \langle i| {\tilde{{{{{{{{\mathcal{U}}}}}}}}}}_{{\rm {tot}}}^{{{{\dagger}}} }(\{{\theta }_{{\rm {s}}}^{{\rm {opt}}}\}){{{{{{{\mathcal{T}}}}}}}}| i\rangle {| }^{2}\,$$
(13)

in which we employ the computational basis S as the ensemble over which the average is calculated, and S represents the number of states in the computational basis S, i.e. S = 4 in the two-qubit case under consideration.

Similarly to what was reported above, the matrix \({{{{{{{\mathcal{T}}}}}}}}\) in Eq. (13) denotes one of the two target operators, as defined, e.g., in Eqs. (4) and (5). Consequently, the particular values \(\{{\theta }_{\rm {{s}}}^{{\rm {opt}}}\}\) depend explicitly on the chosen \({{{{{{{\mathcal{T}}}}}}}}\). An explicit derivation of Eq. (13) is reported in Supplementary Note 3.

For each numerical result reported in the manuscript (corresponding to a given value of nonlinearity U and to a given number of blocks), we have sampled different initial sets of hopping parameters {Jij} recorded as one-dimensional vectors depending on the number of blocks. When trying to optimize a circuit with few blocks (i.e., ≤15), we have considered 2000 different initializations, while for a larger number of blocks (i.e., ≥16), we have considered only 200 initial random configurations. A different optimization procedure is executed for each initialization. Among the resulting optimization outputs, we have selected the best final configuration in terms of the achieved accuracy, i.e. the one leading to the minimal value of the cost function. We refer to Supplementary Note 4 for additional plots concerning the ensemble mean of the average gate fidelity calculated over different initializations for an increasing number of blocks. The initial set of hopping parameters is sampled from a Gaussian distribution centered in \(0.5{J}_{\max }\) with standard deviation \(0.1{J}_{\max }\).

As a final comment, we notice that it might sound appealing to try using the average gate fidelity as a cost function. However, since this quantity is only sensitive to the squared modulus of the overlap amplitudes, it cannot be actually used to optimize \({\tilde{{{{{{{{\mathcal{U}}}}}}}}}}_{{\rm {tot}}}(\{{\theta }_{{\rm {s}}}\})\). Indeed, if on the one hand, it is easy to show that

$$C(\{{\theta }_{{\rm {s}}}^{{\rm {opt}}}\})=0\Rightarrow \bar{F}(\{{\theta }_{{\rm {s}}}^{{\rm {opt}}}\})=1,$$
(14)

on the other hand, the converse statement does not hold true, in general. In other words, there exist sets of parameter values that maximize the fidelity without simultaneously minimizing C({θs}).