## Introduction

Sustained progress in the engineering of platforms for quantum information processing has recently achieved a scale that surpasses the capabilities of classical computers to solve specialised and abstract problems1,2,3,4. But while achieving a computational advantage for practical or industrially relevant problems may be possible with further scaling of special purpose NISQ (noisy intermediate scale quantum) devices1, more general purpose quantum computers will require a hardware platform that integrates millions of components, individually operating above some fidelity threshold5,6. Silicon quantum photonics7, which is compatible with complementary metal-oxide-semiconductor (CMOS) fabrication, provides a potential platform for very large-scale quantum information processing8,9,10.

All-photonic quantum computing architectures rely on arrays of many photon sources to achieve combinatorial speed-ups in quantum sampling algorithms11,12, or to approximate an on-demand source of single photons13,14,15, and supply entangling circuitry for general purpose quantum computing8,16. In the former case, the level of indistinguishability among photons upper bounds the computational complexity of sampling algorithms17; in the latter case, photon impurity and distinguishability lead to logical errors16,18. Furthermore, and in general, lossy or inefficient heralding of photons vitiates the scaling of photonic quantum information processing. The lack of a demonstration that simultaneously overcomes all of these challenges has been a bottleneck to scalability for quantum computing in integrated photonics.

Progress in solid-state sources of single photons make quantum dots an attractive approach for certain NISQ experiments19,20. However, the low-loss integration of solid-state sources into photonic circuitry, that maintains distinguishability over many photons, is an ongoing challenge21,22. Integrated sources of photons based on spontaneous processes, such as four-wave mixing (SFWM) in single-mode waveguides or micro-ring cavities7,23 are appealing for their manufacturability. However, spontaneous sources incur limitations to purity and to heralding efficiency23, with micro-ring cavities additionally requiring resonance tuning to avoid distinguishability among different cavities24,25.

Here, we demonstrate the engineering of a CMOS-compatible source of heralded single photons using silicon photonics, which simultaneously meets the requirements for scalable quantum photonics: high purity, high heralding efficiency, and high indistinguishabilty.

The source is based on inter-modal SFWM, where phase-matching is engineered by propagating the pump in different transverse modes of a spiralled multi-mode (MM) waveguide26.

## Results

### Discrete-band inter-modal SFWM phase-matching and delayed-pump scheme

Integrated photon sources in silicon are based on SFWM, where, if phase-matching (momentum conservation) and energy conservation conditions are satisfied, light from a pump laser can be converted into pairs of single photons23,27. In standard SFWM in single-mode waveguides, near-zero dispersion produces broad phase-matching bands around the pump wavelength, such that the process, dominated by energy conservation conditions, induces undesired strong spectral anticorrelations between the emitted photons. In contrast, in this work we suppress such correlations adopting a inter-modal approach to SFWM. As shown in Fig. 1a, an input pulsed laser coherently pumps the two lowest order transverse magnetic (TM) modes of a MM waveguide, namely TM0 and TM1 (see Fig. 1a inset), and generates pairs of idler and signal photons in these modes via inter-modal phase-matching. The dispersion relations between the TM0 and TM1 modes are such that a discrete narrow phase-matching band appears26. By tailoring the waveguide cross-section, the modal dispersion can be accurately engineered to design the phase-matching band with a bandwidth similar to the pump bandwidth (related to energy conservation). This suppresses the frequency anticorrelations imposed by energy conservation, and enhances the spectral purity of the emitted photons26. In particular, we exploit the different modal group velocities in silicon waveguides to achieve a condition where the idler and signal photons are generated on TM0 at 1516 nm and on TM1 at 1588 nm, respectively, with a bandwidth of approximately 4 nm (see Supplementary Note 1).

Moreover, to obtain a near-unit spectral purity, we further suppress residual correlations in the joint spectrum by inserting a delay τ on the TM0 component of the pump (with higher group velocity than TM1) before injecting it in the source. The delay gradually increases and decreases the temporal overlap between the pump in the TM0 and TM1 modes along the multi-modal waveguide source (colour-coded in Fig. 1a). This results in an adiabatic switching of non-linear interactions in the source, which suppresses spurious spectral correlations28,29. Simulations (see Supplementary Note 1) predict a spectral purity of 99.4% in this configuration, in contrast to the case where no delay is applied, which predicts a purity of 84.0%, as shown in Fig. 1b.

### Source design

Figure 1c, shows the compact footprint for the MM waveguide source obtained by adopting a spiral geometry. The delayed-pump excitation scheme is implemented in three stages, as shown in Fig. 1a. The pump, initially in TM0, is split by a balanced beam-splitter; one arm receives a delay of τ with respect to the other, then the two arms are recombined using a TM0 to TM1 mode converter, and injected in the MM waveguide. Once generated, the signal photon is separated from the idler via a second TM1 to TM0 mode converter. After processing, signal and idler photons are out-coupled to fibres, where the pump is filtered out via broad-band fibre bragg-gratings, and single photons are detected with superconducting-nanowire single photon detectors (SNSPDs).

### Source brightness characterisation

We experimentally characterised the squeezing value ξ of the generated two-mode squeezed state emitted from individual sources with second-order correlation measurements30 (see Supplementary Note 5). Fig. 1e compares results for both the delayed and the non-delayed cases. Experimental results confirm our simulations and additionally demonstrate higher brightness as a benefit of the temporal delay scheme (see Supplementary Note 1). Squeezing values up to $${\left|\tanh (\xi )\right|}^{2}\simeq 0.2$$ are observed using a small input (off-chip) average pump power of 3 mW, corresponding to  >8 MHz photon-pair generation rates on-chip. To reduce noise from multi-photon events, measurements reported from this point on are performed with an input pump power of approximately 0.5 mW: coincidence rates are measured at 15 kHz, with heralded single photon $${g}_{{\rm{h}}}^{(2)}$$ measured at 0.053(1) (Fig. 1e inset).

### Spectral purity characterisation

Source purity is first estimated from a direct measurement of the joint spectral intensity (JSI)31. The JSI reconstruction is implemented using narrow-bandwidth tunable filters to scan the emitted wavelengths of the signal and idler photons, as pictured in Fig. 1f. Data from a source with no temporal delay yields a JSI with a spectral purity of 93.1(2)%, which increases to 99.04(6)%, in the scheme with a delay, as shown in Fig. 1g. The contrasting measurements show a clear suppression of spurious correlations with the delay scheme. A second estimation of the emitted single photon purity for the delayed structure is performed via unheralded second-order correlation measurements $${g}_{{\rm{u}}}^{(2)}{\,}$$30. These are implemented by dividing the output signal mode with an off-chip balanced fibre beam-splitter and measuring coincidences between the two output arms (see Fig. 1h). Measured unheralded second order-correlation values are reported in Fig. 1i. We obtain $${g}_{{\rm{u}}}^{(2)}(0)=1.97(3)$$, which corresponds to a single photon purity of 97(3)%, consistent with the value obtained from the JSI.

### Heralding efficiency characterisation

The capability of the sources to generate pure photons with no requirement for filtering enables the simultaneous achievement of high heralding efficiency and high purity. In our experiment, off-chip filters are used solely for pump rejection: their bandwidth (12 nm, flat transmission) contains >99% of the emitted spectra, which results in ultra-high filtering heralding efficiency32. While the effect of filtering is thus negligible, the intrinsic heralding efficiency of the source is affected by linear and non-linear transmission losses inside the waveguide. These losses are, however, greatly mitigated in MM waveguides (which present  <0.5 dB/cm linear loss, see Supplementary Note 2). Taking into account the characterised losses, we estimate a heralding efficiency of approximately 95% for an individual source. The measured heralding efficiency at the off-chip detectors is 12.6(2)%, corresponding to 91(9)% on-chip intrinsic heralding efficiency after correcting for the characterised losses in the channel to the detectors (see Supplementary Note 2), which can be highly suppressed by implementing low-loss off-chip couplers33 or with integrated detectors34.

### Source indistinguishability characterisation

To experimentally test the source indistinguishabilty we integrate a reconfigurable photonic circuit to perform quantum interference between different sources. Schematics of the circuit are shown in Fig. 2a–b. Two sources are coherently pumped by splitting the input laser with an on-chip tunable Mach-Zehnder interferometer (MZI); the resulting idler and signal modes from the different sources are grouped and interfered on-chip using additional integrated phase-shifters and MZIs (see Methods). Using this circuit, we experimentally estimate the indistinguishability among the sources using three different types of measurements. First, we reconstruct the JSI of each source by operating the two sources individually. The overlap of the JSIs reconstructed from each source (Fig. 2c) estimates a mutual indistinguishability of 98.5(1)%.

A second measurement of the indistinguishability was performed via reversed Hong-Ou-Mandel (HOM) interference between the two sources9,35,36. Both sources were pumped and the respective idler and signal modes were interfered by tuning the output MZIs to act as 50:50 beam-splitters. The 98.7(2)% visibility of the reversed HOM fringe, shown in Fig. 2d and obtained by scanning the phases ϕ1 = ϕ2 = ϕ, corresponds directly to the source indistinguishability (see Supplementary Note 6).

A further estimate of indistinguishability is obtained by testing the entanglement generated when coherently pumping the two sources9,36. Using quantum state tomography, we experimentally reconstruct the density matrix shown in Fig. 2e, which has a fidelity of 98.9(3)% with the ideal maximally-entangled state $$\left|{\Phi }_{+}\right\rangle =(\left|00\right\rangle +\left|11\right\rangle )/\sqrt{2}$$, and provides an indistinguishability value of 98.2(6)% (see Supplementary Note 6 for details).

### Heralded Hong-Ou-Mandel experiments

A key figure of merit for multi-photon experiments, particularly in the context of many photon quantum information processing, is the heralded Hong-Ou-Mandel visibility, which quantifies the interference of photons heralded from different sources. This quantity, which simultaneously incorporates source indistinguishability, purity and absence of multi-photon noise, determines the stochastic noise in photonic quantum computing architectures8,18, and the computational complexity achievable in photonic sampling algorithms17. We implemented heralded HOM experiments by operating our two-source device in the four-photon regime, as shown in Fig. 3a. The circuit is configured such that idler photons from both sources are directly out-coupled to detectors to herald the signal photons, which are interfered in the MZI (see inset of Fig. 3b). The heralded HOM fringe is measured by scanning the phase θ1 inside the MZI and collecting 4-photon events37,38. The measured on-chip heralded HOM fringe is shown in Fig. 3b. The raw-data visibility (no multi-photon noise correction) is 96(2)%.

## Discussion

Our results have a significant impact on the prospects of quantum information processing in integrated photonics. Photon sources from previous state-of-the-art integrated photonic devices demonstrated an on-chip heralded quantum inference raw visibility of 82%38 which upper-bounds any potential quantum sampling experiment to a computational complexity equivalent to 31-photon interference (considering error bounds of 10%17). Our results lift this bound to a computational complexity equivalent to  >150 photon interference (see Supplementary Note 4), deep in the regime of quantum computational supremacy39.

Furthermore, in the context of digital quantum computing, our results make a significant leap toward the 99.9% heralded HOM visibility required to construct lattices of physical qubits with error rates below 1% using current fault-tolerance photonic architectures16,18. Our analysis (see Supplementary Note 3) suggests that heralded HOM visibilities of 99.9% could be achievable with minor modifications to our design; for example by using an improved quality pump laser and by using semiconductor fabrication processes with approximately 4 nm uniformity40,41. Our results represent the near removal of a critical set of physical errors that had limited the scaling of photonic quantum information processing.

## Methods

### Device fabrication

The silicon devices used were fabricated using CMOS-compatible UV-lithography processes in a commercial multi-project wafer run by the Advanced Micro Foundry (AMF) in Singapore. Waveguides are etched in a 220 nm silicon layer atop a 2 μm buried oxide, and an oxide top cladding of 3 μm. The thermo-optic phase-shifters to reconfigure the integrated circuits are formed by TiN heaters positioned 2 μm above the waveguide layer.

### Inter-modal four-wave mixing in silicon waveguides

Inter-modal spontaneous four-wave mixing is performed by propagating the pump, signal and idler waves on different waveguide modes. The spectral properties of the perfect phase matching depend on the group velocity dispersion of the different modes employed in the process, which can be tuned by engineering the waveguide geometry. In our experiment, we operate the pump on the TM0 and TM1 modes, and the signal and idler on the TM1 and TM0, respectively. With these modes, phase matching of the SFWM process is enabled in the 1500–1600 nm spectral window using standard silicon-on-insulator waveguides with a geometry of 2 μm × 0.22 μm, which is used in our source design.

### Pump-delayed generation

When a delay is applied between non-degenerate pump pulses, the effective interaction length depends on the value of such delay. This is due to the walk-off, which limits the nonlinear length. The best scenario is when the overtaking process of the faster pulse over the slower one occurs completely within the waveguide, thus maximising the interaction length and the generation efficiency. In this case the delay is such that the maximum spatial overlap between the pump pulses occurs in the middle of the waveguide. With different delays, the nonlinear medium is not optimally exploited, resulting in reduced generation efficiency. The delay used to optimise the generation efficiency corresponds to the delay for maximum spectral purity (details in Supplementary Note 1).

### Source design

The 2 μm × 0.22 μm multi-mode waveguide in the source is designed with a length of 11 mm and an initial temporal delay of τ = 1.46 ps between the TM0 and TM1 modes. A spiral geometry for the waveguide is used to increase the compactness. Modal cross-talk in the spiral is kept below  −25 dB extinction by adopting 90 Euler bends of radius 45 μm (see Supplementary Note 1 for more details). The footprint of an individual silicon-on-insulator source with our design is approximately 200 μm × 900 μm. The TM0–TM1 mode converters used to inject the pump in the MM waveguide and separate the signal and idler photons at the output have  <−30 dB characterised modal cross-talk, and  >95% conversion efficiency.

### Integrated circuit

The integrated circuit pictured in Fig. 2a (see also Supplementary Fig. 6 for a more detailed schematic) used for the multi-source interference experiments consists of three reconfigurable MZIs (internal phases φ, θ1 and θ2), two phase-shifters (ϕ1, ϕ2), a broad-band waveguide crosser, and two sources. The circuit used is a two-mode version of the circuits implemented, for example, in ref. 9. At the input, the MZI φ is configured to split the pump between the two sources: using φ = 0 (φ = π) we operate the sources individually by pumping only source 1 (source 2), while φ = π/2 implements a balanced pump splitting to coherently operate both sources simultaneously. When uniformly pumping both sources (φ = π/2), each source receives half of the input pump power, and thus the photon-pair generation probability in each source is decreased by a factor four compared to the single source regime. After photons are generated in the sources, the waveguide crosser allows us to route together to signal and idler modes. Arbitrary and reconfigurable two-mode unitary operations are then performed on the signal (idler) modes via the phase ϕ1 (ϕ2) and the MZI θ1 (θ2). Light is coupled in and out of the circuit by means of TM0 focusing grating couplers, which have been individually optimised to maximise their efficiency at the pump, signal and idler wavelengths (6.6 dB loss per coupler). The coupling was observed to be stable over few hours, but to gradually decrease over longer periods (approximately between 0.5 and 1 dB/day without active coupling optimisation). Insertion losses for the individual on-chip circuit components in our devices are: 0.19 dB/cm (0.40 dB/cm) for TM0 (TM1) transmission in the MM waveguide, 0.1 dB per mode converter, <0.01 dB loss per directional coupler, and 0.4 dB per waveguide crossing.

Total insertion losses in the integrated circuit are approximately 14 dB, mostly due to grating couplers. In both the single-source circuit (Fig. 1b) and the two-source circuit (Fig. 2a) the intrinsic heralding efficiencies of the sources have been measured to be approximately the same, with a near-90% efficiency after correcting for the off-chip channel loss. Off-chip pump rejection filters have an insertion loss for the unfiltered photons of 0.4 dB. When including this external transmission loss for pump rejection, the heralding efficiency of the systems is approximately 83%. See Supplementary Note 2 for more details on the design and characterisation of the individual components.

### Experimental set-up

Pump pulses at 1550 nm (4.5 nm bandwidth, 800 fs pulse length, 50 MHz repetition rate) from an erbium-doped fibre laser (Pritel) are filtered via a square-shaped, 5 nm bandwidth filter (Semrock) to eliminate spurious tails at the signal and idler wavelengths, and then injected into the device. A fibre polarisation controller (Lambda) is used to ensure injection of TM0 polarised light to maximise the coupling. After the chip, pump rejection is performed via broadband (>12 nm bandwidth, much larger than the photon spectra) band-pass filters (Opneti), and photons are finally detected using superconducting nanowire single-photon detectors with approximately 80% average efficiency (Photon Spot). For the JSI reconstruction, we use tunable filters with adjustable bandwidth (EXFO XTA-50). Analogue voltage drivers (Qontrol Systems, 300 μV resolution) are used to drive the on-chip phase shifters and reconfigure the integrated circuit.