Interfering trajectories in experimental quantum-enhanced stochastic simulation

Simulations of stochastic processes play an important role in the quantitative sciences, enabling the characterisation of complex systems. Recent work has established a quantum advantage in stochastic simulation, leading to quantum devices that execute a simulation using less memory than possible by classical means. To realise this advantage it is essential that the memory register remains coherent, and coherently interacts with the processor, allowing the simulator to operate over many time steps. Here we report a multi-time-step experimental simulation of a stochastic process using less memory than the classical limit. A key feature of the photonic quantum information processor is that it creates a quantum superposition of all possible future trajectories that the system can evolve into. This superposition allows us to introduce, and demonstrate, the idea of comparing statistical futures of two classical processes via quantum interference. We demonstrate interference of two 16-dimensional quantum states, representing statistical futures of our process, with a visibility of 0.96 $\pm$ 0.02.

M any of the most interesting phenomena are complexwhether in urban design, meteorology or financial prediction, the systems involved feature a vast array of interacting components. Predicting and simulating such systems often requires the use of a prohibitive amount of data, evincing a pressing need for more efficient tools in algorithmic modelling and simulation.
Quantum technologies have shown the potential to dramatically reduce the amount of working memory required to simulate stochastic processes 1,2 . By tracking information about past observations directly within quantum states, a quantum device can replicate the system's conditional future behaviour, using less memory than the provably optimal classical limits. The key to achieving a quantum memory advantage is maintaining coherence of the quantum memory during the simulation process, enabling the encoding of relevant past information into nonorthogonal quantum states. This memory reduction comprises a new application of quantum processing, complementary to computational speedup 3 , cryptography 4 , sensing 5,6 and phase estimation 7 .
This advantage was first illustrated for simulating a particular stochastic process, where past information was encoded within non-orthogonal polarisation states of a single photon 8 . The scheme, however, maintained quantum coherence over only a single simulation cycle. This limitation meant that the resulting simulator exhibited a memory advantage only when simulating a single time step. To simulate multiple time steps, such a device required relevant information to be transferred to classical memory between time steps, negating any quantum advantage.
Here we develop a quantum simulator that overcomes this limitation, such that it exhibits a memory advantage when simulating multiple time steps. As an important additional benefit, our device enables us to create a quantum superposition over all potential future outcomes of a process. We illustrate that such an output lets us estimate the distinguishability in the statistical futures of two stochastic systems via quantum interference. Our experimental approach makes use of temporal (time-bin) encoding in an optical system to experimentally realise a quantum simulation over three consecutive steps, generating a coherent superposition over the process's potential future trajectories. We then implement two such quantum simulations in parallel, simultaneously generating superpositions over the trajectories for each of two independent systems. Experimentally, this corresponds to using our quantum simulators to produce and control high-dimensional quantum states. These are interfered, allowing estimation of how well the corresponding statistical futures coincide.

Results
Framework and tools. In this work, we study a simple stochastic process known as the perturbed coin 1 . It consists of a binary random variable that represents the state of a coin (0 corresponds to heads and 1 to tails) inside a box. At each time step, the box is perturbed, causing the coin to flip with some probability. Afterwards, the state of the coin is emitted. In general the coin may be biased, so the probability of remaining in heads, l, can differ from the probability of remaining in tails, m, as presented in Fig. 1. Repetition of this procedure generates a string of 0s and 1s, whose statistics define the perturbed coin process.
Any device that seeks to replicate correct future statistics must retain relevant past information in a memory. This involves a prescription for configuring its memory in an appropriate state for each possible observed past, such that systematic actions on this memory recover a sequence of future outputs that are faithful to conditional future statistics. The amount of past information stored in memory is quantified by the Shannon entropy C ¼ À P s d s log d s , where d s is the probability that the memory is in state s and the logarithm is in base 2. The minimal possible memory required, C μ , is known as the statistical complexity, and is an important measure of structure in complexity science [9][10][11][12] . For the perturbed coin ( Fig. 1), the minimal information required about the past is the current state of the coin. This induces a statistical complexity of C μ = −q log q − (1 − q) log (1 − q), where q represents the probability that the last outcome was heads (see Eq. (1) in Methods).
A quantum simulator can further reduce memory requirements by encoding the two possible outcomes of the process into mutually non-orthogonal states. Future statistics are then generated by a series of unitary interactions, ensuring that this entropic advantage is maintained at all times during simulation 13 . For the case of the perturbed coin, the quantum simulator can be implemented as shown in Fig. 2. The state of the machine encodes relevant information about past outcomes-here the state of the coin after the last step. It is represented as one of two states, |S 0 〉 or |S 1 〉, of a quantum system that sequentially interacts Fig. 1 Perturbed coin. The process that we study here is a coin with two outcomes, 0 and 1. The transition probabilities, T ij , between different outcomes are determined by l and m for i, j ∈ {0, 1}. The optimal classical model uses the causal states ({S i }) depicted in the circles. There is a simple mapping from the past of the process to the relevant causal state: the last outcome from the coin determines the input causal state. Arrows, with the associated expressions j|T ij , represent the transitions from causal states S i to S j with probability T ij , emitting the classical outcome j. In the quantum model, the causal states become quantum states, Large entangled state Fig. 2 Conceptual representation of the multiple-step quantum simulation of a perturbed coin. The memory system of the simulator is initialised in a qubit state |S i 〉, where i ∈ {0, 1} depends on the past of the process. The ancillary qubits are all initialised in a fixed known state |0〉-the logical zero state-and thus contain no information about the past of the process. At each time step t k , the simulator interacts with the kth ancilla through the same unitary operator U. The inset shows how we implement the relevant unitary operator. The gates are a controlled-X, a single-qubit rotation R such that R|0〉 = |S 0 〉, and a controlled-V such that VR|1〉 = |S 1 〉. This sequence of interactions results in an entangled internal state that includes all the ancillary qubits and the memory state of the simulator. Measuring the ancillas samples the statistical distribution of the process, and at the same time the internal state of the simulator collapses into the correct memory state required for further simulation steps with ancillary systems. Each interaction corresponds to a time step of the stochastic process. All the ancillary systems start in a fixed state, and therefore do not contain any information. The sequence of interactions produces an entangled state. Measuring the ancillary systems after the desired number of steps provides a sample of the statistics.
Experimental implementation. Motivated by recent realisations of quantum walks in linear optical setups with time-bin encoding [14][15][16][17] , we implement the memory system and multiple ancillas -here corresponding to three time steps-by encoding on a single photon. The ancillas, which can be read to obtain the classical outcomes of the process, are encoded in the arrival time of the photon, and the memory state of the simulator is encoded in its polarisation. Thus, for a simulation of M time steps, a 2 Mdimensional system corresponding to 2 M different photon arrival times replaces M distinct ancillary photons. Instead of measuring the classical outcome at each time step, our quantum information processor keeps the photon and builds up a superposition in a high-dimensional Hilbert space; in our case M = 3, and the output of the simulator is 16-dimensional (8 arrival time modes × 2 polarisation modes). The associated memory cost during this process does not increase since all operations remain unitary-and thus conserve entropy. Of course, distinct ancilla qubits could be used instead, but encoding in multiple degrees of freedom provides a convenient, effective and high-fidelity approach for small-to medium-sized photonic systems.
Our experiment demonstrates that high-dimensional (here 16dimensional) quantum states can be encoded and manipulated in photonic temporal and polarisation modes with high fidelity 18,19 . This complements other related works involving hybrid optical states using spatial (path and OAM) and polarisation modes [20][21][22][23] . It also substantiates the oft-repeated claim that combining Step 3 Tomography Filter Translation a b Fig. 3 Experimental setup. a Single photons are generated from a degenerate spontaneous parametric down-conversion (SPDC) source pumped by a 410 nm continuous-wave laser. After filtering the generated photons with (820 ± 1.5) nm bandpass filters, photons in the lower beam (red) and upper beam (orange) are separately prepared in their respective input states, |S 0 〉 or |S 1 〉, using half-wave plates (HWP). Polarisation qubits, one from each beam, are used as memory states for the simulation of two separate and potentially different processes Π 1 (red) and Π 2 (orange). (For the first experiment described in the text, where only one process is required, the second SPDC output (orange) is sent straight to a heralding detector, rather than through the apparatus.) To implement the three-step simulation, three processor blocks are built-labelled Step 1, Step 2 and Step 3. In each step, path and arrival time modes are also employed to realise the relevant physical operation, as explained further in Methods. The output of one of the simulators (lower beam) is used to perform the polarisation tomography and to measure the arrival times of each photon in order to sample the statistical future. To measure the overlap of the future statistics of two processes, both photons are used, and the other outputs of the third beam splitter (BS) are interfered in a fibre BS (yellow box). An automated translation stage is used to move one of the couplers in order to vary the relative delay between the single-photon wave packets. Avalanche photodiodes (APD) and a single-photon counting module are used to count the photons. SMF stands for single-mode fibre, QWP quarter-wave plate, GT Glan-Taylor prism, PBS polarising beam splitter, and FPC fibre polarisation controller. b The inset shows a close-up of two vertically separated beams passing through two HWPs with holes, each of which only acts on one of the beams  30 and complete optical Bell state analysers 31,32 .
Our first task consists of performing the quantum simulation of the perturbed coin. In particular, we seek to verify that the simulator samples from the correct statistical distributions, and to demonstrate the memory advantage due to quantum encoding. The experimental setup is shown in Fig. 3. We generate degenerate pairs of single photons through spontaneous parametric down-conversion. One of the photons (depicted as the red, lower beam in the figure) is prepared in the state |S 0 〉 or |S 1 〉, depending on the past of the process. It then passes through three sequential blocks, which represent the three time steps being simulated. In each block, the short and long paths correspond to outcomes 0 and 1, respectively (details in Methods). For the simulation, only one of the photons (the red beam) is used, and the other photon (orange beam in the figure) is not used except as a herald, and is measured immediately after generation (for this task, it does not go through the apparatus as shown in the figure). We then estimate the polarisation state of the red-beam photon in the tomographic reconstruction at the end of the third block, and also measure its arrival time (using the orange-beam photon as a reference). In this way, we obtain the probability distribution of the stochastic process as simulated by our quantum information processor, together with the final memory state of our simulator, which is needed for further simulation steps.
Experimental results. The experimentally determined outcome probabilities are shown in Fig. 4, and are close to the expected theoretical values. The main discrepancies with theory are due to small differences between nominally identical polarisation elements, and the non-identical single-mode-fibre coupling efficiency of photons taking different paths through the simulator. In order to evaluate how well they agree, we calculate the (classical) fidelity 33 for each set of parameters and initial conditions that we have simulated in our experiment. All the values obtained for this (classical) fidelity are larger than 0.991. Typical uncertainties are around 0.001.
To compare the use of quantum and classical resources, we use C q , the quantum counterpart to the classical statistical complexity (the entropy of the memory register of the quantum simulator), which quantifies the memory requirement of the quantum simulator. We thus calculate C q for this process (details in Methods). The experimental results are shown in Fig. 5a. The corresponding classical statistical complexity is also shown for the sake of comparison, demonstrating that quantum resources dramatically reduce the amount of memory needed for simulating a multi-step stochastic process.
To guarantee that the quantum memory advantage is maintained at all stages of the simulation process, we require the internal dynamics to be close to (ideally, completely) unitary. We can verify this by demonstrating the coherence of the output state that includes all the ancillary qubits and the memory state of the simulator. We observe this coherence via two-photon quantum interference. We use the complete setup of Fig. 3, where the photon depicted by the orange path is no longer measured after generation (as done previously), but also goes through the apparatus. Both the photons pass independently through the three sequential blocks, with each experiencing nominally the same optical elements (although different settings are possible). If the coherence between the different time bins The sampled future of the same process, when the initial state is |S 1 〉. Uncertainties, due to the Poissonian distribution of photon counts, are so small that they are not visible in the graphs-therefore, they are not depicted. Note that the classical probability distribution is determined by the process parameters l and m, as well as the initial causal state. For example, if the last outcome of the coin is 1, the quantum simulator is initialised in state |S 1 〉. The conditional probability of subsequently observing 111 is then m 3 . For a fixed l = 0.4 and increasing m, the average probability of getting 1 in the simulation thus rises accordingly. This can be seen by the higher columns in the right corners of both graphs ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-08951-2 and polarisations exploited in our simulation is maintained, we expect a complete interference, which means that the visibility ideally should be unity. The result in Fig. 5b shows a visibility of 0.96 ± 0.02 for the case where the theoretical output states of the apparatus are uniform superpositions of all time bins and polarisations (which is the scenario where the highest discrepancy from the ideal visibility would be expected as it is most susceptible to imperfections). The high value obtained here indicates that our simulator is (almost) implementing a unitary operator, and the entropy of our system does not significantly increase throughout the simulation process. This requirement is essential for preserving the quantum memory advantage. Moreover, apart from the specific application of this apparatus to simulate classical stochastic processes, this result is also significant in a more general context, since it demonstrates the interference of two discrete high-dimensional states with an extremely high visibility 23 .
Modifying this experimental setup allows us to compare two different processes, Π 1 and Π 2 . Clearly, one way to perform such a statistical comparison is to consider each process individually, and sample its outcomes to reconstruct the corresponding distribution. These two reconstructed distributions can then be compared. However, we notice that in our quantum simulation, all the information about the future statistics is already encoded in the state that exists in our apparatus. Thus, we do not need to collapse the superposition of possible outcomes by sampling, instead we can exploit this superposition for our task of comparing the future of processes. In particular, by simultaneously running quantum simulations of processes Π 1 and Π 2 in parallel and interfering the resulting output states, we can estimate the overlap of their future statistics.
In our experiment, we realise different processes by applying different operations to the two photons (red beam and orange beam) in the three blocks of the setup in Fig. 3. To implement the parameters of each process separately, we use half-wave plates with holes, which allow us to change the polarisation of one beam without affecting the other. We fix one of the processes and change the other process gradually. As the parameters defining the processes become increasingly similar, the two output probability distributions overlap more. This is reflected in the experiment by a higher visibility value, showing how the comparison between two sets of future statistics can be evaluated via interference visibility. Results are shown in Fig. 5c, where the experimental values are close to theoretical predictions. However, there remain slight discrepancies because of experimental imperfections such as small spatial and polarisation mode mismatches. These techniques could be adapted to attain a quantum advantage in estimating the distance between two normalised vectors 34 , which plays an essential role in machine learning tasks such as image recognition 35 .

Discussion
Our multi-step photonic implementation of a stochastic simulation has verified the memory advantage available with quantum resources. We have demonstrated that it is possible to maintain this advantage at all stages of the simulation by preserving quantum coherence, as opposed to previous experiments 8, 36 . Further, we have shown that superpositions of process outcomes can be interfered. These techniques have the potential to reduce memory requirements in simulations of stochastic processes and to provide tools for advances in quantum machine learning and communication complexity.
The time-bin-encoding techniques in our experiment can be extended to other small-and medium-scale simulations by expanding the number of time bins. For example, 10 8 time-bin modes have been realised in the context of communication complexity 37 . However, the number of bins does not scale efficiently with the number of qubits, and thus very-large-scale simulations are not possible with this encoding. This is not a fundamental problem, as the concepts that we demonstrate can be equivalently implemented in other photonic encodings or in other qubit systems. Our current demonstration also uses nondeterministic (post-selected) mode recombination at certain beam splitters within the circuit. This implementation is convenient, but not necessary and thus not a fundamental limitation: a deterministic multi-step simulator could be realised with a step-dependent delay mechanism-for instance, a . Data points are experimental measurements of C q , and the magenta and turquoise curves are theoretical estimations for the quantum and classical complexities C q and C μ , respectively. b Two-photon interference of the superpositions of future trajectories in two implemented stochastic processes, Π 1 and Π 2 , with Π 1 = Π 2 such that l = 0.5 and m = 0.5 and thus |S 0 〉 = |S 1 〉. Since Π 1 = Π 2 , an interference visibility of 100% is theoretically expected, while fitting the experimental data yields a visibility of 0.96 ± 0.02. In the graph, the number of measured twofold coincidences is depicted versus the relative delay between single-photon wave packets. c Magenta and turquoise elements (points-experiment; curves-theory) show the comparison of the statistical futures from two stochastic processes by two-photon interference visibility. In each case, one process (Π 1 ) is fixed, and the other process (Π 2 ) has fixed m but varying l. Magenta represents interference of the output states of the simulators for Π 1 (|S 0 〉 is the input memory state, l = 0.5, and m = 0.5) with Π 2 (|S 0 〉 is the input memory state, m = 0.5, and l varying). Turquoise represents interference of the output states of the simulators for Π 1 (|S 0 〉 is the input memory state, l = 1.0, and m = 1.0) with Π 2 (|S 0 〉 is the input memory state, m = 0.5, and l varying). Uncertainties, from the Poissonian distribution of photon counts, are so small that they are barely visible in some of the graphs controlled fast switch connected to fibre paths of different lengths. The comparison of future statistics has direct relation to other protocols, such as quantum fingerprinting and state comparison in communication complexity 34,37 . Fingerprinting involves estimating the distance between two vectors, where the resource to be minimised is the amount of communication. For the comparison of two vectors, quantum mechanics can reduce the amount of communication required beyond classical limits. In the quantum protocol, Alice and Bob perform a SWAP test-a quantum information primitive, which compares two arbitrary states. Twophoton interference is known to be equivalent to a SWAP test 38 . Our comparison of futures can be cast as a similar problem. In this case, the task would be for Alice and Bob, who each have their future statistics from potentially different processes, to compare the two statistical futures 34 . In principle, for very highdimensional Hilbert spaces, a comparison of statistical futures via two-photon interference can achieve a quantum advantage in communication complexity. The comparison of two vectors is also an important component of many machine learning tasks, and thus a similar advantage could extend to more general settings like speech recognition 35 .

Methods
Theoretical background. A discrete-time stochastic process is generally described by a joint probability distribution, pð X ; X ! Þ, where X ¼ :::; X À1 ; X 0 X ! ¼ X 1 ; X 2 ; ::: denotes the random variables that govern the statistics of past (future) observations. Each past (future) configuration of the random process is denoted by x ð x ! Þ. For an observed past configuration x , the future statistics are dictated by the conditional probability pð X By categorising all sets of past events with the same future statistics into equivalence classes (called causal states, which are encoded as memory states of the simulator), the optimal classical model (called the ε-machine 10,39 ) only needs to store the class εð x Þ that x belongs to. That is, given only εð x Þ, the ε-machine is able to make a statistically accurate inference of the process' conditional future. By observing the outcome of the stochastic process over a long time, one can infer the probability of each causal state and transition probabilities between them. For a stochastic process, the N causal states S ¼ fS i g N i¼1 and their relevant transition probabilities are enough to realise the ε-machine model. The resulting ε-machine requires 33 bits of information about the past, where d i is the probability that the past is in causal state S i . No other predictive model can simulate the future while storing less information about the past. Thus, C μ has been termed the statistical complexity 10,40,41 , and is considered a fundamental measure of complexity that captures how resource-intensive it is to predict the future of a given process. It has been theoretically proven that for many processes, including the one studied here, there exists a quantum ε-machine with entropy C q , such that C q < C μ 1 . Similar to its classical counterpart, this quantum model is defined by its causal states {|S i 〉} and the corresponding transition probabilities. On average, the entropy of such a quantum ε-machine is given by Three-step simulation of a perturbed coin. For the perturbed coin process, the optimal quantum causal states can be written as 1 : To give an example of the output state of our simulator, let us consider a perturbed coin defined by its parameters l and m, which we denote as process Π 1 . The output of the corresponding quantum ε-machine after three time steps is given by the superposition X where n = {1, 2, 3} and p(x 1 , x 2 , x 3 |S i , Π 1 ) is the probability to obtain x 1 , x 2 and x 3 as the outcomes of three time steps of the process Π 1 when the input causal state is |S i 〉. The value of p can be evaluated theoretically from the transition probabilities between causal states (Fig. 1). The variables x n ∈ {0, 1} are the configurations of random variables x 1 , x 2 and x 3 , respectively. To sample from the future statistics of the perturbed coin process, we perform a simultaneous measurement of all the ancillary qubits after the three time steps. By also characterising the polarisation state of the photon in each case, we can tomographically reconstruct the output state associated with each time bin, and thus experimentally determine the statistical complexity of the simulation. To calculate the statistical complexity, C q , for this process, we need to find the state ρ: where and S poljS i is the tomographically reconstructed polarisation state at each arrival time, conditioned on the input memory state being encoded in |S i 〉.
Verifying the unitarity of the processor via two-photon quantum interference.
To verify that the operation is unitary, which guarantees the conservation of the entropy, we need to show that the superposition of different modes, both in time and polarisation, is coherent and that this coherence is maintained throughout the whole process. Using a pure state as the input and viewing the entire simulation as a black box, the output of the unitary operations inside the box should ideally be a pure state. In order to experimentally demonstrate this, we consider the case of simultaneously implementing two setups to model two identical processes, Π 1 = Π 2 . It is possible to verify that two uncorrelated single photons are in identical pure states via two-photon interference-the Hong-Ou-Mandel (HOM) effect. The visibility of the interference, v ¼ P max ÀP min P max , where P max (P min ) is the maximum (minimum) of two-photon coincidence detections measured when varying the delay between the two beams, can only be unity if the photons are in pure and identical states.
Comparison of future statistics. The case of unequal processes also provides useful information. If Π 1 ≠ Π 2 and the output states are pure, the overlap of different future output statistics can be deduced by interfering the output photons. For two photons in states |ψ〉 and |ϕ〉 entering two input ports of a 50:50 beam splitter, the probability of finding a coincidence is 1À ϕjψ h i j j 2 2 , where 〈ϕ|ψ〉 is the overlap of the two states. Therefore, one can use the HOM interference visibility v to estimate overlaps, by noting that v = |〈ϕ|ψ〉| 2 . For our stochastic processes, the overlaps of the photonic output states are directly related to the overlaps of the future statistics produced by the two processes. For two different processes Π 1 and Π 2 , let |S i 〉 be a causal state of Π 1 , and |T j 〉 be a causal state of Π 2 . Using Eq. (5), in general the overlap between the respective outputs of the quantum simulators for Π 1 and Π 2 will be Since the perturbed coin process has Markov order one, and there is a one-toone correspondence between the classical outcome and the causal state the machine transitions to, interfering the output states from a pair of quantum simulators for Π 1 and Π 2 as in Eq. (7), actually results in an overlap X x n ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi That is, in this special case we are able to find the difference between conditional futures up to one additional time step. Therefore, we can use our photonic quantum information processor for two tasks: (1) to simulate the future outcomes over three time steps of the classical stochastic process; and (2) to estimate the overlap of the future output statistics over four time steps.
Details of the experimental design. The schematic in Fig. 3 shows how we implement the multi-step quantum-enhanced stochastic processor. Consider, for instance, the scenario where we want to sample the statistics. For one process (i.e. for one beam) a single photon is injected in the left-hand side of the circuit, from the source, with the state |0〉 (|1〉), which is encoded as |H〉 = horizontal (|V〉 = vertical) polarisation. The first wave plate creates the desired initial causal state of our perturbed coin, either jS 0 i pol ¼ ffi ffi l p jHi pol þ ffiffiffiffiffiffiffiffiffi ffi 1 À l p jVi pol or jS 1 i pol ¼ ffiffiffiffiffiffiffiffiffiffiffiffi 1 À m p jHi pol þ ffiffiffiffi m p jVi pol . The purpose of the first block is to transform a photon with a causal state encoded in polarisation into an appropriately weighted superposition of the classical outcomes of the first step encoded in the arrival time (denoted here as the delay degree of freedom, del), with the corresponding next causal state encoded in the polarisation: This is achieved by temporarily using the photon path as an auxiliary degree of freedom: a polarising beam splitter maps the polarisation degree of freedom onto the path, which is then copied onto the arrival time through the use of different path lengths (|H〉 → short path and |V〉 → long path). By using a wave plate in each of the two paths, a path-dependent (and therefore, arrival-time-dependent) transformation of the polarisation into one of the two causal states is achieved: |S 0 〉 pol in the short path and |S 1 〉 pol in the long path.
Next, the information on the path degree of freedom is erased, to avoid an exponential scaling of the number of paths (and optical elements in the experiment) with the number of time steps. To this end, the paths are recombined in a 50:50 beam splitter, and subsequently post-selected for the photon exiting in the right output arm at the end of the first block (Fig. 3). This means that we will lose half of our photons at the beam splitter, but in each run that we post-select, the evolution is unitary because the post-selection ensures that no photon is detected in the other output arm. By repeating the described block at each time step, we have three blocks to realise a three-step machine. The use of a sequence of interferometers has also been demonstrated in other experiments to study different topics in quantum information, such as non-Markovian dynamics and sequential state discrimination 42,43 .
To be able to attribute a different arrival time to each sequence of classical outcomes, we require a unique path length for every possible combination of short or long paths within the three blocks. The delays are implemented as t 1 = 2 ns at the first step, t 2 = 4 ns at the second and t 3 = 8 ns at the third step. The arrival times are discriminated by time-resolving single-photon detectors. The coincidence window for HOM interference is long enough to include the state which is spread out in a 14 ns time interval.
After the third step, we have the measurement stage at one output arm of the third BS and the circuit continues at the other, which is exploited for the second task of our work. In order to run our simulation and estimate the memory efficiency of this scheme compared to the optimal classical one, we measure the final arrival times (encoding the three ancillary qubits of the original scheme) and reconstruct the final polarisation state of the photon. This can be done simultaneously at the tomography stage, by also measuring the arrival times of the photons, allowing a full reconstruction of the polarisation state and arrival time.
The same apparatus can be exploited for the interference part of our experiment, the only difference being that now two single photons are injected in the setup. They both pass through the three blocks described above. When we want to verify the unitarity of our simulation, the elements in the blocks are the same for both the photons, so as to have Π 1 = Π 2 ; on the other hand, they are different when we want to compare the future statistics of two different processes (Π 1 ≠ Π 2 ). After the output of the third block, the two photons interfere in a fibre BS and the number of coincidences is measured.
Details of the l and m parameters used in the experiment. The simulated process, for which the C q results are depicted in Fig. 5a, is a perturbed coin with parameters l = 0.4 and m ranging from 0.

Data availability
All data, relevant to the information and figures presented in this manuscript, are available upon reasonable request.