Bell’s theorem for temporal order

Time has a fundamentally different character in quantum mechanics and in general relativity. In quantum theory events unfold in a fixed order while in general relativity temporal order is influenced by the distribution of matter. When matter requires a quantum description, temporal order is expected to become non-classical—a scenario beyond the scope of current theories. Here we provide a direct description of such a scenario. We consider a thought experiment with a massive body in a spatial superposition and show how it leads to entanglement of temporal orders between time-like events. This entanglement enables accomplishing a task, violation of a Bell inequality, that is impossible under local classical temporal order; it means that temporal order cannot be described by any pre-defined local variables. A classical notion of a causal structure is therefore untenable in any framework compatible with the basic principles of quantum mechanics and classical general relativity.

Q uantum mechanics forces us to question the view that physical quantities (such as spin, positions or energy) have predefined values: Bell's theorem shows that if observable quantities were determined by some locally defined classical variables, it would be impossible to accomplish certain tasks-such as the violation of Bell's inequalities-whereas such tasks are possible according to quantum mechanics 1,2 and have been realised in experiments [3][4][5][6] . However, the causal relations between events remain fixed in quantum theory: whether an event A is in the past, in the future, or space-like separated from another event B is predefined by the location of such events in space-time 7,8 . In contrast, in general relativity, space-time itself is dynamical: the presence of massive objects affects local clocks and thus causal relations between events defined with respect to them. Nonetheless, the dynamical causal structure of general relativity is still classically predefined: the causal relation between any pair of events is uniquely determined by the distribution of matterenergy degrees of freedom (DOFs) in their past light cone. In other words, causal relations are always determined by local classical variables. This picture is expected to change if we consider quantum states of gravitating DOFs: if a massive system is prepared in a superposition of two distinct states, each yielding an observably different causal structure for future events, would it be possible to observe causal relations that display genuine quantum features?
A main obstacle in the analysis of macroscopic superpositions of gravitating bodies is that, in the absence of a classical spacetime manifold, it becomes unclear how to identify space-like surfaces on which quantum states are defined, or global fields of time-like vectors to define time evolution. Indeed, some models even postulate that such superpositions are simply not valid physical states and must decohere (or collapse) fast enough to preserve a classical description of space-time and dynamical laws [9][10][11][12][13] . A very different mindset underlies various quantum gravity frameworks 14 -where quantum features of the metric and therefore of the causal relations are indeed expected. However, to date, none of the quantum gravity frameworks has been applied to analyse such an epitomic example as superpositions of spacetimes with macroscopically distinct causal structures. Therefore, it is unclear whether there exists any phenomenology unequivocally associated with quantum causal structures, nor whether quantum gravity frameworks can circumvent or directly address the objections against superpositions of manifolds. Independently, quantum formalisms have been recently developed to study quantum causal structures at an abstract level in the context of quantum-information processing 8,15,16 . However, although quantum features of space-time are among the motivations for these studies, no direct link with quantum gravity has yet been established.
This work provides the first direct analysis of quantum causal relations arising from a spatial superposition of a massive object. We show how the temporal order between time-like events can become superposed or even entangled. We further discuss a thought experiment, an admissible albeit remote physical scenario, where these non-classical causal relations arise among physical events. In order to prove their non-classicality, we formulate a Bell-type theorem for temporal order: We define a task that cannot be accomplished if the time order between the events was predetermined by local variables, while the task becomes possible if the events are in a space-time region affected by the gravitational field of a massive object in an appropriate quantum state. Our approach provides a method to directly describe scenarios so far considered to be out of reach for standard theoretical physics. We show explicitly how to overcome the difficulties with describing superpositions of metrics that motivated collapse models. On the other hand, our result is independent of the high-energy completion of any specific quantum gravity frameworkwe do not assume any new physics, the results are based entirely on well-established, low-energy general relativity and on quantum mechanics. Our results are therefore robust against particular mathematical approaches to quantising gravity, thus providing a benchmark for specific frameworks. Furthermore, the time and energy scale at which entangled temporal order arises is closer than the Planck scale, typically invoked in this context, and is also far remote from the scale given by the decoherence modelswhich therefore do not preclude quantum features of space-time to arise. Our results thus reveal that both the above approaches are missing crucial intuition and correct physical understanding of the phenomena associated with causal structures at the interface of quantum and gravitational physics. In turn, our work provides a robust method to quantitatively assess these phenomena, helping to build correct physical intuition for quantum causal structures.

Results
Dynamical causal structure in general relativity. In classical general relativity, the causal structure is the structure of light cones of the space-time metric 17,18 . As the matter-energy DOFs determine the metric through Einstein's equations, the causal structure of a region of space-time is dynamical: it depends on the state of the matter energy in its past light cone. A major obstacle towards a quantum theory of gravity is that it is not clear how to transpose the mathematical notion of causal relations to scenarios where matter DOFs can be in general quantum states, as such scenarios seem to preclude the use of any underlying space-time manifold with respect to which events, light cones and causal relations could be defined. To overcome this obstacle, our approach is to start from a physical understanding of events and their causal relations. Even in classical general relativity a physical event cannot be directly identified with a point on a space-time manifold, a fundamental aspect of the theory captured mathematically by diffeomorphism invariance 19 . Although it can be debated whether or not space-time points have an intrinsic physical meaning, a natural way to define diffeomorphisminvariant events is to specify them operationally, relative to physical systems; for example, positions and proper times of physical systems used as clocks 20 . We adopt this notion of events throughout the work. Causal relations are then understood as the possibility to exchange non-faster-than-light signals-or more generally, physical systems-between operationally defined events.
The presence of massive bodies generally alters the relative rates at which clocks tick. For example, in a weak field limit, a clock in a gravitational potential Φ exchanging signals with an identical clock far away from the source of Φ, where the potential effectively vanishes, will appear to tick slower by a factor In classical physics, this leads to the well-tested timedilation 21,22 and redshift effects 23 . When the clocks are described as quantum systems, new effects arise from the combination of quantum and general relativistic theories. For a clock in superposition of different distances to the mass, its timekeeping DOFs become entangled to the clock's position [24][25][26] . This entanglement implies a universal decoherence mechanism for generic macroscopic systems under time dilation 27,28 . The regime of low-energy quantum systems in curved space-time can be described within a framework of general relativistic composite quantum particles 29 . Here we additionally exploit the fact that only the distance between a clock and a mass has physical significance and due to linearity of quantum theory this must hold also for a superposition of different distances. (There is no difference in the relative ticking rates of two clocks whether we think that the clocks are being positioned at different distancespossibly in a superposition-from the mass, or that the mass is positioned at different distances from the clocks 30 .) Consider two agents, a and b, with two initially synchronised clocks, each following a fixed world line. A third agent prepares one of two mass configurations, K A0B or K B0A , so as to induce time dilation between the clocks of a and b. If configuration K A0B is prepared, event A-defined by the clock of agent a showing proper time t a = τ * -will be in the past light cone of the event B, which is defined in an analogous way: by the clock of agent b showing proper time t b = τ * . If configuration K B0A is prepared, event B will be in the past light cone of event A. To keep the world lines of the agents independent of the mass configuration, their laboratories can be embedded in tight enough trapping potentials, that is, much stronger than the gravitational field (which is feasible since our protocol does not require macroscopic source masses, see Methods). In Supplementary Note 4 we discuss other mass configurations, which have the desired effect on temporal order, but for which the agents a, b can remain inertial.
A possible way to realise configuration K A0B is to place an approximately point-like body of mass M closer to b than to a, see Fig. 1. The light-cone structure of the resulting space-time is fully determined by the metric tensor g μν , for which we adopt the sign convention (−, +, +, +). In isotropic coordinates in the firstorder post-Newtonian expansion the metric components are 31 g 00 ðrÞ ¼ Àð1 þ 2 ΦðrÞ c 2 Þ and g ij ðrÞ ¼ δ ij ð1 þ 2 ΦðrÞ c 2 Þ À1 , i, j = 1, 2, 3, where ΦðrÞ ¼ À GM r is the gravitational potential and r is the spatial distance between the mass and the event where the metric is evaluated. For an event with a spatial coordinate R a and the mass at a spatial coordinate r M (where the spatial coordinates are defined, for example, by a far-away agent as in Fig. 1), we have r ≡ |R a − r M |. Note that we use a common coordinate system to describe the different mass configurations and the associated space-time metrics. Operationally, we can associate such coordinates with the far-away agent, whose local clocks are not affected by the change in the matter distribution. However, this is only a convenient interpretation, we can always think of the coordinates in analogy to gauge fixing-any physical prediction regarding proper times of the clocks and exchange of the signals will not depend on the choice of coordinates.
We consider that a and b remain at fixed coordinate distances from the mass, r a and r b = r a − h, respectively, and find the parameters for which event A ends up in the past light cone of B for K A0B (and vice versa for K B0A ). An infinitesimal proper time element along a world line at a distance r from the mass is given by dτðrÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Àg 00 ðrÞ p dt; where t is the coordinate time, and a photon travelling in the radial direction from r a reaches r b after a . Therefore, if the photon is emitted at the local time t a = τ * , it reaches r b when b's local þ T c , assuming that the local clocks are synchronised so that t a = 0 and t b = 0 coincide with the coordinate time t = 0. For we have t b τ Ã , which means that there is enough time for a not-faster-than-light signal emitted at event A (defined by t a = τ * ) to travel the distance h and reach agent b at event B (defined by t b = τ * ). This means that event A is in the causal past of event B as required. For example, for h ( r a condition (1) is satisfied for τ Ã > 2r 2 a c GM . Configuration K B0A can be arranged analogously, by placing the mass closer to a than to b. Then, the condition τ Ã > 2r 2 b c GM , for h ( r b , ensures that B is in the causal past of A. Note that with the above conditions on τ * the events A and B are always time-like separated, but have different time orders for the two mass configurations-these conditions guarantee that the time order between A and B is swapped in all reference frames.
The example above simply illustrates that in general relativity causal structure is dynamical and depends on the stress-energy tensor of the matter DOFs: preparing different matter distributions on a space-like hypersurface can result in different causal relations between events in its causal future.
Quantum control of temporal order. When A is in the past light cone of B, a physical system can in principle be transferred from A to B. Consider a quantum system S initially prepared in state |ψ〉 S , which undergoes a unitary U A at event A (at the spacetime location where the clock of agent a marks proper time τ * ) and a unitary U B at event B. Such ordered events can therefore result in the following state of S: If B is before A, and S is prepared in the same initial state, the final state of S isψ A situation can therefore be arranged such that state (2) is produced for configuration K A0B and (3) is produced for K B0A . (We ignore a possible additional time evolution between the two events for simplicity.) Different mass configurations can result in different temporal orders of local operations, which holds in quantum as well as in classical theory. Let us make the following assumptions: (a) Macroscopically distinguishable states of physical systems can be assigned orthogonal quantum states. (b) Gravitational time dilation in a classical limit reduces to that predicted by general relativity. (c) The quantum superposition principle holds (regardless of the mass or nature of the involved system).
Even though the above assumptions hold in the standard quantum and general relativistic frameworks, it is not known if a fundamental theory of quantum gravity satisfies them. Our aim is Fig. 1 General relativistic engineering of causal relations between spacetime events using a massive body. Initially synchronised clocks a and b are positioned at fixed distances from a far-away agent whose time coordinate is t. Event A (B) is defined by the clock of a (b) showing proper time τ * . In configuration K A0B (left) a mass is placed closer to b than to a. Due to gravitational time dilation, event A can end up in the causal past of event B: for a sufficiently large τ * the time difference between the clocks becomes greater than it takes light to travel between them. Light emitted at event A reaches clock b before the event B occurs. Configuration K B0A (right) is fully analogous to K A0B : the mass is placed closer to clock a and the event B can end up in the causal past of the event A NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-11579-x ARTICLE to investigate their consequences for the notion of temporal order.
The coordinates introduced in the previous section define a foliation of space-time into equal-time slices. As long as no horizons are present in any of the considered configurations, such slices define space-like hypersurfaces. With each hypersurface one can associate a Hilbert space, containing the quantum states of interest at the given time. The time coordinate corresponds to the time t in Fig. 1 and is operationally defined as the time measured by the local clock of the far-away agent (not affected by the mass configurations). These quantum states can be understood operationally as states assigned by the far-away agent. However, as discussed in the previous section, such an interpretation is not strictly necessary but is merely a convenient way to define the relevant mathematical objects and to carry out the calculations.
The two mass configurations K A0B ; K B0A can thus be assigned quantum states K A0B M , K B0A M . By assumption (a) these states are orthogonal. Since each state individually satisfies the classical limit (mass is sufficiently localised around a single world line), following assumption (b), the system S will evolve as in Eqs.
(2) or (3) depending whether the mass is in state K A0B M or K B0A M , respectively. Finally, by assumption (c), a superposition is a physically allowed mass configuration, and will yield the following final state of the joint system: An explicit calculation showing how this state arises is presented in Methods. We note that not only classical gravity but also semi-classical 14 and stochastic gravity 32 theories would not yield Eq. (4) since these frameworks describe gravitational interactions in terms of classical, possibly stochastic, variables, thus violating assumption (c).
Note that, given a specific physical system used as a clock, it is possible to simulate its time dilation using non-gravitational interactions. For example, an electric field can shift atomic energy levels and thus "time dilate" a clock based on atomic transitions. Therefore, one can produce a state analogous to (4) without using gravity. However, only gravity can alter the relative ordering of events independently of the nature of the systems and interactions used as clocks, due to the universality of time dilation: the preparation and manipulation of the massive object can be carried out without any knowledge of other aspects of the protocol. Such a universality underpins a fundamental distinction between our gravitational protocol and other, non-gravitational, methods to control causal relations between operationally defined events [33][34][35][36][37][38][39] . (See also Supplementary Note 4 for further discussion.) Finally, the state (4) is the result of a process wherein the order of operations on a target system (S) is determined by the quantum state of a control system (position of the massive body). Such a process is known as a quantum switch 15 and has been studied as a possible quantum-information resource [40][41][42][43][44] . The state |ψ sup 〉 MS is a superposition of two amplitudes corresponding to different predefined, classical orders between events A and B. Note that, if the control system is discarded, the reduced state of S is 1 2 which is indistinguishable from a probabilistic mixture of jψ 1 i and jψ 2 i. The state in Eq. (5) can be interpreted as arising from events A and B with a classical, albeit unknown, temporal order. Therefore, any protocol aimed at testing operationally quantum features of temporal order necessarily requires a measurement of the control system.
Bell's theorem for temporal order. The above argument shows that superpositions of massive objects can in principle result in a coherent quantum control of temporal order between events. However, one might question whether such a conclusion has a direct physical meaning or whether it relies on a particular interpretation of state (4). Furthermore, the state assignment is defined in terms of a given coordinate system, while we would like to base our conclusions on coordinate-independent physical events. Since the very meaning of quantum states and measurements might be put into question in the absence of a classical space-time, a proof of non-classical causal relations should not rely on the validity of the quantum formalism. In the following we show that it is possible to probe the nature of temporal order irrespective of the validity of quantum theory. We formulate a theory-independent argument-which does not rely on the quantum framework and provides means to exclude the very possibility of explaining data from a hypothetical experiment in terms of a classical temporal order (which can be stochastic and dynamical) within a broad class of probabilistic theories, not limited to quantum mechanics. Our formulation is analogous to Bell's theorem for local hidden variables 1,2 (see Methods) and we thus refer to the theorem below as Bell's theorem for temporal order of events. The core of the argument is simple: given a bipartite system prepared in a separable state, it is not possible to violate any bipartite Bell inequality by performing local operations (transformations and measurements) on the two parts, as long as the local operations are applied in a definite order. The scenario involves a bipartite system with subsystems S 1 and S 2 and a system M that can influence the temporal order of events. For j = 1, 2, each system S j undergoes two transformations, T A j and T B j , at space-time events A j , B j , respectively. Each system is then measured at an event C j according to some measurement setting i j , producing a measurement outcome o j . Additionally, M is measured at an event D, space-like separated from both C 1 and C 2 , producing an outcome z, see Fig. 2. We now define the notion of classical order between events: Definition 1: A set of events is classically ordered if, for each pair of events A and B, there exists a space-like surface and a classical variable λ defined on it that determines the causal relation between A and B: for each given λ, either A ≼ B (A in the past causal cone of B), B ≼ A (A in the past causal cone of B) or A||B (A and B space-like separated).
Classically ordered events do not necessarily form a partially ordered set: classical order can be dynamical (the order between two events can depend on some operation performed in the past, i.e. some agent can prepare λ) and stochastic (λ might be distributed according to some probability, and not specified deterministically) 45,46 .
Bell's theorem for temporal order. No states, set of transformations and measurements which obey assumptions 1-5 below can result in a violation of the Bell inequalities: 1. Local state: The initial state ω of S 1 , S 2 and M is separable (as defined in Methods). More formally, let us denote by T ¼ ðT A 1 ; T B 1 ; T A 2 ; T B 2 Þ the set of all local transformations irrespective of their order. The thesis of the theorem can be rephrased as: the conditional probability produced under assumptions 1-5 does not violate Bell's inequalities for any value of z. The proof of the theorem is presented in Methods.
Violation of Bell inequalities for temporal order. Here we show how the gravitational quantum control of temporal order from the first section can result in events whose temporal order is entangled: a bipartite quantum system, initially in a product state ψ 1 S 1 ψ 2 S 2 , is sent to two different regions of space such that a 1 , b 1 and c 1 only interact with S 1 , while a 2 , b 2 and c 2 only interact with S 2 . Agents a 1 , a 2 perform, respectively, the unitaries U A 1 , U A 2 at the events A 1 , A 2 , while agents b 1 , b 2 , perform the unitaries U B 1 , U B 2 at the events B 1 , B 2 . Finally, c 1 and c 2 measure S 1 and S 2 at events C 1 and C 2 , respectively, see Fig. 3. Assume that a massive system can be prepared in two configurations, K A0B and K B0A , such that A 1 0 B 1 0 C 1 (A 1 in the past light cone of B 1 , etc.) and A 2 0 B 2 0 C 2 for K A0B , while B 1 0 A 1 0 C 1 and B 2 0 A 2 0 C 2 for K B0A , and such that the events are space-like separated as per assumption 4, which can always be achieved by having the groups sufficiently separated. If the mass is prepared in , the joint state of the mass and the systems after the application of the unitaries is Agent d at the event D measures the mass in the superposition basis j ± i ¼ 1 ffiffi 2 p jK A0B i ± jK B0A i ð Þ . Conditioned on the outcome, the joint state of S 1 and S 2 reads If the states respectively, then the state (8) is maximally entangled. Local measurements can thus be performed on subsystems S 1 , S 2 whose outcomes will violate Bell inequalities, conditioned on the measurement outcome at D (see Supplementary Note 2 for an example). The above thought experiment can in principle be realised in a scenario where it is meaningful to argue that assumptions 1, 2 and 4, 5 are satisfied. Violation of the Bell's inequality would then imply that assumption 3 does not hold, proving non-classicality of temporal order. In order to maximally violate the inequality, the time-dilated clocks of the agents need to decorrelate from the systems S i . In the Methods section we present a particular scenario using photons that satisfies also this requirement. In Supplementary Note 3 we present two concrete examples of our thougth experiment, using as the systems S i polarisation states of photons, depicted in Supplementary Fig. 1, or spatial modes of a quantum field, depicted in Supplementary Fig. 2.  Fig. 2 Bell's theorem for temporal order. A bipartite system, made of subsystems S 1 and S 2 , is sent to two groups of agents. Operations on S 1 (S 2 ) are performed at events A 1 , B 1 (A 2 , B 2 ). At event C 1 (C 2 ), a measurement with setting i 1 (i 2 ) and outcome o 1 (o 2 ) is performed. Events A 1 , B 1 are space-like separated from A 2 , B 2 and C 1 is space-like to C 2 ; light cones are marked by dashed yellow lines. The order of events A j , B j , j = 1, 2, is described by a variable λ defined by a system M. The system M is measured at event D, producing an output bit z. If the initial state of the systems S 1 , S 2 , M is separable, and λ is a classical variable (possibly dynamical and probabilistic), the resulting bipartite statistics of the outcomes o 1 , o 2 cannot violate any Bell inequality, even if conditioned on z

Discussion
The non-classical causal structures discussed in this work arise in a semi-classical, albeit non-perturbative, regime where no explicit quantisation of the gravitational field is needed (which is complementary to the regime of most quantum gravity frameworks 14 ). Our approach shows that general relativity and standard quantum mechanics are sufficient to analyse scenarios involving superpositions of macroscopically different classical backgrounds. Not only is there no tension between the two frameworks, but there is also no ambiguity in the prediction of physical effects that arise: for each probability amplitude, the time-dilation effects introduced by the mass can be treated classically. The considered processes involve a simple superposition of such amplitudes and the final probability amplitude is given by the usual Feynman sum. Note that, even though no explicit quantisation of the metric is used, the amplitudes in the Feynman sum do correspond to macroscopically distinct space-time metrics: this is because each of these amplitudes contains a different causal structure, which determines the metric up to a conformal factor 17,18 . Quantisation of the metric is therefore implicit in our result, in a similar way as in recently considered witnesses for quantum gravity in interferometric scenarios [47][48][49] .
A practical realisation of the Bell test for time order would be extremely challenging, even in light of current efforts to prepare superposition states of massive objects and test their gravitational interactions [50][51][52][53][54] . However, there would be far reaching consequences if a such a test were fundamentally impossible: this would imply that time order, and thus time itself, can be described with a classical parameter even in space-times originating from a quantum state of a massive object-with no need to invoke any other mechanism, such as refs. [9][10][11][12][13] , that would decohere these states (see also Supplementary Note 5 for further discussion). On the other hand, since these mechanisms postulate a specific decoherence time of spatial superpositions, one could think that they preclude the preparation of non-classical causal structures. This is not the case: the time required to complete our protocol can be shorter than the decoherence time postulated by these models (see Methods). Thus, contrary to some motivations 11,13 , these models do not enforce fundamentally classical space-time with a fixed causal structure (i.e. there is a parameter regime where entangled causal structures could form but decoherence postulated by these models is negligible). Finally, classical temporal order could not be excluded also in a scenario where massive bodies can be prepared in quantum states but one (or more) of the assumptions 1, 2, 4, and 5 cannot be satisfied for some fundamental reason. We note that in particular the notion of locality may be fundamentally limited in the context of quantum gravity 55,56 .
We should note that proof-of-principle realisations of indefinite causal order, analogous to the examples discussed here, have been realised in the laboratory. However, such realisations cannot be interpreted as proofs of non-classical space-time in the sense of general relativity, see Supplementary Note 4 for a discussion of the key differences between the gravitational and other methods for a quantum control of temporal order. The full extent of the relation between gravitational and non-gravitational realisations of quantum causal structures merits an in-depth study on its own.
A crucial aspect of Bell's theorem for temporal order is that it provides a theory independent result-it applies to any framework where causal relations are described classically, such as classical, semi-classical 14 and stochastic gravity 32 theories. Moreover, joint validity of the quantum superposition principle and gravitational time dilation, assumptions (a)-(c), suffice for a maximal possible violation of the bound. Therefore, a classical notion of temporal order is untenable in any theory compatible with these basic principles. Finally, the way in which a non-classical causal structure can be engineered exploiting time dilation from a massive body in a quantum state reveals a close connection between the information-theoretic framework of quantum combs/process matrices and joint effects of quantum mechanics and general relativity.

Methods
Quantum gravitational control of temporal order. According to the Einstein equations, a massive object gives rise to a space-time metric g μν , μ, ν = 0, ..., 3, which in isotropic coordinates and a post-Newtonian expansion reads 31 : where r denotes the distance to the location of the mass. In other words, if a test mass or a clock is positioned at a spatial coordinate R a as described by a far-away agent (as in Fig. 1) and the massive object is at a coordinate r M , then r = |R a − r M |, which for clarity we denote below by R a − r M . It is important to note that the same coordinates describe scenarios where the mass is placed at different locations at a finite distance from r M , as long as it remains far from an asymptotic region so that the spatial and temporal coordinates of the far-away agent remain unaffected (i.e. are those of flat Minkowski space-time). In these coordinates, the Hamiltonian of a clock-a particle with internal DOFs-reads (see e.g. refs. [57][58][59] where P i , i = 1, 2, 3 are the components of the momentum operator, and Ω a is the internal Hamiltonian, describing the local time evolution of the internal DOFs. Note that we can restrict ourself to an effectively onedimensional scenario, so only one of the spatial coordinates has been kept in the above expression. In the first post-Newtonian expansion and considering that both the mass and the clock follow fixed world lines at constant R a and r M , respectively, Eq. (9) becomes The asymptotic time coordinate t defines space-like hypersurfaces that are independent of the location of the mass and on which one can define states of all the involved systems (the clocks, the target systems and the mass itself) and Hamiltonian (10) describes their time evolution of with respect to t. Due to the interactions between the mass and the clocks-effected by the space-time metric, which contains the potential Φ(R a − r M )-the time evolution of the clocks depends on their relative distance R a − r M to the mass. Crucially, by the definition of t and the Hamiltonian our description includes both considered different mass configurations: the mass can be semi-classically localised around a single spatial coordinate r or in superposition of different spatial coordinates and the associated states belong to the same Hilbert space associated with a space-like hypersurface labelled by t. We thus have all the tools to analyse time evolution in the presence of a superposition state of the mass, even though it leads to a quantifiably nonclassical causal structure.
With respect to t and the associated foliation of space-time, the evolution of the clock, which at t = 0, is in an internal state |s a (τ 0 )〉, where τ 0 denotes the clock's proper time at t = 0, reads e ÀiΩ a t 1þ ΦðRa Àr M Þ c 2 À Á jR a ijs a ðτ 0 Þi ¼ jR a ijs a ðτ 0 þ τðR a À r M ; tÞÞi; ð11Þ where τðR a À r M ; tÞ :¼ t 1 þ ΦðR a Àr M Þ c 2 is the proper time elapsing for the clock at a radial distance |R a − r M | from the mass when the elapsed coordinate time is t; and for clarity we set ħ = 1. Before continuing on to the gravitational quantum control, we give an example of an internal Hamiltonian, state, and evolution. Let us take Ω a = E 0 |0〉〈0| + E 1 |1〉 〈1| and js a ðτ 0 ¼ 0Þi ¼ 1 ffiffi which is simply |s a (τ(R a − r M , t))〉.
We now use the above to show how the quantum superposition principle and general relativity lead to the prediction that quantised matter acts as a quantum control of temporal order. To this end, we assume conditions (a)-(c) from the Results section and consider two clocks positioned at R A and R B , respectively. The Hamiltonian of clock a is thus Eq. (10) and fully analogously for b, . The clocks are initially synchronised with each other and with a clock of the distant agent so that at t 0 = 0 both clocks are at τ 0 = 0. We further consider a target system, for example, a mode of the electromagnetic field, initially in a state |ψ〉 S , on which an operation O A is performed at an event A = (R a , τ a = τ * ) and an operation O B at an event B = (R b , τ b = τ * ), where τ a , τ b refer to the proper times of the clock A, B, respectively. We effectively represent these operations as O A ¼ δðτ a À τ Ã ; r À R a ÞO A , where δ(τ A − τ * , r − R a ) is a Dirac delta distribution and O A is an operator (e.g. describing rotation of the polarisation of an electromagnetic field mode by a particular half-wave plate) independent of time and location. The total Hamiltonian reads which for simplicity assumes trivial time evolution of the mass and of the target system between the application of the operations. We furthermore consider the following initial (at t 0 = 0) state of the mass, clocks and the target system: where positions r L , r R of the mass refer to the configurations in the left and the right panel of Fig. 1, respectively, that is, they realise configurations K A0B and K B0A : for |r L 〉 the mass is at a distance r a = r L − R a from clock a and at r b = r a − h from b, while for |r R 〉 the relative distances are swapped and the mass is at a distance r a − h from a and at r a from b. After coordinate time t such that τ(r a , t) > τ * (where τ Ã > 2r 2 b c GM , see main text) the state evolves to The order of applying unitary transformations U A ¼ e ÀiO A and U B ¼ e ÀiO B to the target system is controlled by the position of the mass, which due to timedilation changes causal relations between events A and B. Swapping the mass distribution: |r L 〉 → |r R 〉, |r R 〉 → |r L 〉 and letting the state evolve for another time interval t results in the final state where the clocks become synchronised again where τ f = τ(r a , t) + τ(r a − h, t). Measuring the mass in a superposition basis |r L 〉 M ± |r R 〉 M prepares the target system in the corresponding superposition state U B U A | ψ〉 S ± U A U B |ψ〉 S . The above example demonstrates that under very conservative assumptions a spatial superposition of a mass generates a quantum-controlled application of unitary operations. More fundamentally, this effect stems from the superposition of different causal structures associated with the superposed states of the mass.
Proof of Bell's theorem for temporal order. Bell's theorem in general asserts that, under certain assumptions, the correlations between the outcomes of independent measurements on two subsystems must satisfy a class of inequalities. The two measuring parties are referred to as Alice and Bob. In every experimental run, each of them measures one of two properties of the subsystem they receive. For each of the properties, one of two outcomes is obtained, for convenience chosen to be ±1. Bell's inequalities follow from the conjunction of the following assumptions: (1) measurement results are determined by properties that exist prior to and independent of the experiment (hidden variables); (2) results obtained at one location are independent of any measurements or actions performed at space-like separation (locality); (3) any process that leads to the choice of which measurement will be carried out is independent from other processes in the experiment (free choice). The outcomes of Alice A(i, λ) and Bob B(i, λ) thus only depend on their own choice of setting, index i, and on the property of the system, variable λ. The correlation between outcomes A(i, λ) and B(i, λ) for the measurement choices i, j is described by EðA i ; B j Þ ¼ R dλPðλÞAði; λÞBðj; λÞ, where P(λ) is the probability distribution over the properties of the systems. It is straightforward to check that one possible inequality satisfied by the correlations E(A i , B j ) is the so-called Clauser-Horne-Shimony-Holt inequality: |E(A 1 , Crucially, quantum theory allows for the left-hand side of this inequality to reach a value >2, and experimental measurements of this (and other inequalities) have confirmed such a violation [3][4][5][6] . The significance of the violations of Bell's inequalities is in showing that neither nature nor quantum mechanics obey all three assumptions mentioned above.
The assumption of classical order is sufficient to derive Causal Inequalities 16,60 : tasks that, without any further assumptions, cannot be performed on a classical causal structure. However, it is not possible to violate causal inequalities using quantum control of order 45,61 , this is why additional assumptions were required in the present context. It is an open question whether a gravitational implementation of a scenario that does allow for a violation of causal inequalities is possible.
The theorem we have formulated is theory independent, but not fully deviceindependent, as it requires the notions of a physical state and a physical transformation (in addition to the measured probability distributions), which we introduce below and then proceed to the proof. Discussion of the present work in the context of the theory-dependent framework of causally non-separable quantum processes 16,45,61 and the fully theory-and device-independent approach of causal inequalities 16,60 is presented in Supplementary Note 1.
We consider a sufficiently broad framework to describe physical systems that can undergo transformations and measurements, similar to generalised probabilistic theories [62][63][64] . This framework is more general than quantum or classical theory and we thus need to define key notions required in the proof. In this framework, a state ω is a complete specification of the probabilities P(o|i, ω) for observing outcome o given that a measurement with setting i is performed on the system. We are interested in situations where a system can be split up in subsystems, say S 1 and S 2 , with space-like separated agents performing independent operations on S 1 and S 2 . We say ω is a product state, and write ω = ω 1 ⊗ ω 2 , if probabilities for local measurements factorise as P(o 1 , o 2 |i 1 , i 2 , ω) = P(o 1 |i 1 , ω 1 )P(o 2 |i 2 , ω 2 ). If state ω f 1 is prepared for system S 1 and state ω f 2 is prepared for system S 2 , according to a probability distribution P(f) for some variable f, we write ω ¼ R df Pðf Þω f 1 ω f 2 and say the state is separable. Probabilities are then given by the corresponding mixture: Note that for such a decomposition Bell inequalities cannot be violated 1,65 .
A physical transformation of the system is represented by a function ω7 !TðωÞ. To make our arguments precise we need a notion of local transformations, namely, realised at the time and location defined by a local clock. If S 1 is the subsystem on which a local transformation T 1 acts, and S 2 labels the DOFs space-like separated from T 1 , then, by definition, T 1 transforms product states as ω 1 ω 2 7 !T 1 ðω 1 Þ ω 2 and separable states by convex extension. How local operations act on general, non-separable states can depend on the particular physical theory; however, action on separable states will suffice for our purposes. We further need to define how the local transformations combine. This depends on their relative spatio-temporal locations: if transformations T 1 , T 2 are space-like separated they combine as (T 1 ⊗ T 2 )(ω 1 ⊗ ω 2 ) = T 1 (ω 1 ) ⊗ T 2 (ω 2 ), which follows from the definition above; if T 1 is in the future of T 2 , we define their combination as T 1 ο T 2 (ω) = T 1 (T 2 (ω)). (For simplicity, we omit possible additional transformations taking place between the specified events, as they are of no consequence for our argument.) Proof Assumption (1) says that there is a random variable f determining the local states ω f 1 , ω f 2 of systems S 1 , S 2 , respectively. Assumption (3) says there is a random variable λ that determines the order of events. In general, the two variables can be correlated by some joint probability distribution P(λ, f). By assumption (4), events labelled A 1 , B 1 are space-like separated from events A 2 , B 2 and the order between events within each set (A j , B j ), j = 1, 2 can be defined by a permutation σ j . Most generally, there is a probability P(σ j |λ) that the permutation σ j is realised for a given λ. By assumption (2), for each given order the system undergoes a transformation T σ 1 T σ 2 , where T σ 1 is the transformation obtained by composing T A 1 and T B 1 in the order corresponding to the permutation σ 1 and similarly for T σ 2 . (For example, if σ 1 corresponds to the order A 1 0 B 1 , then T σ 1 ¼ T B 1 T A 1 .) Furthermore, at event D an outcome z is obtained with a probability P(z|λ, f, σ 1 , σ 2 ). Finally, using assumption (1), we write the probabilities for all outcomes as A simple Bayesian inversion P(σ 1 |λ)P(σ 2 |λ)P(z|λ, f, σ 1 , σ 2 )P(λ, f) = P(λ, f, σ 1 , σ 2 |z) P(z), where we used P(σ j |λ) = P(σ j |λ, f), gives the desired probabilities wheref is a short-hand for the variables λ, f, σ 1 , and σ 2 . The above probability distribution satisfies the hypothesis of Bell's theorem and thus cannot violate any Bell inequality.
Exemplary scenario realising Bell test for temporal order of events. The protocol allowing for the violation of Bell's inequalities for temporal order exploits correlations between the clocks of the agents a 1 , b 1 and the agents a 2 , b 2 , created due to time dilation induced by the mass. It should be noted that the protocol allows maximal violation of the Bell inequality if the joint state of the systems S 1 and S 2 is pure (and maximally entangled) when the Bell measurements are realised. Thus, for a maximal violation, the clocks need to decorrelate from the mass after the application of the unitaries. Below we sketch a scenario that can achieve this. The space-time arrangement of the mass and the agents in this example is presented in Fig. 4. It can be realised in one spatial dimension: agents acting on the system S 1 are located at distance h from each other, and the mass is placed at distance r (configuration K B0A ) or r + L (configuration K A0B ) from agent a 1 . Agents acting on system S 2 are placed symmetrically on the opposite side of the mass, such that the mass is at a distance r + L from a 2 in configuration K B0A and r in configuration K A0B . Here, events A j are defined by the local time τ a that differs from the local time τ b defining B j , j = 1, 2. In such a case, even though the mass is always closer to a j than to b j , the two mass configurations can lead to different event orders-as they induce different relative time dilations. (Equivalently, one can introduce an initial offset in the synchronisation of the clocks.) Note that the time orders between the two groups are here "anti-correlated": A 1 0 B 1 and B 2 0 A 2 for K A0B , and vice versa for K B0A . Since otherwise the scenario is the same for S 1 and S 2 , we focus on the operations performed on S 1 . The key observation is that swapping the mass distribution, as depicted in Fig. 4, will eventually disentangle the clocks from the mass, and since the clocks must be suitably time dilated when the operations are performed, the operations must not take place in the future light cone of the swapped mass state.
The proper time τ a that has to elapse for the clock of a 1 such that the order of events is A 1 0 B 1 for jK A0B i and B 1 0 A 1 for jK B0A i for the present case reads where T c (r, L/2) is the coordinate travel time of light between radial distances r and r + L/2 from the mass. The coordinate time corresponding to τ a is T a ¼ τ a = ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Àg 00 ðrÞ p . The proper time of event B 1 is then defined as: It can directly be checked that when the mass is placed in configuration K A0Bat a distance r + L from a 1 -the event A 1 defined by local clock of a 1 showing proper time τ a from Eq. (18) is in the past light cone of event B 1 , which is defined by the local clock of b 1 showing proper time τ b from Eq. (19). When the mass is placed in configuration K B0A , event B 1 ends up in the past of the event A 1 . The coordinate time required for the application of the operations can be estimated as twice the travel time of light between the agents, T o = 2T c (r + L/2, h).
The world lines of the mass can be arranged such that: (a) the mass is moving slow so that the two amplitudes of the mass are swapped in a time interval longer than T o ; (b) during the application of the operations the distance of each agent to the mass is approximately the same for both mass configurations (as in Fig. 4). The first guarantees that there is enough time to apply the operations after the clocks get correlated, the second-that the slow-down of light in curved space-time, the Shapiro delay 66,67 , can be neglected.
The coordinate-time duration of the entire protocol can be estimated as T p = 2T a + 4L/2c, where L/2c is the minimal time required to put the mass in superposition of amplitudes separated by the distance L/2. Taking as an example M~0.1 μg, L = h~0.1 μm, r~1 fm, the protocol in Fig. 4 takes T p~1 0 h. Furthermore, we note that a quantum treatment of the local clocks is central to our protocol since the application of the operations on the target systems is conditioned on the states of the clocks. The time-energy uncertainty 68,69 thus poses a limitation to a single-shot precision with which space-time events can be defined with physical clocks. The optimal clock state in this context-evolving the fastest-is a balanced superposition of energy eigenstates; for an energy gap ħ ⋅ 2πν c , where ν c is the clock frequency, the smallest time that can be resolved by a single quantum system is the so-called orthogonalisation time 70-72 t ⊥ = 1/2ν c . For the values of parameters quoted above, the coordinate-time difference between the superposed locations of the events A i , i = 1, 2 is~10 −15 s, and we thus need a system with frequency ν c ≥10 15 Hz such as a clock based on optical transitions in ytterbium 73 or mercury 74 , which both give t ⊥~1 0 −16 s. While this ideal limit is not reached with practical systems, the resolution of current atomic clocks based on such atoms far exceeds this theoretical bound due to averaging over many atoms, with 2.5 × 10 −19 uncertainty of the clock frequency recently demonstrated in ref. 75 . We further note that by using n entangled atoms, the orthogonalisation time of the entire system becomes t ⊥ /n and can thus be even a few orders of magnitude smaller 76 than required. Finally, such atoms have masses~10 −25 kg and their back action on the metric produced by M~10 −7 kg would thus be negligible. Since the mass difference between the atom in the two involved energy levels is 2πħν c /c 2~1 0 −35 kg also quantum effects from the clocks' mutual gravitational interactions 58 can be neglected.
We conclude that it is in principle possible to achieve the required entanglement of orders, swap the mass distribution so as to finally disentangle the clocks from the mass, and satisfy the locality conditions on the events. Although a direct experiment in such a regime is not practical, the above example surprisingly shows that the regime where entangled temporal order arises is in no way related to the Planck scale. It is usually assumed that the Planckscale marks the regime where quantum gravity effects become relevant (first discussed in this context by Bronstein 77 ), but this is not the case for the superposition of temporal order. In terms of a potential experiment, one could also take a different (theorydependent) approach and explore possible witnesses of entangled temporal order 61 , in analogy to witnesses of entanglement in quantum-information theory 78 . A witness would probe the quantum nature of temporal order indirectly and under further assumptions, but in a relaxed parameter range. Such an approach may lead to more feasible experiments, which will be explored in a future study.
A spatial superposition state of a mass such as used in our protocol is postulated to decohere in various gravity-inspired collapse models 9-13 (which thus violate assumption (c) in the first section). However, even if endorsed, these models do not immediately preclude realisation of our protocol: the decoherence time scale in those models is the Diosi-Penrose time 10,11 T DP ¼ 2δ 3 h GðMLÞ 2 , where δ is a free parameter. For every value of δ one can find the mass and the relevant distances (M, r, L, h) so that the duration of our entire protocol is shorter than T DP . For example, following the recent ref. 79 and taking δ = 10 −7 m, for r = 10 10 R Sch , L = 5r, h = r and M = 1 g, where R Sch ≈10 −30 m, the protocol from Fig. 4 takes T p ≈7 × 10 −18 s, while T DP ≈0.5 s. Taking instead the originally proposed value δ = 10 −15 m 10 , the desired regime is achieved, for example, for M = 10 −7 kg, r = 10 7 R Sch , L = 5 × 10 5 r, h = 10 5 r; with T p~1 0 −23 s and T DP~1 0 −13 s. Thus, the above models in principle still allow for events with entangled temporal order, and do not enforce the classicality of the causal structure of space time.

Data availability
The data that support the plots within this paper and other findings of this study are available from the corresponding author upon reasonable request.   〉   Fig. 4 Space-time diagram of a protocol for disentangling the clocks from the mass. In configuration K A0B the mass is at a distance r + L from a 1 , and at r + L + h from b 1 . In K B0A -it is at r from a 1 and at r + h from b 1 .The opposite holds for a 2 , b 2 . The initial mass superposition is swapped (after sufficient time to prepare the clocks in the correlated state) so that they finally show the same time. At the local time τ a of a 1 (at event A 1 ) the agent applies U A 1 on S 1 . At the local time τ b of b 1 the agent applies U B 1 on S 1 . For the mass configuration K A0B A 1 is before B 1 (orange-coloured events), while for K B0A event B 1 is before A 1 (blue-coloured events). The opposite order holds for events A 2 , B 2 occurring on the opposite side of the mass, where agents a 2 , b 2 act on S 2 . Unitary operations should be applied in the future light cone of the event where the clocks get correlated and outside the future light cone of the event when the mass amplitudes are swapped, Bell measurements (at C 1 , C 2 ) should be made when the clocks become disentangled (at future light-like events to when the mass amplitudes are brought together), and the measurement at event D should be space-like to C 1 , C 2 ; dashed yellow lines represent the relevant light cones