Cyclic quantum causal models

Causal reasoning is essential to science, yet quantum theory challenges it. Quantum correlations violating Bell inequalities defy satisfactory causal explanations within the framework of classical causal models. What is more, a theory encompassing quantum systems and gravity is expected to allow causally nonseparable processes featuring operations in indefinite causal order, defying that events be causally ordered at all. The first challenge has been addressed through the recent development of intrinsically quantum causal models, allowing causal explanations of quantum processes – provided they admit a definite causal order, i.e. have an acyclic causal structure. This work addresses causally nonseparable processes and offers a causal perspective on them through extending quantum causal models to cyclic causal structures. Among other applications of the approach, it is shown that all unitarily extendible bipartite processes are causally separable and that for unitary processes, causal nonseparability and cyclicity of their causal structure are equivalent.

T here has been growing interest in higher-order quantum processes in which separate operations do not occur in a definite causal order (see, e.g., refs.  for a selection). This property, called causal nonseparability 3,6,7,23 was formalized within the process matrix framework 3 , which describes correlations between quantum nodes of intervention without assuming a predefined order between the nodes. Challenging conventional notions of causality, causally nonseparable processes have been shown to allow informational tasks that cannot be achieved with operations used in a definite order 4,5,8,24 . Such processes have been conjectured to be relevant in the context of quantum gravity [1][2][3]25 and closed time-like curves 2,3,11,22,26,27 , but some are also known to admit realizations in standard quantum mechanics on time-delocalized systems 18 . A prominent example is the quantum SWITCH, which has been demonstrated experimentally [28][29][30][31][32][33] .
On a separate front, there is the recent development of the framework of quantum causal models 34,35 (see, e.g., refs. [36][37][38][39][40][41][42][43][44][45][46] for related, previous work) as a fully quantum version of the classical framework of causal models 47,48 . It is formulated within the formalism of process matrices, but contains the classical causal models as special cases and generalizes many of the fundamental concepts and core theorems of the latter. Quantum causal models thus constitute a general framework for reasoning about quantum systems in causal terms, allowing the rigorous study of the empirical constraints imposed by quantum causal structureshowever, only as far as causal structures are concerned that are expressible as directed acyclic graphs (DAGs), i.e., where there is a well-defined causal order. The central idea behind the approach in refs. 34,35 is that causal relations between quantum systems, as encoded in a DAG, correspond to influence through underlying unitary transformations. This facilitated, in particular, a justification of the quantum Markov condition relative to a DAG that underpins the definition of a quantum causal model-any such model can be thought of as arising from a unitary circuit fragment with a compatible causal structure by marginalizing over latent local disturbances 35 .
It is a natural question whether these hitherto separate lines of research can be merged to arrive at a causal model perspective on processes that are not compatible with a fixed order of the quantum nodes. While this direction of thought has been considered in earlier work (see, e.g., refs. 45,49,50 ), it was previously not clear how to take the idea forward due to various conceptual and technical obstacles-including, for example, how quantum nodes and the quantum Markov condition should be defined, how the notion of the autonomy of causal mechanisms should be understood 49 , and how to prevent paradoxes.
This work overcomes these obstacles by generalizing the approach to quantum causal models of refs. 34,35 . A large class of processes that are not compatible with a fixed order of the nodes can then be understood to have a causal structure, albeit one that includes directed cycles. This may appear counterintuitive, but the process matrix framework guarantees that it is free of paradoxes. The motivation for entertaining such a proposal is twofold. First, in light of the puzzling nature of causally nonseparable processes and the open question of which ones are physically possible in nature, a conceptual clarification of causal structure is an important next step. Second, our approach yields mathematical tools facilitating new technical results, including a more fine-grained description of the compositional structure of a process that is implied by its causal properties. One of the implications of the latter derived in this work is a proof that all bipartite processes that admit a unitary extension 51 are causally separable. We also prove that for unitary processes, causal nonseparability and cyclicity of their causal structure are equivalent.

Results
The process formalism, causal order and signalling. Let us start by setting out some necessary background and essential concepts. In quantum theory, a system A is associated with a complex Hilbert space H A , and its state is a density operator ρ A 2 LðH A Þ, where LðH A Þ is the space of linear operators over H A . The most general evolution of a system, assuming that it is initially uncorrelated with its environment, is given by a completely positive trace-preserving (CPTP) map E : LðH A Þ ! LðH B Þ, where this notation allows the output system to be different from the input system. The most general operation that an agent can perform from an input system A to an output system B has a classical outcome k and specifies the transformation from A to B conditioned on each value of k being obtained. Mathematically, the operation corresponds to a quantum instrument, which is a collection of completely positive (CP) maps fE k : LðH A Þ ! LðH B Þg, such that E ¼ P k E k is a trace-preserving CP map. It is convenient to represent CP maps with operators, via a variant of the Choi-Jamiołkowski (CJ) isomorphism 52,53 , which to a given CP map E : LðH A Þ ! LðH B Þ associates the CJ operator ρ E BjA :¼ is an orthonormal basis of H A , and i j i A Ã È É the corresponding dual basis. The CJ operator for a CPTP map E satisfies Tr B ½ρ E BjA ¼ 1 A Ã . This variant of the CJ isomorphism is used in refs. 34,35 , and has the advantage that the CJ operator is both positive semidefinite and independent of the basis used in its definition.
The idea behind the process formalism 3 is that there is a fixed set of locations A i , i = 1, ⋯, n, which in this work we call quantum nodes, at each of which an agent can perform an operation on a quantum system. A quantum node A i is associated with two Hilbert spaces, an input Hilbert space H A in i and an output Hilbert space H A out i (both here assumed finite-dimensional). The input Hilbert space carries the state of the input system just before the operation by the agent, and the output Hilbert space carries the state of the output system just after the operation. The operation itself corresponds to a quantum instrument fE k A A : LðH A in Þ ! LðH A out Þg. Conceptually, a quantum node is sometimes thought of as representing a small, localized laboratory in some region of spacetime, but may also be conceived more abstractly, for example as occupying a particular position in between the gates of a quantum circuit.
The aim of the process formalism is to describe the correlations between the outcomes of the operations that are performed at the separate quantum nodes. Given a set of instruments at the quantum nodes A 1 , ..., A n , the joint probability for their outcomes is given by ; k A n Þ ¼ Tr σ A 1 :: where τ k A A :¼ ρ E k A A out jA in T , and σ A 1 :::A n 2 Lð called the process operator, and which we will also sometimes refer to more simply as the process. A process operator σ A 1 :::A n , which up to a different convention of the CJ isomorphism is the same as a process matrix 3 , obeys constraints, designed to ensure that valid joint probabilities are returned by Eq. (1) for any possible choices of the operations performed by the agents, and that the same holds even when the agents have pre-shared entanglement. These constraints are 3 : σ A 1 :::A n ≥ 0 and Tr½σ A 1 :::A n ðτ A 1 Á Á Á τ A n Þ ¼ 1, for any set of CPTP maps fτ A i g at the n nodes. Simple-to-check necessary and sufficient conditions for an operator in Lð when tracing over a node A we will write Tr A ½ :¼ A question of central interest in the study of the process formalism has been whether a given process operator is compatible with the existence of a definite causal order of its nodes. A closely related question concerns how this relates to the possibilities for signalling between different nodes. Let us first make these notions more precise.
Consider the sequence of quantum operations represented in the form of a circuit in Fig. 1.
The gate in the circuit corresponds to an arbitrary CPTP map E, and the initial preparation to an arbitrary bipartite state ρ. Quantum nodes A and B correspond to positions in the circuit in between gates, at which an agent can choose to perform a quantum instrument on the system at that position. The nodes are represented as broken wires, with it understood that the agent's instrument mediates the two pieces. The lower piece of the wire corresponds to the input Hilbert space of the quantum node and the upper piece of the wire to the output Hilbert space. Any circuit with some wires broken defines a partial order over the quantum nodes, with a node N preceding node N 0 in the partial order if and only if there is a path from N to N 0 along the (broken or unbroken) wires of the circuit. We call this partial order the causal order. In the example, node A precedes node B in the causal order. The circuit defines joint probabilities for the outcomes of any quantum instruments that are performed at nodes A and B, hence defines a process operator over the nodes A and B.
An important concept is now that of causal separability, first introduced in ref. 3 for the bipartite case. A bipartite process σ AB is called causally separable iff it can be seen to arise as a convex mixture of processes with a fixed causal order between A and B, i.e., AB is a process that can arise from a sequence of operations of the form of Fig. 1, and σ A6 "B AB is a process that can arise from a different sequence of operations, of the same form except that B precedes A. Otherwise σ AB is causally nonseparable. The idea is that a causally separable process can be thought to describe a situation in which a well-defined, though possibly unknown, causal order of the nodes exists, whereas a causally nonseparable process is not compatible with such an interpretation.
The connection with signalling between the nodes is as follows. Given a bipartite process σ AB , we say that there is no signalling from quantum node B to quantum node A if and only if for all quantum instruments τ k A A at A and all deterministic quantum instruments τ B at B, the probability distribution This condition is equivalent to σ AB ¼ σ AB in 1 ðB out Þ Ã 3 . The connection between signalling and causal order is that in any sequence of operations of the form of Fig. 1, there is no signalling from B to A. Moreover, every bipartite process operator with no-signalling from B to A is known to have a realisation as the process operator arising from a circuit of the form of Fig. 1 54 .
The formal definition of the multipartite generalization of causal separability is more intricate than in the bipartite case: beyond just convex mixtures of fixed causal orders, the definition allows for a dynamical causal order in which the causal order at later quantum nodes can depend on the events taking place at earlier quantum nodes. This definition is postponed to a later section dedicated to causal separability. See also refs. 7,23 for a detailed discussion. A discussion of signalling in a multipartite process is also more involved, since whether a subset of quantum nodes can signal to another subset of quantum nodes depends on the interventions performed at other quantum nodes not in the two subsets.
Causal influence vs signalling. This work is concerned with a notion of causal structure, which is distinct from the causal order defined by a circuit, and which also needs to be carefully distinguished from the possibilities for signalling afforded by a general process operator. In order to motivate the idea, consider a circuit of the form of Fig. 1, with each wire representing a qubit, with ρ ¼ ρ A in 0 j i A 0 0 h j, and with the channel E being a quantum Controlled-NOT gate with the control on the output wire of the A node. This circuit defines a process operator σ 0 AB on the A and B nodes, which may easily be computed, and it can be verified that σ 0 AB allows signalling from A to B. Similarly, in the same circuit except with ρ ¼ ρ A in 1 j i A 0 1 h j, the process operator σ 1 AB is easily computed, and it can be verified that signalling is possible from A to B. Now consider the same experiment, except with the preparation of the A 0 system given by flipping a fair coin, and preparing 0 j i A 0 0 h j on heads and 1 j i A 0 1 h j on tails. If the outcome of the coin flip is unknown, then the state of the A 0 system is the mixed state 1=2, and the corresponding process operator over A and B is In the process σ mix AB , there is no signalling from A to B. Indeed the very same process operator could arise from a situation in which A and B are independent and spacelike separated.
In the experiment with the coin flip, it is clear that A has a causal influence on B, since agents who know the value of the coin flip would be able to send signals from A to B. From the perspective of agents who do not know the value of the coin flip, however, signalling is washed out by the randomness of the unobserved system. A similar phenomenon is well understood in the literature on classical causal modelling. In a canonical example, A, B, and C are all classical bits, with A and C causes of B such that B is equal to the parity of A and C. If C is inaccessible, or hidden, and satisfies P(C = 0) = P(C = 1) = 1/2, then B is randomly distributed regardless of the value of A. Hence as long as C remains hidden, signals cannot be sent from A to B. (See Section 2.4 of ref. 35 .) The conclusion that should be drawn from the example with process σ mix AB is that causal influence between quantum nodes should not be defined in terms of the possibilities for signalling afforded by a process operator, at least not if there is a chance that unobserved systems (A 0 in the example) are interacting with the systems under study (A and B in the example). Given only a process operator σ AB and no other data, although signalling is sufficient for causal influence, it can happen that A has a causal influence upon B even though there is no signalling from A to B in σ AB .
Quantum causal models. The framework of quantum causal models, introduced in refs. 34,35 , is based on the idea that in an example like that just above, statements about causal influence can be defined in terms of signalling, but only once all relevant systems are included in the description. At this point, the description is of a closed system, and at least in standard quantum theory, evolution of a closed system is unitary. Hence quantum causal models define causal influence in terms of unitary transformations. Reference 35 shows that in the case of unitary circuits with broken wires representing quantum nodes, the causal relations between the quantum nodes can be summarized in the form of a DAG, where the DAG imposes constraints on the process operator over the quantum nodes.
These leaves open the question of what the pattern of causal influence might be in causally nonseparable processes described in the literature. Can it even be well defined or must one conclude that these processes are not amenable to causal explanation at all, or that all that can be discussed is signalling between the nodes? Our idea is that such processes can be understood in causal terms, if the framework of quantum causal modelling is extended to allow causal cycles. We will show that the resulting formalism can be used successfully to describe some of the much-studied instances of causally nonseparable processes from the literature. Later, we show the utility of this approach by using it to settle previously open questions concerning causally nonseparable processes.
The following definition generalizes that of refs. 34,35 , by allowing cyclic graphs (along with a more minor generalization, which is that the input and output Hilbert spaces of a quantum node can here have different dimensions).
Definition 1 (Quantum causal model (QCM)-generalized) A QCM is given by: (1) a causal structure represented by a directed graph G with vertices corresponding to quantum nodes A 1 , . . . , A n , where Pa(A i ) denotes the set of parents of A i according to G, such that ½ρ A i jPaðA i Þ ; ρ A j jPaðA j Þ ¼ 0 for all i, j and such that σ A 1 :::A n ¼ Q i ρ A i jPaðA i Þ is a process operator over the quantum nodes A 1 , . . . , A n .
When writing products of the form Q i ρ A i jPaðA i Þ , it is understood implicitly that each factor is padded with an identity operator in tensor product for all other spaces. A QCM is called cyclic iff its causal structure contains directed cycles, and acyclic otherwise.
It is useful to define a term to express the fact that a given process operator σ has the correct form with respect to a given causal structure to define a QCM.
Definition 2 (Quantum Markov condition-generalized) A process σ A 1 :::A n is called Markov for a directed graph G with quantum nodes A 1 , …, A n as its vertices iff it admits a factorization into pairwise commuting channels of the form σ A 1 :::A n ¼ Q n i¼1 ρ A i jPaðA i Þ . Note that the Markov condition of classical causal models 47,48 is a special case of Definition 2, obtained when the graph is acyclic and σ A 1 :::A n is diagonal in a product basis, and encodes a classical probability distribution 35 . The following first sets out some further terminology and basic properties of Definition 1 and then turns to motivating and explaining Definition 1, making the link with unitary transformations, and showing why it is that for a particular directed graph, condition (2) should hold.
First, observe that not every cyclic graph supports a QCM in an interesting way. Consider, for example, the two-node cyclic graph of Fig. 2a. A QCM with such a causal structure would come with a process operator Here and throughout, channels between the nodes on which a process is defined are written such that anything appearing to the right of the bar refers to the output Hilbert space of the node, and anything appearing to the left of the bar refers to the input Hilbert space of the node. By our conventions ρ A|B ρ B|A = ρ A|B ρ B|A . However, this is not a valid process operator unless either In other words, at least one of the channels ρ A|B , ρ B|A carries no information, but simply ignores its input and prepares a fixed state on the output. Intuitively speaking, this is because there would otherwise be logical paradoxes for certain choices of interventions at A and B. More generally, we will say that a QCM is faithful iff each of the channels where d j is the dimension of A out j . Our claim concerning the causal structure of Fig. 2a can be summarized as: Proposition 1 There is no faithful cyclic quantum causal model with two nodes.
Proof See Methods. Now consider the cyclic graph G 0 in Fig. 2b A QCM with G 0 as its causal structure comes with the data Equation (5), compared to Eq. (3), has the key difference that the commuting operators have non-trivial action on ðC out Þ Ã . As a result, it turns out that faithful cyclic QCMs of this form do exist. An example is described below.
Note that, given a cyclic graph such as that in Fig. 2b, even when a faithful QCM exists it is not in general the case that any set of commuting channels ρ A i jPaðA i Þ defines a process operator. (See Methods for an explicit demonstration of this fact.) The constraint in the definition of a QCM that σ A 1 :::A n ¼ Q i ρ A i jPaðA i Þ is a valid process operator is essential, and is what guarantees that grandfather-type paradoxes do not arise 3 . This is in contrast to the acyclic case, where, given an acyclic causal structure, it is not hard to argue that any product of commuting channels of the form Q i ρ A i jPaðA i Þ is a valid process operator 35 , hence in particular a faithful QCM with that causal structure can always be found.
Unitarity and causal structure. The definition of a QCM above is predicated on the idea that causal structure should be represented by a directed graph. This idea, however, along with the stipulation that the accompanying process is Markov for the graph, was presented without much justification or further comment. Why is causal structure represented by a directed graph, for example, as opposed to a different mathematical object, such as a partial order, or a preorder, or some kind of hypergraph? This section considers a subclass of processes-unitary processes, defined momentarily-and shows that a unitary process is associated with a causal structure, which can indeed be represented with a directed graph, and that the unitary process is Markov for that graph. In other words, a unitary process, along with its causal structure, defines a QCM.
In order to define a unitary process, observe that a process operator σ A 1 :::A n has the mathematical form of the CJ operator for a channel P : Where it is convenient to emphasise this form, we will sometimes write σ A 1 :::A n ¼ ρ P A 1 :::A n jA 1 :::A n , where it is understood implicitly that an A i to the right of the bar stands for A out i , while an A i to the left of the bar stands for A in i . A unitary process is a process (where some of the input or output spaces may be trivial, i.e., 1-dimensional) such that the channel P is a unitary channel.
The first step is to define a notion of causal structure that pertains to the inputs and outputs of a unitary channel.
Definition 3 (Causal structure of a unitary channel) Given a For any unitary channel ρ U C 1 :::C l jB 1 :::B k with k input and l output subsystems its causal structure is then the set of causal relations between input and output subsystems and can be represented by a DAG with vertices B 1 , ..., B k and C 1 , ..., C l and an arrow B j → C i whenever B j is a direct cause of C i . This definition (which, given the correspondence between unitary maps U and unitary channels Uð Þ ¼ Uð ÞU y , we let refer to either) lifts naturally to the case of a unitary process, in such a way that causal relationships are defined between the nodes of the process, rather than between inputs and outputs of a channel.
Definition 4 (Causal structure of a unitary process) Given a unitary process σ A 1 :: If node A j can influence node A i , then A j is a direct cause of A i . The causal structure of the unitary process is the set of all causal relations between its quantum nodes, and is representable as the directed graph with vertices A 1 , ..., A n and an arrow A j → A i , whenever A j is a direct cause of A i .
The fact that any unitary process is Markov for its causal structure, hence defines a QCM, is then immediate from the following theorem of refs. 34,35 .
Theorem 1 (References 34,35 ) Given a unitary channel ρ U C 1 :::C l jB 1 :::B k , let fPaðC i Þg l i¼1 be the parental sets as defined by its causal structure. Then the CJ operator factorizes as ρ U C 1 :::C l jB 1 :: The case of non-unitary processes, and their relationship to causal structure is presented below. First, we describe a wellknown example of a causally nonseparable process-the quantum SWITCH 2 -and show explicitly that it defines a unitary process operator with cyclic causal structure, hence a cyclic QCM.
Example: the quantum SWITCH. The quantum SWITCH 2 was the first example described of a causally non-separable process. The SWITCH is standardly defined as a higher-order map 2,54,55 that takes as input two CP maps gives as an output a CP map Here, H Q and H Q 0 are interpreted as the Hilbert spaces of a control qubit at some initial and some final time, respectively, and H S and H S 0 as the Hilbert spaces of some target system at the same two times. Intuitively, the effect of the quantum SWITCH is to transform the target system from the initial to the final time by the sequential application of the CP maps F A and G B , where the order in which the two CP maps are applied is conditioned coherently on the logical value of the control qubit.
To formulate this precisely, we will describe the quantum SWITCH directly as a 4-node process (see Fig. 3), which involves the nodes A and B, where F A and G B are inserted, a node P with P out = QS, where the control qubit and target system at the initial time are prepared in some state, and node F with F in ¼ Q 0 S 0 , where the control qubit and the system at the final time are subject to some measurement. The SWITCH is then a unitary four-partite process with process operator σ SWITCH and the appearance of the dual spaces due to our convention for the CJ isomorphism. It is straightforward to verify that the causal structure of σ SWITCH ABPF is the cyclic directed graph in Fig. 4. From Theorem 1, it follows that where we have formally added ρ P to make the Markovianity of σ SWITCH ABPF for G SWITCH explicit, but here ρ P is just the number 1, since P in is trivial. Hence, the graph G SWITCH together with ρ F|ABP , ρ A|BP , ρ B|AP , ρ P , form a faithful cyclic QCM.  Compatibility vs Markovianity. This section extends the discussion of causal structure to non-unitary processes. Briefly, in a QCM involving a non-unitary process σ, the arrows of the graph are taken to represent facts about the causal structure of some underlying unitary process, with the property that σ is recovered from the unitary process when marginalizing over auxiliary systems. The auxiliary systems take the form of a final system F, along with uncorrelated local disturbances, where the latter are inputs to the unitary process in a direct product state, with the property that each of them is a direct cause of at most one of the nodes of σ. As we shall show, it then follows that the process σ is Markov for the graph.
The following was introduced in ref. 51 (there under the name purifiability), and will help make these ideas precise.
Definition 5 (Unitary extendibility) A process σ A 1 :::A n is called unitarily extendible iff there exists a unitary process σ A 1 :::A n PF ¼ ρ U A 1 :::A n FjA 1 :::A n P on the quantum nodes A 1 , …, A n , plus additional root node P and leaf node F, such that σ A 1 :::A n ¼ Tr FP ½σ A 1 :::A n PF τ P for some state τ P 2 LðH Ã P out Þ. The process σ A 1 :::A n PF is called a unitary extension of σ A 1 :::A n .
It was found in ref. 51 that not all process operators are unitarily extendible. The reason for this is that, although for any process σ A 1 :::A n ¼ ρ P A 1 :::A n jA 1 :::A n , corresponding to a channel P, the channel P admits a dilation to a unitary channel, this unitary channel does not necessarily correspond to a valid process itself. Process operators that are not unitarily extendible are those for which no dilation exists such that the unitary channel corresponds to a valid process. Now suppose that a process σ A 1 :::A n does have a unitary extension σ A 1 :::A n PF , involving the additional root node P. As per Def. 4, the unitary extension σ A 1 :::A n PF has a causal structure given by some directed graph G with nodes A 1 , ..., A n , P, F. Let G 0 be the subgraph with nodes A 1 , ... ,A n , along with all arrows that connect only these nodes in G. In general, in the graph G, the node P will have arrows to several of the A i , meaning that P is a common cause for these nodes. There will then, in general, be correlations in σ that are explained by the common cause P. This means that the graph G 0 , which omits P, is at best an incomplete causal explanation for the correlations in σ, since it does not explain those correlations due to P. In this case, there is no reason why σ should be Markov for the graph G 0 .
Consider now a unitary extension of σ A 1 :::A n with the feature that the node P can be factored into uncorrelated local disturbances λ i , such that each λ i is a direct cause of at most one of the nodes A i . In this case, the graph G 0 , obtained by omitting all of the λ i and leaf node F, can be seen as a causal explanation for correlations described by the process σ A 1 :::A n , which omits only local disturbances and the final effect F, and which does not omit common causes. In this case, we will say that σ is compatible with the graph G 0 . In fact, it is more useful to define this term more broadly: we will say that σ is compatible with any graph, with nodes A 1 , ..., A n , that contains G 0 as a subgraph. The following definition makes this precise, generalizing that of ref. 35 to the cyclic case.
Definition 6 (Compatibility with a directed graph) A process σ A 1 :::A n is compatible with a directed graph G with nodes A 1 , ..., A n , iff σ A 1 :::A n is extendible to a unitary process σ A 1 :::A n λ 1 :::λ n F , with an extra root node λ i for i = 1, ..., n and an extra leaf node F, such that: (1) there exists a product state τ λ 1 Á Á Á τ λ n with τ λ i 2 LðH Ã λ out i Þ such that σ A 1 :::A n ¼ Tr λ 1 :::λ n F ½ σ A 1 :::A n λ 1 :::λ n F ðτ λ 1 Á Á Á τ λ n Þ, (2) σ A 1 :::A n λ 1 :::λ n F satisfies the following no-influence conditions (with Pa(A i ) referring to G): fA j ↛A i g A j = 2PaðA i Þ ; fλ j ↛A i g j≠i . The following then justifies the stipulation, as a part of the definition of a QCM, that the process accompanying a graph is Markov for the graph.
Theorem 2 If a process σ A 1 :::A n is compatible with the directed graph G, then it is also Markov for G.
Proof Similarly to the acyclic case in ref. 35 , the theorem follows essentially from Theorem 1: the unitary extension, asserted to exist by virtue of the assumed compatibility with G, has to factorize into pairwise commuting operators of the form σ A 1 :::A n λ 1 :::λ n F ¼ ρ FjA 1 :::A n λ 1 :: Reference 35 also establishes a converse to this result, for the case that G is acyclic. For a general directed graph G, however, the same proof does not suffice since, even though a dilation to a unitary channel with the required causal constraints can always be found 35 , it is not immediate whether this channel can be guaranteed to define a valid process. We pose this as a hypothesis: Hypothesis 1 If a process σ A 1 :::A n is Markov for a directed graph G, then it is compatible with G.
The hypothesis is satisfied by all examples that we have investigated, but we do not have a proof that it is true in general. Some consequences of the validity or otherwise of this hypothesis are discussed in the "Conclusions" section.
Example: a process that violates a causal inequality. While the quantum SWITCH is causally nonseparable, the correlations that can be established between operations at the nodes of the quantum SWITCH can always be obtained by a causally separable process with sufficiently large input and output dimensions at each node 6,7 . There are, however, causally nonseparable processes that can produce correlations violating causal inequalities 3,7,9,12,[56][57][58][59][60] , which are incompatible with the existence of a definite order between the nodes irrespectively of the types of systems or operations performed at those nodes 4,7,61 . In the literature such processes are called noncausal 7 .
An example of a tripartite noncausal process is the one which was found by Araújo and Feix (AF) and then published and further studied by Baumeler and Wolf in refs. 12,62 . It is remarkable in that the process is both classical and deterministic (see below for further discussion of classical processes). Any classical process can be viewed as a quantum process, diagonal with respect to a product basis. The AF process, viewed as a quantum process on nodes A, B and C, each with twodimensional input and output Hilbert spaces, is described by the process operator where As is explicit in this description, the AF process together with the causal structure in Fig. 5 defines a faithful cyclic QCM.
It was shown by Baumeler and Wolf (BW) 62 that this process is unitarily extendible (also see refs. 27,51 ) with a unitary extension given by where the output space of the root node P is a tensor product of three qubits H P out ¼ H λ A H λ B H λ C and the unitary U is defined by the following bijection of orthonormal bases: The original AF process is recovered for marginalization over F and feeding in the product state 0; 0; 0 j i for λ A , λ B and λ C . Formally letting the latter three define distinct root nodes λ A , λ B and λ C , it is not too hard to show that this BW unitary extension also satisfies the corresponding causal constraints of Definition 13 to establish σ AF ABC to be compatible with the graph of Fig. 5-in keeping with Hypothesis 1.
Cyclicity and extended circuit diagrams. An essential feature of the Markov condition in Definition 2 is the pairwise commutation relation of the operators of the form ρ A i jPaðA i Þ , where the parental sets in general overlap. That two commuting operators act non-trivially on the same Hilbert space has consequences for the algebraic structure of the operators and leads to an intimate link between causal and compositional structure.
In order to exemplify the fruitfulness of studying this link the following will revisit the two examples from earlier.
The quantum SWITCH can be considered as a unitary process over 4 nodes, given by σ SWITCH Fig. 6 together with its causal structure shown in blue. Observe in particular that in U, A out does not influence A in , and similarly B out does not influence B in , as must be the case for a well-defined process 18 .
Reference 63 shows that any unitary map U with three in-and output systems, and the causal constraints of Fig. 6, has a decomposition of the following form: where S and T are unitaries, and fV i g i2I and fW i g i2I families of unitaries of the form T : Such a compositional structure with direct sums over tensor products goes beyond what is expressible with ordinary circuit diagrams. Reference 63 therefore introduced extended circuit diagrams to give a graphical representation of such decompositions. Figure 7 arises from that extended circuit diagram representation of Eq. (14) by bending the wires corresponding to A in and B in down to re-identify the quantum nodes A and Bthereby filling the black box of the quantum SWITCH from Fig. 3. For details on this diagrammatic language, we refer the reader to ref. 63 , but the essential idea is that individual wires with indices on them, such as those between the circles S and V i and W i , respectively, represent the families of Hilbert spaces fH P L i g i2I and fH P R i g i2I , while the two parallel wires together represent L i2I H P L i H P R i . An implicit summation over orthogonal subspaces indexed by i allows the representation of the intermediate It is easy to see what this decomposition is concretely in the case of the quantum SWITCH: the index i takes two values, 0 and 1, corresponding to the logical values of the control qubit, i.e., H P ¼ H Q H S ffi ðC H S Þ È ðH S CÞ and the unitaries V i and W i are either the SWAP transformation on the respective systems or the identity depending on i. We see that even though the causal structure of the full process is cyclic, the process splits into a direct sum of processes in each of which causal influence and the flow of information follow acyclic paths.
This decomposition of the quantum SWITCH applies more generally: seeing as any unitary process of the type depicted in Fig. 3, with a root node P, a leaf node F, and two nodes A and B in between, satisfies A out ↛ A in and B out ↛ B in , it follows that any such unitary process has a decomposition as in Fig. 7. Note that the below will furthermore establish (as a direct consequence of  the proof of Theorem 3) that for each i the summand V i ⊗ W i of that corresponding decomposition has to have an acyclic causal structure, that is, any unitary process with nodes A, B, P, F where P is a root node and F a leaf node, is a direct sum of unitary processes in which causal influences flow along acyclic paths.
The second example concerns the tripartite AF process and its BW unitary extension ρ U ABCFjABCP (see Eqs. (12)- (13)). The root node P has as output space where each λ X influences only X and F for X = A, B, C. The associated unitary map U and its causal structure are depicted in Fig. 8.
The results from ref. 63 allow again the statement of an extended circuit decomposition of U, which is implied by its causal structure and which makes the pathways of causal influence through U graphically evident (the proof is completely analogous to that of Theorem 7 in ref. 63 ).
This decomposition of U is depicted in Fig. 9 and reads: for (families of) unitary maps T : V : W : By appropriately bending the wires that correspond to A in , B in and C in to re-identify the nodes A, B and C (and swapping some wires for better readability) one obtains Fig. 10, revealing a finegrained compositional structure of the BW unitary extension.
Note that the stated decomposition is general in the sense that a decomposition of the form as in Fig. 9 exists for any unitary with a causal structure as in Fig. 8. However, in the concrete case of the BW unitary extension one can easily see what the components in Eq. (19) correspond to through a comparison with Eq. (13). All three indices i, j and k are binary and, via the unitaries S, T and V can be seen to correspond to onedimensional subspaces of A out , B out and C out . Hence, each indexed space, i.e., each element of a family of Hilbert spaces associated with an indexed wire, is a trivial Hilbert space. For any fixed value (i, j, k), the unitary P ij ⊗ Q ik ⊗ R jk is of the type where all spaces are qubits (suppressing all trivial spaces). The unitary P ij : H λ C ! H C in maps λ C j i7 ! λ C È ð: i^jÞ j i , i.e., P ij is the identity or the NOT gate depending on the values of i and j. The unitaries Q ik and R jk can similarly be identified through comparison with Eq. (13).
One thus finds that the BW unitary extension is a direct sum over unitary processes each of which has an acyclic causal structure. Furthermore, it is natural to wonder whether knowing a decomposition of the form as in Fig. 8 might suggest a way in which the process could be implemented-a process, which we recall is one that violates a causal inequality.
How about other unitary processes not of the two presented types? Reference 63 provides extended circuit decompositions Fig. 8 The unitary map U from Eq. (13) that defines the BW unitary extension of the AF process. Also depicted is its causal structure, where for better visibility, rather than direct cause relations, the no-influence conditions are shown as red dashed arrows.  for many classes of unitary transformations, where the decompositions are causally faithful, meaning that if A is an input to the unitary U and B an output, then there is a path from A to B in the extended circuit iff A can influence B through U (note this is distinct from the notion of faithfulness of a QCM). Consider now a unitary map U that corresponds to a unitary process, in the sense that the output Hilbert spaces of the nodes correspond to the inputs to U, and the input Hilbert spaces of the nodes correspond to the outputs of U. If U has a causally faithful extended circuit decomposition, then by appropriately bending the wires, as in the above examples, one can always obtain a finegrained compositional structure of the corresponding unitary process. Reference 63 states the hypothesis that all finitedimensional unitary transformations (over-specified tensor products of input Hilbert spaces and output Hilbert spaces) have a causally faithful extended circuit decomposition. This would mean that all unitary processes, by bending the wires, would admit causally faithful decompositions in a similar manner. At the time of writing, however, the hypothesis remains unproven.
The bipartite unitarily extendible processes. Understanding which processes have a physical realization is a central open question in the field of indefinite causal order 18,51 . While causally nonseparable processes may have a realization in exotic scenarios involving both quantum systems and gravity, it seems clear that any present-day laboratory experiment admits a description in terms of a straightforward, definite, causal ordering of suitably defined parts of the experiment. Nevertheless, various experiments have been performed that are claimed as realizations of nonseparable processes such as the quantum SWITCH [28][29][30][31][32]64 . This has caused some debate 18,49,65 .
Behind much of this debate, however, lies merely a question of how the abstract mathematical description is assumed to map to physical phenomena. Each of the implementations claimed so far is of a process that involves coherent control over the time-ordering of nodes in a similar manner to the SWITCH, and which cannot therefore violate causal inequalities. Reference 18 shows that any such implementation can be seen as a valid implementation of a nonseparable process, if the process is understood as being defined over time-delocalized systems, where the input and output Hilbert spaces of the nodes of the process correspond to subsystems of tensor products of Hilbert spaces of systems associated with different times. This raises the question: which processes in general admit a laboratory implementation, at least in terms of time-delocalized systems? In particular, can a process violating causal inequalities be implemented?
There was some hope that a process violating causal inequalities could be implemented, because ref. 18 also shows that every unitary extension of a bipartite process has a realization in terms of time-delocalized systems. Hence if there were a unitarily extendible bipartite process violating causal inequalities, then it could be implemented, at least via time-delocalized systems. The following theorem, however, shows that there is no such possibility. Any bipartite unitarily extendible process is causally separable, hence, in particular, cannot violate causal inequalities, as conjectured in ref. 51 ; furthermore, all unitary extensions of bipartite processes are variations of the quantum SWITCH, realizable by coherent control of the times of the operations of A and B. The argument uses the existence of a faithful extended circuit decomposition of the form as in Eq. (14) that is implied by the causal constraints of Fig. 6.
Theorem 3 All unitarily extendible bipartite processes are causally separable. Given a bipartite process, if it is unitarily extendible, then the unitary extension has a realization in terms of coherent control of the order of the node operations.
Proof See Methods. As one can see, e.g., from the AF process, being unitarily extendible does not imply causal separability in the general multipartite case. However, the decomposition from Fig. 8 of the BW unitary extension of the AF process proved insightful with regards to how the cyclicity of the causal structure comes from different contributions across the direct sum. More generally, suppose a causally faithful extended circuit decomposition of the unitary extension of some multipartite process is known. It is then natural to ask whether some kind of generalization of the constraints established as part of the proof of Theorem 3 could be derived, which in the bipartite case just happen to give causal separability, while in the general case constrain each summand of the decomposition. As is the case with the bipartite processes, to which Theorem 3 applies, one would expect that such constraints on summands of the unitary extension also manifest themselves in interesting ways for the non-unitary marginal process. We leave this question for future investigation.
Causal nonseparability. The definition of causal separability was given above only for bipartite processes and it was mentioned that the multipartite case, with more than two nodes, is more intricate. This section will first give the general definition, following refs. 7,23 , and then present another main result.
Seeing as the idea of causal separability is to capture whether a process is consistent with our intuitions on causal order, it is natural to let it incorporate the following two features. First, in addition to probabilistic mixtures of fixed orders of nodes it allows for a dynamical causal order of them, that is, the overall causal order of some nodes need not be fixed, but may depend on what happens at some earlier nodes. Second, it demands that causal separability is preserved under extending the process with an arbitrary ancillary input state shared between the nodes (a property called extensibility 7 ). A process is thus causally separable if, upon considering arbitrary shared entanglement between auxiliary input systems to all nodes, the resulting extended process can be seen to arise from a probabilistic mixture of particular processes: for each there is a node P in the past such that for all possible interventions at P the marginal process has a fixed causal order, or more generally, is itself again causally separable. Hence, one ends up with an iterative definition of the concept. This notion was originally called extensible causal separability in ref. 7 to distinguish it from the analogous concept without extensibility, but as it is undoubtedly the more natural concept, we here refer to it simply as causal separability, as in ref. 23 . (Note, there have been two equivalent definitions of that notion 23 , which differ by whether extensibility is imposed at the level of the full process 7 or at each level of the iteration 23 . For the present purposes, it is convenient to use the latter one.) Finally, making the concept precise relies on the following notion of nosignalling in a process, which, along with various equivalent statements, was given in ref. 7 .
Definition 7 (No signalling in a process) Given a process σ A 1 :::A n , we say that there is no signalling from a subset S ⊂ {A 1 , …, A n } of its nodes to the complementary subset S :¼ fA 1 ; ; A n gnS, iff the probabilities Pðk S Þ ¼ A performed at S are independent of the choice of tracepreserving operations τ S = ⨂ A∈ S τ A performed at S. Now let τ A j represent a CP map at the node A j , which is not necessarily trace-preserving. If there is no signalling to a node A j NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-20456-x ARTICLE from {A 1 , …, A n }⧹{A j }, then for any τ A j , the object Tr A j ½σ A 1 :::A n τ A j is proportional to a process operator. In this case, let σj τ A j be the corresponding correctly normalized process operator. We refer to σj τ A j as a conditional process. We can now state the formal definition of causal separability. Definition 8 (Causal separability 23 ) Every single-node process is causally separable. For n ≥ 2, a process σ on n quantum nodes A 1 , …, A n is said to be causally separable, iff, for any extension of each node A j with an additional input system H ðA 0 j Þ in to a new nodeÃ j , defined by HÃin j :¼ H A in j H ðA 0 j Þ in and HÃout j :¼ H A out j , and any auxiliary quantum state ρ 2 LðH ðA 0 1 Þ in H ðA 0 n Þ in Þ, the process σ ⊗ ρ on the quantum nodesÃ 1 , …,Ã n decomposes as with q k ≥ 0, ∑ k q k = 1, where for each k, σ ρ ðkÞ is a process in which there can be no signalling toÃ k from the rest of the nodes, and where for any CP map τÃ k that can take place at the nodeÃ k , the conditional process on the remaining n − 1 nodes, σ ρ ðkÞ j τÃ k , is itself causally separable.
An important question then concerns the relation between causal nonseparability and cyclicity of causal structure. For a QCM that involves a generic (not necessarily unitary) process, the cyclicity of its directed graph does not in general imply causal nonseparability of the process, even if the QCM is faithful. Consider, for example, the quantum SWITCH with process operator σ SWITCH ABFP . Tracing out the system F in , we obtain a reduced 3-node process that (relabelling C as P) is both faithful and Markov for the graph of Fig. 2b, having the form σ ABP = ρ A|BP ρ B|AP ρ P . This process is causally separable, since it can be understood as describing a situation in which the order between A and B depends in an incoherent manner on the logical value of the control qubit prepared at the initial time. This process thus forms a faithful cyclic QCM and is a canonical example of a process with dynamical causal order (here between nodes A and B).
In fact, one and the same cyclic graph may appear in two distinct faithful QCMs, one involving a causally separable, the other a nonseparable process. An example of this can again be given using the quantum SWITCH. The latter is causally nonseparable and has the graph in Fig. 4 as causal structure, which however also is the causal structure of the classical SWITCH 2 , which in contrast is causally separable (see subsequent discussion of classical processes). What this points at is a well-known fact, namely that causal separability cannot separate the distinction between cyclicity and acyclicity on one hand, and classical and quantum causal order on the other hand.
For the case of unitary processes things are, however, much simpler.
Theorem 4 A unitary process is causally nonseparable iff it has a cyclic causal structure.
Proof See Methods. If a unitary process has a causal structure given by an acyclic graph, then it is a unitary comb 54 . Hence a unitary process is either a comb or is causally nonseparable-intermediate possibilities, such as dynamical causal order, cannot arise. Note that there is no classical analogue of Theorem 4, i.e., a classical deterministic process is not necessarily causally nonseparable if it has a cyclic causal structure. The classical SWITCH 2 is again an example that establishes this claim. (See below for an introduction of classical deterministic processes).
Cyclicity and classical processes. If a process operator is diagonal in a basis that is a product of local bases for the input and output Hilbert spaces at each node, it is equivalent to a classical process 3,12,56 , where each node X is associated with a pair of classical variables X in and X out . Following ref. 35 we call such classical nodes classical split nodes. Classical processes are studied in detail in refs. 12,35,56 . (See also refs. 11,22 .) This section presents the main ideas, and defines (possibly cyclic) classical split-node causal models (CSM). For the most part the definitions are the obvious classical analogues of those for the quantum case. While cyclic classical causal models have sometimes been studied (see, e.g., refs. 66,67 ), for example to encompass the possibility of classical feedback loops, they are not of the split-node variety described here, and are not equivalent.
A classical process, defined over classical split-nodes X 1 , ..., X n , corresponds to a map κ X 1 :::X n : X in 1 X out 1 Á Á Á X in n X out n ! ½0; 1, such that P X in 1 ;X out 1 ;:::;X in n ;X out n κ X 1 :::X n Q i PðX out i jX in i Þ ¼ 1, for any set of classical channels fPðX out i jX in i Þg. A local intervention at a node X, with outcome k X , corresponds to a classical instrument P(k X , X out |X in ). Given a local intervention at each node, the joint probability distribution over the outcomes is Pðk X 1 ; :::; k X n Þ ¼ X X in 1 ;X out 1 ;:::;X in n ;X out n κ X 1 :: A special case of a classical process is a deterministic process κ f X 1 :::X n , for which PðX in 1 ; :::; X in n jX out 1 ; :::; X out n Þ ¼ δððX in 1 ; :::; X in n Þ; f ðX out 1 ; :::; X out n ÞÞ, where f : X out 1 :::: X out n ! X in 1 :::: X in n is a function. When f is bijective, we call such a process reversible. It was shown in ref. 12 that the set of classical processes over nodes X 1 , ..., X n forms a polytope, and that the deterministic polytope, defined as all convex mixtures of deterministic processes, is in general a strict subset of it. While all classical processes on two nodes are causally separable 3 , on three or more nodes there exist classical processes, including deterministic classical processes, that are causally nonseparable-the AF process from ref. 56 , described above, is an example.
Definition 9 (CSM-generalized) A CSM is given by: (1) a causal structure represented by a directed graph G with vertices corresponding to classical split-nodes X 1 , . . . , X n , (2) for each X i , a classical channel PðX in i jPaðX i Þ out Þ, where Pa (X i ) denotes the set of parents of X i according to G, such that κ X 1 ÁÁÁX n ¼ Q i PðX in i jPaðX i Þ out Þ is a classical process over X 1 , . . . , X n . This definition generalizes that of ref. 35 to include the case of cyclic graphs, and classical split nodes where the input and output variables have different cardinalities. Reference 35 presents detailed discussion of the relationship between (acyclic) CSMs and standard classical causal models 47,48 .
In the classical case, causal structure (defined for unitary processes in the quantum case) can be defined for deterministic processes.
Definition 10 (Causal structure of a deterministic classical process) Given a deterministic process κ f X 1 :::X n , the causal structure of the process is the directed graph with vertices X 1 ,..., X n and an arrow X i → X j , whenever X in j depends on X out i through the function f. Definition 11 (Classical Markov condition-generalized) A process κ X 1 :::X n is called Markov for a directed graph G with classical split-nodes X 1 , …, X n as its vertices iff it admits a factorization of the form κ X 1 :::X n ¼ Q n i¼1 PðX in i jPaðX i Þ out Þ, where Pa(X i ) denotes the set of parents of X i according to G.
The following is immediate. Proposition 2 Every deterministic classical process is Markov for its causal structure.
In the case of general-i.e., not necessarily deterministicclassical processes, an account of their relationship to causal structure can be given that again mirrors the quantum case. Let us adopt the provisional approach that causal structure always inheres in deterministic reversible processes (where reversibility here may not be essential, but is assumed to provide a closer analogue to the quantum case in which unitarity is assumed). Then compatibility with a given directed graph can be defined in terms of extension to a reversible deterministic process with latent local noise variables.
Definition 12 (Reversible extendibility) A process κ X 1 :::X n is reversibly extendible iff there exists a reversible deterministic process κ f X 1 ÁÁÁX n Fλ with an additional leaf node F and root node λ, such that κ X 1 ÁÁÁX n ¼ P F in ;λ out ½κ f X 1 ÁÁÁX n Fλ Pðλ out Þ for some P(λ out ). Definition 13 (Compatibility with a directed graph) A process κ X 1 ÁÁÁX n is compatible with a directed graph G with nodes X 1 ,...,X n , iff κ X 1 ÁÁÁX n is reversibly extendible to a deterministic process κ f X 1 ÁÁÁX n Fλ 1 :::λ n , with an additional leaf node F, root nodes λ i , and a product distribution With Proposition 2, the following analogue of Thm. 2 is straightforward.
Theorem 5 If a classical process κ X 1 ÁÁÁX n is compatible with a directed graph G, then it is also Markov for G.
As in the quantum case, we leave open whether the converse to Theorem 5 holds.
Hypothesis 2 If a process κ X 1 :::X n is Markov for a directed graph G, then it is compatible with G.
We remark only that Hypothesis 2 is not obviously implied by its quantum counterpart, Hypothesis 1. First, it is not known whether reversible extendibility implies unitary extendibility for a classical process when seen as a special case of a quantum process. Second, even if this is the case, it is still conceivable that while a classical process that is Markov for a given graph may admit unitary extensions with the required no-influence properties when viewed as a quantum process, no such extension may be equivalent to a deterministic classical process for the given preferred basis.
We conclude with the following observation. Theorem 6 Given a set of classical split nodes X 1 , ..., X n , the set of reversibly extendible classical processes on X 1 , ..., X n coincides with the deterministic polytope.
Proof See Methods. If Hypothesis 2 holds, then Theorem 6 implies in particular that the process defined by a CSM must always belong to the deterministic polytope. An example of a classical process κ X 1 ÁÁÁX n outside of the deterministic polytope is described in ref. 12 (and denotedÊ ex1 therein). It is not too hard to show that this process is not Markov for any directed graph, hence cannot be the process defined by a CSM, in keeping with Hypothesis 2.

Discussion
This work presented an extension of the framework of quantum causal models from refs. 34,35 to include cyclic causal structures. We showed that the quantum SWITCH, and a process that violates causal inequalities, found by Araújo and Feix and described by Baumeler and Wolf, can be seen as the processes defined by cyclic quantum causal models. We also gave decompositions of any SWITCH-type process and of the unitary extension of the aforementioned process by Araújo and Feix, enabling diagrammatic representations that make the internal causal structures evident. Applications of these results included proofs that any unitarily extendible bipartite process is causally separable, and that any unitary process is cyclic if and only if it is causally nonseparable.
What technically comes as the natural generalization of the framework of acyclic quantum causal models is conceptually a substantial step-allowing causal structure to be cyclic. Taking this extended causal model perspective seriously then offers an alternative view of certain processes: a process that is incompatible with definite causal order may now also be seen to have a well-defined cyclic causal structure. This is to say, to admit of a partial order is not an essential property of being causal anymore. While processes that violate a causal inequality were previously referred to as noncausal processes, suggesting they cannot be understood causally, at least some of them then do admit a causal understanding.
Note that as far as acyclic causal structures are concerned there also is the earlier framework of QCMs by Costa and Shrapnel from ref. 45 , which is related to, in fact strictly contained in that of refs. 34,35 , which the current work extends. The Markov condition of ref. 45 is a special case of Definition 2, restricted to DAGs for which each node's output space factorizes into as many subsystems as the node has children, with each subsystem only influencing the corresponding child. With this idea of a system per arrow, the process operator Q i ρ A i jPaðA i Þ becomes a tensor product. As a consequence-for essentially the same reason as why Proposition 1 holds-the notion of a QCM from ref. 45 does not admit a nontrivial extension to cyclic directed graphs. The extension of faithful QCMs to cyclic graphs relies on the particular nature of our Markov condition that allows the nontrivial action of pairwise commuting operators ρ A i jPaðA i Þ to overlap on non-factorizing output spaces.
Although we do not provide the details, we note a further application of the generalized framework: it allows an extended version of the causal discovery algorithm sketched in ref. 35 (inspired in turn by the first of its kind in ref. 50 ). While the version in ref. 35 , takes a process operator as input, and outputs DAGs as candidate causal explanations, where possible at all, the extended version can discover and output cyclic causal structures. The basic steps of the algorithm in ref. 35 largely remain the same, but for instance the algorithm does not halt anymore when encountering a cyclic graph G σ that encodes the direct signalling relations between pairs of nodes of the given process σ. Instead Markovianity for such cyclic G σ can still be checked to establish whether G σ is a plausible causal explanation.
One of the main questions left open is the validity of our hypothesis that Markovianity implies compatibility for cyclic graphs, which would generalize one of the main results established for the acyclic case in ref. 35 . The validity of this hypothesis has consequences, which we spell out as follows.
Reference 51 , in motivating the study of unitary extendibility of processes, includes the suggestion that unitary extendibility should be regarded as a necessary condition for a process to be realizable in nature. Here, the meaning of 'realizable' is a little vague, but might be taken, for example, to include exotic scenarios involving gravity as well as the time-delocalized sense discussed above in which some processes have been realized in the laboratory. (It does not include realization via postselection, since it is known that all processes can be realized under a suitable postselection 10,13,27,68 .) The suggestion would hold if all processes, once sufficient systems are included, are unitary at the most fundamental level.
Alternatively, under the assumption that the process operator framework provides the most general description of the possible correlation between quantum systems, in nonpostselected scenarios, one may speculate that a necessary condition for a process to be realizable in nature is that it can arise from a QCM. Here, 'arise' means that there is a QCM with process σ 0 such that σ can be obtained from σ 0 by inserting channels at some of the nodes of σ 0 and marginalizing over them. The idea is that any correlations described by such a process admit a causal explanation, albeit one that may involve cycles. On the other hand, any process that cannot arise from a QCM in this manner describes correlations that are not amenable to an understanding in causal terms.
The connection with unitary extendibility is that any process that is unitarily extendible has the property that it can arise from a QCM. Furthermore, if Hypothesis 1 holds, then any process that is not unitarily extendible cannot arise from a QCM. Hence if Hypothesis 1 holds, the speculation above coincides with the suggestion of ref. 51 .
If Hypothesis 1 fails, there is a peculiar class of cyclic quantum causal models, in which the process is Markov for the graph but not compatible with the graph. There then are two logically conceivable options: one may insist on the notion of compatibility as the essential concept for giving causal explanations, turning the Markov condition into a necessary but insufficient condition; alternatively, one could insist on the Markov condition as the essential concept for giving causal explanations, turning the current notion of compatibility into a sufficient but not necessary condition. We leave open the question whether any meaning can be given to the arrows of the graph in this case, given that there is no suitable unitary extension to define causal relations, and whether such processes might be realizable or not.
Beyond establishing the hypothesis, future work might study the extent to which other core results of the framework of quantum causal models in the acyclic case, such as the dseparation theorem 35 , can be generalized in an appropriate way to the cyclic case, as has been done for the classical framework (see, e.g., ref. 67 ).
Finally, one of the most promising avenues for future work is the general idea behind the above causal decompositions of our example processes together with Theorem 3: to derive further causal decompositions of unitary transformations U, as started in ref. 63 , and then study the interplay between the discovered algebraic structure and the condition that U defines a valid unitary process when identifying in-and output spaces of U as the out-and input spaces of quantum nodes. We expect this mathematical tool to lead to insights into which unitarily extendible processes are causally nonseparable and how the cyclicity is distributed, mathematically speaking, across the process-with possible hints for the process' physical realizability.

Methods
Characterisation of process operators. In order to state necessary and sufficient conditions for an operator to be a valid process operator, the following will be useful. Let fη l X g l¼0 denote a Hilbert-Schmidt (HS) basis for LðH X Þ, i.e., a set of operators such that they are orthonormal with respect to the HS inner product and, in addition, traceless for all l ¼ 1; :::; d 2 X À 1, while η 0 A term of type A in in the expansion is a summand with non-trivial action only on A in , i.e., l 1 ≠ 0 and l 2 = l 3 = l 4 = 0. Similarly for types A in B out etc. It was shown in ref. 3 that σ being a bipartite process operator is equivalent to σ ≥ 0, Tr½σ ¼ d A out d B out and that in a HS basis expansion, in addition to a term, which is proportional to the identity operator on all four spaces, only the coefficients of terms of the types A in , B in , A in B in , A in B out , A out B in , A in A out B in and A in B in B out , may be non-vanishing. These conditions were generalized to n numbers of nodes in ref. 7 and can easily be stated as (1) and (3) that in a HS basis expansion the only non-vanishing terms, apart from an overall identity operator, are of a type such that there must be at least one node, say A i , on whose out-space, A out i , the action is trivial, but on whose in-space, A in i , the action is non-trivial. Equivalent conditions were presented in ref. 6 where the projector onto the linear subspace of process operators was defined explicitly, giving a basisindependent characterization.
Proof of Proposition 1 Suppose a bipartite cyclic QCM is given by the (unique) cyclic graph G with two nodes A and B from Fig. 2a and a process σ AB = ρ A|B ρ B|A , Markov for G. It follows that σ AB = ρ B|A ⊗ ρ A|B , as both factors act on distinct Hilbert spaces. Now suppose that this is a faithful QCM, i.e., both channels ρ A|B and ρ B|A are signalling channels. One way to see that this contradicts the assumption that σ AB is a valid process is by analyzing the non-vanishing types of terms in an expansion of σ AB relative to a HS product basis (see above). If signalling from B out to A in is possible in ρ A|B , then an expansion of just ρ A|B has to contain a non-vanishing term of type A in B out . Similarly, if signalling from A out to B in is possible in ρ B|A , then an expansion of ρ B|A has to contain a non-vanishing term of type B in A out . Consequently, σ AB has to contain a non-vanishing term of type A in B out B in A out , which is forbidden for a process operator 3 .
Product of commuting operators not necessarily a process operator. As established by Proposition 1, not all cyclic graphs support a faithful cyclic QCM.
Here we show that, given a cyclic graph G that does support a faithful cyclic QCM, it is not true that any product of commuting operators Q i ρ A i jPaðA i Þ , with parental sets as in G, constitutes a process operator. Consider for instance the graph G in Fig. 2b (and see the discussion below Definition 8 for an example of a faithful cyclic QCM over G). Letting the three nodes A, B and C be classical split nodes, with classical bits A in , A out , B in , B out , C in and C out , define classical channels as in Eqs. (29)- (30). It is easy to see that the signalling relations through the channels P (A in |B out , C out ) and P(B in |A out , C out ) are indeed as in Fig. 2b. At the same time, for any choice of probability distribution P(C in ), the product P(A in |B out , C out )P (B in |A out , C out )P(C in ) cannot be a classical process: consider an intervention at C which fixes C out to be 0, then P(A in |B out , 0)P(B in |A out , 0) is still a product of two signalling classical channels, which (seeing them as special cases of quantum channels) was already established in the proof of Proposition 1 to be in contradiction with being a process. This establishes the claim.
Proof of Theorem 3. Suppose the bipartite quantum process operator σ AB is unitarily extendible. Consider an arbitrary unitary extension of it, σ ABFP ¼ ρ U ABFjABP . From Eq. (14) it follows that the reduced process obtained by tracing out F in has the form for the decomposition is taken as an operator on the whole space, acting as zero map on all but the ith subspace. Note that from σ ABP being a process operator it follows that feeding in any τ P 2 LðH Ã P out Þ gives a quantum process operator on the nodes A and B. Let i ∈ I be some fixed index and suppose through the channel ρ AjBP L i system B out can signal to A in and similarly, through the channel ρ BjAP R i system A out can signal to B in . Then there exists an appropriate state τ P , which has only support on the ith subspace, and which is of a product form γ P L i ϕ P R i , such that in both, the marginal channel on the left is signalling from B out to A in and the one on the right from A out to B in . Since the expression in Eq. (32) has to give a process operator over A and B, this yields a contradiction due to Prop. 1. Hence, for each i at most one of the channels ρ AjBP L i and ρ BjAP R i allow signalling from B out to A in or from A out to B in , respectively. By assumption there exists an appropriate τ P 2 LðH Ã P out Þ such that ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-20456-x By the above analysis, it also follows that each summand in Eq. (33) has to be a process operator up to normalization. Since they sum up to a process operator, the inverses of the normalization constants have to form a probability distribution and one can therefore write σ AB ¼ P i p i σ ðiÞ AB , where each σ ðiÞ AB is a process operator with at most A signalling to B or vice versa. This is the form of a bipartite causally separable process operator.
Note further that if ρ AjBP L i is non-signalling from B out to A in , then in V i there is no influence from B out to A in , and similarly, if ρ BjP R i A is non-signalling from A out to B in , then in W i there is no influence from A out to B in . Therefore, the above constraints mean that each term V i ⊗ W i in Eq. (14) corresponds to a process over nodes including A and B that allows signalling in at most one direction between A and B. The latter always admits an implementation as a unitary circuit fragment with nodes A and B in a fixed order 54 . Since the full unitary U of the unitary extension is a direct sum of such fixed-order unitary processes taking place in the different orthogonal subspaces, and every operation at the nodes A and B can be dilated to a unitary, the full unitary process σ ABFP ¼ ρ U ABFjABP can be realized by coherently conditioning which of the corresponding fixed-order unitary circuits takes place on the logical value of some control n-level quantum system, where n is the number of different subspaces. Note that since the systems involved in the fixed-order circuits may have different dimensions, this implementation in practice may require bringing in different systems depending on the control variable i, but this can always be seen as part of a process on a larger system of a fixed dimension. Moreover, the fixed-order processes in the different orthogonal subspaces can be grouped into two sets: one in which A is before B and another one in which B is before A. This allows embedding the process into another one where one of two possible circuits (in which A and B occur in different orders) is applied in a coherently controlled fashion based on the logical value of a control qubit, similarly to the quantum SWITCH. This yields another possible unitary extension σ ABFP of the original bipartite process, whereF in andP out would contain F in and P out , respectively, as subspaces. The originally assumed unitary extension σ ABFP can then be seen to take place effectively as part of σ ABFP .
Proof of Theorem 4. The below proof of Theorem 4 will use the following two concepts. First, generalizing the notion of a process being unitary, a process is called isometric if its induced channel from the output systems of all nodes to the input systems of all nodes arises from an isometry. Second, a quantum comb, as defined in ref. 54 (provided first input and last output system are trivial), is a special kind of quantum process: a process σ A 1 A n over n quantum nodes for the given total order of its nodes A 1 , …, A n is a quantum comb (an (n + 1)-comb) iff Proof of Theorem 4 Let σ A 1 A n be a unitary process. The following will establish, what is equivalent to Theorem 4, namely that acyclicity of its causal structure is equivalent to σ A 1 A n being causally separable. First, suppose σ A 1 A n has an acyclic causal structure. There then exists a total order of the quantum nodes A 1 , …, A n (appropriately relabelled) such that A j ↛ A i ∀ j ≥ i (see Def. 4). This implies that the conditions in Eqs. (34)- (35) are satisfied (note that d A out . Hence, σ A 1 A n is a quantum comb 54 . Such a process is a special case of a causally separable process since in a quantum comb there can be no signalling from {A j+1 , ⋯ , A n } to {A 1 , ⋯ , A j } for any j = 1, ⋯ , n − 1, and this remains true under extending the process with arbitrary shared input ancillary states.
For the converse direction, suppose the unitary process σ A 1 A n is causally separable. In order to show that it then has an acyclic causal structure we will prove that it is a quantum comb. In fact we will prove the following more general statement concerning isometric processes, which gives the claim as a special case.
Lemma 1 Every causally separable isometric process is a quantum comb. Proof of Lemma 1 The main idea of the following proof is the observation that the process operator of an isometric process is proportional to a rank-1 projector and hence cannot be written as a nontrivial convex mixture of different positive semi-definite operators. The proof proceeds by induction.
An isometric process over one single node is a 2-comb. Assume that all causally separable isometric processes on n nodes are quantum combs. Let σ A 1 A nþ1 be an isometric process over n + 1 nodes, which is causally separable. Let us extend it by adding auxiliary input systems for all n + 1 nodes with the following pure state shared among them: where each ϕ þ j i ij ¼ 1 ffiffiffi n! p P n! l¼1 l j i l j i is a maximally entangled state, shared between node A i and node A j . Thus, Ψ j i is a tensor product of 1 2 nðn þ 1Þ maximally entangled bipartite states, such that every pair of nodes indexed by (i, j) shares one such state of Schmidt rank n!. Using the notation of Definition 8,σ :¼ σ Ψ j i Ψ h j is an extended process over the extended nodesÃ 1 ; ;Ã nþ1 , with Ψ j i 2 N nþ1 i¼1 H ðA 0 i Þ in , where each H ðA 0 i Þ in is an n-fold tensor product of (n!)dimensional systems.
By assumption,σ is causally separable, too, while it also is proportional to a rank-1 projector. In the decomposition as in Eq. (27), implied by causal separability, there therefore is only one summand. Hence, there exists one node, let this beÃ 1 (for an appropriate relabelling), such thatÃ 2 ; ;Ã nþ1 cannot signal toÃ 1 and for all CP maps τÃ 1 at that node the conditional processσj τÃ 1 is causally separable. Now consider a CP map such that τÃ 1 ¼ τ j i τ h jÃ 1 itself is a rank-1 projector. The process operatorσj τÃ 1 then still is proportional to a rank-1 projector and, hence, representing an isometric process on the remaining n nodesÃ 2 ; ;Ã nþ1 . As argued above it also is causally separable. By assumption then such an isometric, causally separable processσj τÃ 1 on n nodes is a quantum comb.
Notice first that if there is no signalling toÃ 1 from all other nodes in the extended processσ, then there is no signalling to A 1 from all other nodes in the original process σ. Consider τÃ projector on the ancillary input system ðA 0 1 Þ in and τ A 1 ¼ τ j i τ h j A 1 has rank-1. Since projecting the ancillary systems via ϕ j i ϕ h j leaves the ancillary systems on the remaining nodes in some pure state Φ j i Φ h j, the conditional process on the remaining nodes has the form σj τ A 1 Φ j i Φ h j. Since the latter is a quantum comb for every τ j i τ h j A 1 , so must be σj τ A 1 . There are n! different possible total orders of the nodes, given by A π(2) , …, A π (n+1) for π being one of the n! different permutations of 2, . . . , n + 1. We will now show (by proof of contradiction) that there exists a reordering A π(2) , …, A π(n+1) with which the quantum comb σj τ A 1 is compatible for any choice of τ j i τ h j A 1 .
Suppose there does not exist one such appropriate total order. Then for every permutation π, there exists τ π A 1 :¼ τ π j i τ π h j A 1 , such that the corresponding quantum comb σj τ π A 1 is incompatible with the total order of the remaining nodes defined by π. Let C π l ðσÞ ¼ 0 for l = 1, . . . , n be the linear constraint corresponding to the lth condition in Eqs. (34)-(35) for a process operator σ over n nodes to be a valid quantum comb for the total order π.
Consider a process operator σ : , where q π ≥ 0, ∀ π, and ∑ π q π = 1 (letting π, both, be a permutation as well as an index enumerating those permutations). By construction, for every π at least one of the conditions in fC π l ðσj τ π fails. Therefore, one can then choose the weights q π such that for every π the process operator σ violates at least one of these constraints fC π l ð σÞ ¼ 0g n l¼1 , establishing that σ is not a quantum comb for any possible order of the n nodes. More precisely, the condition that σ respects the constraints fC π l ð σÞ ¼ 0g n l¼1 , for a given π can be written as …, n, which implies that (q 1 , …, q n! ), viewed as a point in an (n!)-dimensional Euclidean space, must belong to a specific hyperplane in that space. Our assumption that at least one of C π l ðσj τ π A 1 Þ must be nonzero, makes it a proper hyperplane. Then, in order for σ to be compatible with the quantum-comb conditions for at least one π, the point (q 1 , …, q n! ) must belong to the union of the hyperplanes corresponding to the different values of π. Since this is a finite set of hyperplanes, it is possible to find (a continuum of) points in the positive orthant that are outside of this union. Since rescaling (q 1 , …, q n! ) by a constant factor, which amounts to rescaling σ by a constant factor, does not change the fact of whether any of the above constraints is violated or not, there exists a (q 1 , …, q n! ) with the required properties, such that σ is not a quantum comb for any total order π.
We will now use this fact to construct the contradiction with the assumption that there is no single order π with which all isometric quantum combs σj τ A 1 are compatible. To this end, we will first show that, starting from our extended process σ ¼ σ Ψ j i Ψ h j, for any j ∈ {2, . . . , n + 1} it is possible to apply a suitable CP map τ j i τ h jÃ 1 such that this yields a conditional process of the form σj τÃ ffiffiffiffiffi q π p π j i a j jσ τ π A 1 i with, recalling Eq. (36), H a j the factor of H ðA 0 j Þ in sharing the state ϕ þ j i 1j with H a 1 of the node A 1 and jσ τ π , the conditional process on the remaining n of the original n + 1 nodes, and where Φ j i Φ h j rest 0 in is some pure state on the remaining auxiliary input systems (i.e. Φ j i rest 0 in is in N i≠1 H ðA 0 i Þ in excluding the subfactor H a j ). To see this, let j ≠ 1. If we apply a CP map of the form τ j i τ h jÃ 1 ¼ χ j ihχj a 1 A 1 jϕi ϕ h j restÃ 1 , where χ j i ¼ P n! π¼1 ffiffiffiffi ffi ϵ π p π j i a 1 τ π j i A 1 , and ϕ j i ϕ h j restÃ 1 is some projector on the remaining ancillary input systems in ðA 0 1 Þ in , then we will obtain a conditional process of the formσj τÃ 1 ¼ jσ j ihσ j j jΦihΦj rest 0 in , with jσ j i ¼ 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P n! π¼1 ϵ π γ π q P n! π¼1 ffiffiffiffiffiffiffiffiffi ϵ π γ π p π j i a j jσ τ π A 1 i, where γ π :¼ Tr½ð τ π j i τ π h j A 1 Þσ. Therefore, by choosing ϵ π = q π /(cγ π ), for some large enough constant c to ensure that τ j i τ h jÃ 1 is appropriately normalised to represent a CP map, we can make jσ j i ¼ j σ j i as desired. (Note that ∀ π, γ π ≠ 0 since Tr A 1 ½ð τ π j i τ π h j A 1 Þσ is proportional to a process operator on the remaining n nodes, the trace over which gives Q nþ1 i¼2 d A out i .) By our main assumption, the n-node processσj τÃ 1 ¼ j σ j ih σ j j jΦihΦj rest 0 in must be a quantum comb, and since Φ j i Φ h j rest 0 in is just a state on some input systems, j σ j ih σ j j must also be a quantum comb (on the nodes A i ≠ A 1 , i ≠ j, and the node A j extended via the ancillary input system a j ). But tracing out the system a j from the latter quantum comb must also yield a quantum comb on the nodes A i ≠ A 1 , which can easily be seen from the quantum-comb conditions. However, by construction, Tr a j j σ j ih σ j j ¼ σ, where σ is not supposed to be a quantum comb, which is a contradiction. Therefore, there must exist a total order π, such that σj τ A 1 is a quantum comb compatible with π for every rank-1 τ A 1 . By the convexity of the set of n-node operators that are quantum combs compatible with π, this automatically extends to all CP maps τ A 1 .
So far we have shown that the process σ is such that there is a node A 1 to which the rest of the nodes cannot signal, and the remaining nodes can be put in a total order A 2 , …, A n+1 , such that for every CP map τ A 1 , the conditional process σj τ A 1 is a quantum comb compatible with that order. Now observe that this implies that the full process σ is a quantum comb compatible with the total order A 1 , A 2 , …, A n+1 . Since for all possible CP maps τ A 1 it holds that C l ðσj τ A 1 Þ ¼ 0 for l = 2, ..., n + 1, it follows from the linearity of these constraints, that the corresponding quantum comb conditions hold for σ, i.e., C l ðσÞ ¼ 0 for l = 2, ..., n + 1. Finally, that C 1 ðσÞ ¼ 0 holds follows from just σ being a process, since it is equivalent to that if in σ we trace out all of the nodes A 2 , …, A n+1 , we should be left with, up to normalization, a valid single-node process on A 1 3 . Therefore, the isometric process σ on n + 1 nodes is a quantum comb, too, which completes the proof of Lemma 1 and thereby also that of Theorem 4. Proof of Theorem 6 First, suppose κ X 1 :::X n is a reversibly extendible process, that is, there exists a reversible deterministic process κ g X 1 :::X n λF for some bijection g : X out 1 ::: X out n λ out ! X in 1 ::: X in n F in , such that κ X 1 :::X n ¼ X λ out ;F in κ g X 1 :::X n λF Pðλ out Þ ð37Þ for some probability distribution P(λ out ). It follows from the fact that κ g X 1 :::X n λF is a classical process that marginalization as in Eq. (37) has to yield a classical process over nodes X 1 , ..., X n for arbitrary distributions P(λ out ), in particular for every point-distribution. Hence, for every value λ 0 of λ out , the induced function g λ 0 ð Þ :¼ gð ; λ 0 Þ has to define a deterministic process for n + 1 nodes and furthermore, also once marginalizing over F it still has to be a deterministic process for the n nodes X 1 , …, X n . Hence, Eq. (37) can be read as establishing that the given κ X 1 :::X n is a convex mixture of deterministic processes over the nodes X 1 , ..., X n , i.e., κ X 1 :::X n lies in the deterministic polytope.
Conversely, suppose κ X 1 :::X n lies inside the deterministic polytope, that is, there exists a family of deterministic processes fκ f i X 1 :::X n g m i¼1 , defined by the functions f i : X out 1 ::: X out n ! X in 1 ::: X in n such that κ X 1 :::X n ¼ P m i¼1 q i κ f i X 1 :::X n for some probability distribution {q i }. The proof will proceed by first observing that such a process can be seen to arise from one single deterministic process on n + 2 nodes. Together with the fact that every deterministic process is reversibly extendible, proven in ref. 11 , this establishes the claim. In order to see that indeed an appropriate deterministic process on n + 2 nodes exists, let λ out and F in be variables with cardinality m and define the function ðx; iÞ 7 ! ðf i ðxÞ; iÞ ; where X out ¼ X out 1 ::: X out n (similarly for X in ) and x = (x 1 , ..., x n ). Together with setting P(λ out = i) ≔ q i , f defines a deterministic classical process over the nodes X 1 ,...,X n , λ and F, which gives back κ X 1 :::X n upon marginalization over λ and F. That f indeed defines a process follows from the fact that arbitrary variation of the distribution P(λ out ) corresponds to an arbitrary weighting {q i } in the originally given mixture, each case of which has to be a classical process. This concludes the proof.

Data availability
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.