Post-quantum steering is a stronger-than-quantum resource for information processing

We present the first instance where post-quantum steering is a stronger-than-quantum resource for information processing -- remote state preparation. In addition, we show that the phenomenon of post-quantum steering is not just a mere mathematical curiosity allowed by the no-signalling principle, but it may arise within compositional theories beyond quantum theory, hence making its study fundamentally relevant. We show these results by formulating a new compositional general probabilistic theory -- which we call Witworld -- with strong post-quantum features, which proves to be a intuitive and useful tool for exploring steering and its applications beyond the quantum realm.


I. INTRODUCTION
A striking property of nature is that it is non-classical. Entanglement [1, 2], Bell nonlocality [3], and steering [4][5][6] are examples of quantum phenomena that can be observed experimentally [7][8][9][10][11][12] and which cannot be explained by classical physics. Besides their foundational relevance, with the advent of quantum information theory we learned a valuable lesson: these seemingly bizarre quantum features can be exploited to process information more efficiently, even in ways that could never be possible with classical resources alone [13][14][15].
A ubiquitous framework in which the scope of quantum advantage in information processing is explored is the so-called device-independent framework, where the parties executing protocols only rely on the classical inputs and outputs with which they operate their shared (and possibly quantum) devices. Such a framework is particularly well suited to the necessarily paranoid perspective on cryptographic tasks [13], and is almost ubiquitously underpinned by a Bell nonlocality setup. A special milestone in the research of non-classical resources for deviceindependent information processing was the realisation that there exist correlations beyond what is quantumly admissible (a.k.a., post-quantum correlations), but which nonetheless are consistent with special relativity [16]. These so-called no-signalling correlations, moreover, were shown to be consistent with alternative theories of nature, confirming the necessity of their study. Exploring these general no-signalling correlations enabled, for example, the design of quantum cryptographic protocols that are robust against powerful adversaries that are not bounded by the laws of quantum theory [17,18], and also the formulation of physical principles that a quantum world must satisfy [16,[19][20][21][22][23][24][25]. These post-quantum * paulojcvf@gmail.com † john.h.selby@gmail.com ‡ jamiesikora@gmail.com § tgalley1@perimeterinstitute.ca ¶ sainz.ab@gmail.com correlations are hence studied beyond philosophical motivations, and from the perspective of the resources they provide for operational tasks.
Device-independent frameworks for information processing, however, are substantially demanding to implement experimentally. Indeed, for practical purposes, even if cryptographically secure, device-independent protocols are yet to move beyond 'proof of principle' applications into scalable and easily-accessible technologies. There are situations, however, where one may argue that the quantum description of some of the parties involved can be leveraged in the protocols: in the simplest case where two parties are involved and a single party is assigned a quantum description this is usually referred to as a one-way device-independent framework [14,15]. In such scenarios, the non-classical phenomenon providing quantum advantage is steering rather than Bell nonlocality. Recently, it has been shown that steering beyond that which quantum theory allows, whilst still consistent with special relativity, may exist [26,27], which opens a new plethora of questions, such as (i) can post-quantum steering provide an advantage beyond what is possible with quantum theory for some information processing task?; (ii) is post-quantum steering just a mathematical curiosity, or may it emerge within alternative physical theories?
In this work we tackle those two questions. First, we show that there are alternative theories beyond quantum which feature post-quantum steering, making the phenomenon physically relevant for post-quantum information processing and motivating its exploration. Second, we find a task for which post-quantum steering is a stronger-than-quantum resource: remote state preparation. Remote state preparation (RSP) is a task similar in spirit to teleportation: the goal is to transmit quantum states from one party to another distant party using only shared entanglement and classical communication. Unlike teleportation, though, in RSP the sender has knowledge of the transmitted state, which makes RSP protocols more economical than teleportation in terms of resources (e.g., classical communication) needed to succeed at the task [28,29]. In addition, RSP protocols do not necessarily require the ability to experimentally implement Bell (entangling) measurements, which makes them potentially more feasible experimentally [30]. The kind of RSP protocols that we focus on are so-called oblivious -namely those where no information about the state is leaked to the receiver, apart from the state itself, something that is relevant for certain applications such as blind quantum computation [31]. RSP is indeed an insightful task to explore from both a fundamental and applied viewpoint.
In order to prove our results we define a generalised probabilistic theory (GPT) that we name Witworld, given its strong connection to entanglement witnesses. Witworld combines system types of three well-known GPTs (classical, quantum, and Boxworld) in a simple mathematical way, via the so-called max tensor product. Remarkably, even though Witworld cannot reproduce all the phenomenology of quantum theory, it does realise all quantum predictions for Bell and steering experiments. Hence, we can learn about the limitations of quantum advantage in one-sided and fully device-independent protocols by exploring the performance of Witworld. Despite its simplicity, Witworld displays powerful post-quantum features: not only can it realise all non-signalling correlations in Bell experiments, but it also displays postquantum steering. As Witworld is a fully compositional theory, it comes equipped with an intuitive diagrammatic calculus [32,33]. This provides a convenient toolkit for exploring other applications of post-quantum phenomena for information processing.
The paper is organised as follows. In what remains of this section we present a brief introduction to the three main topics of this paper: generalised probabilistic theories, Bell nonlocality, and steering. Section II A presents the definition of Witworld, assuming basic knowledge of GPTs -the reader who is not familiar with them may consult Sec. A of the Supplemental Material. Section II B discusses the post-quantum properties of Witworld with a focus on Bell and steering experiments, while section II C discusses how Witworld outperforms quantum theory at certain information processing tasks. Section II C also presents the first task where post-quantum steering outperforms quantum steering as a resource. Technicalities, as well as a brief review on steering, are included in the appendices. Generalised probabilistic theories .-The framework of generalised probabilistic theories [34,35] provides tools with which to explore the operational features of candidate theories in a unified fashion. Classical theory, as well as quantum theory, may be recast within the language of GPTs [34,35], which enables their unified and comparative study. The GPT framework has been proven useful not only from a foundational perspective (e.g., for developing axiomatic reconstructions of quantum theory [33,[35][36][37][38]), but also when exploring the quantum information capabilities of post-quantum theories, such as their computational power [39][40][41][42][43][44][45] or cryptographic security [34,[46][47][48][49].
Bell nonlocality.-One example of a non-classical phenomenon of foundational and applied relevance is Bell nonlocality. Bell experiments are ubiquitous in the fields of quantum foundations and quantum information processing. On the one hand, Bell's Theorem [3] established a precise sense in which quantum theory requires a departure from a classical worldview, and violations of Bell inequalities provide a means for certifying the nonclassicality of nature. On the other hand, the correlations observed in a Bell test have become a resource for certain tasks [13], and the violation of so-called Bell inequalities by these correlations has become a standard certification tool for security in cryptographic protocols [17,18,50]. In brief, a Bell scenario consists of a set of distant parties that perform space-like separated actions on their part of a physical system, and the objects of study are the correlations they observe among their measurement outcomes. In the case of a bipartite scenario, let x ∈ X and a ∈ A denote the classical variables that label the measurement choices and produced outcomes, respectively, corresponding to the first party (hereon, Alice), and, respectively, y ∈ Y and b ∈ B those for the second party (hereon, Bob). The correlations observed in this bipartite Bell experiment are captured by the conditional probability distribution {p(ab|xy)}. It is therefore natural to ask ourselves which possible {p(ab|xy)} may be generated, and at what cost. Given the space-like separation constraints, the largest set of correlations observable in a Bell scenario corresponds to those that satisfy the No-Signalling Principle, and it is known that correlations allowed by quantum theory are a strict subset of those correlations. Notably, a GPT colloquially referred to as Boxworld [51,52] has been defined [34], which can realise all the correlations compatible with the no-signalling principle via its bipartite states and local measurements.
Steering.-Steering is another non-classical phenomenon of foundational and practical relevance, which was identified back in the 1930s [4] but, unlike Bell nonlocality, only recently caught the attention of the quantum information community [5,6]. Steering captures the idea that Alice seemingly remotely 'steers' the state of a distant Bob, in a way which has no classical explanation. A main feature of a steering experiment is the asymmetric role that the parties play, which makes it particularly suitable as a resource for certain asymmetric information processing tasks [14,15]. In brief, the simplest steering experiment consist of two distant parties -Alice and Bob -which perform local actions on their part of a physical system. Unlike in a Bell experiment, though, the parties here perform different types of transformations in their labs: Alice performs a measurement, labelled by x ∈ X, on her system, and obtains a classical outcome a ∈ A, whereas Bob performs full tomography of the quantum system and so describes it via a density matrix ρ B a|x that is effectively prepared in his lab after Alice's actions. In this way, the object of study in these experiments are the ensembles of ensembles (a.k.a. assemblages [53]) given by {{σ a|x } a∈A } x∈X , where tr (σ a|x ) = p(a|x) and σ a|x = p(a|x) ρ B a|x . While nonclassical properties of steering within quantum theory have been considerably explored, not much is known about steering beyond quantum theory [26,27,54,55]. One main obstacle for this is the complexity of capturing fundamentally what could be post-quantum about an assemblage of quantum states. An operational recast of the steering phenomenon has been recently put forward [26,27], which facilitates a way to articulate the concept of post-quantum assemblages. The study of post-quantum steering has only just begun, and, unlike for Bell nonlocality, important fundamental and practical questions are yet to be answered. One such question is: does there exist a GPT that realises all these post-quantum assemblages?

A. Witworld
In this section we provide a simple and concise introduction to Witworld, which should enable the understanding of the subsequent results. We moreover provide a detailed formal definition in Sec. B of the Supplemental Material.
In Witworld, there are three types of basic systems, which can be composed to construct more general system types. The basic systems are classical systems, quantum systems, and Boxworld systems [34]. (One could easily modify the theory to allow for further system types. However, it is not clear that this will provide any further benefit to the study of steering). Systems that are of one of those three types are called atomic. Witworld features a composition rule (which we define shortly) by which these simple system types can form new ones that are neither classical, quantum, nor Boxworld. We denote the atomic types diagrammatically with different types of wires by: where C v denotes a classical system of dimension v, Q d denotes a quantum system of dimension d, and B n,k denotes a Boxworld system of dimension (n, k) (These two integers relate to the input/output cardinality of the correlations in Bell scenarios that the system is tailored at [34]). Moreover, when we need to use a generic system type (which can be either simple or composite), we denote this by We can also explicitly denote the components of a composite system by using parallel wires, for example: (3) corresponds to a system composed of a qubit, a (2, 2) Boxworld system, a qutrit, and a classical system of dimension 5.
The state space of a given system type is represented by a convex set living inside some real vector space. For instance, an atomic quantum system Q 2 has states living inside a Bloch sphere in a 4-dimensional real vector space, an atomic classical state C 3 has states living inside a triangle in a 3-dimensional vector space, and an atomic Boxworld system B 2,2 has states inside a square in a 3dimensional vector space. These examples are depicted in Figure 1. Diagrammatically we denote a state σ of a system S by Regarding the effects, Witworld includes all the elements of the dual of the vector space which evaluate to valid probabilities for every state. That is, Witworld satisfies the no-restriction hypothesis (NRH) [32]. For example, for a system of the type Q 2 , the effects correspond to POVM elements and are represented as a particular region of (R 4 ) * . This region can be defined as the intersection of the cone of linear functionals which evaluate to positive reals on the set of state vectors with the set of linear functionals which evaluate to less than 1 for all state vectors. For C 3 , the effects live in a cube, and for B 2,2 , the effects live in an octahedron. A pictorial representation of these can be seen in Figure 1. An effect e for a system S is diagrammatically denoted by e S .
Since effects belong to the dual vector space, when we compose them with a state we obtain a real number, which, by assumption, must give a valid probability. That is, for all states σ S and effects e S we have e S (σ S ) ∈ [0, 1]. Diagrammatically this is written as As mentioned previously, we define Witworld to satisfy the NRH. This, however, is not the only simplifying assumption that we make in this construction. Additionally, we define the composition of systems to be via the so-called max tensor product [46] and, hence, that the theory is locally tomographic [35]. Moreover, we demand that Witworld satisfies the generalised no-restriction hypothesis (GNRH). Intuitively, the GNRH is the NRH together with the requirement that every transformation that takes every element of a valid state space to an element of another valid state space is a valid transformation in the theory, that is, every completely positive transformation is considered valid.
The max tensor product (see Def. A.8 in the Supplemental Material) is a composition rule that assigns as valid states of a composite system A · B any vector in the product vector space From an intuitive point of view, the max tensor product gives rise to a GPT that somehow maximises the set of states that the system can be prepared in, whilst strongly restricting the set of measurements that one may perform on it. As a matter of fact, even though Witworld might appear to be a more general theory than quantum theory, these two are actually incomparable: quantum theory allows for measurements that Witworld systems cannot be acted upon with (with the latter having a more restricted set of measurements on collections of quantum atomic systems types), whilst Witworld allows for more states on which the composition of quantum atomic system types can be prepared (Witworld allows for two qubits to be prepared on a state mathematically corresponding to a quantum entanglement witness, whereas in quantum theory this is not an allowed state of a two-qubit system).
The fact that we have defined composition via the max tensor product and are demanding that the theory satisfies the GNRH, means that when we define the atomic states, we define the whole theory, since from the atomic states and max tensor product every possible state is defined, and from the states and GNRH the effects and transformations are also determined.
Finally, as mentioned above, the max tensor product is tomographically local, that is to say that its states can be uniquely determined by the information obtained from performing local measurements on its parts. Using the example above, this means that ρ AB is completely determined by a set of values e A i ⊗ e B j (ρ AB ). At this point it is worth mentioning some further consequences of our definitions. The first one is that the use of the max tensor product to compose systems implies that every effect in Witworld is separable (see Lem. A.14). Therefore, an important feature of Witworld is that it does not contain entangling effects.
A second important fact about Witworld is that the combination of two atomic quantum systems yields systems whose states spaces are larger than the joint state space obtained from the standard quantum composition rule (see Thms. B.5 and B.6 in the Supplemental Material). For example, in the bipartite case, we have that the composite of Q n and Q m , denoted by Q n · Q m , has as its state space the set of entanglement witnesses (including density matrices [56]) for the quantum bipartite states, which strictly contains the set of bipartite quantum states Q nm . Therefore, whilst Witworld does contain arbitrary quantum systems, quantum theory is not a compositional subtheory within Witworld. Note that if one were to allow quantum systems in Witworld (where these quantum systems do not have the additional dynamics of Witworld quantum systems) to be composable both according to the standard quantum rule as well as to the Witworld composition rule, one could construct a protocol giving negative probabilities [57]. As such one cannot extend Witworld in such a manner as to contain quantum theory as a compositional subtheory.
A third important feature is that Boxworld-type (resp. classical-type) systems in Witworld compose exactly as they do in Boxworld (resp. classical theory). Therefore, both Boxworld and classical theory are indeed full subtheories of Witworld. Here, by full subtheory we mean that you can recover Boxworld or classical theory from Witworld by suitably restricting it to a particular collection of system types. This restriction recovers all and only the states, effects, transformations, and the composition rules of Boxworld or classical theory.
Finally, another important feature of Witworld is that because of the combination of GNRH and max tensor product, there is no difference between positive and completely positive maps (see Lem. A.15). Of course, for systems that are not quantum, a more general notion (relative to that of positive operators in quantum theory) of positivity must be used in order to make that statement (see Def. A.6 in the Supplemental Material). Now, in the case of atomic quantum systems, this means that the valid Witworld transformations correspond to positive, but not necessarily completely positive, quantum transformations. Hence, this means that in Witworld there are more transformations available to local agents (i.e., to Alice and Bob) than would be available in quantum theory.

B. Post-quantum phenomena: Bell non-classicality and steering
In this section we explore the non-classical features that Witworld displays, starting with the case of Bell scenarios. One can readily see that Witworld can realise all non-signalling correlations in arbitrary Bell scenarios (see Fig. 2), since Boxworld is a full subtheory of Witworld. Therefore, one can leverage the Boxworld realisations of any non-signalling correlation, and translate them straightforwardly to a realisation within Witworld. For example, take the case of Popescu-Rohrlich (PR) correlations, which read p P R (ab|xy) = 1 2 δ a⊕b=xy with a, b, x, y ∈ {0, 1} and ⊕ denoting modulo-2 addition; these correlations can be realised in Witworld as follows: with the state s PR and controlled measurements M PR and M PR as introduced in Ref. [34], whose explicit form we present in Eqs. C8, C3, and C6 in the Supplemental Material. Note that for simplicity of notation we will often label the classical systems by an outcome or setting variable such as X, in this case the relevant GPT system is C |X| . The situation slightly changes when we instead focus on the non-classical phenomenon of steering (see Sec. D of the Supplemental Material for a comprehensive introduction). In brief, a traditional bipartite steering experiment consists of two distant parties, Alice and Bob, who share a physical system, perform space-like separated actions, and, unlike in Bell experiments, play asymmetric roles in the experiment. On the one hand, Alice (sometimes referred to as the black-box party in the steering literature) chooses a measurement labelled by x ∈ X to perform on her share of the system, and obtains a classical outcome a ∈ A with probability p(a|x). Bob, on the other hand, merely characterises the quantum state ρ B a|x to which his subsystem is steered. The information collected in this experiment (Alice's probabilities and Bob's conditional states) is expressed concisely as an assemblage [53]: Note that in Ref. [54] it is shown how assemblages can be equivalently represented by so-called 'causal' channels. With a slight abuse of notation we therefore diagrammatically represent the assemblage by the causal classical-quantum channel: To see that this is indeed a good representation, note that we can extract the elements of the assemblage, i.e., the subnormalised steered states as: and then the probabilities as: where denotes the so-called unit effect (see Eq. A11 and its precedding paragraph in the Supplemental Material), which in quantum theory corresponds to the partial trace of the relevant subsystem. Analogously, we can view non-signalling correlations as particular channels, in this case channels with classical inputs and outputs which correspond to stochastic maps. Using this we can rewrite Eq. (8) as Beyond the traditional scenario, one may have steering experiments with more black-box parties also in a space-like separated configuration [26,58], or even situations where Bob may influence the state preparation of his system by choosing a classical variable y (Bob-withinput scenarios) [27].
In a similar fashion to Bell non-classicality, one can define what "classical" (a.k.a. LHS), quantum, and nonsignalling assemblages are [26,27]. Notice that the differences in all these kinds of steering are not related to the type of system prepared in Bob's lab, but rather to the types of shared resources that are used to prepare those quantum systems in Bob's lab. From the point of view of Witworld, then, an assemblage in a steering experiment is produced by the parties performing local operations in a shared arbitrary composite multipartite system, which may include classical, quantum, and Boxworld systems. One fascinating property of Witworld is that it not only features all LHS and quantum assemblages (see Defs. D.5 and D.6, respectively, in the Supplemental Material), but may also realise post-quantum assemblages. That is, Witworld features post-quantum steering. In this section we present a few key examples of this. Whether Witworld can realise all non-signalling assemblages is still an open question (see Fig. 3).
The first example we present is in a tripartite steering scenario, since in traditional bipartite steering scenarios post-quantum steering is forbidden by the Gisin [59] and Hughston, Josza and Wootters [60] theorems. In a tripartite scenario, it is enough to consider the simplest setup with two black-box parties choosing among two dichotomic measurements each, so X = A = {0, 1}, and where Bob's subsystem is a qubit. The particular assemblage we present is the PR-box assemblage, defined by with σ * B a1a2|x1x2 = p PR (a 1 a 2 |x 1 x 2 ) This assemblage cannot be realised by the three parties sharing quantum resources [26], i.e., it is post-quantum. Σ PR AA|XX can however be realised within Witworld when the parties share the following mutipartite system: a bipartite Boxworld system of dimension (2, 2) on a PR state shared by the black-box parties, composed in parallel with a quantum state ρ * B = I 2 for Bob. Leveraging the realisation of PR-box correlations as in Eq. (8), the assemblage Σ PR AA|XX can be realised by: The second example we present is in a bipartite Bobwith-input steering scenario, where Alice has X = A = {0, 1}, Bob's subsystem is a qubit, and Bob's input is dichotomic (i.e., y ∈ Y = {0, 1}). The particular postquantum assemblage Σ * A|XY we consider has elements defined by σ * B a|xy = 1 2 (|a a| δ xy=0 + |a ⊕ 1 a ⊕ 1| δ xy=1 ) [27]. This assemblage can be realised in Witworld by Alice and Bob sharing a bipartite Boxworld system of dimension (2, 2) prepared in a PR state, and implementing the following protocol. On the one hand, Alice performs the measurement M PR of Eq. (8) controlled on her classical input x, and obtains the output a. On the other hand, here the state preparation of Bob's system further depends on Bob's choice of a classical variable y which he inputs in a device. In this protocol, this device has a two-stage process: first it implements the measurement M PR of Eq. (8) on the Boxworld system, conditioned on y; second, there is a controlled state preparation P which prepares the quantum state |b b| conditioned on b, the classical output of M PR . Diagrammatically, the whole protocol reads: One can readily see that Eq. (17) indeed holds, since the assemblage elements of Σ * A|XY can be rewritten as σ * B a|xy = 1 2 |a ⊕ xy a ⊕ xy|, and PR-box correlations satisfy b = a ⊕ xy and 1 2 = b p PR (ab|xy). The third example we present is that of Gleason assemblages [55]. In short, Gleason assemblages are those that can be mathematically expressed in the language of quantum theory by having the parties measure a shared system whose state preparation is represented by a normalised quantum entanglement witness. Gleason assemblages are particularly useful, since there are constructions that yield provably post-quantum Gleason assemblages. More importantly, the post-quantumness of some Gleason assemblages is not implied by post-quantum Bell non-locality, which renders post-quantum steering as a genuinely new effect [55]. Witworld readily provides realisations of any Gleason assemblage, by noticing two facts: (i) any quantum entanglement witness is a valid state of composite quantum-type systems in Witworld (Thm. B.6), and (ii) in Witworld, any local quantum measurement is a valid Witworld measurement (Lem. 8.14). The explicit diagram for a Witworld realisation of a generic Gleason assemblage Σ G A1A2|X1X2 in a tripartite scenario with two black-box parties is: .
(18) The fourth example that we present is in a bipartite Bob-with-input steering scenario, with A = Y = {0, 1} and X = {1, 2, 3}. The particular post-quantum assemblage Σ * * A|XY here has elements defined by [27]: where (Σ 1 , Σ 2 , Σ 3 ) are the Pauli X, Y, and Z operators, respectively. Σ * * A|XY is the first assemblage found in the Bob-with-input scenario whose post-quantumness cannot be proven directly from leveraging post-quantum Bell non-locality, which renders this type of post-quantum steering as a genuinely new effect. To see that Witworld can realise this assemblage, first notice that its elements can be mathematically written as σ * * B a|xy = (σ B a|xy ) y , where σ B a|xy = 1 4 (I + (−1) a Σ x ) are the elements of a quantum assemblage (see Def. D.6 in the Supplemental Material), and y is the identity operator for y = 0 and the Transpose operator (denoted ) for y = 1. The final step is to observe that all of these mathematical objects are acceptable physical operations in Witworld: ( The final example that we consider is steering in the instrumental scenario [27]. This can be seen as an adaptation of the Bob-with-input scenario, in which Alice's output a determines the setting y for Bob. The particular example of post-quantum steering in this scenario that we present here is given by modifying our previous example, by wiring Alice's output to Bob's input. That is, it can be shown that the assemblage which is obtained by setting y = a in Eq. (19), is postquantum [27]. It is then a simple modification of Eq. (20) to see that this too can be realised in Witworld: where the small white circle splitting the classical system is the copy operation. With this we see that Witworld features a variety of non-classical and post-quantum properties, both in Bell and steering scenarios, and hence is the first GPT that has been shown to display post-quantum steering.

C. Post-quantum advantage for information processing
Post-quantum resources may outperform quantum ones for information processing tasks [13,17,18,50,61,62]. A natural question then is whether the postquantum features of Witworld enable this theory to be more powerful than quantum theory in this respect. First, one can focus on device-independent information processing tasks, such as quantum cryptography [17,18,50], which rely on the use of correlations in Bell scenarios. Here, it is known that Boxworld may outperform quantum theory, since it realises any non-signalling correlation. Since Boxworld is a subtheory of Witworld, then, the latter inherits these properties; that is, Witworld outperforms quantum theory in those deviceindependent information processing tasks. A more relevant question then is whether such advantage persists when moving on from device-independent tasks. Hence in this section we investigate whether Witworld provides an advantage for tasks that go beyond the processing of Bell-type correlations.
There are two features of Witworld that go beyond Boxworld which are noteworthy when looking for an information task where Witworld is resourceful. One is the fact that Witworld has quantum systems as atomic system types, and the other is the fact that positive (but not necessarily completely positive) quantum operations are allowed physical operations in Witworld. Using these two facts we first show that Witworld outperforms Quantum Theory in the task of Remote State Preparation, and then we show that the resource underlying this advantage is post-quantum steering.
Remote State Preparation (RSP) [28,29] is a protocol with a similar flavour to state teleportation. A main difference between teleportation and RSP is that in the former, Alice can send to Bob a state she knows nothing about, whereas in the latter she may require a complete classical description of |ψ . We denote this complete classical description by ψ. In both cases, the main goal is for Alice to deterministically prepare a state |ψ in Bob's lab, such that he gets no additional information about |ψ . In RSP (see Fig. 4), however, Alice does not need to perform experimentally challenging entangling measurements (as in a full Bell-state analysis) [30]. Instead, she can directly encode the information about the state she wishes to send onto her share of an entangled state shared with Bob. When Alice and Bob use quantum resources, the minimum amount of classical information that she needs to send him for the protocol to succeed is 2 log d bits of information, where d is the dimension of the Hilbert space containing |ψ [29]. Here we present a protocol using Witworld resources which may prepare an arbitrary qubit state in Bob's lab using only 1 bit (instead of 2) of classical communication.
Consider the following protocol in Witworld. Alice and Bob share the two qubit state |Φ s = (|01 − |10 )/ √ 2. Alice performs the unitary U ψ = |0 ψ ⊥ | + |1 ψ| on her qubit, which encodes the state |ψ to be sent. This effectively applies U † ψ to Bob's half of the state (the singlet state |Φ s transforms trivially under U ⊗ U ; implying that (U ψ ⊗ I)|Φ s = (I ⊗ U † ψ )|Φ s ). Next, she performs the measurement given by B = {|0 0|, |1 1|}, whose outcome consists of one classical bit a which indicates exactly whether Bob now has the post measured state − |ψ (if a = 0) or ψ ⊥ (if a = 1). Then, Alice sends a to Bob, who now knows whether he is holding − |ψ or ψ ⊥ . The task can be completed if Bob has access to a universal-NOT operation, which maps an arbitrary input |φ into an orthogonal state to it (which is unique up to global phases for qubits). The universal-NOT operation is not valid in quantum theory since it is a positive transformation, but not a completely positive transformation. However, in Witworld, this is an allowable transformation. Thus in Witworld Bob can apply the universal-NOT gate when a = 1, leaving him with a perfect copy of |ψ (up to a physically irrelevant global phase). Diagrammatically, this protocol is represented as follows: where cUNOT is the controlled-universal-NOT operation. The diagrammatic manipulations that prove that Eq. (23) holds are presented in Sec. E of the Supplemental Material. Through this protocol, Witworld performs RSP of a qubit deterministically with the transmission of only one classical bit from Alice to Bob, outperforming quantum theory at the task.
We now move on to unveiling what the critical resource is underlying the success of RSP in Witworld. For this, it is convenient to rewrite the diagram in the left hand side of Eq. (23) as: where ψ is not the state |ψ , but simply a classical label corresponding to it, used to determine the unitary U ψ that the transformation cU implements. In addition, B • cU is the process that first implements the controlled unitary cU and then the measurement B. The crucial step here is to notice that each term in the sum in Eq. (24) can be identified with an element of an assemblage {σ a|ψ } in an instrumental steering scenario (see Def. D.4 in the Supplemental Material) as follows: where a denotes Alice's dichotomic outcome, and ψ is the classical variable that denotes her measurement choice. That is, RSP is ultimately an instance of an instrumental steering scenario, and the possible assemblages that Alice can prepare dictates whether RSP is possible for the given cardinality of a. For the particular RSP protocol discussed above, It is readily seen that the assemblage {σ a|ψ } has no quantum realisation: if this was instead the case, this assemblage would provide a quantum RSP protocol that succeeds deterministically with 1 bit of communication, which is fundamentally impossible. We see therefore how instrumental steering powers RSP, and how the post-quantum steering featured in Witworld makes this theory more efficient than quantum theory at the task of Remote State Preparation.
Let us observe that quantum theory restricted to the reals [63], which has mixed states given by symmetric matrices (a subset of quantum states), also requires a single bit of communication for RSP. A rebit (2 dimensional real quantum system) has mixed states given by the X − Z plane of the Bloch sphere (a disk). The universal NOT is just rotation by π around the Y axis, and is completely positive. Since the singlet state |Φ s Φ s | is a real valued density operator (i.e. it is a symmetric matrix) it follows that it is a valid entangled state of two rebits. Hence the protocol outlined above in Witworld can be applied to real quantum theory as well, to give RSP with a single bit of communication.

III. DISCUSSION
In this work we explored the scope of post-quantum steering as a stronger-than-quantum resource for information processing. We particularly focused the search on tasks beyond device-independent ones or those that ultimately rely on Bell correlations (such as random access codes [64,65] or device-independent quantum key distribution): we aimed at finding tasks that intrinsically leveraged quantum systems and non-classical steering. We discovered that remote state preparation of qubits systems provides a friendly proof-of-principle of a general phenomenology: steering assemblages in the instrumental scenario serve as a resource for the task, and postquantum assemblages perform better than quantum ones at it. This is the first time that post-quantum steering -as opposed to post-quantum Bell non-classicality -has been identified as a resource powering information processing which can provably outperform quantum theory.
In order to prove our claims, we defined a generalised probabilistic theory, that we call Witworld, by combining classical, quantum, and Boxworld systems in a simple mathematical way, via the max tensor product. The intuitive formulation of Witworld allowed us to present its powerful post-quantum features in an accessible way: one can readily see how post-quantum Bell nonlocality, post-quantum steering, and post-quantum states emerge within Witworld. The task of remote state preparation can be studied diagrammatically within Witworld, and by doing so we showed how the post-quantum assemblages allowed by the theory makes Witworld perform better at it than quantum theory does.
A feature of Witworld is that, even though it is built in part on quantum systems, it does not contain quantum theory as a subtheory: there are tasks, such as quantum teleportation, that quantum theory can perform whilst Witworld cannot. The reason for this is the choice of composition rule: Witworld composes via the max tensor product, and hence no entangling measurements are allowed in the theory. Nonetheless, Witworld remarkably succeeds at reproducing all the quantum entangled states, quantum steering assemblages, and quantum correlations in Bell scenarios. That is, for the non-classical phenomena usually leveraged in quantum information tasks, Witworld is at least as good as quantum theory at manifesting them.
If we turn our attention to a particular subtheory of Witworld by restricting the system types to classical and quantum only -that is, by removing Boxworld from the theory -we find that this subtheory still features postquantum properties, such as Bell non-classicality in multipartite scenarios (for example, by utilizing the results of Ref. [56]), as well as post-quantum steering and postquantum states even in bipartite scenarios. Remarkably, the post-quantum advantage for remote state preparation is also featured by this subtheory of Witworld, since the post-quantum advantage provided by it stems from the enlarged set of operations allowed on local quantum systems. We leave it as an open question whether other previously defined GPTs (e.g., Refs. [66,67]) may provide such an advantage for this information processing task.
It is worth noticing that Witworld's simple formulation does not make the theory intrinsically groundbreaking from the perspective of generalised probabilitic theories, however its relevance is not grounded in its appeal as a standalone GPT. Rather, Witworld shows that there exists a compositional theory that could underpin postquantum effects such as post-quantum steering. This shows that the latter phenomenon in not in principle un-realisable, and hence its study should not be simply dismissed.
Looking ahead, there are a variety of open questions that can be studied, especially about the extent to which post-quantum steering compatible with special relativity can be underpinned by some generalised theory. We know that Witworld may display post-quantum steering but, unlike the case of Bell non-classicality, it is still unknown whether any no-signalling assemblage may have a realisation within Witworld. Any answer to this question would be of interest: if Witworld can realise all no-signalling assemblages, then this theory becomes the first GPT to accommodate steering fully in a common-cause resource theoretic framework [68]; otherwise, understanding the reason behind the gap between no-signalling realisable and Witworld realisable assemblages may lead to an operational principle that could shed light on the characterisation of quantum phenomena.
Finally, the exploration of the information processing power of steering (quantum and beyond) has only just begun. Since Witworld is formulated in an intuitive way leveraging a diagrammatic representation [32,33,[69][70][71], there is plenty of scope for investigating other postquantum advantages of this theory, and of post-quantum steering, for information processing.

DATA AVAILABILITY
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

CODE AVAILABILITY
Code sharing not applicable to this article as no codes were used during the current study.

COMPETING INTERESTS
The authors declare no competing financial or nonfinancial interests.

Supplemental Material
Appendix A: Generalised Probabilistic Theories

The basics
In this section, we provide a description of generalised probabilistic theories (GPTs) that have the property of being locally tomographic [35]. Roughly speaking, the GPT framework is a general framework to formulate and describe theories (including quantum theory), that allows for the calculation of probabilities of measurement outcomes when system preparations have states associated to them. In this appendix, we aim for a description of GPTs which connects the diagrammatic [32,33] and the linear algebraic notations, hoping to make it useful to a broader audience. Although a more general kind of GPT could be defined, for the purpose of defining Witworld, restricting the present discussion to locally tomographic GPTs significantly simplifies the task in hand. The interested reader can find a more general definition of GPTs in, for example, Ref. [32].
In general, a GPT consists of collections of states that are associated to different system types, a rule for combining state spaces of simple systems into state spaces of composite systems, collections of allowed transformations between these states, and effects -i.e., functions that associate probabilities to each outcome of each measurement for each state in the theory. Here, as mentioned above, we focus on locally tomographic GPTs, which are those where the states of composite systems can be uniquely determined by the information given by local measurements on its parts. Each of these ingre-dients are defined and compared to quantum theory in what follows.
We start with the states. For each system A of a GPT, there is a vector space V A associated to it. A convex subset Ω A ⊂ V A , called the state space, defines the allowed states of the system A. This subset has dimension dim(Ω A ) = dim(V A ) − 1. We require every state in Ω A to be normalised in a sense to be defined later in this section when we introduce effects. The convexity property means that, if σ ∈ Ω A and ρ ∈ Ω A , then pσ + (1 − p)ρ ∈ Ω A for any p ∈ [0, 1]. We require convexity so that the GPT accommodates statistical mixtures of state preparations in a natural way. Diagrammatically, the system A and its associated vector space V A are represented by a labelled wire: (A1) A remark on notation is in order: throughout this section (i.e., Sec. A), we review definitions and properties of a general class of GPTs,which Witworld belongs to, but we do not restrict the presentation to the latter. Hence, the wire type in Eq. (A1) should here be understood as a generic system rather than as a classical system in Witworld (as per Eq. (1)). From Sec. B we will shift the focus back to Witworld, and hence the notation from Eq. (1) will take precedence again. In the case of quantum theory, hence, the wires in Eq. (A1) represent real vector spaces of Hermitian operators on Hilbert spaces. For instance, if A is a qubit system, the wire labelled by A represents the vector space is the space of linear operators on C 2 . Then, Ω A is the set of positive operators whose trace is 1, that is, Ω A = {ρ ∈ V A : ρ ≥ 0, tr ρ = 1}, which is indeed a convex set as required. It is sometimes useful, moreover, to include within the GPT formulation of quantum theory some wires that represent classical variables which store the results of measurements, see, for example, Refs. [70][71][72]. This is done in section II A of the present work.
As mentioned previously, one can construct composite systems by the combination of simpler systems. We denote by A · B the system composed by a system A and a system B. Its states belong to the set Ω A·B ⊂ V A·B , which is represented diagrammatically by multiple wires side by side: (A2) In the locally tomographic GPTs that we consider here, such as quantum theory, V A·B = V A ⊗ V B . For convenience,we omit the label when the exact system being discussed is not relevant or it is clear from the context, or we use different kinds of wires to highlight the distinctions, as we do in Sec. II A. If we want to refer diagrammatically to a specific state of A, that is, some element s of Ω A , we use a box (usually, but not necessarily, a triangular box) with an output wire A connected to its top: The transformations in a (tomographically local) GPT are linear functions from the vector space V A associated to a system of some type A to the vector space V B associated to some system of type B. Hence, the set of transformations of type A → B, denoted by T A→B is a subset of L(V A , V B ), the set of linear transformations from V A to V B . Diagrammatically, a particular transformation T ∈ T A→B is denoted by a box with an input wire A connected to its bottom and an output wire B connected to its top: For quantum theory, the set of transformations T A→B is the set of quantum operations, which correspond to completely positive trace-non-increasing maps from V A to V B . Of course, we may want to represent not just the transformation itself, but its action on a specific state. This is done by connecting the input wire of the transformation with a state of matching type: (A5) Note that with this, viewing T (s) as a vector can be expressed in diagrams by "sliding" the box representing T until it merges with the box representing s: This is a manipulation of diagrams that is used often in this work. The converse operation, where we split a vector into a product where a transformation T is applied on a state s, is also a valid manipulation where we split a diagram with only a state into one where a transformation is connected to a different state. Notice that this mirrors exactly linear algebraic operations where an equation like s = T • s is used for substitutions. Furthermore, boxes representing transformations can also be connected, when the wire types match, to represent the sequential composition of them. Because a sequence of linear transformations T and U , can also be viewed as a single transformation U • T , the composition of both, the merging of boxes can also be done with transformations that are connected: Naturally, one may need to represent transformations that happen in parallel on the parts of a composite system A · B. While in linear algebraic notation this is done with the direct product ⊗, so that, for T of type A → B and V of type C → D, we write T ⊗ V , in diagrammatic notation we simply put T and V side by side: where the order of the wires (from left to right) matters just like the order of the product T ⊗ V . The effects of a system A in a GPT are linear functionals over V A that evaluate to probabilities, i.e., numbers in [0, 1], for every valid state. This means that the set E A of effects of a system A is a subset of (V A ) * , the dual of V A , such that e ∈ E A implies e(s) ∈ [0, 1] for every s ∈ Ω A . Diagrammatically, the effects are represented as boxes with only inputs, so  Notice that Diagram (A10), unlike those in the previous examples, contains no loose wires. This means that e(s) is a real number, and, similarly, any diagram in this formalism without loose wires represents a real number. Moreover, diagrams with only output (top) loose wires are always states, diagrams with only input (bottom) loose wires are always effects, and diagrams with both input and output loose wires are transformations. Now that we have discussed the effects, we can define what it means for a state to be normalised in a GPT. This is done through a special effect, called the unit effect, which, for a system A, we denote by u A . We say that a vector s ∈ V A is normalised if and only if u A (s) = 1. Therefore, by our definition of the set of states Ω A , if s ∈ Ω A , then u A (s) = 1. This special effect is denoted by a special diagram thus, diagrammatically, state normalisation is captured by the condition In the example of quantum theory, the unit effect of any system type is the trace operation, or, in other words, the trace inner product with the identity operator. An important definition to be made, and that appears nicely in diagrammatic notation, is that of separable effects and states. In quantum theory, a separable state is that which can be written as a convex combination of product quantum states. Here, we simply generalise that notion to any GPT state: s ∈ Ω A·B is separable if s = i p i r A i ⊗ r B i , with p i ∈ [0, 1] and i p i = 1, r A i ∈ Ω A , and r B i ∈ Ω B . In diagrammatic notation, a separable state can be viewed as: (A13) Separable effects are defined similarly, with its diagrammatic representation being like the one above but where the loose wires come from the bottom instead of the top.

Further definitions
We can use these fundamental notions to define some concepts that are necessary in this work. These are positive cones, positive vectors, local tomography, the norestriction hypothesis, the generalised no-restriction hypothesis, the maximal tensor product, trace-preserving transformation, and positive and completely positive transformations. Some of these are present in quantum theory, but here we need definitions that generalise them to arbitrary GPTs. Definition A.1 (Positive Cone). A positive cone X + generated by a subset X of a vector space V is the set of nonnegative multiples of the elements of X. That is, Definition A.2 (Positive Vector). A vector v of a vector space V A associated to a system of A of a GPT is said to be positive, denoted v ≥ 0, if v ∈ Ω A + , the cone generated by the set of states.
Note that in quantum theory our notion of positive cones recovers the sets of positive operators from the sets of density matrices. This is because, for any quantum system A, the density matrices ρ ∈ Ω A satisfy ρ ≥ 0 and tr(ρ) = 1, so by multiplying then by positive numbers λ, we are simply dropping the unit trace assumption. Hence, the cone generated is Ω A + = {ρ ∈ V A : ρ ≥ 0}, that is, the set of positive operators on the Hilbert space corresponding to A. Conversely, using the unit effects, it is always possible to recover the states from the positive cones by restricting them to normalised vectors.
Diagrammatically, T is trace non-increasing means, for all s ∈ Ω A , that: Definition A.4 (Trace-Preserving Operation). A transformation T ∈ L(V A , V B ) is said to be trace preserving if ∀s ∈ Ω A , u B (T (s)) = u A (s).
Diagrammatically, T is trace preserving means, for all s ∈ Ω A , that: Diagrammatically, for all s ∈ Ω A , a positive transformation satisfies: Definition A.6 (Completely Positive Transformation). A transformation T ∈ L(V A , V B ) is said to be completely positive if it is positive and for any system C, the trans- Diagrammatically, for all systems C and all bipartite states s ∈ Ω AC , this means that: Any time that a transformation appears in a diagram, it is implied that it is a completely positive transformation for the corresponding GPT because it is an allowed transformation in said theory. The same applies for states: if they appear in a diagram, they must be positive in the corresponding GPT. So, the diagrams drawn in the beginning of this appendix are examples of positive transformations and states. Moreover, note that these notions recover those of positive, completely positive, trace preserving, and trace non-increasing maps when applied to quantum theory, because the cones are generated by the sets of states, and the unit effects are the trace operations.
Definition A.7 (Local Tomography). A GPT is said to be locally tomographic if any state ρ A1·...·An of a composite system A 1 · ... · A n can be uniquely determined by the information obtained from local effects {e A1 },..., {e An } on its parts A 1 ,..., A n . When this holds, the unit vector for the composite system, u A1·...·An , is given by u A1 ⊗ ... ⊗ u An .
As an example of a GPT satisfying local tomography we have quantum theory. There, any ρ A1·...·An is completely determined by a set of probabilities (e A1 i1 ⊗ ...e An in )[ρ A1·...·An ], where each local effect e Aj ij denotes the inner product of the quantum state with the corresponding POVM element.
Definition A.8 (Maximal Tensor Product). The maximal tensor product, ⊗ max is a rule for the combination of two systems into one, say, A and B into A · B, that defines the positive cone of the composite system as the largest set of vectors in V A ⊗ V B that is consistent (in the sense of producing sensible probabilities) with all the separable effects of A · B. That is The maximal tensor product ⊗ max is associative [46], hence one can unambiguously write which has the explicit form: As discussed after Def. A.2, this operation fixes the state spaces for the composite systems, because the cone Ω A·B + and the unit effect u A ⊗ u B (which is the unit effect for u A·B in locally tomographic GPTs [32]) can be used to construct the set of states Ω A·B .
Definition A.9 (No-Restriction Hypothesis [32]). A theory is said to satisfy the no-restriction hypothesis (NRH) if any element e ∈ (V A ) * that gives e(ρ) ∈ [0, 1] for every ρ ∈ Ω A is an element of E A . That is, if This is to say, the NRH is the statement that, given the set of states, the set of effects is the largest possible that still gives sensible probabilities for every state.
Definition A.10 (Generalised No-Restriction Hypothesis). A theory is said to satisfy the Generalised No-Restriction Hypothesis (GNRH), if it satisfies the NRH and every completely positive trace non-increasing transformation is an allowed transformation. That is, for any two systems A and B, every transformation T ∈ L(V A , V B ) that takes elements in Ω A·C + to elements in Ω B·C + for any third system C, and satisfies u B (T (ρ)) ≤ u A (ρ) for any ρ ∈ V A , is a valid transformation from A to B. This is a convenient assumption to make about a theory because it simplifies its description, since it implies that the specification of the state spaces uniquely fixes both the effects and transformations. Again, we can use quantum theory as an example, as it does satisfy the GNRH.
Our last definition in this section is that of entanglement witnesses in a generic GPT. This definition follows closely to that found in the context of quantum theory.
Definition A.11 (Generic Bipartite Entanglement Witness). The set W A·B of entanglement witnesses of a bipartite system A · B for a generic locally tomographic GPT is given by Note that this will depend on the choice of inner product. In the quantum case the standard choice will be the Hilbert-Schmidt inner product.
That is, an entanglement witness is a vector associated, through the Riesz representation, to a linear functional which evaluates to positive numbers for every product state of the bipartite system. This is a simple generalization of the quantum entanglement witnesses that uses arbitrary GPT states instead of quantum states. The generalization for multipartite entanglement witnesses is straightforward.
Definition A.12 (Generic Multipartite Entanglement Witness). The set W A1·...·An of the entanglement witnesses of a multipartite system A 1 · ... · A n for a generic locally tomographic GPT is given by Like in the bipartite case, this will depend on the choice of inner product. Again, in the quantum case the standard choice will be the Hilbert-Schmidt inner product.

Some useful results
We now prove (or reprove) various results which are useful later on. Firstly we note an important consistency condition for the max tensor product. This is well known in the literature (see, e.g., Ref. [73]) but we reproduce it here for completeness.
Proposition A. 13. In a theory where systems compose via the max tensor product, and that satisfies the NRH, one can check that the vectors, v ∈ V A , that can be steered to from bipartite states in Ω AB correspond to subnormalised states, i.e., live in Ω A + and satisfy u A (v) ≤ 1. 1 Proof. Steered states are of the form v A = 1 A ⊗ e B (s AB ) for a bipartite state s AB and an effect e B and the identity transformation 1 A . Note that a special case of this is the reduced state which is given by taking e B = u B . We want that these vectors are local (subnormalised) states. Note that, by definition of the max tensor product we have that for all e A that e A ⊗ e B (s AB ) ∈ [0, 1] and hence that e A (v A ) ∈ [0, 1]. As in our theory the local effects e A are defined via the NRH this means that v A must be in the cone Ω A + . Moreover, it is easy to compute that it is (sub)normalised as Another well-known result (see, e.g., Ref. [46]) is the following.
Proposition A.14. In a GPT composed by the max tensor product, every effect on a composite system is a separable effect [46].
Proof. It follows straightforwardly by noticing that the use of ⊗ max to combine systems implies that the set of effects of the combined system A · B is just the set of separable effects.
Next we prove a lemma which is useful for proving a key observation.
Lemma 1. In a GPT containing systems A and B, if the no-restriction hypothesis holds and the map T ∈ L(V A , V B ) is positive, then Proof. Recall that positivity of T ∈ L(V A , V B ) means that Now, using this positivity and, noting that e B ∈ E B + , we find for all Hence, which completes the proof.
The lemma above can be used to prove a useful fact for our work.
Theorem A. 15. In any GPT that combines systems through ⊗ max , the max tensor product, and satisfies the no-restriction hypothesis, if a map T is positive, then it is completely positive.
Proof. For the sake of contradiction, assume that, in a GPT satisfying the NRH where ⊗ max is the combination rule, the map T ∈ L(V A , V B ) is positive but not completely positive. That is, that there exists a system C such that T ⊗ 1 C ∈ L(V A ⊗ V C , V B ⊗ V C ) is not positive. That means that there must exist some bipartite state s ∈ Ω A·B = Ω A By the definition of ⊗ max , this means exists t ∈ E B and v ∈ E C such that However, as t ∈ E B ⊂ E B + , we know from lemma 1 that: and hence that there exists λ ≥ 0 such that Substituting this into eq. (A30) gives us that: but, as t ∈ E A and v ∈ E C this means that s ∈ Ω A + ⊗ max Ω C + and, hence, we have reached a contradiction.

Appendix B: Formal definition and features of Witworld
In this Appendix we provide the formal definition of Witworld as a GPT, and the proofs that it does indeed possess the features mentioned in the main text. That task amounts to explicitly saying what are the states, effects and transformations of Witworld, following the formalities of the GPT framework mentioned in App. A. For Witworld, this is simplified because we define it to satisfy the GNRH, so by providing just the state spaces for each system type (including the multipartite ones, which requires the combination rule), we determine the complete GPT.
As was said in the main text, Witworld contains systems which we call atomic systems, and systems which we call composite systems. A system A is atomic if it cannot be considered as being the result of combining a system B with another system C. That is, there are no B and C such that A = B · C. This means that any system type can be built from the atomic systems, so to determine all the types in Witworld, we only need to say what the atomic systems are and what the combination rule · is. Regarding the latter, we choose · to be the maximal tensor product ⊗ max . Regarding the former, we now formally define the atomic systems: (Quantum System Q d , d ∈ N). A quantum system of type d has as its vector space V Q d , as positive cone Ω Q d + , as effects set E Q d , and as unit effect the function u Q d ( ), all defined as follows: Hermitian operators on a Hilbert space of dimension d. • : the set of positive operators on H d .
• u Q d ( ) = tr (1 ): the trace inner product with the identity operator 1 in L (H d , H d ).
: the set of trace inner products with operators in V Q d that are positive and smaller than or equal to the identity, as required by NRH.
For atomic systems, the quantum type coincides with those of single systems in traditional quantum theory, and indeed local states and effects of atomic systems coincide for quantum types in both theories. However, as we see later on, this no longer holds for either transformations or composite systems -for the latter, this follows from the fact that the combination rule in Witworld, the maximal tensor product, is not the same as in quantum theory, so Q d · Q d = Q dd .
Definition B.2 (Classical system C v , v ∈ N). A classical system of type v has as its vector space V Cv , as positive cone Ω Cv + , as effects set E Cv , and as unit effect the function u Cv ( ), all defined as follows: The direct sum of a (v−1)dimensional real vector space with the real numbers. We can work with the isomorphic space R v to simplify notation.
• Ω Cv : the set of vectors in R v that are the null vector or have a positive last component and whose first v − 1 components divided by the last give the probabilities for v − 1 outcomes of a measurement with v possible outcomes.
• u Cv ( ) = (0, ..., 0, 1) T , : the Euclidean inner product with the vector in R v whose only nonzero component is the last one, which is 1.
• E Cv = { e, : e ∈ V Cv , e, s ∈ [0, 1]∀s ∈ Ω Cv }: the set of euclidean inner products with vectors in V Cv that evaluate to probabilities for every vector in Ω Cv , as required by the NRH.
Those are the traditional classical systems -probability distributions written as vectors. Such vectors can always be seen as a convex combination of deterministic states. Note that, unlike for quantum systems, for classical systems we do have that C d · C d = C dd 2 . Writing classical states in this form allows us to further notice that they are just particular cases of the Boxworld type. Definition B.3 (Boxworld system B n,k , (n, k) ∈ N 2 ). A Boxworld system of type (n, k) 3 has as its vector space V B n,k , as positive cone Ω B n,k , as effects set E B n,k , and as unit effect the function u B n,k (·), all defined as follows: • V B n,k = (R n ⊗ R k−1 ) ⊕ R 1 ∼ = R n(k−1)+1 : The direct sum of the real numbers with the direct product between two real vector spaces of dimensions n and k − 1. We can work with the isomorphic space R n(k−1)+1 to simplify notation.
• Ω B n.k the vectors in R n(k−1)+1 which are the null vector or that have a positive last component and the first n(k − 1) components divided by the last component (if positive) can be viewed as probabilities for the first k − 1 components of n measurements of k possible outcomes stacked in a list.
• u B n,k ( ) = u, where u = 0 R n ⊗ 0 R k+1 ⊕ 1: the inner product with the vector in R n(k−1)+1 with only null components except for the last, which is 1.
: e ∈ V B n,k , e, s ∈ [0, 1]∀s ∈ Ω B n,k }: the set of inner products with vectors in V B n,k that satisfies the NRH.
The Boxworld systems can be viewed as classical systems that require many measurements to uniquely determine a state, rather than just 1, and the probability distributions for those measurements are independent of each other. A classical system of type v, then, can be viewed as a Boxworld system of type (1, v). Note, however, that unlike classical systems, the composite of two more general Boxworld systems is no longer an atomic Boxworld system, that is B n,k · B n ,k = B n ,k .
When using diagrams, we denote the atomic classical, quantum, and Boxworld systems by different types of wires: (B1) When we need to talk about an arbitrary kind of system, the wire we use is the following: As stated previously, from the atomic systems any general system in Witworld can be constructed as an arbitrary composite of the three fundamental system types, Q d , C v , B n,k . For instance, Q d · Q d and Q d · C v · C v · B n,k ·Q d would both be systems within our theory. More generally, systems correspond to arbitrary strings of elements from the set {Q d , C v , B n,k } d,v,n,k∈N . The positive cones for these composite systems are obtained through the max tensor product of the cones of the atomic types, and from those we can obtain the set of states by taking the intersection of the cone with the set of normalised vectors in the product vector space. Here, by normalised vector we mean vectors for which the unit effect evaluates to 1. Since Witworld is a locally tomographic GPT, the unit effect for A 1 · ... · A n is simply u A1 ⊗ ... ⊗ u An .
Since Witworld, by definition, satisfies the NRH, once we establish what the states for every type of system are, the effects are also determined. Given any system like Q d · B n,k · ..., every linear functional on V Q d ⊗ V B n,k ⊗ ... that gives probabilities for every vector in Ω Q d ·B n,k ·... is a valid effect.
The definition of the transformations in Witworld is similar to that of the effects. Here, we require the theory to satisfy the GNRH, so when we determine the states, the transformations are fixed. In Witworld, any completely positive transformation is allowed. We prove later that this, together with the fact that the combination rule is ⊗ max , implies that any positive transformation is an allowed transformation for arbitrary systems in Witworld.
Formally, and concisely, Witworld is therefore defined as follows: Definition B.4 (Witworld). Witworld is the locallytomographic GPT that satisfies the generalised norestriction hypothesis, and whose systems are arbitrary combinations under the max tensor product ⊗ max of the atomic system types described in definitions B.1, B.2, and B.3.
Now that we presented the definition of the theory, we can move on to observing or proving the various features of Witworld which we used in the main text. Note that because Witworld composes via the max tensor product and satisfies the GNRH, that all of the results of App. A 3 hold.
Firstly, Proposition A.14, tells us that in Witworld there are only separable effects. Therefore, for systems that are the combination of atomic quantum systems, there are fewer effects in Witworld than in quantum theory: effects from measurements in an entangled basis are not present in Witworld. For Boxworld and classical systems, such a difference does not exist: that is, Boxworld and classical system types feature separable-only effects both in Witworld and in their respective traditional frameworks.
Theorem A.15, together with the GNRH means that local transformations which in quantum theory are positive but not completely positive maps are indeed valid transformations in Witworld. This is important for constructing many examples of post-quantum assemblages. Again for Boxworld and classical systems this distinction does not exist as for them the notion of positivity and complete positivity already coincide in their respective traditional frameworks.
Next we show that bipartite states for quantum systems within Witworld correspond to quantum entanglement witnesses.
Theorem B.5. In Witworld, the bipartite system resulting from the combination of atomic quantum systems Q d and Q d contains every bipartite entanglement witness in its positive cone.
Proof. Take the set of effects E Q d . To each effect ϕ e in it, there is a vector e ∈ V Q d associated to it by its Riesz representation with the Hilbert-Schmidt inner product. Let us call the set of all e ∈ V Q d associated to some ϕ e ∈ E Q d byẼ Q d . Do the same to defineẼ Q d . In quantum theory,Ẽ + = Ω + because the effects are inner products with positive operators. Now, note that by changing E A → E A + and E B → E B + we do not change the set defined by equation A.11. Hence, we can write it, already using the action of the effect as an inner product, as This theorem means that if we compare the positive cones in quantum theory with the positive cones in Witworld when combining two atomic quantum systems, we see that the positive cone in Witworld is larger than the positive cone in quantum theory. To see this more explicitly, suppose A and B are two atomic quantum systems, then refer to A and B combined as prescribed by quantum theory by A ⊗ B, and by A · B when combined as prescribed by Witworld. Then, since definition A.11 is independent of the combination rule and is equivalent to what is used in quantum theory, theorem B.5 implies Ω A·B where the last inclusion is given from quantum theory. Nevertheless, since classical and Boxworld systems originally combine through ⊗ max , in Witworld the combination of atomic systems of said types do not build more states than what we would normally have. Finally, states that are the combination of different types of atomic systems are incomparable to states in quantum, Boxworld or classical systems.
The fact that we can view quantum entanglement witnesses as valid states in Witworld is a key feature which underpins many of our realisations of post-quantum assemblages. This is also true in the multipartite generalisation which we now prove.
Theorem B.6. In Witworld, the multipartite system resulting from the combination of atomic quantum systems Q d1 = A 1 , ..., Q dn = A n contains every multipartite entanglement witness in its positive cone.
Proof. This is a straightfoward generalization of the bipartite case: e Ai , s ≥ 0 ∀e Ai ∈ Ω Ai + = W Ai⊗...⊗An (B4) where e Ai ∈ V Ai is associated to ϕ Ai e ∈ (V Ai ) * by the Riesz representation andẼ Ai + = Ω Ai + because the A i are atomic quantum systems.

Appendix C: How to realise a PR-box in Boxworld and Witworld
To make explicit that Witworld can, in fact, realise a PR-box, we explicitly write down the elements in the diagram and thereby show that the following holds: where ⊕ is addition modulo 2, and a, b, x, y ∈ {0, 1}.
The first step is to define the measurements M PR and M PR . We characterise these by their associated set of effects. For example, M PR by {e A a|x }, for the outcome a when measurement x is performed, diagrammatically these are defined as: (C3) Similarly we can characterise the measurement M PR by the set of effects {e B b|y } which are defined analogously. We therefore want to verify that there exists measurements and states such that: which, to make this more explicit, can be rewritten symbolically as: Following definition B.3, the states of B 2,2 are three component real vectors, whose first component is p(0|0), second component is p(0|1) and last component is 1. Therefore, for e a|x (s) = p(a|x) to hold, we need the vectorsẽ a|x , associated to e a|x by the Riesz representation, to be given bỹ Also by definition B.3, the Riesz representation of the unit effect u A is given bỹ so thatẽ 0|x +ẽ 1|x =ũ A , which makes {e a|x } a valid measurement in B 2,2 for each x. From the vectors above, we can, using the Kronecker product, writeẽ a|x ⊗ẽ b|y , which are associated to the product effects of the composite system B 2,2 · B 2,2 . Now, any vector s AB in V B2,2 ⊗ V B2,2 ∼ = R 9 such that e a|x ⊗ e b|y (s AB ) = ẽ a|x ⊗ẽ b|y , s AB ≥ 0 for all a, b, x, y ∈ {0, 1} is in the positive cone Ω B2,2 + ⊗ max Ω B2,2 + . We now show that the following vector describes a normalised state in the positive cone of bipartite states, and, moreover, reproduces the statistics of the PR box as we desire. That is, consider the vector: and note that it satisfies ẽ a|x ⊗ẽ b|y , s PR = 1 2 δ a⊕b,xy , which can be verified by direct calculation. Therefore s PR is in the positive cone of B 2,2 · B 2,2 and moreover reproduces the PR box statistics. Finally it is also normalised as: Hence, s PR is a valid state of B 2,2 ·B 2,2 which recovers the PR-box under the separable measurements {e A a|x ⊗ e B b|y }, which proves that Witworld can indeed realise a PR-box. and, instead, merely keeps his system -by assumption, a quantum one. Then Bob has the subnormalised states σ a|x which are given by the following scheme: where ρ a|x is the normalised state in possession of Bob when Alice obtains outcome a upon measuring x with probability p(a|x). The complete description of this scenario is specified by the the set Σ A|X = {σ a|x } a∈A,x∈X of subnormalised quantum states, which contains the information about the states that Bob can have by the end of Alice's measurement, each of them conditioned on some measurement x and outcome a on Alice's side, together with the probability of the outcome a happening. This set of subnormalised quantum states, Σ A|X , is known as an assemblage [53]. Note that the assemblage elements σ a|x indeed contain the complete information about the scenario because p(a|x) = tr(σ a|x ) and ρ a|x = σ a|x / tr(σ a|x ). Of course, if signalling is permitted between Alice and Bob, then (within quantum theory) any assemblage can be trivially prepared, so we restrict our discussion to the non-signalling scenarios. The assemblages that can possibly be produced in this case are called non-signalling assemblages. We define them as being those satisfying conditions analogous to those that define non-signalling boxes in Bell scenarios: Definition D.1 (No signalling bipartite assemblages). An assemblage Σ A|X is no signalling iff In the channel based picture, these constraints can be captured diagrammatically by the condition that Σ is a 'causal' channel [74]: (D6) It can be seen that this is equivalent to the standard nosignalling condition by noting that whatever input x is chosen for the classical system X can have no influence over the quantum system, as it is in a fixed normalised state ρ B . The study of steering is the study of the properties of assemblages -or equivalently, the study of the properties of 'causal' classical-quantum channels. We use this to make a classification of types of assemblages in a meaningful way. Notice that while the set Σ A|X is a set of (subnormalised) quantum states, the fact that the system that Alice measures is not specified opens up the possibility for post-quantumness in the joint scenario, while keeping the quantum theoretical description valid for the local state of Bob. Of course, like in the study of Bell non-locality, this scenario can be generalised. The two generalisations that we consider here are: i) adding more parties that are steering Bob; and, ii) allowing for Bob to have a setting variable y ∈ Y. For the purpose of investigating postquantumness, this is not only possible, but necessary, as it has been proven [59,60] that every assemblage in the standard bipartite scenario can be realised in quantum theory. That is, any standard bipartite no signalling assemblage can be constructed by Alice performing a controlled measurement on one half of a bipartite quantum state shared with Bob. We call the assemblages which are beyond the powers of quantum theory in non-signalling scenarios, that is, which cannot be realised in this way, post-quantum assemblages.
In multipartite scenarios, the assemblage elements now carry the labels for the outcomes a i ∈ A i for multiple parties i ≤ N , and similarly for the measurement choices x i ∈ X i . They form assemblages Σ A1...A N |X1...X N = {σ a1...a N |x1...x N } and are given in diagrammatic notation by The no-signalling constraints of Def. D.2 can alternatively be expressed in diagrammatic notation in a simple way. For each arbitrary partitioning of {1, ..N } = {s 1 , ..., s r } {t 1 , ..., t N −r }, with S = {s 1 , ..., s r } and now 0 ≤ r ≤ N , describe this partitioning via a physical splitting of the wires into a left hand group (the s i ) and a right hand group (the t j ) depicted by the process Part S . Then diagrammatically, Eqs. (D9), (D10), and (D11) read: As r · · · At 1 At N −r · · · Xs 1 Xs r (D12) The other kind of generalization of the bipartite scenario that we consider is to allow Bob to, instead of staying passive, perform a local transformation labelled by y ∈ Y to his share of the system. Here, we require that Bob's system is locally a quantum one only after his transformation. In this case, the assemblage is denoted by Σ A|XY = {σ a|xy }, and pictorially is represented by Definition D.7 (Local Hidden State (LHS) Bob-with-input Assemblage). An assemblage Σ A|XY in the Bob-withinput steering scenario has a local hidden state model if and only if it can be prepared by the parties performing local operations on a shared classical system. That is, where ρ B λ,y are local quantum states.
Definition D.8 (Quantum Bob-with-input Assemblage). An assemblage Σ A|XY in the Bob-with-input steering scenario has a quantum realization if and only if it can be prepared by the parties performing local operations on a shared quantum system. That is, for some local POVM {M x a }, joint quantum state ρ AB , and quantum operations {T y }.
The LHS and quantum assemblages in the instrumental scenario are defined just like in the Bob-with-input scenario, but with the constraint that y = a.
With the steering scenarios and assemblages defined in a general way, we can proceed to describe how these appear within the GPT framework. We follow a similar path, starting from Bell nonlocality scenarios and legitimately use the GPT diagrams for each case.
In a GPT, a Bell scenario where the no-signalling condition is satisfied is produced when Alice (Bob) makes local measurements M A (M B ) with input x (y) on a shared state s. The set of no-signalling boxes that can be realised in such a way are equivalent to the set of 'causal' classical channels, N , that can be realised by: Again, we now let Bob be passive and perform no measurement, under the assumption that his system is a quantum one, and the resulting diagram is an assemblage element in the bipartite steering scenario: Definition D.9 (GPT realisable assemblages). i) A bipartite assemblage Σ A|X is GPT realisable for a given GPT iff the channel associated to it can be written as: (D25) Note that the transformation T can be viewed as the process by which Bob characterises his system, which could be a post-quantum system, as a quantum system. This could be incorporated into the state s and we could view Bob as being given a quantum system to start with, this picture, however, is useful for later generalisations. ii) A multipartite assemblage Σ A1...A N |X1...X N is GPT realisable if and only if its associated causal channel can be written as: iii) A Bob-with-input assemblage Σ A|XY is GPT realisable iff its associated channel can be written as: iv) An Instrumental assemblage Σ I A|X is GPT realisable if and only if it is a wiring of a GPT realisable assemblage in the Bob-with-input scenario: (D28) With the definition of GPT realisable assemblages in place, we can revisit the LHS and quantum assemblages, and see how these amount to restrictions on the shared state s and the GPT to which it belongs. That is, if we consider the assemblages that are realisable in the GPT of quantum theory, then, within this GPT, the GPT realisable assemblages are exactly the Quantum assemblages. If we moreover restrict to the state s being a separable state, then we recover the LHS assemblages. In this section, we prove that Eq. (23) is true. Namely, that for all ψ we have: To see this we note some basic results regarding the various components of the diagram on the left hand side. First note that, as we are considering a qubit system Q 2 , then for each pure state ψ there is a unique orthogonal state, which we denote as ψ ⊥ . This uniqueness, in particular, means that: Now, turning to basic properties of the diagrammatic elements we have: i. the singlet state Finally, note that we can decompose the classical system as a sum of projectors. Consider the basis for classical system C v labelled by i ∈ {1, ..., v}, then From these, together with linearity of GPTs, we can conclude that the protocol does indeed work as we want: which is the state |ψ ψ| chosen by Alice but prepared at Bob's lab, as required.