The computational landscape of general physical theories

The emergence of quantum computers has challenged long-held beliefs about what is efficiently computable given our current physical theories. However, going back to the work of Abrams and Lloyd, changing one aspect of quantum theory can result in yet more dramatic increases in computational power, as well as violations of fundamental physical principles. Here we focus on efficient computation within a framework of general physical theories that make good operational sense. In prior work, Lee and Barrett showed that in any theory satisfying the principle of tomographic locality (roughly, local measurements suffice for tomography of multipartite states) the complexity bound on efficient computation is AWPP. This bound holds independently of whether the principle of causality (roughly, no signalling from the future) is satisfied. In this work we show that this bound is tight: there exists a theory satisfying both the principles of tomographic locality and causality which can efficiently decide everything in AWPP, and in particular can simulate any efficient quantum computation. Thus the class AWPP has a natural physical interpretation: it is precisely the class of problems that can be solved efficiently in tomographically-local theories. This theory is built upon a model of computing involving Turing machines with quasi-probabilities, to wit, machines with transition weights that can be negative but sum to unity over all branches. In analogy with the study of non-local quantum correlations, this leads us to question what physical principles recover the power of quantum computing. Along this line, we give some computational complexity evidence that quantum computation does not achieve the bound of AWPP.

There is ever-growing evidence that quantum computers are more powerful than classical computers [1][2][3][4]. However, an understanding of the source of this power remains elusive. Many features of quantum mechanics have been posited as the origin of this so-called "speedup" [45][46][47][48][49] but the debate is far from resolved [50][51][52]. In recent years, one way of examining this power has been to ask how the computational power changes as features of quantum theory are altered. Beginning with the work of Abrams and Lloyd, it was shown that allowing more exotic transformations in quantum theory can result in easily solving hard problems [5]. This has even motivated the belief that quantum theory is an "island" within the space of all possible theories; alter quantum mechanics and we obtain dramatic consequences [6].
Another possibility is that our understanding of computation in possible physical theories is couched too much in the language of quantum theory. For example, it could be entirely possible to have a theory that has the same computational power as quantum theory but barely resembles it. We thus require an abstract framework in which to study the power of computation, where quantum and classical computation are special cases.
The study of generalised probabilistic theories provides us with a suitable framework for the study of information processing based on operational principles [10,12,13,62,63]. That is, we can make statements about the limits and power of information processing without referring explicitly to quantum theory. For example, a no- * Electronic address: ciaran.lee@ucl.ac.uk broadcasting theorem can be proven from fundamental properties that reasonable operational probabilistic theories should satisfy [61]. Features thought unique to quantum theory (as opposed to classical physics) can be seen to be ubiquitous within these theories. This then begs the question of what fundamental principles uniquely single out quantum physics from these myriad possibilities. Indeed, starting from various frameworks of generalised probabilistic theories there have been many derivations of quantum theory from information processing principles (e.g. Refs. [11,13,64]).
Recently a circuit-based model of computation has been defined and studied in a broad operationally-defined framework for physical theories [21,23,24]. Informally, a theory in this framework specifies a set of laboratory devices that can be connected together to form experiments, and assigns probabilities to experimental outcomes. Whilst many such theories may not correspond to descriptions of our physical world, they nevertheless make good operational sense, and allow one to systematically assess how computational power depends on the underlying physical theory.
One can identify physical principles that theories may or may not satisfy, such as causality (no signalling from future to past), or tomographic locality (local measurements suffice for tomography of joint states). An important result of [21] was to show that for theories satisfying tomographic locality, whether or not causality is satisfied, computational problems that can be solved efficiently are contained in the classical complexity class AWPP-a bound first proved for the quantum case by Fortnow and Rogers [16].
An important open problem is to determine whether arXiv:1702.08483v1 [quant-ph] 27 Feb 2017 the bound of AWPP is tight for all possible theories. If there existed a theory that could decide all problems in AWPP, then it would have computational power beyond that which we expect from quantum mechanics and could simulate any quantum computation. In this paper we resolve this open problem and show that there does indeed exist a non-quantum theory which can decide everything in AWPP. We may then consider this theory as a "foil" theory, which can be used to deepen our understanding of the limitations of quantum computers. Remarkably, this foil theory is constructed from a computational model using quasi-probabilities, i.e. an affine combination of weights assigned to particular events. Furthermore, this theory satisfies both tomographic locality and causality. This then naturally motivates the study of what minimal set of information principles are needed to recover the power of quantum computation.
To outline the paper, in Sec. I we present the framework of computation in general physical theories setting, in particular defining efficient computation in such a theory, and give computational complexity bounds on this notion of computation. In Sec. II we describe the theory that attains the bound of AWPP, generating the theory from a circuit construction related to a family of computational models called Affine Turing Machines. In Sec. II, we also give some computational complexity evidence that quantum computers will not achieve AWPP. Sec. III and IV contain the details of our constructions and results. Finally, in Sec. V, we conclude with some remarks about the problem of determining the computational complexity of quantum computers from basic informational principles, in analogy with the characterisation of quantum non-locality [32,33,[36][37][38][39].

I. THE FRAMEWORK
The fundamental goal of any physical theory is to provide a consistent account of experimental data. This constitutes the core idea underlying the framework of generalised probabilistic theories [10,11,13,21], where the primitive notions are operational in nature.
Informally, a theory in this framework specifies a set of laboratory devices, that can be connected together in certain ways and assigns probabilities to different experimental outcomes. A laboratory device comes equipped with "input ports", "output ports", and a "classical pointer". We consider the input and output ports to correspond to physical systems. Each physical system has a particular type, denoted A, B, C, . . . . Devices may be connected by taking the output port of some device and feeding it to the input port of another device, provided that in doing so we do not introduce cycles of input/output dependencies. Moreover, the types of the output and input systems much match. An "experiment" consists of such a closed circuit of devices, in which every output port has been fed to the input port of some other device in this way. When a device is used in an experiment, the classical pointer comes to rest in one of a number of positions, indicating that a particular outcome has occurred. The theory defines a joint probability on the pointer outcomes for all of the devices in the circuit, corresponding to the outcome probabilities of the experiment represented by the closed circuit.
We elaborate on the above summary as follows. Laboratory devices can be classified into preparations, which have no input ports, transformations, which have both input and output ports, and measurements, which have no output ports. Informally, each use of a preparation device outputs a "physical system" in some particular state, determined by the variety of device used. After a given system is prepared in some state, it can pass through a transformation device which can alter the system and its state, in a possibly non-deterministic manner which is indicated by the classical pointer of the transformation. Finally, the system can enter a measurement device, on which the final resting place of the classical pointer denotes the measurement outcome.
By standard operational arguments [10][11][12][13], each system gives rise to a finite dimensional real vector space V A which contains the set of states as a subset. Moreover, transformations and single measurement outcomescalled effects-act linearly on this vector space [12]. Hence each state has a representation as a vector living in this real vector space, each transformation as a matrix acting on this vector space, and each effect as a vector in the dual space [10,11].
We denote by |s) A the vector corresponding to a state of system A. Passing a system through a transformation device results in an output state |s ) B , related to the input state via |s ) B = T A B |s) A , where T A B is the matrix corresponding to the transformation. When a system of type B enters a measurement device, the dual vector representing the effect corresponding to a specific measurement outcome will be denoted B (e|. The probability of preparing a system in state |s) A , having transformation T A B applied to it, and getting measurement outcome B (e| on entering a measurement device corresponds to One can define a notion of causality for theories in the framework: the probabilities of present experiments are independent of future measurement choices [10]. This requirement is equivalent to the existence of a unique deterministic effect (meaning that the device it corresponds to only has one possible outcome) for each system, denoted B (u| [10], such that e B (e| = B (u|, where the sum is over all outcomes of a particular measurement, and B (u|s) B ≤ 1 for all states. Note however, that consistent theories can be constructed which have more than one deterministic effect and which violate causality [14].
In this work we assume that the vector space arising from the composite of multiple systems corresponds to the tensor product of vector spaces of the component systems. That is, V AB = V A ⊗ V B . This requirement is implied by the principle of tomographic locality [10,12], which says that multipartite states can be uniquely specified by the results of local measurements on each component system. In particular, this implies that the matrix representation of a transformation corresponding to simultaneously acting with transformation T A on subsystem A of composite system AB and transformation T B on subsystem B is given by T A ⊗ T B , where ⊗ is the vector space tensor product.
An explicit example serves to illustrate the framework. In standard finite dimensional quantum theory for instance, systems correspond to complex, finite dimensional Hilbert spaces, their type corresponding to the dimension of this space. States correspond to positive semi-definite operators acting on the underlying Hilbert space, effects to POVM elements with the deterministic effect being given by the identity matrix, and transformations to quantum instruments, i.e. a collection of completely positive maps summing to a completely positive and trace-preserving map. Quantum states are elements of the real vector space of Hermitian matrices with the vector spaces for distinct system types composing via the vector space tensor product.

A. Free and non-free theories
In the standard definition of a generalised theory [10][11][12][13]21], a theory specifies a set of laboratory devices from which one can construct closed circuits by connecting preparation devices to transformation and measurement devices, and assigns a probability distribution over the possible outcomes of each closed circuit. Moreover, the set of devices-and device outcomes-is closed under such sequential and parallel composition. For the purposes of this paper, we refer to such theories as "free" generalised probabilistic theories.
One can consider a modified definition of a generalised probabilistic theory in which a theory specifies a set of devices, a set of allowed closed circuits which can be built from those devices, and assigns a probability distribution over the outcomes of allowed closed circuits. Note that this modified definition is slightly more general than the one standardly discussed in the literature, as it only assigns a probability distribution to the set of allowed closed circuits specified by the theory. Our definition is not unmotivated if one takes the viewpoint that a physical theory corresponds both to a consistent account of experimental data and to which experiments are implementable in principle.
Such "non-free" theories will be the focus of the current work. Note that even in non-free theories, states are represented by vectors in a real vector space, transformations by matrices acting on this space, and effects by vectors in the dual space [22]. However, composing states, transformations, and effects in sequence and parallel to form closed circuits may only result in a valid probability when the closed circuit is allowed by the theory.

B. Computation
The class of "yes/no" problems that a quantum computer can solve efficiently is denoted by BQP and much research has been concerned with how large this class is. At present, the best known upper bound is BQP ⊆ AWPP, where AWPP is a (slightly obscure) classical complexity class, known to be contained in PP, hence in PSPACE. This class will be formally defined in Sec. III A. That is to say: a quantum computer cannot efficiently solve any problem outside AWPP, but it is unknown whether a quantum computer can solve every problem in AWPP.
In order to define efficient computation in theories belonging to the framework introduced above, we need the notion of a (polynomially sized) uniform circuit family, and a condition for a circuit to accept an input. A polynomially sized uniform circuit family is a set of closed circuits {C x } such that: 1. There is a gate set G, consisting of laboratory devices, such that each circuit in the family is built from elements of G.
2. The number of gates in the circuit C x is bounded by a polynomial in |x|.
3. For each type of system, there is a fixed choice of basis, relative to which transformations are associated with matrices. Given the matrix M representing (a particular outcome of) a gate in G, a Turing machine can output a matrix M with rational entries, such that |(M − M ) ij | ≤ , in time polynomial in log(1/ ).

4.
There is a Turing machine that, acting on input x = x 1 x 2 . . . x n , outputs a classical description of C x in time bounded by a polynomial in |x|.
This produces a description of an experiment, whose devices produce classical outcomes (corresponding to the particular state prepared, transformation applied, or measurement outcome observed). The theory defines a joint probability for these outcomes, which we may use to perform computation as follows. Denoting the string of observed outcomes by z, we define the final output of the computation to be given by an acceptor function a(z) ∈ {0, 1}, where there must exist a Turing machine that computes a in time polynomial in the length of the input |x|. We say that a run of the experiment accepts an input string x if the outcome string z of the circuit C x satisfies a(z) = 0. The probability that a computation accepts the input string x is therefore given by where the sum ranges over all possible outcome strings z of the circuit C x for which a(z) = 0.
Definition 1. For a theory G, a language L is in the class BGP if there exists a poly-sized uniform family of circuits in G, and an efficient acceptor, such that 1.
x ∈ L is accepted with probability at least 2 3 . 2.
x / ∈ L is accepted with probability at most 1 3 . The constants in the above definition can be chosen arbitrarily as long as they are bounded away from a half by some inverse polynomial [21]. The following theorem was proved for free theories in [21], and follows without modification for non-free theories as well: Theorem 1. For any (free or non-free) theory G satisfying tomographic locality, the following holds:

II. ACHIEVING THE UPPER BOUND
The main result of this work is the construction of a non-free theory, satisfying causality and tomographic locality, that has exactly the power of this upper bound.
Theorem 2. There exists a non-free theory G, satisfying causality and tomographic locality, such that Hence AWPP, despite having a slightly involved definition in terms of gap functions for non-deterministic Turing machines (see Sec. III A), can be thought of much more intuitively as the class of problems efficiently solvable by tomographically local physical theories.
The class AWPP contains problems for which an efficient quantum solution is unknown. Notable among these is the Graph Isomorphism problem, which asks for an efficient procedure to determine if two given graphs are equivalent. It is unknown whether a quantum computer can solve the Graph Isomorphism problem, and so our result provides a theory which can act as a "foil" to deepen our understanding of the limitations of quantum computers. Moreover, the promise version of AWPP contains the Unique Satisfiability Problem (or UNIQUE-SAT), which asks if a given Boolean formula has either a single satisfying assignment, or no satisfying assignment at all-promised one of these two cases is true. This is a very important problem whose complexity is closely related to the class of NP-complete problems [25].
We now provide an intuitive sketch of this construction, but defer the formal definitions and proofs to Sec. III. We start by introducing a quasi-probabilistic model of computation, taking the form a Turing Machine with quasi-probabilistic transition weights with the constraint that the total weight of transitions from a given state must sum to +1. We refer to this model as an Affine Turing Machine and provide a schematic illustration in Fig. 1. We show that the class of problems which can be efficiently solved by this model, with bounded error, perfectly captures the class AWPP. We then construct uniform poly-size circuits, in which the gates are certain affine transformations, that can simulate-and be simulated by-this affine Turing Machine, and hence AWPP. This construction results in a collection of closed circuits which correspond to the probability that the final result of the affine Turing Machine is "yes" or "no" on inputs of different lengths. We prove that these closed circuits correspond to closed circuits in a causal and tomographically local non-free theory, thereby proving Theorem 2.
One might wonder if efficient quantum computation can achieve the bound of Theorem 1. We now present a complexity-theoretic argument which may be considered weak evidence against such a possibility. Here, the classes PromiseBQP and PromiseAWPP are promise versions of the classes BQP and AWPP, meaning that they contain promise rather than decision problems. An example of a decision problem is deciding whether some bit string x is in a particular language; either x is in the language, or it is not. A promise problem is a generalisation of a decision problem where the input is promised to belong to a subset of all possible inputs, so that there are disjoint subsets Π ACCEPT , Π REJECT ⊆ Σ * of inputs to be accepted or rejected (respectively), but which do not exhaust the set of all inputs. If an input belonging to neither Π ACCEPT nor Π REJECT is given to an algorithm for a certain promise problem, no requirements are placed on the output, i.e. the algorithm is allowed to output anything.
While PromiseBQP = PromiseAWPP may hold independently of whether BQP = AWPP holds, these statements are at least conceptually related. Intuitively, if one of these statements appears unlikely, the other one should also be considered unlikely-although to a lesser degree. Indeed, problems which are complete for BQP (i.e. the hardest problems in BQP) are in fact promise problems. Hence, PromiseBQP and PromiseAWPP can be loosely thought of as characterising the power of BQP and AWPP respectively. It is believed unlikely [17,18,40] that NP is contained in either BQP or AWPP. This can, in some sense, be taken as evidence against the assertion that the computational power of quantum theory (in the promise problem setting) exactly equals PromiseAWPP.

III. PROOF OF THEOREM 2
The proof of Theorem 2 is contained in the following sections. As informally discussed in above, we introduce a quasi-probabilistic model of computation, taking the form a Turing Machine with quasi-probabilistic transition weights, and consider the class of problems which can be efficiently solved by this model. We refer to this model as an Affine Turing Machine. We show that this model perfectly captures the class AWPP. We then show that one can construct uniform poly-size circuits, where the gates are certain affine transformations, that can simulate-and be simulated by-this (efficient) quasi-probabilistic Turing Machine. Thus, this affine circuit construction also captures the power of AWPP. We finish by showing that these circuits give rise to a non-free theory that is causal and tomographically local.

A. Definition of AWPP
Let Σ be a finite set of symbols, e.g. Σ = {0, 1}, and let Σ * be the set of all finite sequences over Σ (commonly referred to as strings). For a string x ∈ Σ * , we let |x| denote its length. A gap function over Σ is a function g : Σ * → Z which computes the difference between the number of accepting branches and rejecting branches of some nondeterministic Turing machine N, where N takes no more than T (|x|) computational steps on input |x| for some polynomial T on whatever input x it is given.
Fenner [17,Theorem 1.3] characterised AWPP as the class of languages L ⊆ Σ * for which there is a gapfunction g : Σ * → Z and a polynomial p, such that Combining this with [17, Theorem 3.1], more generally we have L ∈ AWPP if and only if for a gap-function g and any poly-time computable function h : N → N. While the original definition of AWPP [19] further required there to exist a gap-function g and a poly-time computable function h for any polynomial r : , 1], we instead use the characterisations of both Eqns. (1) and (2) in our results.

B. Affine Turing Machines
We define an Affine Turing Machine (AffTM) to be a non-deterministic Turing Machine, in which every transition has an associated real-valued (possibly negative) weight. The weight of a given computational branch is then the product of the weights of the transitions involved. We require that for each symbol being read, the total weight of transitions from a given (non-halting) state is +1. In this article we consider only rational transition weights, but expect that similar results would obtain for algebraic real coefficients.
We interpret AffTMs as a model of quasi-probabilistic computation, as follows. Given an AffTM M whose branches all halt in in a finite number of steps, the acceptance weight α M (x) of M on an input x is the total weight of the accepting paths on input x. An AffTM M is proper if 0 ≤ α M (x) ≤ 1 for all inputs, and that it decides a language L with bounded error if furthermore An AffTM is efficient if the number of computational steps it takes in any computational path on any input x is bounded by some polynomial in |x|. The first step towards Theorem 2 is to establish the following: Lemma 1. The class of languages decided with bounded error by some efficient AffTM is equal to AWPP.
The proof of this result is contained in following two sections.

Solving AWPP problems with an affine Turing machine
For L ⊆ AWPP, let g : Σ * → Z be a gap-function satisfying Eqns. (1) for some polynomial p. Also let N be the non-deterministic Turing machine whose accepting/rejecting branches determine the gap-function g, and T be the polynomial bounding the number of computational steps of N on its input. By standard results [19], we may require that N have the same number of nondeterministic transitions at each step, which we denote by N ≥ 1, and that all computational branches of N have the same length on input x. We suppose that each transition of N is associated with some label ∈ {1, 2, . . . , N }: the computational branches of N are then in one-to-one correspondence with sequences {1, 2, . . . , N } T (|x|) . We may then consider an AffTM M which simulates N, in the following sense: 1. M first makes T (|x|) non-deterministic transitions, writing a sequence of symbols β 1 , β 2 , . . . , β T (|x|) ∈ {0, 1, 2, . . . , N } on the tape to produce a string β ∈ {0, 1, 2, . . . , N } T (|x|) . The weights of these transitions are +1 for each choice β t = 0, and (1−N ) for each choice β t = 0, so that the transition weights sum to +1.
2. In branches with one or more symbols β t = 0, M transitions deterministically with weight +1 to a state reject. All other branches of M have weight +1 and record a string β ∈ {1, 2, . . . , N } T (|x|) indexing some computational branch of N. In these branches, M simulates the computational branch of N whose transitions are indexed by β.
3. For any branch in which the simulation of N rejects, M makes a non-deterministic transition to a state dampen with weight −1, and to the reject state with weight +2. For the branches in which the simulation of N accepts, M transitions deterministically to dampen with weight +1.
By the construction of the branch weights, M is an AffTM; and as the number of transitions that M makes is O(T + p), it is efficient. By construction, the total weight of the branches which transition to the dampen state is g(x); sampling the string δ ∈ {0, 1} p(|x|) and rejecting unless δ = 11 · · · 1 ensures that the acceptance weight is α M (x) = g(x)/2 p(|x|) . By hypothesis, this is bounded between 0 and 1, is at least 2 3 if x ∈ L, and is at most 1 3 otherwise. Thus M decides L with bounded error.

Simulating an Affine Turing Machine in AWPP
Suppose that M is a proper and efficient AffTM which has transitions with rational weights. Let M be the common denominator of the transition weights of Q, T ∈ O(poly(n)) the running time of M on an input of length n, and m > 0 be an integer such that 2 m ≥ M T and 2 m ≥ |u|M ) T for all transition weights u of M; it follows that m ∈ O(T ). We may obtain an AWPP algorithm to approximately simulate M, as follows. We define a non-deterministic machine N, which simulates M in the following sense.
1. The machine N reserves some space on the tape to represent some weight Ω ∈ Q for each branch. We call this the recorded weight of the branch. 4. N determines whether to accept or reject, treating c m−1 c m−2 · · · c 1 c 0 as the binary expansion of an integer 0 ≤ C < 2 m , as follows.
Consider the gap function g(x) of the machine N. From Step 4, it is clear that if C ≥ |Ω| in any particular branch, N accepts and rejects with equal measure, contributing nothing to g(x). The significance of the contribution of any simulated branch of M is then in proportion to its recorded weight in N, which in absolute value is 2M T times its weight in M (arising from the systemic failure to divide the recorded weight by M at each of the T transitions, and from the two values of b). Let α + (x) be the total weight of those accepting branches of M with positive weight, α − (x) be the total (absolute value of) the weight of accepting branches with negative weight; and similarly for ρ + (x) and ρ − (x) for rejecting branches of positive and negative weight. Then α( is the contribution to the gap from branches in which a = 0, and g 1 (x) is the contribution to the gap from branches in which a = 1. We then have as α(x)+ρ(x) = 1. In the branches where a = 1, the sign of the contribution from rejecting branches is negated, so that again using α(x) + ρ(x) = 1. Then g(x) = 4M T α(x), and for h(n) = 4M T , we have 0 ≤ g(x)/h(|x|) ≤ 1 as M is proper. Furthermore, if M decides a language L with bounded error, then either 2 3 ≤ g(x)/h(|x|) ≤ 1 or 0 ≤ g(x)/h(|x|) ≤ 1 3 according to whether x ∈ L or x / ∈ L; then L ∈ AWPP as well.

C. Constructing affine circuits
The next step towards Theorem 2 is to construct a family of circuits that can simulate an arbitrary AffTM. The final step will then be to show that the collection of all such circuit families is available in a specific non-free theory that satisfies tomographic locality and causality.
The construction of the circuits is based on that used by Yao in [44] to construct quantum circuits that simulate a quantum Turing Machine (and also on that of [58,59] for circuits that simulate a probabilistic Turing machine). As before, let M be a proper and efficient AffTM with alphabet Σ, set of states Q and transition amplitudes δ(q, a, τ, q , a ) ∈ Q with τ ∈ {←, •, →}; the symbols ←, → and • are interpreted as the tape head of the AffTM moving to the left, moving to the right, and remaining stationary. Here δ is the transition weight of M to change to state q , print a on the tape and move according to τ , if the machine is currently in state q and reading a. The condition on the weights in order for M to be an AffTM is: τ,q ,a δ(q, a, τ, q , a ) = 1 for all q ∈ Q, a ∈ Σ.
We may denote any configuration of the AffTM by a real basis vector where the index −t ≤ i ≤ t denotes the ith cell of the tape and t is the run time of the AffTM (there are 2t + 1 cells, numbered from −t to t). Here s i takes on value 0 when the head is not at cell i, value 1 when it is at cell i and the transition step has not occurred and value 2 when the head has just moved according to a transition and is now at cell i. Note that we can represent s i with two bits. The label q i denotes the internal state of the machine at cell i, so q i ∈ Q ∪ {∅}, where q i = ∅ if and only if s i = 0; and a i ∈ Σ denotes the alphabet character printed on cell i. It is clear that bits, where = 2 + log(|Q|+1) + log(|Σ|) , are required to represent the information at each cell. One can thus think of these basis vectors as being encoded by strings in {0, 1} (2t+1) .
The transitions made along any one branch are represented by a sequence of these vectors, where each element of the sequence is the configuration of the machine at a given moment in time. The full state of the AffTM corresponds to an affine combination of such configurations, and the evolution of the AffTM corresponds to affine transformations of these configurations in superposition. We may then simulate the AffTM by a uniform family of affine circuits.
Here, an "affine circuit" (in analogy to quantum circuits) refers to an acyclic network of "affine gates", each of which represents an affine transformation acting on real vectors. We demand that the matrices corresponding to these affine transformations have entries (with respect to the standard basis) that can be computed efficiently, i.e. in poly-time, by an ordinary Turing Machine. We also demand that the description of the circuit can be computed efficiently, and in particular that it contain only a polynomial number of gates.
A specific affine circuit in this family will correspond to the concatenation of t identical sub-circuits, which we denote by B. Each sub-circuit B simulates one time-step of the AffTM M. To construct these circuits, each tape cell of the AffTM is associated with = 2 + log(|Q|+1) + log(|Σ|) wires in the circuit, which are sufficient to encode a tuple (s i , q i , a i ) ∈ {0, 1, 2} × Q ∪ {∅} × Σ, as described above. We build the sub-circuit B with 3 input wires and 3 output wires, constructed from copies of two gates G and I as follows. We first perform a cascading sequence of 2t − 1 copies of G (whose behaviour we describe below), with each one shifted right by wires from the preceding one. We then perform 2t + 1 copies of a gate I, in parallel, each acting on wires. The gate I acting on the ith cell changes the value of s i with value 2 to 1 and value 1 to 2, leaving a value of s i = 0 alone. It is clear that I is an affine transformation and can be built using O(t) gates whose function is to implement the change in s i for a specific i. We denote the ith instance of G as G i . See Fig. 2 for a pictorial representation of B.
The intuitive idea behind this construction is as follows. The 3 inputs to G should be thought of as describing the contents of three consecutive cells of the AffTM, including the information about the position of the head. We want G to transform the contents of these cells if the head is at the middle cell and the transition step has not occurred (i.e. s i = 1 with i being the middle cell) according to how the AffTM would transform the contents. Thus we design G to act as follows: We can think of G as a controlled affine transformation that does nothing if the input has s i = 1 and performs the transition step of the AffTM otherwise. (We may extend this to define G|y) = |y) for any other basis state |y), where y ∈ {0, 1} 3 does not encode a valid tuple (s i−1 , q i−1 , a i−1 , s i , q i , a i , s i+1 , q i+1 , a i+1 ).) As the configuration of the AffTM is an affine combination of vectors encoding tuples |s −t , q −t , a −t , . . . , s t , q t , a t ), and as we have defined the action of G (when tensored with the identity on cells on which it dos not act) on all such vectors, extending linearly uniquely defines G's action on all configurations of the AffTM. Note that some linear combination of vectors with s i = 1 can lead to the same output as when G is applied to a vector with s i = 1, so that G may not be reversible. This may be expected, as affine transformations are not reversible in general; nor is there any requirement in the setting of GPTs, to realise transformations reversibly. We construct B using a cascading sequence of G gates, acting on the wires 1 through 3 (representing cells −t, −t+1, and −t+2), then on the wires +1 through 4 , then 2 +1 through 5 , and so forth, as illustrated in Fig. 2. This in effect scans over the contents of the tape of the AffTM M, doing nothing in most cases but simulating one of transition of M on the triple whose middle cell contains the head at the beginning of the transition. The I gates then flip the value of each s i , so that the next simulation step can be performed. In this way, B simulates one step of the AffTM.
We describe the initial state of the tape of M by setting a 0 a 1 · · · a n−1 = x 1 x 2 · · · x n (where x ∈ Σ * is the input of length n), and setting a i to the blank symbol for i < 0 and i > n. We describe the initial head position of M by setting s 0 = 1 and s i = 0 for i = 0; similarly we set q 0 ∈ Q to the initial state of M and q i = ∅ for i = 0. This describes the initial state |s −t , q −t , a −t , . . . , s t , q t , a t ) which is the input to the affine circuit. The run time of the simulated machine is t, so by concatenating t instances of B acting on the initial state, we obtain an affine circuit simulating the entire run of M, producing a distribution |ψ x ), which is an affine combination of basis vectors |s −t , q −t , a −t , . . . , s t , q t , a t ) representing the final configuration of all of the branches of the AffTM.
As the position of the head in M in each branch may be different, we define another gate which will allow us to localise the final state of M in a definite subsystem. We define a gate S acting on 2 wires which transforms |0, q i , a i , 1, q i+1 , a i+1 ) → |1, q i+1 , a i+1 , 0, q i , a i ), and leaves all other basis states unchanged. By performing a cascade of S first on the wires (2t−1) +1 through (2t+1) (representing cells t−1 and t), then on 2(t−2) +1 through 2t (representing cells t−2 and t−1), and so forth, each standard basis state is mapped to one of the form |1,q,ā, s 0 , q 0 , a 0 , . . .) for someq which is either the accept state A or reject state R. Acting on |ψ x ), this cascade of S gates produces a vector |ϕ x ) = |1, A)|ϕ A,x ) + |1, R)|ϕ R,x ).
By the conditions on the acceptance weight of M, the sum w A,x of the coefficients of |ϕ A,x ) satisfies either w A,x ∈ [0, 1 3 ] or w A,x ∈ [ 2 3 , 1]; the same holds for the Cell 0 sum w R,x of the coefficients of |ϕ R,x ). Applying the operator j (u| = j (0| + j (1| on all wires, except for the wires 3 through log(|Q| + 1) + 2 representing the final state A or R of M, we then obtain a state which is a distribution representing the probability with which M accepts x. The entire affine circuit constructed in this way is illustrated in Fig. 3. The probability to accept is then just the factor in front of the basis state corresponding to the accepting configuration. We may thus simulate M by the t-fold application of B on the initial configuration, followed by the cascade of S gates and the application of unit effects described above.

D. A tomographically local theory
The construction in the preceding section resulted in a collection of closed (affine) circuits which correspond to the probability that the AffTM accepts or rejects given a specific input. To work towards Theorem 2, we now show that such an affine circuit family can be simulated in a non-free theory that satisfies tomographic locality and causality.
As discussed in Sec. I, the correspondence between probabilities and closed circuits in a physical theory gives rise to a vector space structure for the states, transformations, and effects. To find a theory which "represents" an affine circuit such as the ones described above, one must ensure that the vector spaces which emerge from that theory have the same structure as that of the affine circuits, so that the states, effects and transformations in the theory correspond to the real vectors and matrices involved in the construction of the affine circuits.
This requirement is stronger than it may at first appear, for the following reason. For a proper AffTM M, the affine circuits to simulate M merely represent preparations of distributions over |A) and |R), despite the fact that we may describe them as involving intermediate stages involving distributions over basis vectors |s −t , q −t , a −t , . . . , s t , q t , a t ) in a 2 (2t+1) dimensional vector space. Without some appropriate additional structure, a physical theory representing these affine circuits would be a theory of elaborate preparations of a 2-dimensional system, corresponding to the acceptance and rejection outcomes. To ensure that we may attribute non-trivial vector spaces to the input and output wires of each gate in the affine circuit in an appropriate way-representing the intermediate stages of the affine circuits-we could allow effects (0|, (1| on each wire. However, if some of the affine transitions involve "improper" distributions (where some coefficients have negative weights), this will not produce a theory in which all outcomes result in well-defined probabilities. Furthermore, while we know that the gates G, S, and I may be composed with appropriate preparations and effects to give rise to probability distributions (from the premise that M is a "proper" AffTM), we have no particular bounds on the coefficients of arbitrary compositions of these affine gates.
To solve this problem, the theory we construct to represent these affine circuits is taken to be a non-free theory, in which the gates are not "freely composable" in the theory but are instead limited to certain allowed compositions. At the same time, to associate each wire in a circuit with the same vector space in the physical theory as it has in the affine circuit, we introduce a collection of "noisy" measurements. These measurements, consisting of "noisy" effects that can be applied to single systems, are constructed to wash out any possible negative or super-normalised weights arising in the measurement process and to allow tomography to be performed locally on any preparation.
The allowed closed circuits M n in our non-free theory will consist of those corresponding to the closed affine circuits from the previous section which simulate AWPP computations on inputs of length n, and-making use of the noisy measurements-those sufficient for tomography to be performed locally on any state preparation. In our theory, preparations consist of semi-constructed versions of the circuits M n , possibly followed by measurements on some of the systems.

System types, states, and transformations
Let {M n } n≥1 be the family of affine circuits, simulating a proper AffTM M on inputs of length n ≥ 1, described in Sec. III C. At any specific step along one branch of the computation, the contents of cell i of the AffTM tape is described by a bit string c i ∈ {0, 1} . For each n, we will define types such that each wire gets a type ν n , each parallel composition of wires (corresponding to a bit string c i ∈ {0, 1} and representing one cell of the AffTM) gets a type A n , and the parallel composition of (2t+1) wires (representing the memory of a single copy of the circuit M n ) gets a typeĀ n . For simplicity, we will here only discuss the states, transformations, and effect for a fixed n; all other types will constructed in the same manner. We will henceforth write ν n , A n , andĀ n simply as ν, A, andĀ.
We define vectors A (d i | dual to states |c i ) A as follows: 1} . We will use these dual vectors as the starting point for constructing noisy measurements in the following section. Given the construction of the real vectors from Sec. III C, the state corresponding to the entire contents of the tape at the very start of the AffTM computation corresponds to |b)Ā = The non-free theory we consider only allows a limited set of conceivable compositions of devices as valid circuits. Specifically, the only allowed closed circuits are those consisting of those compositions of the gates G, I, and S which form an "initial segment" of the affine circuit M n defined in Sec. III C (Fig. 3), followed by measurements on any remaining unmeasured systems. We define an initial segment of M n to consist of the first m i gates to act on each wire i ∈ {1, 2, . . . , (2t+1) }, where the number of gates m i ≥ 0 may vary for each i, and may include all or none of the gates acting on wire i. (Thus, an initial segment is one which may be completed to the whole circuit M n by post-composition of an appropriate sequence of gates.) The preparations allowed by the theory are precisely those which correspond to such initial segments of M n , again possibly followed by measurements on some of the systems. The objective is to define this theory in such a way that it satisfies local tomography.
We now describe how to construct such a theory. Note that states of systemĀ include ones of the form where B is the circuit fragment depicted in Fig. 2. As M is allowed assign negative weights to certain transitions,Ā(e|B · · · B|b)Ā may be negative. Thus effects of the formĀ(e|, for e ∈ {0, 1} (2t+1) , may not be admissible effects on all states of systemĀ. Instead, we use versions of the effects A (e| with added "noise"-consisting of A (e| precomposed with a mixing operator-to wash out negative or super-normalised weights without losing the vector space structure of the individual system.
Before we introduce noisy measurements, note that the effect ν (u| on system ν given by ν (u| = ν (0| + ν (1| is allowed, because the result of applying this effect to any affine state preserves the property of the coefficients adding to 1, and so will yield an affine state on a smaller system. By parallel composition, we may also define the effects S (u| on any composite system X; in particular, it is easy to show X (u|T X |s)Ā = X (u|s)Ā for any affine transformation T we may perform on any subsystem X, as X (u| is a left +1-eigenvector of any such operator T . Thus, X (u| is a deterministic effect on systems of type X.

Noisy measurements
We construct effects for the noisy measurements, so that when applied at the end of an initial segment of M n , the measurement statistics are not negative values resulting from the affine transformations in the circuit. We define two effects ν (a 0 | = ν (0|D ν and ν (a 1 | = (1|D ν , in terms of a suitable stochastic 2 × 2 operator D ν . The role of D ν is to provide a veil of propriety, so that for any preparation |s) on k bits included in the theory, all of the coefficients of the matrix (D ⊗k ν )|s) are in the interval [0, 1]. If we extend X (a e | = i∈X (a ei | to define noisy effects on a subsystem X, it follows that (a e |s) ∈ [0, 1] for all e ∈ {0, 1} |X| .
Our strategy is to construct D ν = 1−pν where I 2 is the 2 × 2 identity matrix. As p ν decreases, the effect of D ν is to decrease the "bias" of the distribution over outcomes { ν (a 0 |, ν (a 1 |} for any single system of type ν. The "bias" is just the difference between the two weights described by the inner products (a i |x) for i ∈ {0, 1}, where |x) is a state of type ν. If this bias is in the interval [0, 1], then the weights form a probability distribution and we can consider { ν (a 0 |, ν (a 1 |} to be a well-defined measurement acting on system ν. We prove that there exists a non-zero value of p ν small enough to ensure that D ν provides a veil of propriety for all allowed preparations of the system (which in our theory correspond to partial constructions of the affine circuits to simulate AffTMs, followed by some number of measurements on some of its subsystems). It is clear that if p ν = 0 then the bias will be 0, regardless of which state |s) A we apply the measurement to and which system we measure, so that we get sensible outcome probabilities. Recall that effects are functions which take all allowed states to allowed outcome probabilities; and note that in our non-free theory, there are only finitely many valid preparations and finitely many systems (in the partial construction of a single circuit on inputs of length n) on which to perform a measurement. Hence, continuity of the outcome probabilities in the effects ensures that there exists a value p ν > 0 such that the bias is contained in the interval [0, 1], for all preparations and measured systems. Fixing such a value of p ν results in noisy measurements that are sufficient for tomography on system ν.
Thus, we obtain two distinct effects for systems of type ν, which allows us to associate a non-trivial vector space structure to each wire. One can readily check that Thus the deterministic effect arising from summing over noisy effects from each noisy measurement is the same as that introduced in the previous section.

Accepting/rejecting measurement
Let (A| and (R| denote the duals of the basis states |A) and |R), representing the accept and reject states (respectively) of the AffTM, defined at the end of Sec. III C. The accept/reject measurement effects are defined as follows: which apply the effect (A| and (R| to the appropriate part of the first cell representing the state of the AffTM M, and ν (u| to all other wires. We also define a third effectĀ(e none | =Ā(u| −Ā(e acc | −Ā(e rej | to define a set of effects which sum toĀ(u|. Recall that the configuration of the AffTM machine at step t is such that the sum of the amplitudes of the accepting paths and the rejecting paths are both bounded within [0, 1], and sum to 1. Hence,Ā(e acc | andĀ(e rej | define valid effects, and the result of applyingĀ(e none | at the end of the affine circuit will in fact always result in the scalar 0. Then we may define a three-valued measurement {Ā(e acc |,Ā(e rej |,Ā(e none |} whose effects sum tō A (u|. Performing the three-valued measurement at the end of the circuit simulating the affine circuit will yield a probability distribution (w A,x , w R,x , 0) which indicates the result of the computation.
This measurement, and the noisy measurements that are allowed for the other initial segments of the affine circuit, are all of the allowed measurements in the theory; and in each case the sum of the effects for those measurements add to (u|. Then (u| is the unique deterministic effect, and so the theory we are constructing is causal.

Allowed closed circuits
As discussed at the start of this section, the allowed closed circuits in the theory are those corresponding to initial segments of the circuits M n simulating AffTMsincluding the complete circuits M n themselves-with noisy measurements or the deterministic effect on any remaining free wires. The set of closed circuits involving only noisy measurements ensures one can perform full local tomography of all the preparations in the theory, ensuring the theory inherits the vector space structure of the affine circuits.
This implies, among other things, that vector spaces arising in the theory have a tensor product structure; we have thus constructed a causal and tomographically local non-free theory. Furthermore, given that we can efficiently compute a time t by which M has halted, the circuit M n may be assumed to have an efficiently computable depth, so that it is easy to determine whether a given circuit is an initial segment of M n . This yields a non-free theory to simulate M.

E. An operational theory for all of AWPP
The final step in the proof of Theorem 2 is to show how to combine the preceding constructions to describe a (non-free) operational theory G not just for a single language in AWPP, but for the entire class.
As we show in Sec. III B, every problem in AWPP can be solved with bounded error by a proper affine Turing machine (AffTM) which halts in polynomial time. Conversely, any poly-time proper AffTM which has an acceptance weight either ≥ 2 3 or ≤ 1 3 for all inputs, defines a language L ∈ AWPP. We then define a theory G which simply contains enough devices and system types to simulate every such AffTM, and only these AffTMs. In this theory, each system type is parameterised by a (polytime, proper, bounded-error) AffTM M and an input size n ≥ 1; and each device is one of the sort described in the previous sections, also parameterised by (M, n). The devices G M,n , S M,n , I M,n , and the various preparations and measurements for each system type, may then be used to construct circuits C M,n to simulate the AffTM M on inputs of size n; and for each such M, there will be a deterministic Turing machine U which can generate C M,n in poly(n) time.
To summarise: for any L ∈ AWPP, there is a polytime, proper AffTM M which decides L with bounded error, which may be simulated by an affine circuit family {M n } n≥1 . This affine circuit family may be constructed uniformly, by the fact that it simulates an AffTM which halts in polynomial time. The family {M n } n≥1 may itself be simulated by a uniform circuit family {C M,n } n≥1 consisting of allowed experiments in the theory G. Then G is a non-free theory in which AWPP ⊆ BGP. Together with Theorem 1, this concludes the proof of Theorem 2.

IV. PROOF OF THEOREM 3
In this section we present a proof of Theorem 3.
Proof. Recall that UNIQUE-SAT is the problem of deciding whether a given Boolean formula has exactly one satisfying truth assignment, or no satisfying assignment at all, promised that one of these is the case. It is known that UNIQUE-SAT is contained in PromiseUP, which is a subset of PromiseAWPP [26]. The Valiant-Vazirani theorem [25] says that if one has an efficient algorithm for solving UNIQUE-SAT in conjunction with the ability to perform random reductions, then one can solve any problem in NP. (More precisely, the Valiant-Vazirani theorem says the standard Boolean Satisfiability Problem SAT can be randomly reduced to UNIQUE-SAT.) Now, if PromiseBQP = PromiseAWPP then UNIQUE-SAT ∈ PromiseBQP, so that there is a uniform family of quantum circuits which solve an instance of the promise-problem UNIQUE-SAT (with no requirements made on inputs which do not respect the promise). However, a crucial point is that, as gates in quantum theory are closed under composition, the output of the algorithm will always result in sensible probabilities, regardless of the input. One can therefore perform the random reduction of Valiant-Vazirani in quantum theory (randomly generating an appropriate instance of SAT, and using this to generate an appropriate experiment of the sort that solves UNIQUE-SAT with bounded error), and run the algorithm many times on each input produced by the reduction to test whether it is a YES instance of UNIQUE-SAT. By performing this reduction many times, we may then solve SAT with bounded error in BQP. It then follows that NP ⊆ BQP, which using Theorem 1 gives NP ⊆ AWPP.

V. DISCUSSION
A. On computation in non-free theories We have described a theory G, within the framework of Generalized Probabilistic Theories, which satisfies tomographic locality and allows the solution of any L ∈ AWPP, providing a converse to the results of [21]. To describe this construction, we introduce a new possibility: that of a "non-free" theory, in which the possible transformations of systems are not necessarily closed under sequential and parallel composition.
Before discussing broader implications, we address an issue that might arise with non-free theories: what if an agent can solve a hard problem (say, outside of AWPP) simply by observing whether a certain type of system exists in the universe or not? Similarly: might an agent be able to solve problems outside of AWPP by simply observing whether a given circuit is an allowed experiment? This would amount to a form of 'cheating', similar in some ways to the construction of non-uniform circuits in the classical or quantum cases, but is consistent with an operational account of the limits of agents in the theory. If such 'cheating' were possible in a universe described by a non-free theory G, this would not contradict the claim that BGP ⊆ AWPP, which is a formal mathematical theorem about uniformly specifiable families of allowed experiments. But it would undermine the significance of the claim, since the definition of BGP could not be said to accurately capture the set of problems that an agent can efficiently solve by operational means.
Concerning the first possibility above, of computation by testing for the existence of devices: our answer is that we have not said anything about how difficult it is to determine whether a given type of system exists in the universe or not. We can suppose, for instance, that the universe is infinite; and that given a classical description of an affine Turing Machine, there is no step-by-step procedure that an agent can follow to determine if a corresponding type of system exists. Hence a non-free theory does not admit an easy way for an agent to solve the (uncomputable!) problem of whether a given affine Turing machine is proper or not.
The second possibility, of an agent solving difficult problems by testing whether an experiment is allowed, is more problematic. It is easy to show that for any PromiseAWPP problem, there is an AffTM which may not be proper, but which halts in polynomial time and produces a bounded-error "proper" distribution for inputs which satisfy the promise; and it is easy to apply the constructions of Sections III C and III D to produce nonfree theories to simulate such an AffTM, on those inputs for which M does produce a proper distribution. Consider the problem UNIQUE-SAT ∈ PromiseAWPP, of determining whether or not an instance of Boolean Satisfiability has a satisfying assignment, provided the promise that it has most one such assignment. It is not difficult to show that evaluating whether this promise holds, is itself an NP-complete problem. Then, if N is the AffTM corresponding to the PromiseAWPP algorithm for UNIQUE-SAT, we might concieve of an enterprising experimenter who realises that they can solve NPcomplete problems, by testing whether a given instance of SAT maps to an allowed experiment in the non-free theory to simulate N. As it is considered unlikely [17,18,40] that NP is contained in AWPP, this raises the issue of whether BGP ⊆ AWPP adequately captures the notion of efficient computation in an operational theory.
We do not have a complete account of how the experimenter can be prevented from obtaining computational power from testing the boundaries of an arbitrary nonfree theory. However, for the non-free theories which we have explicitly described to solve problems in AWPP, we may show that such probing does not provide any power beyond AWPP. Consider attempts by an experimenter to test the boundaries of the AWPP-hard theory G which we have constructed. A closed circuit C is an allowed experiment if it simulates an initial segment of a circuit C M,n simulating some affine Turing Machine M (or an initial segment thereof, with subsequent noisy measurements), on systems of a single explicitly specified type, which is parameterised by (M, n) for some polytime proper AffTM M and input-size n ≥ 1. Whether or not C is allowed is easy to check with a classical computation from M and n. Hence the observation that a given circuit can or cannot be constructed cannot solve any harder problem. It also seems likely that a tighter definition of "a non-free theory" could be found, which explicitly prevents an agent from being able to solve difficult problems by probing the boundaries of a theory, while still permitting a theory which is AWPP-hard [65]. We leave the demonstration of such a definition as an open problem.

B. Conclusion and outlook
An interesting feature of the AWPP-complete theory G constructed in this paper is that it satisfies the principle of causality. The main result of [21] was that for any theory satisfying tomographic locality, whether or not causality is satisfied, efficiently solvable computational problems are contained in AWPP. Taken together, these results show that computational circuits in any non-causal theory can always be efficiently simulated by circuits in a causal theory. Hence, in the landscape of general theories, "acausality" does not appear to be a resource for computation.
Theorem 2 is reminiscent of a result encountered when quantum correlations are viewed in the context of the set of non-signalling theories [32]. This set consists of theories satisfying the no-signalling principle, ranked according to the strength of their correlations (quantified by the violation of certain Bell Inequalities) [32]. Quantum theory is ranked above classical theory, but there exists a theory colloquially known as "Boxworld" [12,15] which has the strongest correlations consistent with the no-signalling principle and is thus ranked above quantum theory. In the current paper, we considered the set of theories satisfying tomographic locality and ranked them according to the power of efficient computation. As quantum computers can efficiently simulate classical ones, classical theory is ranked below quantum theory, but here we showed the existence a theory with the strongest possible computational power and is hence ranked above quantum theory. In Fig. 4 we schematically represent this analogy between the sets correlations satisfying the non-signalling conditions, and the computational complexity classes of theories satisfying tomographic locality, along with the quantum and classical cases for each.
Moreover, Refs. [27][28][29] have shown that methods employing quasi-probability distributions can simulate arbitrary non-signalling correlations. The quasi-probabilistic model of computation introduced here to build a theory with maximal computational power bears an intriguing resemblance to these approaches, providing another similarity between the set of all non-signalling correlations On the left-hand-side is the schematic of the set of all non-signalling probability distributions (correlations) obtained in a Bell test, with the sets of quantum and classical correlations strictly contained inside (as is the case in general). On the right-hand-side is another schematic of the computational complexity classes associated with theories that satisfy tomographic locality, with the theory presented in this paper saturating the whole of AWPP, and the classes associated with quantum and classical theory contained in this class. Note that we can only conjecture that each of these containments is strict, but there evidence to support this conjecture (see, for example, [1,3] for the containment of classical computation within quantum, and Theorem 3 for the containment of quantum computation within AWPP). and the computational landscape of general theories.
Many attempts at providing reasonable physical principles that uniquely characterise the set of quantum correlations as a subset of the set of all non-signalling correlations have been made [33,36,38,39]. These principles, while not fully capturing the exact quantum boundary [37], have deepened our understanding of quantum correlations and provided connections between physical principles and information-theoretic advantages. Insights garnered from these connections have also lead to the development of Device-Independent Cryptography. So while investigating such connections has foundational interest, it has also been shown to have practical implications.
It seems prudent to ask the analogous question for the set of tomographically local theories: can the class of efficient quantum computation be characterised by some set of physical principles? Such a characterisation would deepen our understanding of quantum computation and may also be of practical relevance; if one uncovers the necessary and sufficient physical requirements for universal quantum computation one could design algorithms that optimally take advantage of them. The results presented in this paper provide one with the language and tools to pose these questions in a rigorous fashion One approach to such a characterision would be to find the minimal set of physical principles that imply the quadratic speed-up over classical computation offered by Grovers search algorithm [7]. This speed-up is optimal for quantum computers [40], so any set of physical principles which imply it could be argued to capture some of the essence of quantum computation. Work in this direction has appeared in [54][55][56][57], where the quadratic lower bound to searching an unstructured database has been shown to hold for a large class of theories.
Recently, methods have been proposed that make use of quasi-probability distributions to classically estimate the output of a quantum computer [30]. These classical estimates converge on the true quantum output probabilities in a time quantified by the "negativity" of the quasi-probability distribution. The larger the negativity, the harder it is for a classical computer to estimate the output probability of a quantum computer. As we have provided an interpretation of the class AWPP in terms of quasi-probabilities, it would be interesting to determine if quantum algorithms can be constructed that estimate the output probability of this quasi-probabilistic computational model. In analogy with the classical estimation algorithms of [30] the quantum algorithms may converge to the true output probability at a rate governed by the negativity of the quasi-probability distribution. Determining how hard it is for a quantum computer to simulate AWPP would provide a way to determining if quantum theory is powerful for computation in the landscape of general theories.
The theory constructed in this paper-while having the maximal computational power of any tomographically local theory-does not exhibit many physically interesting features. This feature is again reminiscent of Boxworld, which is severely restricted in its computational ability (at least for the case of reversible computation) [41,42] as well as in its entangling dynamics [31]. Having a more refined approach to constructing theories in this landscape may allow us to investigate nonlocality, or other notions of physical relevance, along with the computational power of these theories, which would deepen our understanding of how computation and other information-theoretic advantages are connected.
Finally, the distinction introduced in this paper between free and non-free theories appears to be important for the study of computation in general physical theories. Indeed, it is still an open question whether there exists a free theory whose computational power equals AWPP. The important distinction between free and non-free theories is that transformations in free theories are closed under composition, implying a bound on the set of states. This need not be the case in non-free theories. Could it be the case that a quantum computer can exploit this fact and efficiently simulate computation in all tomographically-local free theories? If this conjecture was borne out, it could shed light on which physical features give rise to the quantum speed-up.
Note added -While writing up the current work we became aware of the related but independent work [60], on a similar characterisation of AWPP.