Quantum computation is the unique reversible circuit model for which bits are balls

The computational efficiency of quantum mechanics can be defined in terms of the qubit circuit model, which is characterized by a few simple properties: each computational gate is a reversible transformation in a connected matrix group; single wires carry quantum bits, i.e. states of a three-dimensional Bloch ball; states on two or more wires are uniquely determined by local measurement statistics and their correlations. In this paper, we ask whether other types of computation are possible if we relax one of those characteristics (and keep all others), namely, if we allow wires to be described by d-dimensional Bloch balls, where d is different from three. Theories of this kind have previously been proposed as possible generalizations of quantum physics, and it has been conjectured that some of them allow for interesting multipartite reversible transformations that cannot be realized within quantum theory. However, here we show that all such potential beyond-quantum models of computation are trivial: if d is not three, then the set of reversible transformations consists entirely of single-bit gates, and not even classical computation is possible. In this sense, qubit quantum computation is an island in theoryspace.


I. INTRODUCTION
Since the discovery of quantum algorithms that outperform all known classical ones in certain tasks [1], improving our understanding of the possibilities and limitations of quantum computation has become one of the central goals of quantum information theory. While it is notoriously difficult to prove unconditional separation of polynomial-time classical and quantum computation [2], an approach that is often regarded more tractable is to analyze how certain modifications of quantum computing affect its computational power. For instance, one may consider restrictions on the set of allowed quantum resources, and ask under which condition the possibility of universal quantum computation is preserved despite the restriction. Notable results along these lines, among many others, include the Gottesman-Knill theorem [3][4][5], insights on the necessity of contextuality as a resource for magic state distillation [6], or bounds on the noise threshold of quantum computers [7].
In a complementary and in some sense more radical approach, going back to Abrams and Lloyd [8], one considers modifications of the quantum formalism itself and studies the impact of those modifications on the computational efficiency, resembling strategies of classical computer science like e.g. the introduction of oracles [9]. For example, it has been shown that availability of closed timelike curves leads to implausible computational power [10], that stronger-than-quantum nonlocality reduces the set of available transformations [11][12][13][14], that tomographic locality forces computations to be contained in a class called AWPP [15,16], and that higher-order interference does not lead to a speed-up in Grover's algorithm [17].
1. The circuit model that we consider in this paper. We have an arbitrary finite number n of wires (here n = 4), and each wire carries a "gbit" which is a state in a d-dimensional Bloch ball state space. Initially, a product state is prepared (encoding, for example, the classical input to the algorithm), then a finite number of gates Gi is applied, each acting on an arbitrary number of gbits, and finally local measurements are performed. We assume that the Gi are elements of an (arbitrary unspecified) closed continuous matrix group, and that the global state of n wires is uniquely determined by the statistics and correlations of single-wire measurements ("tomographic locality"). If d = 3, i.e. if the gbits are qubits, it has been shown in [44] that these assumptions uniquely characterize unitary quantum computation. Here we analyze the case d = 3, and prove that -despite conjectures to the opposite [19] -the corresponding models do not allow for any non-trivial computation at all. We do not assume that wires can be swapped, or that all transformations can be composed out of two-gbit transformations. See the main text for details.
In this paper, we consider a specific modification of the quantum formalism that is arguably among the simplest and most conservative possibilities. This modification dates back to ideas by Jordan, von Neumann, and Wigner [18], and it has several independent motivations as we will explain further below. This generalization keeps all characteristic properties of quantum computation unchanged, but modifies a single aspect: namely, it allows the quantum bit to have any number of d ≥ 2 degrees of freedom, instead of standard quantum theory's d = 3 (or the classical bit's d = 1). It has been conjectured [19] that the resulting theories allow for interesting "beyond quantum" reversible multipartite dynamics, which would make the corresponding models of computation highly relevant objects of study within the research program mentioned above. However, here we show that, quite on the contrary, these models are so constrained that they do not even allow for classical computation; hence, in Aaronson's terminology, the d = 3 case of the standard qubit circuit model can be seen as an "island in theoryspace" [20].
Our paper is organized as follows. Section II gives the mathematical framework. We define single bits that generalize the qubit ("gbits"), and then give three postulates that allow us to reason about circuits that are constructed out of n of these gbits. We formulate the problem that is addressed in this work and describe how it relates to earlier results in the literature. In Section III, we state and prove our main result: namely, while our principles uniquely determine quantum computation in the case that the single gbits have d = 3 degree of freedom, any other value of d does not even allow for classical computation. We give the full proof for the case d ≥ 4 (the d = 2 case is deferred to the appendix), and illustrate the main idea of some of the proof steps by a circuit diagram, before concluding in Section IV.

II. GENERALIZED BITS AND GBIT CIRCUITS
In both classical and quantum computation, we can restrict our attention to the circuit model (as in Figure 1) where each of the wires (the single systems that enter and exit logical gates) corresponds to a two-level system. Quantum two-level systems (qubits) are different from classical ones (bits): they allow for a more complex behavior which encompasses phenomena like coherent superposition, interference, or uncertainty relations. Yet, both classical and quantum bits can be formalized in a unified way that we now describe.

A. Single gbits
To any d ∈ N, we associate a "generalized bit" (gbit) that has the d-dimensional Bloch ball, B d = { a ∈ R d | | a| ≤ 1}, as its state space. Every vector a in the Bloch ball B d corresponds to a possible state of the generalized bit. Two-outcome measurements are described by vectors b ∈ R d with | b| = 1, such that the probability of the first outcome if performed on state a is (1+ a· b)/2, and that of the second outcome is (1− a· b)/2. In the following, it will be convenient to use the notation ν( a) = (1, a) ∈ R d+1 , such that these two probabilities become 1 2 ν( a) · ν(± b). Reversible transformations of states are given by a → R a, where R ∈ SO(d) is a rotation matrix. These transformations map states to states and can be inverted (by applying R −1 ), hence we can interpret them as closed-system time evolutions or, equivalently, reversible gates on single generalized bits. For d = 3, this formalism recovers the qubit of standard quantum theory [5]: as is well-known, every 2 × 2 density matrix ρ can be written in the form where σ = (σ x , σ y , σ z ) denotes the Pauli matrices. It is automatic in this representation that tr ρ = 1, and positivity ρ ≥ 0 is equivalent to | a ρ | ≤ 1. Hence the set of states of a quantum bit can be represented by the Bloch ball B 3 . This representation has the important property that statistical mixtures correspond to convex combinations: if a state ρ is prepared with probability p and another state ρ is prepared with probability 1 − p, then the total state pρ + (1 − p)ρ corresponds to the Bloch vector a pρ+(1−p)ρ = p a ρ + (1 − p) a ρ . This statistical interpretation of convex mixtures is also taken for balls of other dimensions d = 3, hence these Bloch balls can be regarded as state spaces of generalized probabilistic theories [11].
In the d = 3 case, projective measurements are represented by unit vectors b, | b| = 1, with outcome probabilities (1 ± a · b)/2 as described above. Unitary transformations U on states, acting as ρ → U ρU † , are described in the Bloch ball picture by orthogonal maps R U , R U R U = 1, such that a U ρU † = R U a ρ . More general measurements (positive operator-valued measures) or transformations (completely positive maps) can also be described in the Bloch ball representation, but they are not needed in what follows and thus omitted.
The simplest case of d = 1 corresponds to the classical bit: there are two possible configurations, a = +1 and a = −1, and further states that represent classical uncertainty about the configuration. Namely, if we have +1 with probability p (and thus −1 with probability 1 − p), this corresponds to the state p a + (1 − p) a in the interior the one-dimensional "Bloch ball".
There is one speciality in the d = 1 case: instead of SO(1) = {1}, we should allow the group O(1) = {−1, 1} as Bloch ball transformations such that also the bit flip is allowed.
What is the significance of the d-dimensional Bloch balls if d is neither one nor three? These gbits have appeared in various places in quantum information theory and the foundations of quantum mechanics. Historically, they have first shown up as precisely those two-level state spaces that can be described as (formally real, irreducible) Jordan algebras [18], a natural algebraic generalization of standard quantum theory. In fact, quantum theory with real amplitudes, i.e. over the field R instead of C, has a (d = 2)-dimensional Bloch ball as its "quantum bit", and the bits of quaternionic and octonionic quantum theory correspond to B d for d = 5 and d = 9 respectively. Furthermore, the fact that a two-level system should have a Euclidean ball state space can be derived from a variety of different sets of natural assumptions. In many reconstructions of quantum theory from physical or information-theoretic principles [21][22][23][24][25][26][27][28][30][31][32], this fact is derived as a first step. A direct geometrical motivation can be found by considering spin-1 2 particles (compare e.g. to [19]): under rotations SO(3), they transform via SU (2). The density matrix transforms under the adjoint representation, which means that the Bloch vectors transform via the same rotation as in physical space. Therefore, the Bloch vector b can be seen as defining an oriented axis in physical space. The model considered in this paper is a direct generalization of the Bloch ball and this interpretation to arbitrary spatial dimensions. Indeed, the possibility that space might have more than three dimensions has appeared in a large variety of physical theories, see e.g. [33][34][35][36][37][38]. It has also been argued that these generalized bits can be interpreted as "information quasiparticles" in some sense [39]. In summary, these gbits are among the simplest and most natural generalizations of the classical bit and the qubit of quantum mechanics.

B. Several gbits and computation
To describe circuit computation, we need to define the state space, measurements, and transformations of several gbits. In standard quantum theory, where the gbits are qubits, there is a unique definition of these notions: the states of n qubits are exactly the (2 n ) × (2 n ) density matrices, the reversible transformations are the unitaries, and the measurements are described by collections of projection operators. Similar definitions apply to n classical bits. But if the gbits are Bloch balls of dimension d ∈ {1, 3}, then it is apriori unclear what the composite state space should be.
Since we would like to be as general as possible, we will not make any attempt to fix the composite state space from the outset. Instead, we will work with a small set of principles that the composite n-gbit system is supposed to satisfy. While these principles will constrain the n-gbit state space, it is by no means obvious that they determine it uniquely. However, we will show below that they are indeed constraining enough to allow us to derive the full set of states and transformations.
An important principle is the no-signalling principle [11]: the outcome statistics of measurements on any group of gbits does not depend on any other operations (e.g. measurements) that are performed on the remaining gbits. This is a physically well-motivated constraint that lies at the heart of what we mean by "different wires" (i.e. subsystem) of the circuit in the first place.
This principle is satisfied by classical as well as quantum computation, and so is our second postulate of tomographic locality [21,40]: every state on n gbits is uniquely characterized by the statistics and correlations of the local gbit measurements. In other words, a global n-gbit state is nothing but a catalog of probabilities for the outcomes of all the single-gbit measurements and their correlations.
It is not only classical and quantum theory that satisfies the principle of tomographic locality, but also more general probabilistic theories like boxworld [12]. If this principle was violated, then a collection of gbits would in some counterintuitive sense be "more" than a composition of its building blocks. Even though this formulation makes tomographic locality sound very natural, there are simple examples of theories that violate it. One such example is given by quantum theory over the real numbers R [41,42]. This is because observables of two single real qubits do not linearly generate all observables of two real qubits. In particular, if σ y is the Pauli matrix with purely imaginary entries, then σ y is not a real qubit observable, but σ y ⊗σ y is a real two-qubit observable. Intuitively, it represents a novel "holistic" degree of freedom that cannot be constructed out of local degrees of freedom and their correlations.
Not only is the postulate of tomographic locality very intuitive, but it is also very powerful: it allows us to represent states of n gbits as tensors [11]. That is, even if we do not know what the set of n-gbit states is, we know that every such state can be written as an element of the linear space (R d+1 ) ⊗n (in the quantum case, where d = 3, this amounts to the 4 n -dimensional real linear space of Hermitian (2 n ) × (2 n ) matrices; for real bits, it is the 2 n -dimensional space that contains the probability vectors over 2 n configurations). In particular, an n-gbit product state with local Bloch vectors a 1 , . . . a n is represented by ν( a 1 , . . . , a n ) := (1, a 1 ) ⊗ . . . ⊗ (1, a n ) , and all other states ω are vectors on the same space (but not of this product form). Tomographic locality then amounts to the fact that all these states are uniquely determined by the numbers which are the outcome probabilities of local gbit measurements corresponding to the Bloch vectors b 1 , . . . , b n on the state ω. This mathematical property has many intuitive consequences that are not otherwise guaranteed, e.g. the property that products of pure states are pure. It is also the reason why the mathematical literature has focused almost entirely on this notion of composite state space (cf. e.g. [43]): it leads to notions of "tensor products" of ordered linear spaces that admit to prove general statements that are otherwise unavailable. In the context of this paper, it would seem extremely difficult to make any meaningful statements whatsoever if not even the linear space on which the global states live could be fixed from the outset.
We need one further ingredient to arrive at a model of computation, namely a set of reversible transformations. In analogy to standard quantum computation (where these are the unitaries), we postulate that the transformations form a closed continuous matrix group, and thus Lie group, G: they form a group since they can be composed; they must be linear maps since if we prepare a state ω with probability p and ω with probability (1 − p), they must act on the components of the convex combination pω + (1 − p)ω individually, to be consistent with the probabilistic interpretation [11]. Moreover, it is physically meaningful to model the group as closed since whenever we can approximate a transformation to arbitrary accuracy by gates, it makes sense to declare this transformation as in principle implementable. This postulate is almost, but not quite, satisfied by classical computation, i.e. the d = 1 case: the transformations on n classical bits are the permutations of the 2 n configurations. They form a closed matrix group of linear maps, but this group is discrete and not continuous. This discreteness is already reflected in the fact that the one-dimensional "Bloch ball" is discrete, i.e. has only a finite number (two) of pure states. In the case d ≥ 2 to which we thus restrict our attention in the following, continuity of the group of transformations is well-motivated (though not logically implied) by the continuity of the single gbit Bloch balls and their rotations. It is also motivated by the continuity of time evolution which is ultimately assumed to generate the logical gates that we apply in our circuits.
All gates in a circuit will be elements of G. This group must in particular contain the local qubit rotations: for R ∈ SO(d), writeR(1, a) := (1, R a) , then the subgroup of local transformations is Note that we have used tomographic locality in deriving this prescription: since a local transformation acts like a product of transformations on the product states, it must act like this on all other states too since they live on the vector space that is spanned by the product states. Tomographic locality hence enforces that we can represent any linear map X : (R (d+1) ) ⊗n → (R (d+1) ) ⊗n as a tensor with n upper and n lower indices; that is, where 0 ≤ α i , β i ≤ d, and e γ denotes the γ-th unit vector, e.g. e 0 = (1, 0, . . . , 0) . This is in contrast to Bloch vectors b ∈ R d , where we use the notation R d b = e 1 = (1, 0, . . . , 0) .
We demand that G loc ⊆ G, but do not make any further assumptions on G. In particular, we do not assume that the n gbits play physically identical roles: our assumptions allow in principle composite state spaces of n gbits that are not symmetric with respect to permutations of the gbits. Hence we are also not assuming that gbits can be reversibly swapped, or that other natural choices of transformations such as extensions of classical reversible gates (like CNOT) can necessarily be implemented.
C. The trivial case G = G loc For any Bloch ball dimension d, there is a trivial computational model: namely the choice that G = G loc . This describes a theory where the only possible reversible transformations are independent local transformations of the single gbits. Such a model does not even allow for classical gates like the CNOT; it only admits gates and computations that evolve the gbits independently from each other without ever correlating them, i.e. products of single-gbit gates. A state space that is compatible with this choice of global transformations is simply i.e. all convex combinations of product states. This is a state space that does not contain entanglement.

D. d = 3 equals quantum computation, and relation to earlier work
For the case of the standard qubit, i.e. of d = 3, it has been proven in [44] that there is only a single possible non-trivial (G loc G) theory that satisfies the assumptions from above: namely, standard quantum theory over n qubits, with the (2 n )×(2 n ) density matrices as the states, and the projective unitary group G = PU(2 n ) of transformations. That is, the postulates on composition of gbits from above, together with the structure of the single qubit, is sufficient to determine qubit quantum computation uniquely.
While this result is interesting in its own, it also represents the main motivation for the present work: if quantum computation is characterized by such a simple list of principles, then maybe one obtains other interesting models of computation by slightly tweaking one of the postulates. Since large parts of the mathematical structure are determined by the postulates on composition (no-signalling and tomographic locality), the most promising road towards modifying the setup and also keeping important mathematical tools seems to be to modify the structure of the single qubit -and technically as well as conceptually (as explained in Subsection II A), the most natural way to do this is by changing the dimension of the Bloch ball d.
In the special case of n = 2 gbits, the consequences of the above postulates have been explored in [45,46]. There it has been proven that the only consistent choice of transformations for Bloch ball dimension d = 3 is given by the trivial choice G = G loc . However, computation is typically taking place on a large number n 2 of gbits, and the techniques of [45,46] cannot readily be generalized to n > 2.
In fact, it has been suggested in [19] that it is essential for Bloch ball dimensions d ≥ 4 to allow for genuine mpartite interaction of the gbits, where m ≥ d − 1 ≥ 3. Without a conclusive proof or explicit construction of the state space, the authors conjectured that interesting multipartite reversible dynamics is possible for such systems. In contrast to quantum theory, this m-partite dynamics would not be decomposable into two-gbit interactions. While tomographic locality has not been assumed in [19], it is an important first step to verify their conjecture under this additional assumption. In fact, it has been argued in [47] that in the context of spacetime physics (the Bloch balls are interpreted in [19] as carrying some sort of d-dimensional spin degrees of freedom), tomographic locality is to be expected due to arguments from group representation theory. This gives us another, independent motivation to ask the main question of this paper: if d = 3 and n is any finite number of gbits, then what are the possible theories that satisfy the assumptions of Subsection II B?

III. MAIN RESULT
The main result of this work is an answer to the question posed at the end of the previous section: Theorem 1. Consider a theory of n gbits, where single gbits are described by a (d ≥ 2)-dimensional Bloch ball state space, subject to the single-gbit transformation group SO(d). As described above, let us assume no-signalling, tomographic locality, and that the global transformations form a closed continuous matrix group G.
If d = 3, then necessarily G = G loc , i.e. the only possible gates are (independent combinations of) single-gbit gates. No transformation can correlate gbits that are initially uncorrelated; hence not even classical computation is possible.
We will now prove this result for the case d ≥ 4. The proof in the d = 2 case uses similar techniques, but differs in several details for group-theoretic reasons. It will hence be deferred to the appendix.
As a first step, we will consider the generators of global transformations and show that there exists at least one that is of a certain normal form. This part of the proof is valid for all dimensions d ≥ 2.

A. Generator normal form for all dimensions d ≥ 2
Let G ∈ G be a transformation of the composite system. Suppose we prepare the n gbits initially in states with Bloch vectors a 1 , . . . , a n , evolve the resulting product state via G, and perform a final local n-gbit measurement with Bloch vectors b 1 , . . . , b n . The probability that the all the n outcomes on the n gbits are "yes" is ( a 1 , a 2 , . . . , a n ) ∈ [0, 1].
Let us consider a group element G = e X with X ∈ g (the corresponding Lie algebra) and ε ∈ R and expand: v( b1, . . . , bn) 1+ X + From now on we restrict ourselves to unit length Bloch vectors, i.e. | a i | = | b j | = 1 for all i, j. We obtain C[ a 1 ] := v(− a 1 , b 2 , ..., b n ) Xv( a 1 , a 2 , . . . , a n ) = 0 since the zeroth order is zero which is a local minimum as a function of (see Figure 2 for an interpretation). Thus the second order contribution has to be nonnegative: v (− a 1 , b 2 , . . . , b n ) X 2 v( a 1 , a 2 , . . . , a n ) ≥ 0, or more generally with the role of the qubits exchanged, ( a 1 , . . . , a n ) ≥ 0.
(1) Other first and second order constraints are We are using configurations like this one to derive constraints on the generators X ∈ g. In the special case ε = 0, the transformation exp(εX) reduces to the identity. Hence, if we prepare the first wire in the (pure) state with Bloch vector a1, and perform a final measurement of that wire with Bloch vector − a1, the corresponding outcome will have probability zero, regardless of which local measurements we choose for the other wires. But probability zero is a local minimum, which implies that the derivative of this probability with respect to ε must be zero (yielding C[ a1] = 0), and the second derivative must be non-negative (yielding constraint (1) in the case k = 1). v( a 1 , a 2 , . . . , a n ) Xv( a 1 , a 2 , . . . , a n ) = 0, (2) v( a 1 , a 2 , . . . , a n ) X 2 v( a 1 , a 2 , . . . , a n ) ≤ 0 for analogous reasons as above. For fixed Bloch vectors a 2 , . . . , a n , b 2 , . . . , b n , define W α β as The equation for all i ≥ 1 and all α 2 , . . . , α n , β 2 , . . . , β n ≥ 0. Similarly, C[ 1 √ 2 ( e i + e j )] = 0 for i = j, i, j ≥ 1 yields Using the results on W i i and W 0 i further above, this reduces to − 1 2 W j i − 1 2 W i j = 0, and thus X i α2 ... αn j β2 ... βn = −X j α2 ... αn i β2 ... βn (7) for all i, j ≥ 1 and α 2 , . . . , α n , β 2 , . . . , β n ≥ 0. While we have derived (5), (6) and (7) for the first gbit, analogous equations hold for all other gbits with labels 2, . . . , n. Let us denote by A the antisymmetric (d + 1) × (d + 1)matrices of the form and by B the symmetric (d + 1) × (d + 1)-matrices of the form Furthermore, let I := R · 1, i.e. all multiples of the (d + 1) × (d + 1) identity matrix. The sets A, B and I are real linear matrix subspaces. Note that these three spaces are pairwise orthogonal with respect to the Hilbert-Schmidt inner product X, Y := tr(X Y ). The matrix W defined in (4) must then be an element of A ⊕ B ⊕ I due to the identities for its components that we have derived above. More generally, since the same identities hold for every index i ∈ {1, . . . , n} for the tensor X, we obtain X ∈ (A ⊕ B ⊕ I) ⊗n . Since X ∈ g was arbitrary, this tells us that Now let X ∈ g \ g loc be an arbitrary generator which is not in the local Lie algebra (here we explicitly make the assumption that such an X exists). Since X = 0, there must exist x such that Φ x (X) = 0 for the orthogonal projection Φ x into S x , and since X ∈ g loc , at least one of those x must satisfy x ∈ {AI . . . I, IAI . . . I, . . . , I . . . IA}.
Reordering the gbits, we may assume that x = A n A B n B I n I , where n A + n B + n I = n and one of the following three cases applies: (i) n A = 0, (ii) n A = 1 and n B ≥ 1, Since S x has an orthonormal basis of matrices of the form where B := B e1 . Set X := T XT −1 , then since T ∈ G loc ⊂ G, we have X ∈ g \ g loc , and X , M x = tr(T XT −1 TM x T −1 ) = X,M x = 0. Similar argumentation allows us to bring the AĀ i into a standard form.
Since the d × d-matricesĀ i are antisymmetric, one can infer from the results in [60] that there are orthogonal transformations R i ∈ SO(d) such that To save space, we will use the following notation in the remainder of the paper, where σ = 0 1 −1 0 : Now consider the corresponding (d + 1) × (d + 1)matrices A RiĀiR i , for which we will introduce the following notation. By A j , denote the matrix for which only the j-th block is non-zero, with λ j = 1. That is, for even d, we have the (d + 1) × (d + 1)-matrices and for odd d, we have an extra initial zero, namely The local transformationT : where the λ (i) j are real numbers. Set X :=T X T −1 , then sinceT ∈ G loc ⊂ G, we have X ∈ g \ g loc , and X , M x = tr(T X T −1T M xT −1 ) = X , M x = 0. In summary, we have shown that if there exist any nonlocal generators at all, then there is one (denoted X ) that has non-zero overlap with a matrix M x ∈ S x of the simple form (8).
Next we will show that this implies that g = g loc for all Bloch ball dimensions d ≥ 4.

B. Proof of Theorem 1 for d ≥ 4
We now use Schur's Lemma to construct orthogonal projectors (with respect to the Hilbert-Schmidt inner product) onto the subspaces of A ⊕ B ⊕ I. First, define For j = 1, . . . , d, consider the stabilizer subgroup where e j denotes the jth standard unit vector in R d . Every G j is isomorphic to SO(d − 1) whose fundamental representation is irreducible (note that this is not true , and, similarly as above, Schur's Lemma implies that Φ e1 is the orthogonal projector onto span(B) ⊕ I. Hence Φ B := Φ e1 −Φ I is the orthogonal projector onto span(B).
Finally, we will construct the orthogonal projector onto  where M 0,0 is a y × y-matrix, all M i,j for i, j ≥ 1 are 2 × 2-matrices, and the other matrices are y × 2 and 2 × ymatrices. Then, the action of Φ becomes Hence Φ is an orthogonal projection that acts as the identity on I (i.e. Φ (1) = 1), and it projects A into its subspace A blocks . Furthermore, if d is even, then Φ annihilates B, and if d is odd, then Φ projects B into its subspace span(B). Thus, for d even, the orthogonal pro- Note that all these statements are only claimed to hold for the case that the maps are applied to operators in A ⊕ B ⊕ I. The projectors Φ I , Φ B and Φ A map the Lie algebra g into itself, if we apply different products of those projectors to the n sites. For example, consider the special case n = 1. Then Z ∈ g implies Φ I [Z] ∈ g since g is closed with respect to conjugations and integrals. Similarly, Φ e1 [Z] ∈ g, and since g is a linear space, we also have Φ B [Z] = Φ e1 [Z] − Φ I [Z] ∈ g, and similarly for the projector Φ A . If n ≥ 2, then we can successively apply the projectors to one of the sites, using that tensoring local rotations with identities gives local transformations in G loc . Thus, if we define and thus Y = 0 (we have used that Φ is an orthogonal projection and thus in particular self-adjoint with respect to the Hilbert-Schmidt inner product). In particular, Y ∈ Im(Φ) = A ⊗n A blocks ⊗ span(B) ⊗n B ⊗ I ⊗n I . Consequently, there are real numbers λ j1,...,jn A such that Now we apply the identities A j A k = −δ jk P j and B 2 = P B , where and so on, up to P z . This gives us Suppose that n A is even such that (−1) n A = 1. We will now show that constraint (3) gets violated. To this end, fix some j 0 1 , . . . , j 0 n A such that λ j 0 1 ,...,j 0 n A = 0. For i = 1, . . . , n A , choose some unit vector a i ∈ R d such for i ≥ n A + n B + 1, choose a i arbitrarily such that Altogether, we obtain ν( a 1 , . . . , a n ) Y 2 ν( a 1 , . . . , a n ) > 0 which violates constraint (3). Thus n A must be odd, and (−1) n A = −1.
Recall constraint (1) in the special case k = 2: We will now distinguish two cases for n A . First, consider the case n A = 1. Since our original generator X was chosen nonlocal, it follows that n B ≥ 1, as explained in Subsection III A. Thus, the second tensor factor in (10) must be P B . We will now choose a 2 = e 2 which implies that 1 − a 2 P B 1 a 2 = 1. But then we may still choose b 1 , a 1 arbitrarily, and by choosing these two unit vectors suitably from the subspace Im(P j1 ), we may generate an arbitrary sign for Thus, we can break constraint (11) by a suitable choice of these two unit vectors, which yields a contradiction. Second, suppose that n A ≥ 3 (we already know that n A must be odd). Then we can choose a 2 such that 1 − a 2 P j2 1 a 2 = −1. We have even more freedom than in the previous case: for all i ∈ [1, n A ] \ {2}, we can choose b i , a i from the subspace Im(P ij ) such that we get an arbitrary sign for every This also leads to a violation of constraint (11), and we obtain a contradiction as well. This means that our initial assumption must have been wrong -namely, that there exists a generator in g \ g loc . We conclude that instead this set must be empty, hence g = g loc . But since G is compact and connected, it follows from [48,Theorem VII.2.2 (v)] that G cannot be larger than G loc . This proves our main result, Theorem 1, for Bloch ball dimensions d ≥ 4. The proof for d = 2 is given in Appendix A.

IV. CONCLUSIONS
Given a few simple properties that turn out to characterize qubit quantum computation, we have considered a natural modification: allowing the single bits to have more or less than the qubit's d = 3 degrees of freedom. We have analyzed the set of possible reversible transformations in the resulting theories, under the conjecture [19] (and hope) to discover novel computational models that differ in interesting ways from quantum computation. Unfortunately, it turns out that the resulting models do not allow for any non-trivial reversible gates whatsoever. This reinforces earlier intuition [20] that quantum theory, or in this context quantum computation, is an "island in theoryspace".
While we have made an effort to be as careful and parsimonious in our assumptions as possible, it is still interesting to ask whether there are any remaining "loopholes" that could in principle leave some wiggle room for non-trivial beyond-quantum computation: can any of the assumptions of Subsection II B be dropped or weakened, while insisting that single bits are described by Bloch balls? We discuss several options in Appendix B; in short, the most promising (but difficult) approaches would be to drop tomographic locality, and/or to drop reversibility or continuity of transformations. Both options present formidable mathematical challenges and are thus deferred to future work.
The "rigidity" of quantum theory, i.e. the difficulty of modifying it in consistent ways, has been recognized in different contexts for a long time, see e.g. Weinberg's proposal of a nonlinear modification of quantum mechanics [49], and Gisin's subsequent discovery [50] that this modification allows for superluminal signalling. The research presented in this paper and in other work (like [51,52]) makes this intuition more rigorous by specifying which combination of principles already enforces the familiar behavior of quantum theory. These insights also illuminate our understanding of quantum computation, since they tell us which physical principles enforce its properties, and/or which other theoretical models of computation are plausibly conceivable. Finally, it is interesting to speculate that the result of this paper is indirectly related to spacetime physics. After all, it is the fact that a qubit is represented as a 3ball B 3 , with SO(3) as its transformation group, which allows for spin-1/2 particles that couple to rotations in three-dimensional space. Given the popularity of approaches in which spacetime emerges in some way from an underlying quantum theory [53][54][55], this observation can perhaps be regarded as more than a coincidence. In fact, it has been argued more rigorously that the structures of quantum theory and spacetime mutually constrain each other [19,47,[56][57][58]. This suggests a slogan that also fits some other ideas from quantum information [59]: the limits of computation are the limits of our world.
It turns out that this map leaves not only I but also A (which is now one-dimensional) invariant. Since it still annihilates B, it is the orthogonal projector onto A ⊕ I. We can still use Φ B := 1 − Φ AI as the projector onto B, but we cannot construct a projector onto span(B) in a similar way. Now set n AI := n A + n I , and reorder the gbits such that A comes first, and then I, and then B (in contrast to the previous subsections). Next define the orthogonal projector (9) proves that Y = 0. It also follows that Y ∈ g \ g loc since Y has non-zero overlap with M x which in turn is orthogonal onto g loc . Defining where the α k1,...,kn are real numbers and m := n AI + 1. Now we will apply the first-order constraint (2) for some special choice of unit vectors a i . First, fix j 1 , j 2 , . . . , j n ∈ {0, 1} arbitrarily. For i ≤ n AI set a i := e 1 , and for i ≥ m set We obtain the following two equations Thus α 0,...,0,jm,...,jn = 0, i.e. every non-vanishing summand in (A1) contains at least one A (1) -term. Furthermore, in the special case that n B = 0, all summands with a single A (1) -term are themselves elements of g loc , and by subtracting those elements, we obtain another non-zero generator (which now also call Y ) for which every nonvanishing summand has at least two A (1) -terms. Next we slightly generalize constraint (1): 1 , a 2 , . . . , a n ) ≥ 0 (A2) also holds if we replace one or more of the unit vectors b j , a j , but not ± a k , by the zero vector.
Proof. We start with constraint (1), where all vectors are assumed to be unit vectors. To replace, for example, b j (for j = k) by 0, consider (1) and its version with b j replaced by − b j . Adding up the two inequalities (and dividing the result by two) proves (A2) for b j = 0. We can similarly replace any of the a j (for j = k) by 0, and do so recursively. Now we are ready to state and prove the main result of the appendix: Proof. Our strategy is to prove the following claim: Claim: Let 0 ≤ ≤ n AI be an integer. Then Y does not contain any summand in (A1) which has exactly occurrences of A (0) . In more formal words, if j 1 , . . . , j n has the property that #{i ∈ [1, n AI ] | j i = 0} = then α j1,...,jn = 0.
This claim will then imply that Y = 0, which is a contradiction (we have shown further above that Y = 0). We will prove this claim for two different cases separately; in both cases, our proof will be by induction. Note that we have already shown the claim above for = n AI (since there must be at least one A (1) -term in every summand).
Case 1: n B = 0 (such that n AI = n). Induction start: We know the claim is true for = n. Furthermore, since n B = 0, we have constructed Y such that no summand contains exactly one A (1) -term, hence the claim is also true for = n − 1.
Induction hypothesis: Consider an arbitrary integer with 0 ≤ ≤ n − 2. Let us assume that for any integer with 0 ≤ ≤ n and > we know that Y contains no summand with exactly occurrences of A (0) . Induction step: Using the induction hypothesis, we will now show that the Claim also holds for itself. We do so by contradiction. Suppose there was at least one non-vanishing summand in Y with exactly occurrences of A (0) . That is, there exist j 0 1 , . . . , j 0 n such that α j 0 1 ,...,j 0 n = 0 and exactly of the j 0 i are equal to zero. We will apply constraint (A2) for some choice of vectors a i , b i . To this end, for every i with Now consider w := Y 2 ν( a 1 , . . . , a n ). If a summand of Y 2 has less than indices k i with k i = 0 then it does not contribute to w; also, there are no summands with more than indices k i with k i = 0. Among those summands with exactly indices k i with k i = 0, these indices must occur in exactly those places i where j 0 i = 0, otherwise those summands do not contribute to w. But this enforces that only the summand with (k 1 , . . . , k n ) = (j 1 , . . . , j n ) = (j 0 1 , . . . , j 0 n ) contributes to w, and we get Y 2 ν( a 1 , . . . , a n ) = α 2 There are at least two indices z with j 0 z = 1; let k be one of those indices, and define a k := e 1 . Then 1 − a k A (j 0 k ) 2 1 a k = 1. Among the remaining places z with j 0 z = 1, we can choose a z and b z such that a z takes any sign we like. This will allow is to violate constraint (A2), and we have a contradiction. Case 2: n B ≥ 1.
Induction start: We have already shown the claim for = n AI . Induction hypothesis: Consider an arbitrary integer with 0 ≤ ≤ n AI − 1. Let us assume that for any integer with 0 ≤ ≤ n AI and > we know that Y contains no summand with exactly occurrences of A (0) .
Induction step: We proceed similarly as in Case 1. Using the induction hypothesis, we will now show that the Claim also holds for itself.
We do so by contradiction. Suppose there was at least one non-vanishing summand in Y with exactly occurrences of A (0) . That is, there exist j 0 1 , . . . , j 0 n such that α j 0 1 ,...,j 0 n = 0 and exactly of the j 0 i among i ∈ [1, n AI ] are equal to zero. We will apply constraint (A2) for some choice of vectors a i , b i . To this end, for every i with j 0 i = 0 set a i := 0 and choose b i arbitrarily. For those i, it follows that In Case 2, Y 2 is of the form Again, we have to choose which place corresponds to the k in constraint (A2). This time, we will choose k = m, and set a k = e 1 if j 0 k = 1 resp. a k = e 2 if j 0 k = 0, which implies Regardless of how we choose the remaining a i , we obtain ν( b 1 , b 2 , . . . , b k−1 , − a k , b k+1 , . . . , b n ) Y 2 ν( a 1 , a 2 , . . . , a n ) = = 1 j1,...,jn=0 1 k1,...,kn=0 α j1,...,.jn α k1,...,kn Consider the different possibilities for k 1 , . . . , k n for which α k1,...,kn AI ,j 0 m ,...,j 0 n = 0. There are less than or equal to many occurrences of k i (1 ≤ i ≤ n AI ) with k i = 0. If there are less, then (A3) implies that the final product in (A4) vanishes, hence the corresponding summand does not contribute to (A4). On the other hand, if there are exactly many, then (A3) implies that this product vanishes unless the occurrences of k i = 0 agree with the occurrences of j 0 i = 0. Similar argumentation works for the j 1 , . . . , j n , and if we also use (A2), we finally get