Abstract
We present a heuristic derivation of Born’s rule and unitary transforms in Quantum Mechanics, from a simple set of axioms built upon a physical phenomenology of quantization. This approach naturally leads to the usual quantum formalism, within a new realistic conceptual framework that is discussed in details. Physically, the structure of Quantum Mechanics appears as a result of the interplay between the quantized number of “modalities” accessible to a quantum system, and the continuum of “contexts” that are required to define these modalities. Mathematically, the Hilbert space structure appears as a consequence of a specific “extracontextuality” of modalities, closely related to the hypothesis of Gleason’s theorem, and consistent with its conclusions.
Introduction
A series of recent experimental tests of Bell’s theorem^{1,2,3} have been said to close the door on Einstein’s and Bohr quantum debate^{4}. It is generally considered that Einstein lost the case, by advocating a notion of “local realism” incompatible with quantum mechanics (QM)^{5}. However, Bohr^{6} also presented himself as a realist as far as physics is concerned, and QM has no direct conflict with relativistic causality. One may thus wonder whether a deep – but philosophically sound – redefinition of physical reality might provide a way to reconcile the founding fathers of quantum physics. In refs 7, 8, 9 and 10 we argued that this can be done, under the condition that fully predictable physical properties (called “elements of physical reality” in ref. 5) are attached not to a system alone, but to a system within a given experimental context^{6}.
In this paper, we further exploit this idea, in order to present a heuristic derivation of the quantum formalism, understood as a nonclassical way to calculate probabilities. An outstanding feature of our approach is that the superposition principle and Born’s rule appear as consequences of the quantized number of states accessible to a quantum system, without any appeal to “wave functions” – but we do recover projective measurements. Our approach bears some relationship with Gleason’s theorem^{11,12,13}, as it will be discussed below (see also Methods).
We shall start without formalism, but from a few definitions and hypotheses, presented here as axioms. These axioms are based on standard quantum phenomenology, and they have been introduced and discussed in refs 7 and 8 under the acronym “CSM”, meaning Context, System, Modality. In the present article we will not repeat this discussion, but rather use the the following axioms to summarize the main features of our approach. Though the formulation of the axioms contains very little mathematics, they have deep mathematical consequences, that will be spelled out in the “Results” section below.
Axiom 1 (modalities): (i) Given a physical system, a modality is defined as the values of a complete set of physical quantities that can be predicted with certainty and measured repeatedly on this system. (ii) Here a “complete set” means the largest possible set compatible with certainty and repeatability, for all possible modalities attached to this set. This complete set of physical quantities is called a context, and the modality is attributed jointly to the system and the context. (iii) Modalities cannot show up independently of a context, but the same modality may appear in different contexts, with the same conditions of repeatability and certainty.
Axiom 2 (quantization): (i) For a given context, there exist N distinguishable modalities {u_{i}}, that are mutually exclusive: if one modality is true, or realized, the others are wrong, or not realized. (ii) The value of N, called the dimension, is a characteristic property of a given quantum system, and is the same in all relevant contexts.
Axiom 3 (changing contexts): Given axioms 1 and 2, the different contexts relative to a given quantum system are related between themselves by continuous transformations g which are associative, have a neutral element (no change), and an inverse. Therefore the set of context transformations g has the structure of a continuous group .
For the sake of clarity, we note that, within the usual QM formalism (not used so far), a modality and a context correspond respectively to a pure quantum state, and to a complete set of commuting observables. The axioms are formulated for a finite N, but this restriction will be lifted below. Intuitively, as discussed in details in ref. 7, a context can be seen as a given “knob settings” of the measurement apparatus. We will not repeat this discussion here, but we want to consider the following question: it is postulated in Axiom 2 that there are N mutually exclusive modalities associated to each given context, but there are many more modalities, corresponding to all possible contexts, related according to Axiom 3. These modalities are generally not mutually exclusive, but are incompatible: it means that if one is true, one cannot tell whether the other one is true or wrong. Then, how to relate between themselves all these modalities?
A first crucial result already established in ref. 7 is that this connection can only be a probabilistic one, otherwise the axioms would be violated; the argument is as follows. Let us consider a single system, two different contexts C_{u} and C_{v}, and the associated modalities u_{i} and v_{j}, where i and j go from 1 to N. The quantization principle (Axiom 2) forbids to gather all the modalities u_{i} and v_{j} in a single set of more than N mutually exclusive modalities, since their number is fixed to N. Therefore the only relevant question to be answered by the theory is: If the initial modality is u_{i} in context C_{u}, what is the conditional probability for obtaining modality v_{j} when the context is changed from C_{u} to C_{v}? We emphasize that this probabilistic description is the unavoidable consequence of the impossibility to define a unique context making all modalities mutually exclusive, as it would be done in classical physics. It appears therefore as a joint consequence of the above Axioms 1 and 2, i.e. that modalities are quantized, and require a context to be defined.
Now, according to Axiom 3, changing the context results from changing the measurement apparatus at the macroscopic level, that is, “turning knobs”. A typical example is changing the orientations of a SternGerlach magnet. These context transformations have the mathematical structure of a continuous group, denoted : the combination of several transformations is associative and gives a new transformation, there is a neutral element (the identity), and each transformation has an inverse. Generally this group is not commutative: for instance, the threedimensional rotations associated with the orientations of a SternGerlach magnet do not commute. For a given context, there is a given set of N mutually exclusive modalities, denoted {u_{i}}. By changing the context, one obtains N other mutually exclusive modalities, denoted {v_{j}}, and one needs to build up a mathematical formalism, able to provide the probability that a given initial modality u_{i} ends up in a new modality v_{j}.
The standard approach at this point is to postulate that each modality u_{i} is associated with a vector u_{i}〉 in a Ndimensional Hilbert space, and that the set of N mutually exclusive modalities in a given context is associated to a set of N orthonormal vectors. Rather than vectors u_{i}〉 and v_{j}〉, one can equivalently use rankone projectors and , and Born’s rule giving the conditional probability p(v_{j}u_{i}) can be written as
In this article, we will postulate neither Born’s rule nor even Hilbert spaces, but we will derive them as the consequence of the previous Axioms. Then we will discuss the relation with Gleason’s theorem, as well as some consequences of our approach.
Results
In this part, we start from Axioms 1–3 and construct a consistent probability theory, by imposing some requirements on what it should describe. The first steps will thus be to translate the Axioms into mathematical constraints on probabilities relating modalities. This will lead us to manipulate N × N probability matrices, in a general way not restricted to the quantum formalism. Using our Axioms to obtain physicallybased constraints, we will finally get Born’s rule and unitary transforms.
First consequence of the Axioms: the general probability matrix
Let us denote {u_{i}} and {v_{j}} the respective modalities of the initial and final context, and define the probability of finding the particular modality v_{j}, when starting from modality u_{i}. There are N^{2} such probabilities, that can be arranged in a N × N matrix , containing all probabilities connecting the N modalities in each context {u_{i}} and {v_{j}}. Since one has obviously and , the matrix Π_{vu} is said to be a stochastic matrix (see Methods for definitions).
For clarity, let us emphasize the interpretation of the conditional probability notation: in agreement with the definition of modalities as certainties, the meaning of is that “if we start (with certainty) from modality u_{i} in the old context, then the probability to get modality v_{j} in the new context is ”. These probabilities provide the connection between theoretical predictions and experiments, and correspond to relative frequencies in repeated experiments starting from the same u_{i}. It is not critical whether they are interpreted in a frequentist or Bayesian sense, but it is critical to acknowledge that they are intrinsic consequences of our axioms on quantized modalities, and thus are not associated to any “missing information”.
For N = 3, one has for instance
As we will see below, N ≥ 3 is required because some crucial properties of Π_{vu} do not show up for N = 2. Let us also define a “return” probability matrix Π_{uv}, by exchanging the roles of the initial and final contexts. The matrix Π_{uv} is stochastic like Π_{vu}, but these two matrices are a priori unrelated, whereas it is known that in standard QM, they are transpose of each other.
Second consequence of the Axioms: the extracontextuality of modalities
An essential ingredient for determining the mathematical structure of Π_{vu} is provided by a physical constraint on the probability . This is found in Axiom 1 (iii) where it was stated that “Modalities cannot show up independently of a context, but the same modality may appear in different contexts, with the same conditions of repeatability and certainty”. This means in particular that if , then v_{j} ≡ u_{i}, i.e. v_{j} and u_{i} represent the same modality, within two different contexts. This claim may seem surprising since the measured physical quantities in the two contexts can be quite different (see example in Methods); but what matters here is that the certainty and reproducibility are transmitted from one context to the other, hence the idea that the modality is conserved. This has also a major mathematical consequence, which is that when , we will require that the same mathematical object is associated with the modality (v_{j} or u_{i}) in the two contexts.
More generally, and again in agreement with the physical reality of modalities, we will require that the probability depends only on the particular modalities u_{i} and v_{j} being considered, and not on the whole contexts in which they are embedded. Importantly, to build our formalism, we shall apply this requirement not only to the value of , but also to its mathematical expression; how to do that will be spelled out below. This property will be called “extracontextuality” (see relation with other works in Methods), and it means also that a modality can be defined independently of the (N − 1) other modalities which appear in a given context. Such an extracontextuality is fully compatible with contextual objectivity^{7,9,10}: the latter states that a modality needs a context to be defined, whereas the former tells that the same modality can show up in several contexts (a simple example is given in Methods, as well as an interesting link with a proof by John Bell^{14}).
The mathematical translation of the Axioms
Summarizing the previous discussions, we want to calculate probabilities relating physical events called modalities, occurring for a given physical system in a given physical context. Given the physical system, the rules are:
For any given physical context, there are exactly N mutually exclusive modalities. As a consequence, the N^{2} probabilities connecting the N modalities {u_{i}} (resp. {v_{j}}) in two different contexts can be arranged in a N × N stochastic matrix Π_{vu}. A similar “return” probability matrix Π_{uv} is defined by exchanging the roles of the initial and final contexts, and the set of context transformations has the structure of a continuous group.
If , then v_{j} and u_{i} are the same modality, and will be associated with the same mathematical object. This rule applies within a given context, where one has . If all probabilities are either zero or one between two different contexts, one will say that this is the same context, up to relabelling the modalities.
Extracontextuality constraint (ECC): the probability depends only on the two modalities u_{i} and v_{j} being considered, and not on the whole contexts in which they are embedded. In mathematical terms, depends only on the two mathematical objects associated with the two modalities.
These rules have obtained from the Axioms, though not by a fully formal deduction. They may thus be considered as additional principles, deduced from the nonmathematical Axioms, and leading to exploitable mathematical consequences.
From there, the main idea of our derivation is the following: we will first write a general parametrization of stochastic matrices, which is mathematically and physically neutral, i.e., it is just a rewriting. Nevertheless, this parametrization provides a simple criterion for the stochastic matrix to be unistochastic, i.e. that its coefficients are the square moduli of the coefficients of a unitary matrix (see definitions in Methods). Then we will translate the extracontextuality constraint into an equation, from which we will show that the matrices Π_{vu} and Π_{uv} are unistochastic. Finally the usual formalism of quantum mechanics (Born’s rule, unitary transforms, link between Π_{vu} and Π_{uv}) will follow automatically.
Mathematical lemmas on stochastic matrices
The theorems below are valid both for Π_{uv} and Π_{vu}, so u and v will be omitted whenever clarity allows.
Lemma 1: The elements p_{ji} of a N × N stochastic matrix can always be written under the general form
where and are two sets of N hermitian projectors of dimension N × N, mutually orthogonal within each set, and where R is a real nonnegative diagonal matrix such that Tr(R^{2}) = N, and Tr( R^{2}) = 1 for all projectors within .
Proof. Let us first introduce the orthogonal (N × N) projectors P_{i}, that are zero everywhere, except for the i^{th} term on the diagonal that is equal to 1; one has obviously P_{i}P_{j} = P_{i}δ_{ij}. A useful operation is then to extract the particular probability from Π_{vu}, or from Π_{uv}, and one has the following identities:
where Tr is the Trace, ^{†} is the Hermitian conjugate, and
are N × N matrices formed by square roots of the probabilities, and by arbitrary phase factors which are introduced here for the sake of generality, and cancel out when calculating the matrices Π_{vu} and Π_{uv}.
From Eqs (3,4,5) the elements p_{ji} of a general stochastic matrix Π can be written as (the subcripts uv or vu are omitted for simplicity):
Now, according to the singular values theorem (see Methods), there must exist two unitary matrices U and V, and a real diagonal matrix R, such that
where the diagonal values of R are the square roots of the (real) eigenvalues of ΣΣ^{†}, equal to those of Σ^{†}Σ, and are called the singular values of Σ (see proof in Methods). The value of Tr(R^{2}) is the sum of the square moduli of all the coefficients of Σ, and is therefore equal to N.
Using Eqs (6 and 7) p_{ji} can now be written as:
where { = U^{†}P_{i}U} and { = V^{†}P_{j}V} are two sets of projectors, mutually orthogonal within each set. Finally, the normalization condition ∑_{j}p_{ji} = 1 implies that
hence the diagonal elements of R^{2} in the basis associated with the projectors {} are all equal to one. ◻
Let us show now that very different situations occur, depending on the fact that the matrix R is (or is not) equal to the identity matrix . This is related to:
Lemma 2: The matrix Σ is unitary iff .
Proof. Eq. (7) shows that Σ is unitary if , and if Σ is unitary then and , so . ◻
An important corollary is that the matrix Π is unistochastic if . The reciprocal is not true, because Π being unistochastic does not imply that any matrix Σ defined by Eq. (5) is unitary (the phases may be wrong).
An obvious consequence of Lemma 2 is that if for all possible pairs of contexts, then the matrix Π is unistochastic for all pairs of context; we will show below that this corresponds to the usual quantum formalism. The opposite case is that for some pairs of contexts, but we will show that this contradicts our basic constraint that should depend only on the particular modalities u_{i} and v_{j} being considered. First let us establish the following mathematical Lemma:
Lemma 3: If then = 0, where is the determinant of the unistochastic matrix obtained from the square moduli of the coefficients of U introduced in Lemma 1.
Proof. Let us consider the normalization conditions Tr( R^{2}) = 1 obtained from Eq. (10), where the N projectors = U^{†}P_{k}U correspond to the initial context. Since one has also Tr() = 1, the N homogeneous equations must be verified by the N diagonal values of R^{2}. This set of equations admits a nontrivial solution if its determinant is equal to zero, and it is easy to check that is the determinant of the unistochastic matrix obtained from U. ◻
Summarizing the previous results, Lemma 1 tells us that for any stochastic matrix Π, one can parametrize the probabilities p_{ji} by using the diagonal matrix R and two sets of projectors and . Then according to Lemmas 2 and 3, two situations are possible:
 either for all pairs of contexts, and the matrix Π is always unistochastic as shown in Lemma 2.
 or for some pairs of contexts, and a stochastic (but generally not unistochastic) matrix Π is obtained for appropriate projectors, with = 0 as shown in Lemma 3.
The Fundamental Theorem
We are now in a position to use the extracontextuality constraint (ECC), which says that the expression of p_{ji} = Tr( R R) should depend only on the particular modalities u_{i} and v_{j} being considered, and not on the whole contexts in which they are embedded. A first step is the following Lemma:
Lemma 4: Given a N dimensional system, each context must be associated with a set of N mutually orthogonal projectors, each projector corresponding to one of the N mutually exclusive modalities.
Proof. In the case where the initial and final contexts are the same, then , Σ is unitary and diagonal, and . From its definition V can be any unitary matrix, and U = ΣV, so the two sets of projectors {} and {} are identical, and are associated with the current context. In addition, since Lemma 2 gives p_{ji} = Tr() = δ_{ij}, each modality u_{k} (k = i or j) must be associated with a projector of the set {} corresponding to the current context. ◻
For N ≥ 3, these N projectors may be part of other orthogonal sets, and the corresponding modalities may be part of other contexts. Again for consistency with the ECC, we will require that the same projector always corresponds to the same modality. This will extend first to all contexts containing one (or several) of the initial modalities, giving new projectors and new modalities, and then to the whole space of all N × N projectors, which will thus be associated to all possible modalities. This association has to be consistent when the contexts are changed; this will be discussed in eqs (14 and 15).
Let us emphasize that at this point we don’t have QM yet; in some sense, we have justified the Hilbert space framework of Gleason’s theorem, as the space of N × N projectors, but we still miss the main hypothesis and the result of the theorem, i.e. Born’s law. More precisely, we have justified that and depend solely on u_{i} and v_{j} in eq. (2); however, it is still possible that R depends on the whole contexts C_{u} and C_{v} in which u_{i} and v_{j} are embedded, and not on these two modalities only.
So we will use again the ECC to require that not only and but also R depend solely on u_{i} and v_{j}; more explicitly, this can be written:
where depends only on the two specific modalities u_{i}, v_{j} associated with the projectors , and not on the contexts in which they are embedded.
Fundamental Theorem: If each modality is bijectively associated with a rankone projector, and if p_{ji} is given by Eq. (11), then for all pairs of contexts.
Proof. Let us assume that ; then one has = 0 according to Lemma 3. But one can change the initial context by choosing new projectors to replace the {}, keeping for the modality i of interest, whereas the other projectors are different, but still mutually orthogonal (this is possible only if N ≥ 3). Then will generally not be zero any more (see Methods), so the assumption is not acceptable. More generally, the same reasoning is valid for any pair of modalities u_{i}, v_{j} therefore one has for all pairs of modalities, and also for all pairs of contexts. ◻
Intuitively, the theorem says that the resulting formula provides the unique way to express the coefficients p_{ji} of a stochastic matrix as a function of the sole modalities u_{i} and v_{j}, satisfying the ECC as expressed by Eq. (11). As we will show now Born’s formula and unitary transforms directly follow from this result.
Unitary matrices and Born’s formula
From now on we will take according to the previous Theorem. Therefore the matrix Σ_{vu} = UV^{†} is unitary, but one may wonder whether orthogonal (real) matrices might be enough. In order to justify that the full unitary set is required, we will use Axiom 3, telling that the change of contexts corresponds to a continuous group, to require that the set of matrices Σ_{vu} is connected in a topological sense, and contains the identity matrix. This set must contain permutation matrices, because they correspond simply to “relabelling” the modalities, i.e. to a trivial change of context. One has then
Lemma 5: If the set of matrices Σ_{vu}, including permutation matrices, is connected in a topological sense, and contains the identity matrix, then the matrices Σ_{vu} must be complex unitary matrices.
Proof. The set of real orthogonal matrices is topologically disconnected in two parts with determinant +1 and −1, whereas permutation matrices may have determinant −1, and the identity has determinant +1. On the other hand, all (complex) unitary matrices are connected to the identity, hence the result (see also refs 15 and 16). ◻
We are thus lead to the conclusion that Σ_{vu} must be a unitary matrix S_{vu}, with . Then Eqs (3) for picking up a particular probability become:
As said above Π_{vu} is unistochastic, and we can define
It is clear that these operators are all Hermitian projectors, i.e. one has P^{†} = P and P^{2} = P for each of them, and also that all sets {} and {} have the same orthogonality properties as the initial set of projectors {P_{i}}, i.e. P_{i}P_{j} = P_{i}δ_{ij}. One can thus rewrite Eq. (3) as:
which is just Born’s formula (Eq. 1). Eqs (12 and 13) are consistent with our initial requirement associating a projector with a modality in any context, but make clear that this association is up to a global unitary transform, related to the choice of a fiducial context. In particular, there are two possible choices for the matrix Π_{vu}:
One can now come back to the matrix Π_{uv}, for which the same reasoning is valid, and leads to a unitary matrix S_{uv}. By reverting the contexts one has thus:
Again, the projectors should be the same for a given modality in a given context, i.e. one should have (for the same P_{i} in the other context), and (for the same P_{j} in the other context). This is obtained if S_{uv} is the inverse of S_{vu}, leading to a last lemma:
Lemma 6: If and as defined above, then (up to global phase factors), and the matrices Π_{uv} and Π_{vu} are related by .
Proof. Obvious from the relations (14) and (15). ◻
Then the various points of view represented in the relations (14, 15) are all consistent and give the same values for the probabilities, because each S_{vu} can be associated to an element of the group of context transformations , and its inverse is . For the general consistency of the approach including Axiom 3, this set of matrices gives a N × N (projective) representation of the group of context transformations; this is fully consistent with the well known Wigner theorem^{17}. This continuous unitary evolution will be essential to describe the evolution of the system (translation in time)^{17}. Since we have now reached the starting point of most QM textbooks^{18}, it should be clear that the standard structure of QM can be obtained from this construction. In particular, one can associate the N orthogonal projectors {P_{i}} to the N orthonormal vectors which are eigenstates of these projectors up to a phase factor, i.e., to rays in the Hilbert space. Similarly, the expected probability law for the measurement results {a_{i}} will be obtained by writing any physical quantity A as an operator , this is the usual spectral theorem. The tensor product structure for composite systems can also be introduced in the usual way (see Methods).
Discussion
An interesting outcome of our derivation is that the usual Hilbert space structure (for N × N matrices) shows up, without any initial assumption of a superposition principle, interference effect, or wave function^{19,20,21,22,23,24,25,26}. This structure comes directly from requirements on probabilities, implying that the matrices Π_{uv} and Π_{vu} belong to the unistochastic subset of stochastic matrices. This appears as the mathematical consequence of the joint physical requirements of contextuality of the theory (contexts are needed to define modalities), quantization of modalities (making probabilities necessary), and extracontextuality of modalities (probabilities depend on modalities, that may belong to different contexts).
Extracontextuality is also a crucial hypothesis for Gleason’s theorem^{11,12}, which is deeply related to our derivation; however, the reasonings proceed in quite different ways. The Hilbert space structure is an assumption in Gleason’s theorem, whereas in our case it appears more heuristically. Rather than reconstructing “from scratch” the Trace formula, as done by Gleason, we introduce it as a general parametrization of stochastic matrices; this avoids the heavy machinery of Gleason’s theorem^{11,12}, in particular the demonstration of continuity (see Methods). Then we use extracontextuality to restrict acceptable matrices to unistochastic ones, ending up again with Born’s formula in finite dimension. Using explicitly Gleason’s theorem is also possible^{13}, and has the advantage of lifting the restriction on a finite N (see Methods).
We emphasize that we do not need any additional “measurement postulate”, since measurement is already included in Axiom 1, i.e. in the very definition of a modality^{7,8,9,10}. Quantum superposition are here as usual, but they are not spooky “deadandalive” concepts: they are rather the manifestation of a modality (i.e., a certainty) in another context. Entanglement is also present as linear superpositions of tensor product states, corresponding to modalities in a “joint” context, and the specific case of twoparticle BellEPR experiment is discussed in ref. 8. Since a modality requires both a context and a system, it embeds nonlocal features corresponding to quantum nonlocality, but it is fully compatible with relativistic causality^{7,8}, and operationally agrees with nosignaling, just like QM does. From a foundational point of view, our approach also provides a clear distinction between the modality, which is a real physical phenomenon, or a physical event in the sense of probability theory, and the projector, which is a mathematical tool for calculating nonclassical probabilities.
To conclude, let us emphasize that we discussed a very idealized version of QM, based on pure states and orthogonal measurements. Nevertheless, this idealized version does provide the basic quantum framework, and connects the experimental definition of a physical quantity and the measurement results in a consistent way, both physically and philosophically^{7}. Adding more refined tools such as density matrices, imperfect measurements, POVM, open systems, decoherence, is of great practical interest and use, but this will not “soften” the basic ontology of the theory, as it is presented here. The present work, deeply rooted in ontology, is thus complementary to many recent related proposals^{19,20,21,22,23,24,25,26,27,28,29,30,31,32,33}.
Methods
Stochastic matrices
A stochastic matrix has real positive coefficients, with all lines summing up to 1. Bistochastic matrices are stochastic ones, with both lines and columns summing to 1. Orthostochastic and unistochastic matrices are obtained by taking the square moduli of the coefficients of respectively an orthogonal or a unitary matrix^{34}. For N = 2, all bistochastic matrices are also ortho and unistochastic, and for N ≥ 3, the set of unistochastic matrices is larger than the orthostochastic set, but smaller than the bistochastic set. For instance, the simple matrix
is a well known example of a bistochastic matrix, which is neither orthostochastic, nor unistochastic; therefore it is not an acceptable (quantum) probability matrix Π_{vu}.
Singular values theorem and the invariance of R
To obtain the singular values decomposition, diagonalize the Hermitian matrix Σ^{†}Σ, get the real diagonal matrix R and the unitary matrix V so that V^{†}Σ^{†}ΣV = R^{2}. Then define another unitary matrix U such that UR = ΣV, and RU^{†} = V^{†}Σ^{†}, so that U^{†}ΣΣ^{†}U = R^{2}. One gets thus the decomposition Σ = URV^{†} as expected.
In the demonstration of the fundamental theorem, we note that one might restrict the new projectors to be such that . Then R can be different of , but it has still to be modified to some to fulfill the normalization conditions with the . Since the hypotheses is that R should be constant, this case is excluded also.
One may also wonder what would happen if no phase factors were included in the definition of Σ. Then the Lemmas are still valid, but Σ cannot be unitary, and not even orthogonal. Then according to Lemma 2, R cannot be the identity, and therefore the extracontextuality constraint cannot be satisfied.
Extracontextuality and Gleason’s theorem
Extracontextuality is not a new concept, but it is a new name given to a known concept, called noncontextuality in articles dealing with Gleason’s theorem^{12}, or “measurement noncontextuality” in ref. 31. Extracontextuality is not the contrary of contextuality, and it avoids confusion arising when using “noncontextuality”. In particular, contexts are needed to define modalities, and modalities are extracontextual, without any contradiction with refs 7, 9, 10 and 13 or with the KochenSpecker theorem. As a simple example of extracontextuality, consider a system of two spin 1/2 particles, and define . Using standard notations for coupled and uncoupled basis, the m_{1} = 1/2, m_{2} = 1/2〉 modality in the context {S_{z1}, S_{z2}} is the same as the S = 1, m_{S} = 1〉 modality in the context , though other modalities in the same two contexts are different.
Demonstrating continuity of the probability formula is an essential step of Gleason’s theorem. In our derivation continuity appears formally in Axiom 3, and is embedded in the matrix formalism that we are using. It is used in the Fundamental theorem to to build a new set of projectors , keeping one of them constant, and in Lemma 5 to get complex unitary matrices. So it does play a role, but does not have to be demonstrated. The explicit use of Gleason’s theorem for allowing the dimension N to be infinite is spelled out in ref. 13. This requires to introduce an Axiom 4 associating modalities and projectors in an Hilbert space; such an Axiom is not formally required in the present heuristic derivation, but it provides a useful “backup”.
It is interesting to note that John Bell demonstrated explicitly in ref. 14 (Section V) that if extracontextuality is accepted as it is done in the hypothesis of Gleason’s theorem, then the impossibility of hidden variables (HV) automatically follows (more technically, Bell showed that there is no dispersionfree state). Then he wrote at the end of his proof : It was tacitly assumed that measurement of an observable must yield the same value independently of what other (commuting) measurements may be made simultaneously. Thus as well as P(Φ_{3}) say (projector on vector Φ_{3}), one might measure either P(Φ_{2}) or P(Ψ_{2}), where Φ_{2} and Ψ_{2} are orthogonal to Φ_{3} but not to one another. These different possibilities require different experiment arrangements; there is no a priori reason to believe that the results for P(Φ_{3}) should be the same. So Bell rejected the “tacit assumption”, and therefore also the conclusion that there is no dispersionfree states. But extracontextuality, seen as a consequence of the reality of modalities, may provide the missing “a priori reason” to accept the assumption, and thus also Bell’s proof.
Obviously Bell’s goal was very different from ours, since he was investigating the possibility of HV, whereas we want to recover the quantum formalism. Nevertheless, we conclude that if we accept extracontextuality, then the quantum formalism follows (from the present article), and consistently with this result, HV are excluded (from ref. 14).
Relations with textbook quantum mechanics
In this section we outline various issues relating our approach to standard QM. First, we considered only pure states and orthogonal (projective) measurements. This fits with the usual view that mixed states (density matrices) and nonorthogonal measurements (POVM) correspond to more classical aspects of probabilities, and can be introduced at a later stage. This is possible because in each context, a classical probability distribution can be built upon the N mutually exclusive modalities.
In our approach entanglement appears naturally in the following way: let us consider two systems 1 and 2 with N_{1} and N_{2} mutually exclusive modalities. If both systems are considered together, but each one in its own context, there are clearly N = N_{1} × N_{2} mutually exclusive modalities. But from Axiom 2, the value of N does not depend on the context, so the global system must be described by N × N projectors and unitary matrices. Many of these projectors cannot be split into projectors acting separately on system 1 or system 2, and are associated to entangled states.
In this article the postulate on time evolution is not spelled out, but it enters in the same framework, by including translations in time in the group . For instance, if is the Galileo group, standard nonrelativistic QM can be recovered, including Schrödinger’s equation^{17}. Also, we did not discuss the known connection between the physical quantities and the infinitesimal generators of , or the role of “projective” representations. Finally, we considered only nonrelativistic quantum mechanics, and therefore “type I” von Neumann algebra, see e.g. ref. 35.
Additional Information
How to cite this article: Auffèves, A. and Grangier, P. Recovering the quantum formalism from physically realist axioms. Sci. Rep. 7, 43365; doi: 10.1038/srep43365 (2017).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Hensen, B. et al. “Loopholefree Bell Inequality Violation Using Electron Spins Separated by 1.3 Kilometres”. Nature 526, 682 (2015).
 2.
Giustina, M. et al. “SignificantLoopholeFree Test of Bell’s Theorem with Entangled Photons”. Phys. Rev. Lett. 115, 250401 (2015).
 3.
Shalm, L. K. et al. “Strong LoopholeFree Test of Local Realism”. Phys. Rev. Lett. 115, 250402 (2015).
 4.
Aspect, A. “Closing the Door on Einstein and Bohr’s Quantum Debate”. Physics 8, 123 (2015).
 5.
Einstein, A., Podolsky, B. & Rosen, N. “Can Quantum Mechanical Description of Physical Reality Be Considered Complete?”. Phys. Rev. 47, 777 (1935).
 6.
Bohr, N. “Can QuantumMechanical Description of Physical Reality be Considered Complete?”. Phys. Rev. 48, 696 (1935).
 7.
Auffèves, A. & Grangier, P. “Contexts, Systems and Modalities: a new ontology for quantum mechanics”. Found. Phys. 46, 121 (2016), eprint arXiv:1409.2120 [quantph] (2014).
 8.
Auffèves, A. & Grangier, P. “Violation of Bell’s inequalities in a quantum realistic framework”. Int. J. Quantum Inform. 14, 1640002 (2016); eprint arXiv:1601.03966 [quantph] (2016).
 9.
Grangier, P. “Contextual objectivity: a realistic interpretation of quantum mechanics”. European Journal of Physics 23(3), 331 (2002); eprint arXiv:quantph/0012122 (2000).
 10.
Grangier, P. “Contextual objectivity and the quantum formalism”. International Journal of Quantum Information 3(1), 17–22 (2005); eprint arXiv:quantph/0407025 (2004).
 11.
Gleason, A. M. “Measures on the Closed Subspaces of a Hilbert Space”. J. Math. Mech. 6(6), 885–893 (1957).
 12.
Granström, H. “Gleason’s theorem”. Master Thesis, Department of Physics, Stockholm University (2006); http://kof.physto.se/theses/helenamaster.pdf.
 13.
Auffèves, A. & Grangier, P. “A simple derivation of Born’s rule with and without Gleason’s theorem”. eprint arXiv:1505.01369 [quantph] (2015).
 14.
Bell, J. S. “On the Problem of Hidden Variables in Quantum Mechanics”. Rev. Mod. Phys. 38, 447 (1966).
 15.
David, F. “A short introduction to the quantum formalism[s]”. Lectures notes for a course given at IPhTSaclay and Ecole Doctorale ED107 de Physique de la Region Parisienne, eprint arXiv:1211.5627 [mathph] (2012).
 16.
Aaronson, S. “Is quantum mechanics an island in theory space”. eprint arXiv:quantph/0401062 (2004).
 17.
Laloë, F. “Les symétries en mécanique quantique”. Cours de DEA, Ecole Normale Supérieure (1980).
 18.
CohenTannoudji, C. Diu, B. & Laloe, F. “Mécanique Quantique”. Hermann, 1977.
 19.
Hardy, L. “Why is nature described by quantum theory?” In “Science and Ultimate Reality”. pages 45–71, Cambridge University Press, 2004.
 20.
Hardy, L. Reformulating and Reconstructing Quantum Theory. eprint arXiv:1104.2066v1 (2011).
 21.
Popescu, S. & Rohrlich, D. “Causality and Nonlocality as Axioms for Quantum Mechanics”. Proceedings of the Symposium on Causality and Locality in Modern Physics and Astronomy (York University, Toronto, 1997).
 22.
Brukner, C. & Zeilinger, A. “Information Invariance and Quantum Probabilities”. Found. Phys. 39, 677 (2009).
 23.
Pawlowski, M. et al. “A new physical principle: Information Causality”. Nature 461, 1101 (2009).
 24.
Chiribella, G., D’Ariano, G. M. & Perinotti, P. “Informational derivation of quantum theory”. Phys. Rev. A 84, 012311 (2011).
 25.
Colbeck, R. & Renner, R. “No extension of quantum theory can have improved predictive power”. Nature Commun. 2, 411 (2011).
 26.
Masanes, L., Müller, M. P., Augusiak, R. & PerezGarcia, D. “Existence of an information unit as a postulate of quantum theory”. PNAS 110:41, 16373 (2013).
 27.
Zurek, W. H. “Quantum darwinism, classical reality, and the randomness of quantum jumps”. Phys. Today 67, 44–50 (October 2014).
 28.
Auffèves, A., Grangier, P., Kastner, R. E. & Zurek, H. W. “Classical selection and quantum Darwinism”. Phys. Today 68, 8–10 (May 2015).
 29.
Amaral, B., Terra Cunha, M. & Cabello, A. “The exclusivity principle forbids sets of correlations larger than the quantum set”. Phys. Rev. A 89, 030101 (2014).
 30.
Sainz, A. B. et al. “Exploring the Local Orthogonality Principle”. Phys. Rev. A 89, 032117 (2014).
 31.
Acin, A., Fritz, T., Leverrier, A. & Sainz, A. B. “A Combinatorial Approach to Nonlocality and Contextuality”. Commun. Math. Phys. 334(2), 533 (2015).
 32.
Ekert, A. “Quantum Paths” (eds Hui Khoon, N. g. & Rui, Han), 199–203 (World Scientific, 2015).
 33.
Horodecki, R., Korbicz, K. & Horodecki, P. “Quantum origins of objectivity”. Phys. Rev. A 91, 032122 (2015).
 34.
Bengtsson, I., Ericsson, A., Ku, s M., Tadej, W. & Zyczkowski, K. “Birkhoff’s polytope and unistochastic matrices, N = 3 and N = 4”. Commun. Math. Phys. 259, 307–324 (2005); eprint arXiv:math/0402325 (2004).
 35.
Rèdei, M. & Summers, S. J. “Quantum probability theory”. Studies in the History and Philosophy of Modern Physics 38, 390–417 (2007); eprint arxiv:quantph/0601158 (2006).
Acknowledgements
The authors thank François Dubois, Cyril Branciard and Anthony Leverrier for many discussions, and Nayla Farouki for continuous support. P.G. also thanks Adàn Cabello and Alastair Abbott for useful comments and discussions.
Author information
Affiliations
Institut Néel, BP 166, 25 rue des Martyrs, F38042 Grenoble Cedex 9, France
 Alexia Auffèves
Laboratoire Charles Fabry, IOGS, CNRS, Université Paris Saclay, F91127 Palaiseau, France
 Philippe Grangier
Authors
Search for Alexia Auffèves in:
Search for Philippe Grangier in:
Contributions
A.A. and P.G. both wrote and reviewed the manuscript.
Competing interests
The authors declare no competing financial interests.
Corresponding author
Correspondence to Philippe Grangier.
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Further reading

1.
Relational Quantum Mechanics and Probability
Foundations of Physics (2018)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.