Introduction

Do pure quantum states correspond one-to-one to real physical states? The claim that they do not—that they represent states of incomplete knowledge or information about the actual physical states—dates back to the first years of quantum mechanics1,2. In recent decades, this claim has inspired a research program3,4,5,6,7,8,9,10,11,12 taking distinct pure quantum states to be compatible with a single physical state. That is, even pure quantum states in quantum mechanics could play a role similar to phase space distributions in classical statistical mechanics. Following refs. 7,13,14, we shall call the underlying physical state the “ontic state” and the probability distribution over the ontic states associated with a given quantum state the “epistemic state”.

This program prompts the following question: Can we reconstruct quantum mechanics as a clear, physical modification of classical statistical mechanics? (An answer could be of practical interest toward understanding the physical resource responsible for the advantages of quantum over classical information protocols.) To answer this question, we note two essential features of quantum mechanics that distinguish it from classical mechanics: entanglement and the uncertainty principle. Can these two features lead us toward a reconstruction of quantum mechanics from classical statistical mechanics? Our only hope for a positive answer is to first adapt entanglement and the uncertainty principle to the framework of classical statistical mechanics. There have been attempts to explain (parts of) quantum mechanics starting from classical statistical models by assuming a fundamental restrictions on the class of possible epistemic states that can be prepared6,7,9,10,11,12,15. Such an “epistemic restriction” captures, to an extent, the uncertainty principle. Indeed, it is remarkable that this approach has been shown to reproduce a substantial portion of quantum phenomena traditionally judged incompatible with any classical world view.

In general, a model or theory that assumes that physical reality exists independent of measurement is called an “ontological” (or “ontic”) model13,14,16. For example, Bohmian mechanics17 is an “ontic extension” of classical mechanics: it posits a physically real or ontic and in general nonseparable wave function defined in a multi-dimensional configuration space, satisfying the Schrödinger equation, that guides the dynamics of the particles. (For an alternative nomological interpretation of the wave function within Bohmian mechanics, see ref. 18). Conversely, Pusey, Barrett, and Rudolph have shown that any ontological model of quantum mechanics in which the wave function is not physically real must violate a statistical independence requirement called preparation independence19. Of course, such a model must also violate Bell’s inequalities20 and satisfy the Bell–Kochen–Specker contextuality theorem15,21 and its generalization16.

Here we derive nonrelativistic quantum mechanics (without quantum spin) in a way closely paralleling the derivation of classical statistical mechanics. That is, we derive the two theories within a common axiomatic framework, imposing conservation of average energy and probability current. Within this framework, two axioms distinguish quantum from classical statistical mechanics. One axiom creates an “ontic extension” in the form of “a global-nonseparable random variable;” the other axiom imposes a specific “epistemic restriction” on what probability distributions of momenta can be prepared, given a distribution of conjugate positions, and vice versa. (See Eqs. (5) and (6) below.) We obtain the mathematics and rules of quantum mechanics in a complex Hilbert space from this model as we average over the nonseparable random variable. We find that the ontic extension and epistemic restriction are together deeply related to two distinctive features of quantum mechanics: entanglement and the uncertainty principle, which arguably are responsible for the classically puzzling features of microscopic phenomena22. We conjecture and argue that, unlike Bohmian mechanics, the wave function in our ontological model is not physically real; what is real is a nonseparable, global random variable.

Results

Classical statistical mechanics of ensemble of trajectories

Let us start from the conventional classical statistical mechanics of ensembles of trajectories. We work within the Hamilton–Jacobi formalism23, proposing a set of specific modifications. One of the reasons we take the Hamilton–Jacobi formalism as our starting point is that in some formal classical limit, the Schrödinger equation (in the position representation) reduces to the Hamilton–Jacobi equation, with the quantum phase reducing to Hamilton’s principal function. This limit is formally nontrivial; see for example the discussion in ref. 6. We will argue that this formal limit can be extended conceptually, i.e., that Schrödinger’s equation can likewise be interpreted as describing the dynamics of an ensemble of trajectories. Indeed, we will derive both fundamental equations within one axiomatic framework.

Let us consider a general many-particle system with N degrees of freedom, in a spatial configuration denoted as \(q = \left( {{q_1}, \ldots ,{q_N}} \right) \in {{\Bbb R}^N}.\) Let t denote the time parameterizing its evolution. Within the Hamilton–Jacobi formalism, the central result of classical mechanics, based on the principle of least action, is that there is a differentiable real-valued scalar function of the configuration and time \({S_{\rm{C}}}(q;t):{{\Bbb R}^N} \times {\Bbb R} \mapsto {\Bbb R}\) with the dimensions of action—Hamilton’s principal function—so that the momentum field p = (p 1, …, p N ) canonically conjugate to q is given by its gradient as:

$${p_i}(q;t) = {\partial _{{q_i}}}{S_{\rm{C}}}{\rm{,}}$$
(1)

i = 1, …, N; and the time evolution of S C(q;t) is generated by the classical Hamiltonian H(p,q) obeying the Hamilton–Jacobi equation:

$$ - {\partial _t}{S_{\rm{C}}} = H(p,q).$$
(2)

Equations (1) and (2) describe the dynamics of an ensemble of trajectories characterized by a single function S C(q;t). A choice of q at a single t specifies the dynamics of a single trajectory.

As long as we consider a classical Hamiltonian up to second order in momentum, there is another way to derive the Hamilton–Jacobi Eq. (2), without resort to the least action principle. The derivation, given in Theorem 1 below, will provide physical insight for our reconstruction of quantum mechanics. Suppose we are given a continuous and deterministic momentum field in configuration space p i  = p i (q;t), i = 1, …, N. Let us construct a differentiable function S C(q;t) such that its spatial gradient at (q;t) is precisely equal to p as prescribed by Eq. (1). Next, let ρ(q;t) denote the probability distribution over the configurations q at time t. The phase space distribution, given a pair of functions S C and ρ, can then be written as \({\rm{P}}\left( {p,q\left| {{S_{\rm{C}}},\rho } \right.} \right) = {\rm{P}}\left( {p\left| {q,{S_{\rm{C}}}} \right.} \right)\rho (q)\), where the conditional probability distribution of p given q and S C reads, noting Eq. (1),

$${\rm{P}}\left( {p\left| {q,{S_{\rm{C}}}} \right.} \right) = \mathop {\prod}\limits_{i = 1}^N \delta \left( {{p_i} - {\partial _{{q_i}}}{S_{\rm{C}}}} \right);$$
(3)

time is implicit. For a pair S C and ρ, the ensemble (phase space) average of any physical quantity \({\cal O}\left( {p,q} \right):{\rm{\Omega }} \mapsto {\Bbb R}{\rm{,}}\) where \({\rm{\Omega }} \equiv {{\Bbb R}^{2N}}\) is the phase space, is thus expressible as \({\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{C}}},\rho } \right\}}} \equiv {\int} {\rm{d}}q{\rm{d}}p\,{\cal O}(p,q){\rm{P}}\left( {p,q\left| {{S_{\rm{C}}},\rho } \right.} \right)\) = \({\int} {\rm{d}}q{\kern 1pt} {\cal O}\left( {{\partial _q}{S_{\rm{C}}},q} \right)\rho (q)\) with dq ≡ dq 1…dq N , dp ≡ dp 1…dp N .

How does the phase space distribution evolve over time? It is clear from the above construction that it is determined by the time evolution of S C and ρ, since this pair yields the phase space distribution. The time evolution of ρ depends, via the continuity equation, on the velocity field, which in turn is determined by S C via Eq. (1). Hence, to obtain the time evolution of the phase space distribution, it suffices to know the dynamical equation for S C(q;t). Of course, we need constraints, characterizing the dynamics of the ensemble trajectories in phase space, to single out the dynamics of S C(q;t). The following theorem, proved in “Methods” subsection “Proof of Theorem 1”, then applies:

Theorem 1. Consider an ensemble of trajectories satisfying Eq. (1) or equivalently Eq. (3). For a classical Hamiltonian H(p,q) up to second order in momentum, the constraint that the ensemble of trajectories conserves the probability current and average energy implies a unique dynamics for S C(q,t), given by the Hamilton–Jacobi equation, Eq. (2):

$$ - {\partial _t}{S_{\rm{C}}} = H(p,q).$$

The Hamilton–Jacobi equation immediately implies the usual Liouville equation describing the time evolution of the phase–space distribution.

In the course of the paper, we shall not explicitly use Theorem 1. However, as mentioned earlier, it provides physical insight into, and thus anticipates, Theorem 3 below, which states that the Schrödinger equation can likewise be derived by imposing conservation of average energy and probability current. The only difference is that rather than constraining the ensemble of trajectories to satisfy Eq. (3), which is necessary for the derivation of the Hamilton–Jacobi equation, for deriving the Schrödinger equation we need a different class of trajectories, satisfying certain constraints to be discussed in the next subsection (cf. Eq. 6). On the one hand, this line of reasoning will provide a smooth formal and conceptual quantum-classical correspondence. On the other hand, we expect it to also tell us what really distinguishes quantum mechanics from classical statistical mechanics.

Now we make two remarks on classical statistical mechanics. First, in classical mechanics, the ontic (physical/microscopic) state is completely specified by the values of a pair of canonically conjugate variables (p,q)  Ω, or a point in the phase space Ω. In a system having many subsystems, the space of the ontic states of the whole system is thus always “separable” into the Cartesian product of spaces of subsystems: Ω = Ω1 × … × Ω N . Also, the time evolution of the ontic state is deterministic. Second, the epistemic (macroscopic) state in classical statistical mechanics is given by the probability distribution over the phase space. One can then explicitly see from Eq. (1) or (3) that in classical statistical mechanics there is an “epistemic” (statistical) freedom to choose an arbitrary momentum field p(q) consistent with a given ρ(q). In other words, we are free to select the conditional probability distribution over the momentum independently of the distribution over the canonically conjugate position:

$${\rm{P}}\left( {p\left| {q,{S_{\rm{C}}},\rho } \right.} \right) = {\rm{P}}\left( {p\left| {q,{S_{\rm{C}}}} \right.} \right){\rm{.}}$$
(4)

Conversely, given a momentum field p(q), there is a freedom to prepare an ensemble of trajectories compatible with p(q) with arbitrary ρ(q). That is, each trajectory in the momentum field can be weighted arbitrarily. We show below how to derive quantum mechanics by giving up these two basic features of classical statistical mechanics.

Microscopic ontic extension and epistemic restriction

Evidently, we need new physical axioms if we are to recover quantum mechanics. Here we introduce the following two innovations. First, we make an ontic extension by introducing a hypothetical global-nonseparable ontic variable ξ. It is real valued with dimensions of action, and depends only on time—it is spatially uniform. By “global-nonseparable,” we mean that at any time, two arbitrarily separated physical objects are subject to the same simultaneous value of ξ. We assume that ξ fluctuates “randomly” on a microscopic time scale with a probability distribution at any time given by μ(ξ), so that each single run of an experiment, in an ensemble of identical experiments, is parameterized by an independent realization of ξ.

Second, we assume the following epistemic restriction on the possible phase space probability distributions that nature allows us to prepare. We assume that, given a momentum field, it is not possible to prepare an ensemble of trajectories compatible with the momentum field with arbitrary distribution over configurations, again denoted by ρ(q;t). In other words, each trajectory in the momentum field can no longer be assigned arbitrary weight. In turn, the conditional distribution of p, given q at time t, therefore depends on the choice of ρ(q;t), so that we have:

$${\rm{P}}\left( {p\left| {q{\rm{, \ldots ,}}\rho } \right.} \right) \, \ne \, {\rm{P}}\left( {p\left| {q{\rm{, \ldots }}} \right.} \right)$$
(5)

(in contrast to Eq. 4). Hence, we shall consider a model which lacks the epistemic freedom of classical statistical mechanics.

Let us then assume that an ensemble of identical preparations (defined by the same set of macroscopic parameters) will generate a conditional probability distribution for the momenta which depends on ρ in accord with Eq. (5) as follows:

$${\rm{P}}\left( {\left. p \right|q,\xi ,{S_{\rm{Q}}},\rho } \right) = \mathop {\prod}\limits_{i = 1}^N \delta \left( {{p_i} - \left( {{\partial _{{q_i}}}{S_{\rm{Q}}} + \frac{\xi }{2}\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)} \right);$$
(6)

time is implicit. Here \({S_{\rm{Q}}}(q;t):{{\Bbb R}^N} \times {\Bbb R} \mapsto {\Bbb R}\) is a real-valued scalar function with dimensions of action; it replaces S C(q;t) of the corresponding classical case (cf. Eq. 3). In general, S Q may have a different form from S C: the time evolution of S Q, as will be shown later, will have to satisfy a modified Hamilton–Jacobi equation with a ρ-dependent term. (See “Methods” subsections “Proof of Theorem 3” and “Schrödinger’s equation for measurement of angular momentum”). The ensemble of trajectories must satisfy Eq. (6). Further on the consistency of the statistical interpretation of Eq. (6) is given in “Methods” subsection “Statistical interpretation of the epistemic restriction.”

For a smooth macroscopic classical limit, Eq. (6) must reduce to the classical form of Eq. (3). Thus, the classical limit must have \(\left| {{\partial _q}{S_{\rm{Q}}}} \right| \gg \left| {\xi {\rm{/}}2\left\| {{\partial _q}\rho {\rm{/}}\rho } \right.} \right|\), and S QS C. This condition suggests that the fluctuations of ξ, characterizing the strength of both the ontic extension and the epistemic restriction, must be microscopic. Hence, let us take the universal mean and variance of ξ to be

$$\overline \xi \equiv {\int} d\xi \,\xi \,\mu (\xi ) = 0,\quad \sigma _{{\rm{ns}}}^{\rm{2}} \equiv \overline {{\xi ^2}} - {\overline \xi ^2} = \overline {{\xi ^2}} = {\hbar ^2}{\rm{,}}$$
(7)

respectively. One then sees that the formal limit \(\left| \xi \right|\) → 0 is equivalent to σ ns = ħ → 0.

We have thus, in ξ, introduced a fundamental concept of “ontic nonseparability”8,24,25, with strength given by the Planck constant. The nonseparability of ξ will be shown in the next subsection to be necessary for determining the correlations among systems and the average interaction energy between two systems, and to obtain the correct (linear) Schrödinger equation governing interacting systems, generating quantum entanglement. The ontic extension introducing ξ with statistics given in Eq. (7), and the epistemic restriction of Eq. (6), together distinguish the quantum from the classical world. Let us mention that a toy model combining an epistemic restriction with an ontic extension in the form of a “relational stochastic variable” was recently proposed in ref. 25 to address the Pusey–Barrett–Rudolph theorem19. Note that, since ξ does not depend on q, also σ ns = ħ does not depend on q. Below, we will further assume that σ ns = ħ does not depend on time, either. We will see that the spatiotemporal neutrality of σ ns = ħ is crucial for reconstructing quantum mechanics.

Let us illustrate the epistemic restriction of Eqs. (6) and (7) in one spatial dimension, as follows. First, a trajectory passing randomly through q acquires a fluctuating momentum p from ξ. The strength of the fluctuation of p is proportional to the strength σ ns = ħ of the fluctuation of ξ, as in Eq. (7), and to the normalized slope of ρ(q). We exhibit the meaning of these fluctuations in two extreme cases that rule out any epistemic state that is sharp both in p and q. First, suppose we fix p to be \(\bar p\), that is \({\rm{P}}\left( {\left. p \right|q,\xi ,{S_{\rm{Q}}},\rho } \right) = \delta \left( {p - \bar p} \right)\). Equation (6) implies that the term ∂ q ρ/ρ on the right-hand side of Eq. (6) must vanish; otherwise p would fluctuate because ξ does. Then for ∂ q ρ/ρ to vanish, ρ(q) must not depend on q. Hence, any attempt to fix p inevitably implies complete uncertainty about q. Conversely, suppose we fix q at \(\bar q\), i.e., \(\rho (q) = \delta \left( {q - \bar q} \right)\). Then the term ∂ q ρ/ρ in Eq. (6) diverges, implying a random fluctuation in p with infinite strength. Hence, sharp knowledge about q implies completely ignorance about p.

As a simple, concrete example, let us consider an ensemble of trajectories in a one-dimensional space with a distribution of position given by a Gaussian ρ(q) of a vanishing mean and a variance \(\sigma _q^2\), namely \(\rho (q) = {\rm{exp}}( { - {q^2}{\rm{/}}2\sigma _q^2} )\) (up to normalization), and take S Q independent of q, for simplicity. Substituting into Eq. (6) and noting Eq. (7), we obtain the variance \(\sigma _p^2\) of p as \(\sigma _p^2 = {\hbar ^2}{\rm{/}}4\sigma _q^2\) and an uncertainty relation σ q σ p  = ħ/2, showing that it is impossible to prepare an epistemic state sharp both in q and p (via squeezing along both p and q axes). This uncertainty relation will be derived in general in the next subsection.

The example suggests that our epistemic restriction of Eqs. (6) and (7) is closely related to the knowledge-balance principle7, the uncertainty principle11, and the principle of classical complementarity12, adopted by Spekkens et al. as a fundamental epistemic restriction for reconstructing a significant part of quantum mechanics from the statistical theory of some classical models; all these principles, as well, assert that complete knowledge of both q and p is impossible. Note, however, that none of these restrictions leads to a full reconstruction of quantum mechanics.

We will show that our epistemic restriction allows derivation of the uncertainty relation postulated as an epistemic restriction in ref. 11. Thus, we adopt the epistemic restriction of Eqs. (6) and (7) as an axiom for reconstructing quantum mechanics. It clearly shows that maximal knowledge is always incomplete2. However, unlike the models in refs. 7,11,12, we also introduce an ontic extension—a fundamentally nonseparable random variable ξ—that induces an epistemic restriction of strength σ ns = ħ. Moreover, the nonseparability of ξ will prove to be crucial for generating quantum entanglement via interaction.

Emergent quantum kinematics and dynamics

We assume that physical quantities \({\cal O}(p,q)\) in our model are real-valued functions of phase space, \({\cal O}(p,q):{\rm{\Omega }} \mapsto {\Bbb R}{\rm{,}}\) having the same form as those in classical mechanics. Hence, they depend on ξ only implicitly via p as in Eq. (6). We further confine ourselves to physical quantities that are at most of second order in momentum. The two theorems presented below then assert that averaging over the fluctuations of the random variable ξ leads to the mathematics and rules of quantum kinematics and dynamics, including the dynamics of measurement interaction.

Theorem 2. Assume that an ensemble of trajectories satisfies the epistemic restrictions of Eqs. (6) and (7), where σ ns = ħ is constant in space. The phase space (ensemble) average \({\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}\) of any physical quantity \({\cal O}(p,q)\) up to second order in p is then equal to the quantum mechanical expectation value \(\langle {\psi | {\hat {\cal O}} |\psi } \rangle \):

$${{\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} \equiv {\int} {\rm{d}}q{\rm{d}}\xi {\rm{d}}p\,{\cal O}(p,q){\rm{P}}\left( {p\left| {q,\xi ,{S_{\rm{Q}}},\rho } \right.} \right)\mu (\xi )\rho (q) = \left\langle {\psi \left| {\hat {\cal O}} \right|\psi } \right\rangle ,}$$
(8)

where \(\hat {\cal O}\) is the Hermitian operator obtained by applying the Dirac canonical quantization scheme to \({\cal O}(p,q)\) with a specific ordering of operators, and the wave function \(\psi (q;t) = \left\langle {q\left| \psi \right.} \right\rangle \) is defined as:

$$\psi (q;t) \equiv \sqrt {\rho (q;t)} {\rm{exp}}\left( {i{S_{\rm{Q}}}(q;t){\rm{/}}\hbar } \right).$$
(9)

See the proof of Theorem 2 in the “Methods” section. (Here we have assumed that the joint probability distribution over (q, ξ) is factorizable: P(ξ, q) = μ(ξ)ρ(q). In general, it may not be, and one has instead \({\rm{P}}(\xi ,q) = \mu \left( {\xi \left| q \right.} \right)\rho (q)\), where μ \(\left( {\xi \left| q \right.} \right)\) is the conditional probability of ξ given q. All the results of calculations in this paper still apply unmodified in this general case if we replace the averages over μ(ξ) with those over μ \(\left( {\xi \left| q \right.} \right)\), as long as the first two moments of μ \(\left( {\xi \left| q \right.} \right)\) are given by Eq. (7) independent of q).

Note first that, from Eq. (9), Born’s statistical interpretation of the wave function is valid by construction: \({\rm{P}}\left( {q\left| \psi \right.} \right) = \rho (q) = {\left| {\psi (q)} \right|^2}\). Moreover, from Eqs. (6) and (9), each pure quantum state ψ is associated with a phase–space distribution conditioned on ξ, namely \({\rm{P}}\left( {p,q\left| {\xi ,\psi } \right.} \right) = {\rm{P}}\left( {p\left| {q,\xi ,{S_{\rm{Q}}},\rho } \right.} \right)\rho (q)\) = \(\mathop {\prod}\nolimits_{i = 1}^N \delta \left( {{p_i} - \left( {{\partial _{{q_i}}}{S_{\rm{Q}}} + \xi {\partial _{{q_i}}}\rho {\rm{/}}2\rho } \right)} \right)\rho (q)\). These results show the statistical aspect of the wave function. Although a consistent statistical interpretation of Eq. (6) can be made (as argued in “Methods” subsection “Statistical interpretation of the epistemic restriction”), this observation does not allow us to conclude that the wave function within the model defined in Eq. (9) is purely statistical with no physical (ontic) role. As discussed in that subsection, to have such a purely statistical ψ, we need to show that Eq. (6) can be derived from the ontic dynamics of individual systems with transparent causation, leading “effectively” to a statistical correlation between p and ρ with no causal relation. We conjecture that this is indeed the case.

As can be seen explicitly from the proof of Theorem 2 in the “Methods” section, the fundamental “nonseparability” of ξ is necessary for obtaining Eq. (8). Suppose instead that ξ is separable into N random variables ξ = (ξ 1, …, ξ N ), ξ i is associated with the i-th degree of freedom replacing ξ in Eq. (6). Assume that they have vanishing average \({\overline \xi _i} = 0\), i = 1, …, N, and there is “independent” pair ξ i , ξ j for some i ≠ j so that they are uncorrelated \(\overline {{\xi _i}{\xi _j}} = {\overline \xi _i}{\overline \xi _j} = 0\). Then, the last term in Eq. (27) associated with the pair of indices i, j vanishes.

For a concrete simple example showing the crucial role of the nonseparability of ξ, let us consider a pair of particles and compute the ensemble average of \({\cal O} = {p_1}{p_2}\). Using Eq. (6) with ξ i , i = 1, 2, for each degree of freedom replacing ξ, one directly gets \({\left\langle {{p_1}{p_2}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}\) = \({\int} {\rm{d}}q{\kern 1pt} \left[ {\left( {{\partial _{{q_1}}}{S_{\rm{Q}}}} \right)\left( {{\partial _{{q_2}}}{S_{\rm{Q}}}} \right) + \overline {{\xi _1}{\xi _2}} \left( {{\partial _{{q_1}}}\rho } \right)\left( {{\partial _{{q_2}}}\rho } \right){\rm{/}}4{\rho ^2}} \right]\rho \), where we have made use of \({\overline \xi _1} = 0 = {\overline \xi _2}\). In the nonseparable case, namely ξ 1 = ξ 2 = ξ, this gives us the quantum expectation value: \({\left\langle {{p_1}{p_2}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}\) = \({\int} {\rm{d}}q\left[ {\left( {{\partial _{{q_1}}}{S_{\rm{Q}}}} \right)\left( {{\partial _{{q_2}}}{S_{\rm{Q}}}} \right) + {\hbar ^2}\left( {{\partial _{{q_1}}}\rho } \right)\left( {{\partial _{{q_2}}}\rho } \right){\rm{/}}4{\rho ^2}} \right]\rho \) = \({\int} {\rm{d}}q{\kern 1pt} \left[ {\left( {{\partial _{{q_1}}}{S_{\rm{Q}}}} \right)\left( {{\partial _{{q_2}}}{S_{\rm{Q}}}} \right) - {\hbar ^2}\left( {{\partial _{{q_1}}}{\partial _{{q_2}}}{R_{\rm{Q}}}} \right){\rm{/}}{R_{\rm{Q}}}} \right]\rho \) = \(\left\langle \psi \right|{\hat p_1}{\hat p_2}\left| \psi \right\rangle ,\) where R Q ≡ \(\sqrt \rho \) and we have used Eq. (7) in the first equality; the identity of Eq. (24) (see the proof of Theorem 2 in the “Methods” section), and partial integration to get the second equality; and the definition of the wave function of Eq. (9), and \(\left\langle {q_i^\prime \left| {{{\hat p}_i}} \right|{q_i}} \right\rangle \equiv - i\hbar {\partial _{{q_i}}}\delta \left( {q_i^\prime - {q_i}} \right),\) to arrive at the last equality. If we instead assume that ξ 1 and ξ 2 are independent (thus uncorrelated) random variables, namely \(\overline {{\xi _1}{\xi _2}} = {\overline \xi _1}{\overline \xi _2} = 0,\) then we get \({\left\langle {{p_1}{p_2}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = {\int} {\rm{d}}q{\kern 1pt} \left( {{\partial _{{q_1}}}{S_{\rm{Q}}}} \right)\left( {{\partial _{{q_2}}}{S_{\rm{Q}}}} \right)\rho (q),\) which is just the value obtained in conventional classical statistical mechanics (identifying S Q with S C).

Notice in particular that, in the above example, if ξ were separable into a pair of independent random variables, the quantum correction term \({\int} {\rm{d}}q\,{\hbar ^2}\left[ {\left( {{\partial _{{q_1}}}\rho } \right)\left( {{\partial _{{q_2}}}\rho } \right){\rm{/}}4{\rho ^2}} \right]\rho \) = \( - {\int} {\rm{d}}q\,{\hbar ^2}\left[ {\left( {{\partial _{{q_1}}}{\partial _{{q_2}}}{R_{\rm{Q}}}} \right){\rm{/}}{R_{\rm{Q}}}} \right]\rho \) would be missing. We will show later that this term generates a quantity, the quantum potential of Bohmian mechanics, that is crucial for the description of two particles17. (See “Methods” subsection “Schrödinger’s equation for measurement of angular momentum” for a concrete example.) In contrast to our model, Bohmian mechanics postulates a quantum potential, taking ρ = \({\left| \psi \right|^2}\) and ψ physically real. (One can equivalently postulate that ψ follows the Schrödinger equation.) It is well known that the quantum potential plays a decisive role in any realist account of quantum mechanics, and is commonly regarded as responsible for many classically puzzling features of microscopic world. In our model, the quantum correction term arises effectively from the epistemic restriction of Eqs. (6) and (7) underlying the kinematics of the ensemble of trajectories. Moreover, for many-particle systems, as shown above, the nonseparability of ξ is indispensable for the emergence of the quantum correction.

Let us note that the quantity \({\left\langle {{p_1}{p_2}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = \left\langle {\psi \left| {{{\hat p}_1}{{\hat p}_2}} \right|\psi } \right\rangle \) above can be regarded either as the momentum “correlation” between two arbitrarily separated particles, or as proportional to the “average interaction energy” between the two particles, which, e.g., arises in the von Neumann’s prescription for measurement interaction. Hence, the fluctuation of ξ not only fixes the strength of the epistemic restriction as discussed in the previous subsection; the irreducible nonseparability of ξ plays a crucial role in describing the correlation between two particles (or subsystems) and their interaction. This role will become more prominent in the derivation of Schrödinger’s equation for many interacting subsystems, in Theorem 3 below, in which we show that the nonseparability of ξ is crucial for obtaining the correct Schrödinger equation describing interactions, and hence for obtaining quantum entanglement. Otherwise, if ξ is separable, one will instead get a classical Hamilton–Jacobi equation. (See “Methods” subsection “Schrödinger’s equation for measurement of angular momentum”).

As an important corollary of Theorem 2, substituting \({( {p - {{\langle p \rangle }_{\{ {{S_{\rm{Q}}},\rho } \}}}} )^2}\) for \({\cal O}\) in Eq. (8) yields the standard deviation σ p of p in the ensemble of trajectories, and shows that σ p equals the quantum mechanical standard deviation \({\sigma _{\hat p}}\) of \(\hat p\). Namely, \(\sigma _p^2 \equiv {\langle {{{( {p - {{\langle p \rangle }_{\{ {{S_{\rm{Q}}},\rho } \}}}} )}^2}} \rangle _{\{ {{S_{\rm{Q}}},\rho } \}}}\) = \(\langle \psi |{( {\hat p - \langle {\psi | {\hat p} |\psi } \rangle } )^2}| \psi \rangle \equiv \sigma _{\hat p}^2.\) Likewise, the standard deviation σ q of q equals the quantum mechanical standard deviation \({\sigma _{\hat{{ q}}}}\) of \(\hat q\), i.e., \(\sigma _{{q}}^2 \equiv {\langle {{{( {q - {{\langle q \rangle }_{\{ {{S_{\rm{Q}}},\rho } \}}}} )}^2}} \rangle _{\{ {{S_{\rm{Q}}},\rho } \}}}\) = \(\langle \psi |{( {\hat q - \langle \psi |\hat q| \psi \rangle } )^2}| \psi \rangle \equiv \sigma _{\hat{{q}}}^2.\) Hence, the standard deviations σ p , σ q of the ensemble of trajectories satisfying Eqs. (6) and (7) always formally satisfy the Heisenberg–Kennard uncertainty relation26,27,28:

$${\sigma _q}{\sigma _p} = {\sigma _{\hat q}}{\sigma _{\hat p}} \ge \hbar {\rm{/}}2.$$
(10)

An alternative derivation of this uncertainty relation, which does not refer to Eq. (8), but directly applies the epistemic restriction given by the pair of Eqs. (6) and (7), appears in “Methods” subsection “An alternative derivation of the uncertainty relation.”

The uncertainty relation of Eq. (10) describes a constraint on the epistemic states that can be prepared, rather than on simultaneous values of position and momentum. A similar uncertainty relation, together with the maximum entropy principle, is used in the ontological model of ref. 11 to derive a simplified quantum mechanics, called Gaussian quantum mechanics, from classical statistical mechanics. Unlike ref. 11, however, we do not impose the principle of maximum entropy. Thus, we recover the non-Gaussian regime.

The next question is how the ensemble of trajectories in our model evolves with time and how the evolution transforms the corresponding phase space distribution. Equations (6) and (9) tell us that the time evolution of the phase–space distribution is determined by that of ψ. Moreover, it is also clear that any type of time evolution for ψ will preserve the uncertainty relation of Eq. (10). What, then, is the dynamical equation governing ψ? In the classical case, as mentioned in Theorem 1, the Hamilton–Jacobi equation—and thus the Liouville equation—are obtained by imposing the requirement that the ensemble of trajectories conserves the average energy and probability current. To have a conceptually smooth classical correspondence, we want the same requirement to single out the dynamical equation for ψ. And it does, as follows:

Theorem 3. Consider an ensemble of trajectories satisfying Eqs. (6) and (7), where σns = ħ is constant in space and time. Given a classical Hamiltonian H(p,q) that is at most quadratic in p, and an ensemble of trajectories conserving the average energy and probability current, there is a unique time evolution for ψ given by the (linear and unitary) Schrödinger equation:

$$i\hbar \frac{{\rm{d}}}{{{\rm{d}}t}}\left| \psi \right\rangle = \hat H\left| \psi \right\rangle ,$$
(11)

where \(\hat H\) is a Hermitian operator again having the same form as that obtained by applying the Dirac canonical quantization procedures to H(p,q) with a specific ordering of operators. See the proof of Theorem 3 in the “Methods” section.

As a first corollary to Theorem 3, in the macroscopic classical physical regime \(\left| {{\partial _q}{S_{\rm{Q}}}} \right| \gg \left| {\xi {\rm{/}}2\left\| {{\partial _q}\rho {\rm{/}}\rho } \right.} \right|,\) we regain the dynamical equation governing classical statistical mechanics, the Hamilton–Jacobi equation. For a proof, recall that in the limit, the epistemic restriction of Eq. (6) reduces to the conditional probability distribution over p in classical statistical mechanics, given by Eq. (3); so the average energy defined as in Eq. (8) must also reduce to the value obtained in classical statistical mechanics. Accordingly, by Theorem 1, the Schrödinger equation of Eq. (11) in the position representation must reduce in this limit to the classical Hamilton–Jacobi equation of (2) where S QS C.

As a second corollary, one can show that interaction in the past in general implies a non-factorizable (entangled) wave function. Consider the limiting classical case where one has, via Eqs. (1) and (2), the equality \({S_{\rm{C}}}(q;t) = {\int}^{(q;t)} L{\kern 1pt} {\rm{d}}t,\) where \(L(q,\dot q) = p \cdot \dot q - H\) is the classical Lagrangian. Next, consider two subsystems with a configuration q = (q 1,q 2). Assume that they interacted, so that there was an interval of time in the past during which the total Lagrangian was not (additively) decomposable into that of the two subsystems: \(L(q,\dot q) \ne {L_1}\left( {{q_1},{{\dot q}_1}} \right) + {L_2}\left( {{q_2},{{\dot q}_2}} \right).\) It follows that also Hamilton’s principal function does not, in general, decompose: \({S_{\rm{C}}}(q;t) = {\int}^{(q;t)} L{\kern 1pt} {\rm{d}}t \ne {S_{{{\rm{C}}_1}}}\left( {{q_1};t} \right) + {S_{{{\rm{C}}_2}}}\left( {{q_2};t} \right).\) Accordingly, for a smooth classical limit, also S Q(q;t) must not, in general, decompose: \({S_{\rm{Q}}}(q;t) \ne {S_{{{\rm{Q}}_1}}}\left( {{q_1};t} \right) + {S_{{{\rm{Q}}_2}}}\left( {{q_2};t} \right).\) The wave function defined in Eq. (9) is therefore, in general, non-factorizable: ψ(q;t) ≠ ψ 1(q 1;t)ψ 2(q 2;t).

Note that for the two interacting subsystems above, to get quantum entanglement via Schrödinger equation, the nonseparability of ξ is crucial. If instead we assume that ξ is separable into two independent random variables ξ 1 and ξ 2 with vanishing average \({\overline \xi _1} = 0 = {\overline \xi _2},\) so that \(\overline {{\xi _1}{\xi _2}} = {\overline \xi _1}{\overline \xi _2} = 0,\) we will not get the correct quantum mechanical entangled wave function. Instead, as shown above, the average interaction energy is given by the classical statistical mechanics value, rather than the quantum expectation value. Hence, imposing the principle of conservation of average energy leads, as Theorem 1 shows, to the Hamilton–Jacobi equation with a classical interaction Hamiltonian instead of the Schrödinger equation for interacting subsystems. (For a concrete example, see the discussion at the end of “Methods” subsection “Schrödinger’s equation for the measurement of angular momentum”.) Indeed in the ontological model, the nonseparability of ξ is crucial for obtaining non-factorizable (entangled) wave functions.

We obtain yet another corollary of Theorem 3 if we couple a system to a measuring device via the von Neumann measurement-interaction Hamiltonian \({H_{\rm{I}}} = g{{\cal O}_{\rm{S}}}{p_\Sigma },\) where \({{\cal O}_{\rm{S}}}\left( {{p_{\rm{S}}},{q_{\rm{S}}}} \right),\) the physical quantity of the system to be measured, is linear in the momentum p S, p Σ is the momentum of the apparatus pointer, and g is the coupling strength. (Measuring a physical quantity \({{\cal O}_{\rm{S}}}\left( {{p_{\rm{S}}},{q_{\rm{S}}}} \right)\) that is second order in momentum requires a different measurement interaction, to make H I altogether only second order in momentum.) We get the Schrödinger equation of Eq. (11) with the quantum Hamiltonian \({\hat H_{\rm{I}}} = g{\hat {\cal O}_{\rm{S}}}{\hat p_\Sigma }\). The “Methods” subsection “Schrödinger’s equation for measurement of angular momentum” provides an example of deriving the Schrödinger equation with a measurement interaction to measure angular momentum. From this result, given that the particles in our model always have definite positions and momenta as in Bohmian mechanics, it follows that each single measurement run will yield an outcome given by one of the eigenvalues of \({\hat {\cal O}_{\rm{S}}}\) with statistics following Born’s rule; and that the “effective” wave function of the system after the measurement is given by the eigenfunction of \({\hat {\cal O}_{\rm{S}}}\) associated with the measurement outcome. We derive this rule explicitly in “Methods” subsection “Derivation of Born’s rule”, and discuss Wallstrom’s critique29 of this program for reconstructing quantum mechanics.

Discussion

Among attempts to clarify the meaning and foundations of quantum mechanics, and to pinpoint its place among possible theories, there has been much interest recently in deriving quantum mechanics from physically transparent axioms30,31,32,33,34,35,36,37,38,39,40,41. In the present paper, partly inspired by the successes of the research program of epistemically restricted classical statistical models6,7,9,10,11,12,15,25, which reproduce some quantum phenomena usually regarded as classically inexplicable, we have attempted to provide axioms for quantum mechanics that closely parallel the axioms of classical statistical mechanics, i.e., axioms within the same conceptual framework. Specifically, as Theorem 1 and Theorem 3 show, the dynamics of classical statistical mechanics and of our ontological model of quantum mechanics (which correspond, respectively, to the Hamilton–Jacobi equation and the Schrödinger equation) follows directly from axioms of conservation of average energy and probability current.

What transforms classical statistical mechanics into quantum mechanics, in our model, is the structure of the space of ontic and epistemic states and the dynamics of the ontic states. While in conventional classical statistical mechanics the ontic state follows deterministic dynamics and the space of the ontic states is separable, in our model the ontic extension arises from a nonseparable random variable ξ. Moreover, while classical mechanics allows preparing an ensemble of trajectories with an arbitrary distribution of positions, independently of a given momentum field and vice versa, in our model, quantum mechanics emerges when this independence is partially sacrificed in accordance with the epistemic restriction of Eqs. (6) and (7). The epistemic restriction of Eq. (6) can be generalized to any pair of canonically conjugate variables. We thus claim that two outstanding and paradoxical features of quantum mechanics, entanglement and uncertainty relations15,16,19,20,21,22, are fundamentally related to this ontic extension and epistemic restriction.

We have presented the epistemic restriction of Eqs. (6) and (7) as a novel objective-realist approach to the Heisenberg uncertainty principle. The Heisenberg uncertainty principle connects the standard deviations of position and momentum measurements outcomes, whereas our approach connects probability distributions for momenta with probability distributions for positions, independent of measurement. Moreover, unlike the Heisenberg uncertainty principle from which, to the best of our knowledge, no one has derived Schrödinger’s equation, here we have shown that the epistemic restriction and axioms of conservation of average energy and probability current do imply Schrödinger’s equation. In this sense, the epistemic restrictions of Eqs. (6) and (7) are more powerful than the Heisenberg uncertainty principle.

Within the ontological model, conventional classical statistical mechanics emerges in the deterministic and separable physical regime when \(\left| {{\partial _q}{S_{\rm{Q}}}} \right| \gg \left| {\xi /2\left\| {{\partial _q}\rho {\rm{/}}\rho } \right.} \right|,\) so that Eq. (6) reduces to Eq. (3). In this limit, evidently the ontic extension and the epistemic restriction vanish smoothly and jointly (they stand or fall together). These features of the model are appealing in the context of the long-standing problem of the quantum-classical correspondence: trajectories do not emerge as approximations to a macroscopic classical world; rather, they are well defined even in the microscopic world. Thus, we can also regard Theorem 2 and Theorem 3 as a novel quantization scheme12 that applies only to systems for which the Hamiltonian is at most second order in momentum. Note that unlike Dirac canonical quantization procedure, as shown in the proof of Theorem 2 (see “Methods”), our scheme yields a unique Hermitian operator \(\hat {\cal O}\) with a specific ordering of operators. Moreover, while Dirac quantization procedure is mathematically inspired, our scheme is physically and conceptually motivated. While we have focused on particles, this quantization scheme might find direct application in linear quantum optics, with (p,q) as the field quadratures.

Many important questions are left for future work. Whence the specific epistemic restriction of Eqs. (6) and (7)? How does the model account for the quantum phenomena that seem least compatible with classical mechanics22? The answers must include an explanation of well-known no-go theorems such as Bell’s theorem20, the Bell–Kochen–Specker contextuality theorem15,21 and its generalization16, and the recent Pusey–Barrett–Rudolph theorem19. Our model suggests tracing these nonclassical phenomena and no-go theorems to the ontic extension and epistemic restriction imposed on an otherwise-conventional classicalstatistical mechanics. To address quantum paradoxes, it might be necessary to obtain the epistemic restriction of Eq. (6) from a deeper causal model for individual systems. (See the discussion in “Methods” subsection “Statistical interpretation of the epistemic restriction”.) Our model may stimulate novel ideas for simulating quantum information processing, shed new light on the physical nature of Planck’s constant, and suggest a natural and consistent extension of quantum mechanics. Finally, it is necessary to extend our model to include spin and, ultimately, to confront relativistic invariance.

Methods

Proof of Theorem 1

Theorem 1: Consider an ensemble of trajectories satisfying Eq. (1) or equivalently Eq. (3). For a classical Hamiltonian H(p,q) up to second order in momentum, the constraint that the ensemble of trajectories conserves the probability current and average energy implies a unique dynamics for S C(q,t), given by the Hamilton–Jacobi equation, Eq. (2):

$$ - {\partial _t}{S_{\rm{C}}} = H(p,q).$$

Proof. We shall prove the theorem by considering a simple example of a single particle in three-dimensional space. The proof for the general case is completely analogous. Let us consider a single particle with mass m subjected to a time-independent scalar potential V(q) and a vector potential A(q) = (A 1, A 2, A 3). The Hamiltonian thus reads

$$H(p,q) = \mathop {\sum}\limits_{i = 1}^3 \frac{{{{\left[ {{p_i} - {A_i}(q)} \right]}^2}}}{{2m}} + V(q).$$
(12)

From Eq. (12), the velocity field is related to momentum field as: \({\dot q_i}(q;t) \equiv {\rm{d}}{q_i}{\rm{/d}}t = \partial H{\rm{/}}\partial {p_i} = \left( {{p_i} - {A_i}} \right){\rm{/}}m,\) so that noting Eq. (1), we get \({\dot q_i} = \left( {{\partial _{{q_i}}}{S_{\rm{C}}} - {A_i}} \right){\rm{/}}m.\) Assuming that the probability density current is conserved (i.e., no creation or annihilation of trajectories), which is a natural assumption for a closed system, the probability density ρ(q;t) of q at time t satisfies a continuity equation:

$$0 = {\partial _t}\rho + \mathop {\sum}\limits_i {\partial _{{q_i}}}\left( {\rho {{\dot q}_i}} \right) = {\partial _t}\rho + \mathop {\sum}\limits_i \frac{1}{m}{\partial _{{q_i}}}\left[ {\rho \left( {{\partial _{{q_i}}}{S_{\rm{C}}} - {A_i}} \right)} \right].$$
(13)

On the other hand, from Eqs. (3) and (12), the average energy of the ensemble of trajectories characterized by the same S C(q;t) and ρ(q;t) is:

$$\begin{array}{*{20}{l}} {{{\left\langle H \right\rangle }_{\left\{ {{S_{\rm{C}}},\rho } \right\}}}} \equiv {{\int} {\rm{d}}q{\rm{d}}p\,H(p,q){\rm{P}}\left( {p\left| {q,{S_{\rm{C}}}} \right.} \right)\rho (q;t)} \\ = {{\int} {\rm{d}}q{\kern 1pt} \left[ {\mathop {\sum}\limits_i \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{C}}} - {A_i}} \right)}^2}}}{{2m}} + V} \right]\rho .} \hfill\end{array}$$
(14)

Next, differentiating Eq. (14) with respect to time, one gets \(({\rm{d/d}}t){\left\langle H \right\rangle _{\left\{ {{S_{\rm{C}}},\rho } \right\}}}\) = \({\int} {\rm{d}}q{\kern 1pt} \{ {( {{\partial _t}\rho } )[ {\mathop {\sum}\nolimits_i {{( {{\partial _{{q_i}}}{S_{\rm{C}}} - {A_i}} )}^2}{\rm{/}}2m + V} ]+[ {\mathop {\sum}\nolimits_i \rho ( {{\partial _{{q_i}}}{S_{\rm{C}}} - {A_i}} ){\partial _{{q_i}}}{\partial _t}{S_{\rm{C}}}{\rm{/}}m} ]} \}.\) Integrating by parts the last term on the right-hand side and using the continuity equation of (13), we obtain:

$$\frac{{\rm{d}}}{{{\rm{d}}t}}{\left\langle H \right\rangle _{\left\{ {{S_{\rm{C}}},\rho } \right\}}} = {\int} {\rm{d}}q{\kern 1pt} {\partial _t}\rho \left[ {\mathop {\sum}\limits_i \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{C}}} - {A_i}} \right)}^2}}}{{2m}} + V + {\partial _t}{S_{\rm{C}}}} \right].$$
(15)

The above relation clearly shows that the average energy is conserved for any time, i.e., \(({\rm{d/d}}t){\left\langle {H(t)} \right\rangle _{\left\{ {{S_{\rm{C}}},\rho } \right\}}} = 0\) for any t ρ, if and only if the term inside the bracket is vanishing

$${\partial _t}{S_{\rm{C}}} + \mathop {\sum}\limits_i \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{C}}} - {A_i}} \right)}^2}}}{{2m}} + V = 0.$$
(16)

This equation is just the Hamilton–Jacobi equation of (2), as we see by noting Eqs. (1) and (12).

We obtained the Hamilton–Jacobi equation of Eq. (2) by positing the kinematics of Eq. (1) or equivalently Eq. (3) and imposing the principles of conservation of average energy and probability current. The “Methods” subsections “Proof of Theorem 3” and “Schrödinger’s equation for measurement of angular momentum” show that the same axiomatic framework yields, instead, the Schrödinger equation, if one posits the alternative kinematics of Eq. (6) or equivalently Eq. (17) below.

Statistical interpretation of the epistemic restriction

We discuss the possible conceptual issue which may arise in the epistemic restriction of Eq. (6). First, Eq. (6) can be equivalently written as:

$${p_i} = {\partial _{{q_i}}}{S_{\rm{Q}}} + \frac{\xi }{2}\frac{{{\partial _{{q_i}}}\rho }}{\rho } ,$$
(17)

i = 1, …, N. A similar momentum fluctuation is also postulated in ref. 42, but no specific relation with ρ is proposed, and no introduction of a global-nonseparable variable as in our model. It seems from the above equation that the momentum p associated with q for a given ξ is in part determined by the probability density ρ(q) for q. How can it be? Initially, this formal relation between p and ρ might give the impression that the dynamics of the particle is being guided causally by ρ. But such an interpretation grants causal power to mere epistemic possibilities (the probability density ρ(q)), which is unacceptable from the standpoint of statistical mechanics. We avoid this bizarre interpretation by denying ρ an ontic status as in Bohmian mechanics17 (in which ρ determines the energy density via a term called quantum potential), or as in the many interacting worlds interpretation43 (which assumes that all possible alternative realities co-exist).

Instead, as discussed in “Results” subsection “Microscopic ontic extension and epistemic restriction,” we interpret the relation between p and ρ in Eq. (6) or (17) as describing a statistical constraint or correlation, rather than as a causal relation, between p(q;ξ) and ρ(q). Namely, given a momentum field p(q;ξ), among all possible classes of ensembles of trajectories with different weighting given by the different probability densities ρ(q) that are compatible with p(q;ξ), we choose a specific one with ρ(q) that satisfies the constraint given by Eq. (17) for some S Q(q). From Eq. (17) (or equivalently Eq. 6) and Eq. (7), the form of S Q is determined by the average of p over ξ as \({\overline p _i}(q) \equiv {\int} {\rm{d}}\xi \,{p_i}(q;\xi )\mu (\xi ) = {\partial _{{q_i}}}{S_{\rm{Q}}},\) i = 1, …, N. Recall that in the classical case, any form of ρ(q) is allowed (each trajectory belonging to the given momentum field can be weighted arbitrarily); here we have sacrificed part of this freedom. To stress this (non-bizarre) interpretation, we have formulated the epistemic restriction as a delta-functional conditional probability density of p given q, ξ, S Q and ρ in Eq. (6), as well as in the equivalent form of direct relation among p, q, ξ, S Q and ρ of Eq. (17). The correlation between p(q;ξ) and the gradient of ρ(q) does not imply causation.

As an example, let us suppose that we are given a one-dimensional momentum field \(p(q;\xi ) = - \xi q{\rm{/}}2\sigma _q^2,\) where σ q is a constant. Within the model, since \(\overline p = 0 = {\partial _q}{S_{\rm{Q}}},\) this is the case when S Q is independent of q; it may still depend on time. In classical statistical mechanics, one is free to prepare any ensemble of trajectories compatible with a given momentum field with any arbitrary weighting given by ρ(q), i.e., any form of ρ(q) is allowed. In our model, however, such epistemic freedom is no longer granted. Instead, among ensembles of trajectories following the given momentum field \(p(q;\xi ) = - \xi q{\rm{/}}\left( {2\sigma _q^2} \right),\) we select one with ρ(q) that satisfies the epistemic restriction. Inserting this momentum field into the epistemic restriction of Eq. (17) and noting that S Q is independent of q, we have to choose ρ(q) to solve the following differential equation: \({\partial _q}\rho {\rm{/}}\rho = - q{\rm{/}}\sigma _q^2.\) This equation yields a Gaussian distribution of q: \(\rho (q)\sim {\rm{exp}}\left( { - {q^2}{\rm{/}}2\sigma _q^2} \right).\) As shown in “Results” subsection “Microscopic ontic extension and epistemic restriction”, assuming Eq. (7) makes this ensemble of trajectories automatically satisfy the uncertainty relation σ q σ p  = ħ/2.

Hence, within our model, Eqs. (6) and (17) have physical meaning only for ensembles of identically prepared systems, and not for any individual systems. To have a complete realistic model, we need to provide a dynamics for the time evolution of individual systems (comparable to Newton’s equation or the Langevin equation), with transparent causal structure; in particular such an ontic dynamics must not grant ρ a causal role. We do not provide it here, but we “conjecture” that such an ontic dynamics consistent with Eq. (6) exists. This dynamics must therefore lead “effectively” to the emergence of the statistical correlations between p and ρ given in Eq. (6). This conjecture on the existence of ontic dynamics for individual systems with no causal role of ρ, implies that the wave function within our model is “epistemic.” Remarkably, as we show in “Results” subsection “Emergent quantum kinematics and dynamics,” the explicit ontic dynamics of individual systems is not needed for deriving the formal-mathematical concepts and operational rules of quantum mechanics.

Markopoulou and Smolin44 and Smolin45 suggest a similar notion by arguing that the dependence of energy density in Nelson’s stochastic mechanics46 (which corresponds to the quantum potential in Bohmian mechanics and is thus an ontic variable) on the spatial derivative of ρ(q) (an epistemic parameter) arises effectively in a cosmological model where quantum mechanics is an approximation that applies only to a subsystem of the universe. See also ref. 47 in which the dependence arises effectively due to interactions between the particle and a zero-point radiation field, after averaging over the latter.

We end this subsection by presenting two more simple examples of how, for a given a momentum field, to define an ensemble of trajectories that satisfies the epistemic restriction of Eq. (6) or (17). As a first example, let us suppose that we are given a spatially uniform one-dimensional momentum field p = p 0, independent of q and ξ. Clearly in this case, any ensemble of trajectories compatible with the momentum field must have a sharp distribution of momentum: they must all have momentum p = p 0. Again, recall that in classical mechanics, we are free to prepare an ensemble of trajectories compatible with a given momentum field with arbitrary ρ(q). (Each trajectory belonging to the momentum field can be assigned arbitrary weight.) By contrast, our model offers no such freedom. The ensemble of trajectories must satisfy the statistical restriction of Eq. (17).

First, inserting p = p 0 into Eq. (17) and averaging over ξ, one gets \(\overline p = {\partial _q}{S_{\rm{Q}}} = {p_0},\) which can be integrated to yield S Q(q) = p 0 q + f(t), where f(t) depends only on the time t. Inserting this back into Eq. (17), we see that ρ(q) must therefore satisfy the differential equation ∂ q ρ/ρ = 0, which gives a spatially uniform ρ(q), i.e., ρ does not depend on q. Hence, only a spatially uniform ρ(q) is allowed, consistent with our earlier result derived in “Results” subsection “Microscopic ontic extension and epistemic restriction,” that an ensemble of trajectories with a sharp distribution of momentum must have a completely uncertain position. In this case, noting Eq. (9), the corresponding wave function is thus given by a plane wave \(\psi \sim {\rm{exp}}\left( {i{p_0}q{\rm{/}}\hbar } \right)\) in accordance with the quantum mechanical notion that a plane wave function describes a spatially uniform ensemble of particles with a sharp momentum.

For another example, consider a particle in a one-dimensional box of unit length, −1/2 ≤ q ≤ 1/2. Let us assume that we are given a random momentum field of the form p(q;ξ) = −ξπ sin(πq)/cos(πq). Again, in the classical case, given a momentum field, one is free to prepare any ensemble of trajectories compatible with the momentum field with arbitrary ρ(q). By contrast, in our model, given the momentum field, only ρ(q) satisfying the epistemic restriction of Eq. (17) is allowed. Notice first that at the two boundaries of the box, i.e., q = ±1/2, the momentum field is infinite. However, we shall soon see that the allowed probability density ρ(q) for the particle to reach the wall of the box, which satisfies the epistemic restriction of Eq. (17), vanishes, i.e., ρ(±1/2) = 0.

Namely, since the average of the momentum over ξ is vanishing, \(0 = \overline p = {\partial _q}{S_{\rm{Q}}},\) then S Q must be independent of q (but may still depend on time). Noting this and inserting the momentum field into the epistemic restriction of Eq. (17), we find that the probability density of q must satisfy the following differential equation: ∂ q ρ/ρ = −2π sin(πq)/cos(πq). Integrating this equation, we get \(\rho (q) = 2\,{\rm{co}}{{\rm{s}}^2}(\pi q),\) which is just the probability density of q corresponding to the quantum mechanical ground state of a particle in the box. One can check that the above momentum field, with the corresponding probability distribution of the position, automatically satisfies the uncertainty relation as shown in general in the main text in Eq. (10) and also in “Methods” subsection “An alternative derivation of the uncertainty relation.” In fact, calculating the variance of q, one directly gets \(\sigma _q^2 = {\int}_{ - 1/2}^{1/2} {\rm{d}}q\,{q^2}\rho (q)\) = \(2{\int}_{ - 1/2}^{1/2} {\rm{d}}q\,{q^2}{\rm{co}}{{\rm{s}}^2}(\pi q) = ({\pi ^2} - 6){\rm{/}}12{\pi ^2}.\) On the other hand, calculating the variance of p one obtains \(\sigma _p^2 = {\int}_{ - 1/2}^{1/2} {\rm{d}}q{\int} {\rm{d}}\xi \,p{(q;\xi )^2}\mu (\xi )\rho (q)\) = \(2{\pi ^2}{\hbar ^2}{\int}_{ - 1/2}^{1/2} {\rm{d}}q\,{\rm{si}}{{\rm{n}}^{\rm{2}}}(\pi q) = {\pi ^2}{\hbar ^2}\) where we have used Eq. (7). Hence, one has \({\sigma _q}{\sigma _p} = \sqrt {({\pi ^2} - 6){\rm{/}}3} \hbar {\rm{/}}2 \ge \hbar {\rm{/}}2\).

Proof of Theorem 2

Theorem 2: Assume that an ensemble of trajectories satisfies the epistemic restrictions of Eqs. (6) and (7), where σns = ħ is constant in space. The phase space (ensemble) average \({\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}\) of any physical quantity \({\cal O}(p,q)\) up to second order in p is then equal to the quantum mechanical expectation value \(\left\langle \psi \right|\hat {\cal O}\left| \psi \right\rangle \):

$${\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} \equiv {\int} {\rm{d}}q{\rm{d}}\xi {\rm{d}}p\,{\cal O}(p,q){\rm{P}}\left( {p\left| {q,\xi ,{S_{\rm{Q}}},\rho } \right.} \right)\mu (\xi )\rho (q) = \left\langle {\psi \left| {\hat {\cal O}} \right|\psi } \right\rangle ,$$
(18)

where \(\hat {\cal O}\) is the Hermitian operator obtained by applying the Dirac canonical quantization scheme to \({\cal O}(p,q)\) with a specific ordering of operators, and the wave function \(\psi (q;t) = \left\langle {q\left| \psi \right.} \right\rangle \) is defined as:

$$\psi (q;t) \equiv \sqrt {\rho (q;t)} {\rm{exp}}\left( {i{S_{\rm{Q}}}(q;t){\rm{/}}\hbar } \right).$$
(19)

Proof. Let us first calculate, within the ontological model developed in the main text, the phase space (ensemble) average of a general classical physical quantity up to second order in momentum:

$${\cal O}(p,q) = \left( {{g^{ij}}(q){\rm{/}}2} \right)\left( {{p_i} - {A_i}(q)} \right)\left( {{p_j} - {A_j}(q)} \right) + V(q),$$
(20)

where g ij(q) = g ji(q), A j (q) and V(q) are real-valued functions and summation over repeated indices are assumed. One must evaluate the following integral:

$${\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} \equiv {\int} {\rm{d}}q{\rm{d}}\xi {\rm{d}}p\,{\cal O}(p,q){\rm{P}}\left( {p\left| {q,\xi ,{S_{\rm{Q}}},\rho } \right.} \right)\mu (\xi )\rho (q).$$
(21)

First, inserting Eqs. (6) and (20) into Eq. (21) one directly obtains, after a trivial integration over p,

$${{\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = {\int} {\rm{d}}q{\kern 1pt} {\rm{d}}\xi \left[ {\frac{{{g^{ij}}}}{2}\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i} + \frac{\xi }{2}\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)\left( {{\partial _{{q_j}}}{S_{\rm{Q}}} - {A_j} + \frac{\xi }{2}\frac{{{\partial _{{q_j}}}\rho }}{\rho }} \right) + V} \right] \mu (\xi )\rho (q).}$$
(22)

Expanding the multiplication in the bracket, integrating over ξ and noting Eq. (7), one gets

$${\left\langle {\cal O} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = {\int} {\rm{d}}q{\kern 1pt} \left[ {\frac{{{g^{ij}}}}{2}\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)\left( {{\partial _{{q_j}}}{S_{\rm{Q}}} - {A_j}} \right) + V + \frac{{{\hbar ^2}}}{2}\frac{{{g^{ij}}}}{4}\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)\left( {\frac{{{\partial _{{q_j}}}\rho }}{\rho }} \right)} \right]\rho .$$
(23)

Now let us proceed to evaluate the last term on the right-hand side of Eq. (23). Using the following mathematical identity

$$\frac{1}{4}\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)\left( {\frac{{{\partial _{{q_j}}}\rho }}{\rho }} \right) = - \frac{{{\partial _{{q_i}}}{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}} + \frac{1}{2}\frac{{{\partial _{{q_i}}}{\partial _{{q_j}}}\rho }}{\rho },$$
(24)

where R Q ≡ \(\sqrt \rho \), we first have

$$I \equiv {\int} {\rm{d}}q{\kern 1pt} \frac{{{\hbar ^2}}}{2}\frac{{{g^{ij}}}}{4}\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)\left( {\frac{{{\partial _{{q_j}}}\rho }}{\rho }} \right)\rho = - {\int} {\rm{d}}q\frac{{{\hbar ^2}}}{2}\left( {{g^{ij}}\frac{{{\partial _{{q_i}}}{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}\rho - \frac{{{g^{ij}}}}{2}{\partial _{{q_i}}}{\partial _{{q_j}}}\rho } \right).$$
(25)

Integrating the second term by parts once, noting that σ ns = ħ is spatially uniform (i.e., ∂ q ħ = 0), and that \(\rho = R_{\rm{Q}}^2,\) we obtain:

$$\begin{array}{*{20}{l}} I \hfill & \hskip-8pt = \hfill &\hskip-7pt { - \frac{{{\hbar ^2}}}{2}{\int} {\rm{d}}q{\kern 1pt} \left( {{g^{ij}}\frac{{{\partial _{{q_i}}}{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}\rho + \frac{{{\partial _{{q_i}}}{g^{ij}}}}{2}{\partial _{{q_j}}}\rho } \right)} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt { - \frac{{{\hbar ^2}}}{2}{\int} {\rm{d}}q{\kern 1pt} \left( {{g^{ij}}\frac{{{\partial _{{q_i}}}{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}\rho + {\partial _{{q_i}}}{g^{ij}}\frac{{{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}\rho } \right).} \hfill \\ \end{array}$$
(26)

Substituting back into Eq. (23) yields

$$\begin{array}{*{20}{l}} {{{\left\langle {\cal O} \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\rm{d}}q{\kern 1pt} \left[ {\frac{{{g^{ij}}}}{2}\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)\left( {{\partial _{{q_j}}}{S_{\rm{Q}}} - {A_j}} \right) + V} \right.} \hfill \\ {} \hfill & {} \hfill & {\left. { - \frac{{{\hbar ^2}}}{2}\left( {{g^{ij}}\frac{{{\partial _{{q_i}}}{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}} + {\partial _{{q_i}}}{g^{ij}}\frac{{{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}} \right)} \right]\rho .} \hfill \\ \end{array}$$
(27)

We can show that Eq. (27), which is the ensemble average of \({\cal O}(p,q)\) of Eq. (20) within the ontological model, is exactly equal to the quantum mechanical expectation value given by the right hand side of Eq. (18), as mentioned in Theorem 2. To do this, let us calculate the quantum mechanical expectation value of the following Hermitian operator (quantum observable):

$$\hat {\cal O} = (1{\rm{/}}2)\left( {{{\hat p}_i} - {A_i}\left( {\hat q} \right)} \right){g^{ij}}\left( {\hat q} \right)\left( {{{\hat p}_j} - {A_j}\left( {\hat q} \right)} \right) + V\left( {\hat q} \right),$$
(28)

over a quantum state \(\left| \psi \right\rangle \). Note that we have chosen a specific ordering in which \({g^{ij}}(\hat q)\) is sandwiched between two operators \(\left( {{{\hat p}_i} - {A_i}(\hat q)} \right).\) In the position representation, writing \(\left\langle {{q_i}\left| {{{\hat p}_i}} \right|q_i^\prime } \right\rangle = - i\hbar {\partial _{{q_i}}}\delta \left( {q_i^\prime - {q_i}} \right),\) we have to compute

$$\left\langle {\psi \left| {\hat {\cal O}} \right|\psi } \right\rangle = {\int} {\rm{d}}q{\kern 1pt} {\psi ^*}\left[ {\frac{1}{2}\left( { - i\hbar {\partial _{{q_i}}} - {A_i}} \right){g^{ij}}(q)\left( { - i\hbar {\partial _{{q_j}}} - {A_j}} \right) + V} \right]\psi .$$
(29)

Now inserting the wave function in Eq. (19), namely ψ = R Q exp(iS Q/ħ), R Q = \(\sqrt \rho ,\) evaluating the spatial differentiations straightforwardly, and again recalling that σ ns = ħ is spatially uniform, we can divide the integral into real and imaginary parts I r and I i:

$$\left\langle {\psi \left| {\hat {\cal O}} \right|\psi } \right\rangle = {I_{\rm{r}}} + {I_{\rm{i}}},$$
(30)

where the real part is

$$\begin{array}{*{20}{l}} {{I_{\rm{r}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\rm{d}}q{\kern 1pt} \left[ {\frac{{{g^{ij}}}}{2}\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)\left( {{\partial _{{q_j}}}{S_{\rm{Q}}} - {A_j}} \right) + V} \right.} \hfill \\ {} \hfill & {} \hfill & {\left. { - \frac{{{\hbar ^2}}}{2}\left( {{g^{ij}}\frac{{{\partial _{{q_i}}}{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}} + {\partial _{{q_i}}}{g^{ij}}\frac{{{\partial _{{q_j}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}} \right)} \right]\rho ,} \hfill \\ \end{array}$$
(31)

and, since g ij = g ji, the imaginary part is

$$\begin{array}{*{20}{l}} {{I_{\rm{i}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {i\frac{\hbar }{2}{\int} {\rm{d}}q{\kern 1pt} \left[ { - {g^{ij}}{\partial _{{q_i}}}R_{\rm{Q}}^2{\partial _{{q_j}}}{S_{\rm{Q}}} - {\partial _{{q_i}}}{g^{ij}}R_{\rm{Q}}^2{\partial _{{q_j}}}{S_{\rm{Q}}} - {g^{ij}}R_{\rm{Q}}^2{\partial _{{q_i}}}{\partial _{{q_j}}}{S_{\rm{Q}}}} \right.} \hfill \\ {} \hfill & {} \hfill & {\left. { + {g^{ij}}{\partial _{{q_i}}}R_{\rm{Q}}^2{A_j} + {\partial _{{q_i}}}{g^{ij}}R_{\rm{Q}}^2{A_j} + {g^{ij}}R_{\rm{Q}}^2{\partial _{{q_i}}}{A_j}} \right].} \hfill \\ \end{array}$$
(32)

Indeed, the real part I r already equals Eq. (27). We only need to check that the imaginary part I i vanishes. Note that the integral by parts of the third term on the right hand side of Eq. (32) cancels the first and second terms. Also, integrating the sixth term by parts cancels the fourth and fifth terms. Hence I i vanishes. Of course it does, since \(\hat {\cal O}\) as defined in Eq. (28) is a Hermitian operator, hence the quantum mechanical expectation value must be real.

An alternative derivation of the uncertainty relation

In the main text, the uncertainty relation of Eq. (10) is derived via Theorem 2. (See Eq. 8.) Here we show that the uncertainty relation is in fact directly implied by the choice of the epistemic restriction of Eqs. (6) and (7).

Let us consider a pair of canonical conjugate variables corresponding to the i-th degree of freedom (p i ,q i ). First, the normalization condition \({\int} {\rm{d}}q{\kern 1pt} \rho (q) = 1\) of the probability density ρ(q) can be written, via integration by parts, as

$$ - 1 = {\int} {\rm{d}}q{\kern 1pt} \left( {{q_i} - {q_{{0_i}}}} \right){\partial _{{q_i}}}\rho = {\int} {\rm{d}}q\left( {{q_i} - {q_{{0_i}}}} \right){\sqrt \rho}\, {\frac {{\partial _{{q_i}}}\rho}{{\sqrt \rho}}} ,$$

where \({q_{{0_i}}}\) is an arbitrary number. Applying the Cauchy–Schwartz inequality to the integral on the right, we find

$${\int} {\rm{d}}q{\kern 1pt} {\left( {{q_i} - {q_{{0_i}}}} \right)^2}\rho (q){\int} {\rm{d}}q{\kern 1pt} {\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)^2}\rho (q) \ge 1.$$
(33)

Now we choose \({q_{{0_i}}} \!=\! {\int} {\rm{d}}q{\kern 1pt} {q_i}\rho (q) \equiv {\left\langle {{q_i}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}\) and we write \(\sigma _{{{{q}}_i}}^2 \equiv {\int} {\rm{d}}q{( {{q_i} - {{\langle {{q_i}} \rangle }_{\{ {{S_{\rm{Q}}},\rho } \}}}} )^2}\rho (q).\) Multiplying both sides of Eq. (33) by \(\sigma _{{\rm{ns}}}^2{\rm{/}}4\) and recalling that σ ns is independent of q, we get:

$$\sigma _{{{{q}}_i}}^2{\int} {\rm{d}}q{\kern 1pt} {\left( {\frac{{{\sigma _{{\rm{ns}}}}}}{2}\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)^{\!\!2}}\rho (q) \ge \frac{{\sigma _{{\rm{ns}}}^2}}{4}.$$
(34)

On the other hand, the variance of p i at any time can be evaluated as:

$$\begin{array}{*{20}{l}} {\sigma _{{{{p}}_i}}^2} \equiv {{\int} {\rm{d}}q{\kern 1pt} {\rm{d}}\xi {\kern 1pt} {\rm{d}}p{\kern 1pt} {{\left( {{p_i} - {{\left\langle {{p_i}} \right\rangle }_{\{ {S_{\rm{Q}}},\rho \} }}} \right)}^{\!\!2}}{\rm{P}}\left( {{p_i}\left| {q,\xi ,{S_{\rm{Q}}},\rho } \right.} \right)\mu (\xi )\rho (q)} \hfill \\ = {{\int} {\rm{d}}q{\kern 1pt} {\rm{d}}\xi {\kern 1pt} {{\left( {\frac{\xi }{2}\frac{{{\partial _{{q_i}}}\rho }}{\rho } + \left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {{\left\langle {{p_i}} \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \right)} \right)}^{\!\!2}}\mu (\xi )\rho (q)} \hfill \\ = {{\int} {\rm{d}}q{\kern 1pt} \left( {\frac{{\overline {{\xi ^2}} }}{4}} \right){{\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)}^{\!2}}\rho (q) + {\int} {\rm{d}}q{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {{\left\langle {{p_i}} \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \right)}^2}\rho (q)} \hfill \\ \quad \ge {{\int} {\rm{d}}q\left( {\frac{{\overline {{\xi ^2}} }}{4}} \right){{\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)}^{\!2}}\rho (q) = {\int} {\rm{d}}q{{\left( {\frac{{{\sigma _{{\rm{ns}}}}}}{2}\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)}^{\!2}}\rho (q),} \hfill \end{array}$$
(35)

where from the first to the second line we have used Eq. (6), and to get the third line we have imposed Eq. (7). Finally, multiplying both sides of Eq. (35) by \(\sigma _{{q_i}}^2\) and using Eq. (34), one obtains

$${\sigma _{{q_i}}}{\sigma _{{p_i}}} \ge \frac{{{\sigma _{{\rm{ns}}}}}}{2} = \frac{\hbar }{2},$$
(36)

where we have again used Eq. (7) that σ ns = ħ.

Proof of Theorem 3

Theorem 3: Consider an ensemble of trajectories satisfying Eqs. (6) and (7), where σns = ħ is constant in space and time. Given a classical Hamiltonian H(p,q) that is at most quadratic in p, and an ensemble of trajectories conserving the average energy and probability current, there is a unique time evolution for ψ given by the (linear and unitary) Schrödinger equation:

$$i\hbar \frac{{\rm{d}}}{{{\rm{d}}t}}\left| \psi \right\rangle = \hat H\left| \psi \right\rangle ,$$
(37)

where \(\hat H\) is a Hermitian operator again having the same form as that obtained by applying the Dirac canonical quantization procedures to H(p,q) with a specific ordering of operators.

Proof. Again, we prove the theorem by considering an ensemble for a single particle of mass m moving in three dimensions in time-independent scalar and vector potentials V(q) and A(q) = (A 1, A 2, A 2), so that the classical Hamiltonian is given by Eq. (12). First, from Eq. (12), the velocity field is related to the momentum as \({\dot q_i}(p) = \partial H{\rm{/}}\partial {p_i} = \left( {{p_i} - {A_i}} \right){\rm{/}}m,\) so that noting Eqs. (6) and (7), we obtain the average velocity field over the fluctuations of ξ as \(\overline {{{\dot q}_i}} (q;t) \equiv {\int} {\rm{d}}\xi {\kern 1pt} {\rm{d}}p{\kern 1pt} {\dot q_i}\left( {{p_i}} \right){\rm{P}}\left( {p\left| {q,\xi ,{S_{\rm{Q}}},\rho } \right.} \right)\mu (\xi )\) = \({\int} {\rm{d}}\xi {\kern 1pt} {\rm{d}}{p_i}{\kern 1pt} \left( {{p_i} - {A_i}} \right)\delta \left( {{p_i} - \left[ {{\partial _{{q_i}}}{S_{\rm{Q}}} + (\xi {\rm{/}}2){\partial _{{q_i}}}\rho {\rm{/}}\rho } \right]} \right)\mu (\xi ){\rm{/}}m\) = \(\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right){\rm{/}}m.\) In this case, the assumption of conservation of probability current (no creation or annihilation of trajectories) implies that ρ(q;t) satisfies the following continuity equation:

$${\partial _t}\rho + \mathop {\sum}\limits_i \frac{1}{m}{\partial _{{q_i}}}\left( {\rho \left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)} \right) = 0.$$
(38)

On the other hand, the conservation of average energy requires the ensemble of trajectories to satisfy the following equation:

$$\frac{{\rm{d}}}{{{\rm{d}}t}}{\left\langle H \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = 0,$$
(39)

where \({\left\langle H \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}\) is the ensemble (phase space) average of the classical energy within the ontological model defined in Eq. (8).

We impose the constraints of conservation of probability current and average energy (Eqs. (38) and (39)) to the dynamics of the ensemble of trajectories. First, from Eqs. (6) and (12), the ensemble average of energy is

$$\begin{array}{*{20}{l}} {{{\left\langle H \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\rm{d}}q{\kern 1pt} {\rm{d}}\xi {\kern 1pt} {\rm{d}}p{\kern 1pt} \left[ {\mathop {\sum}\limits_i \frac{{{{\left( {{p_i} - {A_i}(q)} \right)}^2}}}{{2m}} + V(q)} \right]{\rm{P}}\left( {p\left| {q,\xi ,{S_{\rm{Q}}},\rho } \right.} \right)\mu (\xi )\rho (q)} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\rm{d}}q{\kern 1pt} {\rm{d}}\xi {\kern 1pt} \left[ {\mathop {\sum}\limits_i \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} + \xi \left( {{\partial _{{q_i}}}\rho } \right){\rm{/}}2\rho - {A_i}} \right)}^2}}}{{2m}} + V} \right]\mu (\xi )\rho (q)} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\rm{d}}q{\kern 1pt} \rho \left[ {\mathop {\sum}\limits_i \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)}^2}}}{{2m}} + \frac{{{\hbar ^2}}}{{8m}}{{\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)}^2} + V} \right],} \hfill \end{array}$$
(40)

where the second line follows from a trivial integration over p and the third line is due to Eq. (7). Differentiating with respect to time and assuming that σ ns = ħ is constant in time, we get

$$\begin{array}{*{20}{l}} {\frac{{\rm{d}}}{{{\rm{d}}t}}{{\left\langle H \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} ({\partial _t}\rho )\,\left[ {\mathop {\sum}\limits_i {\kern 1pt} \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)}^2}}}{{2m}} + \frac{{{\hbar ^2}}}{{8m}}{{\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)}^2} + V} \right]} \hfill \\ {} \hfill & {} \hfill & { + {\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} \rho \left[ {\mathop {\sum}\limits_i {\kern 1pt} \frac{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)}}{m}{\partial _{{q_i}}}{\partial _t}{S_{\rm{Q}}} + \frac{{{\hbar ^2}}}{{4m}}\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right){\kern 1pt} {\partial _{{q_i}}}\left( {\frac{{{\partial _t}\rho }}{\rho }} \right)} \right].} \hfill \end{array}$$
(41)

Integrating by parts the two terms in the second line, noting that ∂ q ħ = 0, and using Eq. (38), we can rewrite Eq. (41) as:

$$\frac{{\rm{d}}}{{{\rm{d}}t}}{\left\langle H \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = {\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} {\partial _t}\rho {\kern 1pt} \left( {{\partial _t}{S_{\rm{Q}}} + \mathop {\sum}\limits_i {\kern 1pt} {\textstyle{{{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)}^2}} \over {2m}}} + {\textstyle{{{\hbar ^2}} \over {2m}}}{\kern 1pt} \left[ {\frac{1}{4}{{\left( {\frac{{{\partial _{{q_i}}}\rho }}{\rho }} \right)}^{\!\!2}} - \frac{1}{2}\frac{{\partial _{{q_i}}^2\rho }}{\rho }} \right] + V} \right).$$
(42)

Using the identity of Eq. (24) and imposing the requirement of the conservation of average energy (Eq. 39) we obtain:

$${\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} {\partial _t}\rho \left( {{\partial _t}{S_{\rm{Q}}} + \mathop {\sum}\limits_i {\kern 1pt} \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)}^2}}}{{2m}} - \frac{{{\hbar ^2}}}{{2m}}\frac{{\partial _{{q_i}}^2{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}} + V} \right) = 0.$$
(43)

To be valid for any ∂ t ρ, the term inside the bracket in the integrand of the above equation must vanish identically. We finally get

$${\partial _t}{S_{\rm{Q}}} + \mathop {\sum}\limits_i {\kern 1pt} \frac{{{{\left( {{\partial _{{q_i}}}{S_{\rm{Q}}} - {A_i}} \right)}^2}}}{{2m}} - \frac{{{\hbar ^2}}}{{2m}}\frac{{\partial _{{q_i}}^2{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}} + V = 0.$$
(44)

We have thus a pair of coupled Eqs. (38) and (44) which govern the time evolution of ρ(q;t) and S Q(q;t), respectively, arising from the assumption of conservation of probability current and conservation of average energy. Using the definition of the wave function given by Eq. (9), noting that R Q = \(\sqrt \rho \) and bearing in mind the assumption that σ ns = ħ is constant in space and time, we can recast Eqs. (38) and (44) into the following compact form:

$$i\hbar {\partial _t}\psi = \left( {\mathop {\sum}\limits_i {\kern 1pt} \frac{1}{{2m}}{{\left( { - i\hbar {\partial _{{q_i}}} - {A_i}} \right)}^2} + V} \right){\kern 1pt} \psi .$$
(45)

Equation (45) is just the familiar Schrödinger equation in the position representation, for a quantum particle of mass m in three-dimensional space subject to a scalar potential V(q) and a vector potential A(q) = (A 1, A 2, A 3) with a Hermitian quantum Hamiltonian \(\hat H = \mathop {\sum}\nolimits_i {\kern 1pt} {\left( {{{\hat p}_i} - {A_i}\left( {\hat q} \right)} \right)^2}{\rm{/}}\left( {2m} \right) + V\left( {\hat q} \right).\)

This derivation of the Schrödinger equation closely parallels the derivation of the classical Hamilton–Jacobi equation given in “Methods” subsection “Proof of Theorem 1.” Both fundamental equations are developed within the same framework and are singled out by imposing two common axioms—conservation of average energy and conservation of trajectories (probability current). It is easy to check that the only difference is that to derive the Hamilton–Jacobi equation one starts with Eq. (3), whereas to derive the Schrödinger equation one has to replace the classical kinematics of Eq. (3) with Eq. (6). As discussed in “Results” subsection “Microscopic ontic extension and epistemic restriction” and in “Methods” subsection “Statistical interpretation of the epistemic restriction,” Eq. (6) can be interpreted as the manifestation of an epistemic restriction which is absent in the classical mechanics.

Exactly the same framework and derivations apply to many-particle systems with a classical general Hamiltonian up to second order in momentum, as spelled out in Theorem 3. In particular, when there are interactions among degrees of freedom, the fundamental nonseparability of ξ will play a crucial role. An example of special interest—because it generates entanglement—appears in the next subsection. Various methods for deriving Schrödinger’s equation are also reported in refs. 42,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63. Our method is distinguished in its modification of classical statistical mechanics via an ontic extension (introducing a global-nonseparable random variable ξ), and a specific form of an epistemic restriction (Eq. 6), and imposing the principles of conservation of average energy and conservation of probability current. We emphasize that without the ontic extension and the epistemic restriction, our derivation yields the classical Hamilton–Jacobi equation.

Schrödinger’s equation for measurement of angular momentum

Let us apply our model to derive the Schrödinger equation for a measurement of angular momentum, adopting von Neumann’s prescription for a measurement interaction. We will follow all the steps of the previous subsection. For simplicity, let us model the measurement setup via two interacting particles with positions denoted by q S and q Σ; they represent, respectively, the position of the measured system and the position of the pointer of a measuring device. Let us denote the corresponding conjugate momentum p S and p Σ. Without loss of generality, we consider a measurement of the z-component of angular momentum, \({l_{{z_{\rm{S}}}}} = {\left( {{q_{\rm{S}}} \times {p_{\rm{S}}}} \right)_z} = {x_{\rm{S}}}{p_{{y_{\rm{S}}}}} - {y_{\rm{S}}}{p_{{x_{\rm{S}}}}},\) where q S ≡ (x S, y S, z S) and \({p_{\rm{S}}} \equiv \left( {{p_{{x_{\rm{S}}}}},{p_{{y_{\rm{S}}}}},{p_{{z_{\rm{S}}}}}} \right).\) The classical Hamiltonian corresponding to the von Neumann interaction is then

$${H_{\rm{I}}} = g{l_{{z_{\rm{S}}}}}{p_{\rm{\Sigma }}} = g\left( {{x_{\rm{S}}}{p_{{y_{\rm{S}}}}} - {y_{\rm{S}}}{p_{{x_{\rm{S}}}}}} \right){\kern 1pt} {p_{\rm{\Sigma }}},$$
(46)

where g is the interaction coupling. For simplicity, we take g to be constant during the measurement; otherwise g = 0.

Taking this interaction Hamiltonian H I to express the velocity in term of momentum via \({\dot q_i}(p) = \partial {H_{\rm{I}}}{\rm{/}}\partial {p_i}\) and using Eqs. (6) and (7), we obtain the velocity field averaged over the fluctuations of ξ, i.e., \(\overline {{{\dot q}_i}} (q;t) \equiv {\int} {\kern 1pt} {\rm{d}}\xi {\rm{d}}p{\kern 1pt} {\dot q_i}(p){\kern 1pt} {\rm{P}}\left( {p{\rm{|}}q,\xi ,{S_{\rm{Q}}},\rho } \right){\kern 1pt} \mu (\xi ),\) which is given by:

$${\overline {\dot x} _{\rm{S}}} = - g{y_{\rm{S}}}{\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}},\quad {\overline {\dot y} _{\rm{S}}} = g{x_{\rm{S}}}{\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}},\quad {\overline {\dot q} _{\rm{\Sigma }}} = g\left( {{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}} - {y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right),$$
(47)

and \({\overline {\dot z} _{\rm{S}}} = 0.\) Assuming that the probability current is conserved, we arrive at the following continuity equation:

$$\begin{array}{*{20}{l}} {{\partial _t}\rho - g{y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}\left( {\rho {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}} \right) + g{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}\left( {\rho {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}} \right) + g{x_{\rm{S}}}{\partial _{{q_{\rm{\Sigma }}}}}\left( {\rho {\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}}} \right)} \hfill \\ {\quad - g{y_{\rm{S}}}{\partial _{{q_{\rm{\Sigma }}}}}\left( {\rho {\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right) = 0.} \hfill \end{array}$$
(48)

Now let us impose conservation of average energy. First, using Eq. (46) the ensemble average of energy reads

$$\begin{array}{*{20}{l}} {{{\left\langle {{H_{\rm{I}}}} \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} {\rm{d}}\xi {\kern 1pt} {\rm{d}}p{\kern 1pt} g\left( {{x_{\rm{S}}}{p_{{y_{\rm{S}}}}} - {y_{\rm{S}}}{p_{{x_{\rm{S}}}}}} \right){p_{\rm{\Sigma }}}} \hfill \\ {} \hfill & {} \hfill & { \times{\rm{P}}\left( {{p_{{x_{\rm{S}}}}},{p_{{y_{\rm{S}}}}},{p_{\rm{\Sigma }}}{\rm{|}}q,\xi ,{S_{\rm{Q}}},\rho } \right){\kern 1pt} \mu (\xi )\rho (q).} \hfill \end{array}$$
(49)

Inserting Eq. (6) and evaluating the integrations over p and ξ, we get, noting Eq. (7),

$$\begin{array}{*{20}{l}} {{{\left\langle {{H_{\rm{I}}}} \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} \left\{ {g\left( {{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}} - {y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right){\kern 1pt} {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}\rho } \right.} \hfill \\ {} \hfill & {} \hfill & {\left. {\quad + g\frac{{{\hbar ^2}}}{4}\left[ {{x_{\rm{S}}}\left( {\frac{{{\partial _{{y_{\rm{S}}}}}\rho }}{\rho }} \right)\,\left( {\frac{{{\partial _{{q_{\rm{\Sigma }}}}}\rho }}{\rho }} \right) - {y_{\rm{S}}}\left( {\frac{{{\partial _{{x_{\rm{S}}}}}\rho }}{\rho }} \right)\,\left( {\frac{{{\partial _{{q_{\rm{\Sigma }}}}}\rho }}{\rho }} \right)} \right]{\kern 1pt} \rho } \right\}.} \hfill \end{array}$$
(50)

Taking the derivative with respect to time to both sides, noting that σ ns = ħ is constant in time, we obtain, after a long but straightforward calculation, and rearrangement,

$$\begin{array}{*{20}{l}} {\frac{{\rm{d}}}{{{\rm{d}}t}}{{\left\langle {{H_{\rm{I}}}} \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} \left\{ {\left( {{\partial _t}\rho } \right){\kern 1pt} g\left( {{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}} - {y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right){\kern 1pt} {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}} \right.} \hfill \\ {} \hfill & {} \hfill & { + \left( {g{y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}\left( {\rho {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}} \right) - g{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}\left( {\rho {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}} \right)} \right.} \hfill \\ {} \hfill & {} \hfill & {\left. { - g{x_{\rm{S}}}{\partial _{{q_{\rm{\Sigma }}}}}\left( {\rho {\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}}} \right) + g{y_{\rm{S}}}{\partial _{{q_{\rm{\Sigma }}}}}\left( {\rho {\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right)} \right){\kern 1pt} {\partial _t}{S_{\rm{Q}}}{\kern 1pt} } \hfill \\ {} \hfill & {} \hfill & { + g{\hbar ^2}{\kern 1pt} \left[ {{x_{\rm{S}}}{\kern 1pt} \left( {\frac{1}{4}\left( {\frac{{{\partial _{{y_{\rm{S}}}}}\rho }}{\rho }} \right)\,\left( {\frac{{{\partial _{{q_{\rm{\Sigma }}}}}\rho }}{\rho }} \right) - \frac{{{\partial _{{y_{\rm{S}}}}}{\partial _{{q_{\rm{\Sigma }}}}}\rho }}{{2\rho }}} \right)} \right.} \hfill \\ {} \hfill & {} \hfill & {\left. {\left. { - {y_{\rm{S}}}{\kern 1pt} \left( {\frac{1}{4}\left( {\frac{{{\partial _{{x_{\rm{S}}}}}\rho }}{\rho }} \right)\,\left( {\frac{{{\partial _{{q_{\rm{\Sigma }}}}}\rho }}{\rho }} \right) - \frac{{{\partial _{{x_{\rm{S}}}}}{\partial _{{q_{\rm{\Sigma }}}}}\rho }}{{2\rho }}} \right)} \right]{\kern 1pt} {\partial _t}\rho } \right\}.} \hfill \end{array}$$
(51)

Here, bearing in mind that ħ is constant of space, we have performed partial integrations where appropriate. Using Eq. (48), the second line can be simplified into ∂ t ρ t S Q; moreover the last line can be simplified by virtue of the identity Eq. (24), so that the whole equation simplifies into

$$\begin{array}{*{20}{l}} {\frac{{\rm{d}}}{{{\rm{d}}t}}{{\left\langle {{H_{\rm{I}}}} \right\rangle }_{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} \left( {{\partial _t}\rho } \right){\kern 1pt} \left\{ {{\partial _t}{S_{\rm{Q}}} + g\left( {{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}} - {y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right){\kern 1pt} {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}} \right.} \hfill \\ {} \hfill & {} \hfill & {\left. { - g{\hbar ^2}\left( {{x_{\rm{S}}}\frac{{{\partial _{{y_{\rm{S}}}}}{\partial _{{q_{\rm{\Sigma }}}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}} - {y_{\rm{S}}}\frac{{{\partial _{{x_{\rm{S}}}}}{\partial _{{q_{\rm{\Sigma }}}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}} \right)} \right\},} \hfill \end{array}$$
(52)

where R Q ≡ \(\sqrt \rho .\) Conservation of average energy, \(({\rm{d/d}}t)\,{\left\langle {{H_{\rm{I}}}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = 0\) for any t ρ, means that the integrand inside the bracket must vanish. So we have

$${\partial _t}{S_{\rm{Q}}} + g\left( {{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}} - {y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right){\kern 1pt} {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}} - g{\hbar ^2}{\kern 1pt} \left( {{x_{\rm{S}}}\frac{{{\partial _{{y_{\rm{S}}}}}{\partial _{{q_{\rm{\Sigma }}}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}} - {y_{\rm{S}}}\frac{{{\partial _{{x_{\rm{S}}}}}{\partial _{{q_{\rm{\Sigma }}}}}{R_{\rm{Q}}}}}{{{R_{\rm{Q}}}}}} \right) = 0.$$
(53)

We thus have a pair of coupled Eqs. (48) and (53), arising, respectively, from the conservation of probability current and conservation of average energy. Finally, applying the definition ψ = R Qexp(iS Q/ħ) of wave function in Eq. (9), and noting that σ ns = ħ is constant in space and time, we recast Eqs. (48) and (53) into

$$i\hbar \frac{{\rm{d}}}{{{\rm{d}}t}}\left| \psi \right\rangle = {\hat H_{\rm{I}}}\left| \psi \right\rangle .$$
(54)

Here \({\hat H_{\rm{I}}}\) is a Hermitian operator defined as:

$${\hat H_{\rm{I}}} \equiv g{\hat l_{{z_{\rm{S}}}}}{\hat p_{\rm{\Sigma }}},$$
(55)

where \({\hat p_i}\) is the quantum momentum operator for the i-degree of freedom and \({\hat l_{{z_{\rm{S}}}}} \equiv {\hat x_{\rm{S}}}{\hat p_{{y_{\rm{S}}}}} - {\hat y_{\rm{S}}}{\hat p_{{x_{\rm{S}}}}}\) is the z-component of the quantum angular momentum operator of the measured system. This equation is just the Schrödinger equation for a measurement of angular momentum via the von Neumann measurement interaction \({\hat H_{\rm{I}}}.\)

In this derivation of Schrödinger’s equation for a measurement interaction, the nonseparability of ξ plays a crucial role. To see this, let us instead suppose that ξ is separable into three random variables \(\xi = ( {{\xi _{{x_{\rm{S}}}}},{\xi _{{y_{\rm{S}}}}},{\xi _{{q_{\rm{\Sigma }}}}}} )\) each associated with the respective degrees of freedom (x S, y S, q Σ), with vanishing average \({\overline \xi _{{x_{\rm{S}}}}} = {\overline \xi _{{y_{\rm{S}}}}} = {\overline \xi _{{q_{\rm{\Sigma }}}}} = 0,\) so that the pairs \(( {{\xi _{{x_{\rm{S}}}}},{\xi _{{q_{\rm{\Sigma }}}}}} )\) and \(( {{\xi _{{y_{\rm{S}}}}},{\xi _{{q_{\rm{\Sigma }}}}}} )\) were both independent of each other (thus uncorrelated): \(\overline {{\xi _{{x_{\rm{S}}}}}{\xi _{{q_{\rm{\Sigma }}}}}} = {\overline \xi _{{x_{\rm{S}}}}}{\overline \xi _{{q_{\rm{\Sigma }}}}} = 0 = {\overline \xi _{{y_{\rm{S}}}}}{\overline \xi _{{q_{\rm{\Sigma }}}}} = \overline {{\xi _{{y_{\rm{S}}}}}{\xi _{{q_{\rm{\Sigma }}}}}} .\) In this case, the last term in Eq. (50) (explicitly proportional to ħ 2) vanishes, yielding

$${\left\langle {{H_{\rm{I}}}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = {\int} {\kern 1pt} {\rm{d}}q{\kern 1pt} g\left( {{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}} - {y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right){\kern 1pt} {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}}\rho .$$
(56)

Identifying S Q as the classical Hamilton’s principal function S C, and recalling Eq. (3), the above expression is just the conventional classical average energy.

Then, imposing the conservation of average energy \(({\rm{d/d}}t)\,{\left\langle {{H_{\rm{I}}}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}} = 0\) with \({\left\langle {{H_{\rm{I}}}} \right\rangle _{\left\{ {{S_{\rm{Q}}},\rho } \right\}}}\) given by Eq. (56) and using Eq. (48), instead of Eq. (53) we obtain

$${\partial _t}{S_{\rm{Q}}} + g\left( {{x_{\rm{S}}}{\partial _{{y_{\rm{S}}}}}{S_{\rm{Q}}} - {y_{\rm{S}}}{\partial _{{x_{\rm{S}}}}}{S_{\rm{Q}}}} \right){\kern 1pt} {\partial _{{q_{\rm{\Sigma }}}}}{S_{\rm{Q}}} = 0.$$
(57)

This is just the classical Hamilton–Jacobi equation which can be seen by identifying S Q as the Hamilton’s principal function S C, and noting Eqs. (1) and (46). Notice that, comparing Eq. (57) with Eq. (53), the last term in Eq. (53) which explicitly depends on ħ 2 (obtained with ξ nonseparable), is no longer present in Eq. (57) (obtained with separable ξ). This ħ 2-dependent term is called quantum potential in Bohmian mechanics, and is generally argued as being responsible for the classically puzzling quantum phenomena. Hence, the fundamental nonseparability of ξ plays a crucial role in the derivation of the Schrödinger equation for interacting systems. Since such interaction implies quantum entanglement—for the above example for the measurement interaction, we obtain, in the next subsection, entanglement between the system and the apparatus—the nonseparability of ξ is indeed crucial for obtaining quantum entanglement.

Derivation of Born’s rule

Let us apply our ontological model to a measurement of \({{\cal O}_{\rm{S}}}({p_{\rm{S}}},{q_{\rm{S}}})\) using a von Neumann measurement interaction Hamiltonian \({H_{\rm{I}}} = g{{\cal O}_{\rm{S}}}{p_{\rm{\Sigma }}}.\) Here p Σ is the momentum of the pointer on a measuring device, conjugate to the pointer position q Σ, and g is the interaction coupling. As spelled out in Theorem 3, the resulting Schrödinger equation reads \(i\hbar ({\rm{d/d}}t)\,\left| \psi \right\rangle = {\hat H_{\rm{I}}}\left| \psi \right\rangle ,\) where the quantum Hamiltonian is \({\hat H_{\rm{I}}} = g{\hat {\cal O}_{\rm{S}}}{\hat p_{\rm{\Sigma }}}.\) An example of the derivation of the Schrödinger equation for the measurement of angular momentum is given in the previous subsection. From the Schrödinger equation governing the time evolution of the wave function during the measurement interaction, we can proceed to describe measurements reproducing the predictions of quantum mechanics as prescribed by Born’s rule, as follows.

We let ψ S(q S) denote the wave function of the system at the initial measurement interaction time t = 0. It can be expanded as \({\psi _{\rm{S}}}({q_{\rm{S}}}) = \mathop {\sum}\nolimits_k {\kern 1pt} {c_k}{\phi _{{{\rm{S}}_k}}}({q_{\rm{S}}}),\) where \(\left\{ {\left| {{\phi _{{{\rm{S}}_k}}}} \right\rangle } \right\},\) k = 0, 1, 2, … is the complete set of orthonormal eigenvectors of the Hermitian operator \({\hat {\cal O}_{\rm{S}}}\) with the corresponding eigenvalues {o k }, satisfying \({\hat {\cal O}_{\rm{S}}}\left| {{\phi _{{{\rm{S}}_k}}}} \right\rangle = {o_k}\left| {{\phi _{{{\rm{S}}_k}}}} \right\rangle .\) The expansion coefficient is then \({c_k} = {\int} {\kern 1pt} {\rm{d}}{q_{\rm{S}}}\phi _{{{\rm{S}}_k}}^*\left( {{q_{\rm{S}}}} \right){\psi _{\rm{S}}}\left( {{q_{\rm{S}}}} \right) = \left\langle {{\phi _{{{\rm{S}}_k}}}{\rm{|}}{\psi _{\rm{S}}}} \right\rangle .\) Let φ Σ(q Σ) denote the initial wave function of the pointer of a measuring device, and assume that the total wave function of the system and device at t = 0 is factorizable: \(\psi \left( {{q_{\rm{S}}},{q_{\rm{\Sigma }}};0} \right) = {\psi _{\rm{S}}}({q_{\rm{S}}}){\kern 1pt} {\varphi _{\rm{\Sigma }}}\left( {{q_{\rm{\Sigma }}}} \right) = \mathop {\sum}\nolimits_k {\kern 1pt} {c_k}{\phi _{{{\rm{S}}_k}}}\left( {{q_{\rm{S}}}} \right){\kern 1pt} {\varphi _{\rm{\Sigma }}}\left( {{q_{\rm{\Sigma }}}} \right).\) It evolves in time in accordance with Schrödinger’s equation \(i\hbar ({\rm{d/d}}t)\,\left| \psi \right\rangle = {\hat H_{\rm{I}}}\left| \psi \right\rangle \) with the measurement-interaction quantum Hamiltonian \({\hat H_{\rm{I}}} = g{\hat {\cal O}_{\rm{S}}}{\hat p_{\rm{\Sigma }}}\); thus the total wave function at the end of measurement interaction at time t = T is entangled:

$$\psi \left( {{q_{\rm{S}}},{q_{\rm{\Sigma }}};T} \right) = \mathop {\sum}\limits_k {\kern 1pt} {c_k}{\phi _{{{\rm{S}}_k}}}{\kern 1pt} \left( {{q_{\rm{S}}}} \right){\kern 1pt} {\varphi _{\rm{\Sigma }}}\left( {{q_{\rm{\Sigma }}} - g{o_k}T} \right).$$
(58)

Let us assume that the strength of the interaction coupling g is sufficient such that at t = T the series of device wave packets {φ Σ(q Σ − go j T)}, j = 0, 1, 2, …, for different values of j, effectively do not overlap. If so, when the position of the pointer of the device q Σ at the end of the measurement process belongs to the support of φ Σ(q Σ − go j T), we can unambiguously register the outcome of measurement as o j , one of the eigenvalues of \({\hat {\cal O}_{\rm{S}}}\). The probability that the measurement yields o j given q = (q S, q Σ) and ψ(T) is thus

$${\rm{P}}\left( {{o_j}{\rm{|}}{q_{\rm{S}}},{q_{\rm{\Sigma }}},\psi (T)} \right) = 1{\kern 1pt} \left\{ {{q_{\rm{\Sigma }}} \in {{\rm{\Lambda }}_j}} \right\},$$
(59)

where Λ j is the support of φ Σ(q Σ − go j T) and 1{“event”} is an indicator function which gives “1” if the event occurs and “0” if not.

Furthermore, from the definition of the wave function of Eq. (9) and Eq. (58), the probability that the configuration of the system and device at t = T is q = (q S, q Σ) is given by:

$$\begin{array}{*{20}{l}} {{\rm{P}}\left( {{q_{\rm{S}}},{q_{\rm{\Sigma }}}{\rm{|}}\psi (T)} \right)} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{{\left| {\psi \left( {{q_{\rm{S}}},{q_{\rm{\Sigma }}};T} \right)} \right|}^2}} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {\mathop {\sum}\limits_{(j,k)} {\kern 1pt} c_j^*{c_k}\phi _{{{\rm{S}}_j}}^*\left( {{q_{\rm{S}}}} \right){\kern 1pt} {\phi _{{{\rm{S}}_k}}}\left( {{q_{\rm{S}}}} \right){\kern 1pt} \varphi _{\rm{\Sigma }}^*\left( {{q_{\rm{\Sigma }}} - g{o_j}T} \right){\kern 1pt} {\varphi _{\rm{\Sigma }}}\left( {{q_\Sigma } - g{o_k}T} \right)} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {\mathop {\sum}\limits_k {\kern 1pt} {{\left| {{c_k}} \right|}^2}\,{{\left| {{\phi _{{{\rm{S}}_k}}}\left( {{q_{\rm{S}}}} \right)} \right|}^2}\,{{\left| {{\varphi _{\rm{\Sigma }}}\left( {{q_{\rm{\Sigma }}} - g{o_k}T} \right)} \right|}^2},} \hfill \end{array}$$
(60)

where in the last equality we have taken into account the fact that since the support of {φ Σ(q Σ − go k T)} for different k do not overlap, the cross terms in the double sum all vanish.

One can finally show straightforwardly from Eqs. (59) and (60), via conventional probability theory, that the probability to get o j when the initial wave function of the system is \({\psi _{\rm{S}}} = \mathop {\sum}\nolimits_k {\kern 1pt} {c_k}{\phi _{{{\rm{S}}_k}}}\) is given by the celebrated Born’s rule:

$$\begin{array}{*{20}{l}} {{\rm{P}}\left( {{o_j}{\rm{|}}{\psi _{\rm{S}}}} \right)} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}{q_{\rm{S}}}{\rm{d}}{q_{\rm{\Sigma }}}{\kern 1pt} {\rm{P}}\left( {{o_j}{\rm{|}}{q_{\rm{S}}},{q_{\rm{\Sigma }}},\psi (T)} \right)\,{\rm{P}}\left( {{q_{\rm{S}}},{q_{\rm{\Sigma }}}{\rm{|}}\psi (T)} \right)} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}{q_{\rm{S}}}{\rm{d}}{q_{\rm{\Sigma }}}{\kern 1pt} 1{\kern 1pt} \left\{ {{q_{\rm{\Sigma }}} \in {{\rm{\Lambda }}_j}} \right\}{\kern 1pt} \mathop {\sum}\limits_k {\kern 1pt} {{\left| {{c_k}} \right|}^2}\,{{\left| {{\phi _{{{\rm{S}}_k}}}\left( {{q_{\rm{S}}}} \right)} \right|}^2}\,{{\left| {{\varphi _{\rm{\Sigma }}}\left( {{q_{\rm{\Sigma }}} - g{o_k}T} \right)} \right|}^2}} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {{\int} {\kern 1pt} {\rm{d}}{q_{\rm{S}}}{\rm{d}}{q_{\rm{\Sigma }}}{{\left| {{c_j}} \right|}^2}\,{{\left| {{\phi _{{{\rm{S}}_j}}}\left( {{q_{\rm{S}}}} \right)} \right|}^2}\,{{\left| {{\varphi _{\rm{\Sigma }}}\left( {{q_{\rm{\Sigma }}} - g{o_j}T} \right)} \right|}^2} = {{\left| {{c_j}} \right|}^2} = {{\left| {\left\langle {{\phi _{{{\rm{S}}_j}}}{\rm{|}}{\psi _{\rm{S}}}} \right\rangle } \right|}^2},} \hfill \end{array}$$
(61)

where in the last line we have used the normalizations of \({\phi _{{{\rm{S}}_j}}}({q_{\rm{S}}})\) and φ Σ(q Σ − go j T) as implied by the definitions given in Eq. (9).

Suppose that the position of the pointer on the measuring device belongs to the support of φ Σ(q Σ − go j T), i.e., that the outcome of measurement is o j . If the measurement is not destructive and since φ Σ(q Σ − go j T) for different j do not overlap, Eq. (58) implies that the “effective” wave function of the system and device becomes \({\phi _{{{\rm{S}}_j}}}({q_{\rm{S}}}){\kern 1pt} {\varphi _{\rm{\Sigma }}}\left( {{q_{\rm{\Sigma }}} - g{o_j}T} \right)\). Hence, when the outcome of measurement is o j , the effective wave function of the system alone is \({\phi _{{{\rm{S}}_j}}}({q_{\rm{S}}})\), i.e., the eigenstate of \({\hat {\cal O}_{\rm{S}}}\) associated with the eigenvalue o j .

In this connection, Wallstrom29 has argued that a derivation of the Schrödinger equation based on the combination of a modified Hamilton–Jacobi equation and a continuity equation via Eq. (9), as in our model (in the previous two subsections), will have to allow many more wave functions than those allowed in quantum mechanics. In particular, the wave function ψ defined in Eq. (9) is in general not single-valued (since the phase function S Q is in general many-valued, for example for wave functions with angular momentum). This is also the feature of Nelson’s stochastic mechanics46 and many other approaches42,44,45,47. He went on to argue that unless one imposes, by hand, a quantization condition as in the old quantum theory (to ensure the single-valuedness of the wave function), then one has, for example, to allow a particle to have a non-integral (continuum) value of angular momentum. Such an ad hoc condition will, in our model, physically translate into an additional statistical constraint which selects a yet narrower class of ensembles of trajectories.

Within our model, by construction, a particle may indeed have a non-integral value of angular momentum (or a continuum value of energy) if left unmeasured. In this case, the wave function may indeed not be single-valued. Nevertheless, as shown above, measurement of angular momentum will only yield discrete (quantized) outcome as in quantum mechanics. Hence, discrete quantum numbers is an emergent feature of measurement, rather than an objective property of the system regardless of measurement. Remarkably, as shown by Theorem 2 and above in this subsection, the ensemble average of the angular momentum prior to measurement (in which the angular momentum may take continuum values) is well defined and is equal to the quantum mechanical expectation value obtained in measurement (in which each single shot yields discrete integral value). A similar answer to Wallstrom’s objection is argued in refs. 45,47.

Data availability

Data sharing is not applicable to this article as no data sets were generated or analysed during the current study.