Introduction

Although quantum mechanics is one of the most successful physical theories and has been experimentally confirmed extensively, there are many fundamental questions still left unanswered. For instance, the origin of probability in quantum mechanics is not clearly understood. It is still a curiosity why the probability is calculated as the absolute square of a complex number. The meaning of wave function, especially the interpretation of wave function collapse in a measurement, has been always a debated topic. These questions were not fully addressed by the traditional Copenhagen Interpretation1. Over the years in the modern history of quantum physics, many more theories and interpretations have been developed. These include the many-worlds interpretation2,3,4, consistent histories5,6,7,8, decoherent theory9,10,11, relational interpretations12,13, quantum Bayesian theory14,15, and many others. Along the development of these interpretations, one noticeable idea is the realization that a quantum state is relative in nature. That is, an observer independent quantum state is not necessarily the basic description of a quantum system. In the early days of quantum mechanics, Bohr had already emphasized that the description of a quantum system depends on the measuring apparatus16,17. Reference2 recognized that a quantum state of a subsystem is only meaningful relative to a given state of the rest of the system. Similarly, in developing the theory of decoherence induced by environment, ref.9 concluded that correlation information between two quantum systems is more basic than the properties of the quantum systems themselves. Relational Quantum Mechanics (RQM) has pursued this idea to the furthest extend. RQM is inspired by the basic principle from Einstein’s Special Relativity. In the context of RQM, a quantum system should be described relative to another system, there is no absolute state for a quantum system. Specifically, the main idea of RQM is stated as following,

Quantum mechanics is a theory about the physical description of physical system relative to other systems, and this is a complete description of the world12.

This statement appears radical but reflects the fact that quantum mechanics was originally developed as a theory to describe the experimental observations of a quantum system in a measurement. When we state that the observing system records the measurement results of a variable of the observed system, it means that a correlation between the observed system and the observing system is established through physical interaction. By reading the pointer variable in the observing system, one can infer the value of variable in the observed system. In this sense, quantum theory does not describe the independent properties of a quantum system. Instead, it describes the relation among quantum systems, and how correlation is established through physical interaction during measurement. The reality of a quantum system is only meaningful in the context of measurement by another system.

The idea that relational properties are more basic than the independent properties of a quantum system is profound. It should be considered a starting point for constructing the formulation of quantum mechanics. However, traditional quantum mechanics always starts with an observer-independent quantum state. It is of interest to see if a quantum theory constructed based on relational properties can address some of the unanswered fundamental questions mentioned earlier. Such reconstruction program was initiated in ref.12 and had some successes, for example, in deriving the Schrödinger Equation. However, the origin of quantum probability, the Born’s Rule, the measurement theory, were not yet fully developed. The term correlation in refs2,9,12 essentially refers to the entanglement between two quantum systems. How entanglement affects the calculation of probability was not analyzed in ref.12.

The goal of this paper is to continue the program of reconstructing the formulation of quantum mechanics with the starting point that the relational properties are the most basic elements. What is novel in our approach is a new framework for calculating the probability of an outcome when measuring a quantum system. Such a framework is fundamental in deriving basic laws of quantum mechanics, so we briefly describe it here. In searching for the appropriate relational properties as the starting elements for the reconstruction, we recognize that a physical measurement is a probe-response interaction process between the measured system and the measuring apparatus. This important aspect of measurement process seems being overlooked in other reconstruction efforts. Our framework for calculating the probability, on the other hand, explicitly models this bidirectional process. As such, the probability can be derived from product of two quantities and each quantity is associated with a unidirectional process. We call such quantity relational probability amplitude. When two quantum systems interact, there are many alternative configurations for such two-way process. Each alternative is assigned with a weight that is a product of two relational probability amplitudes associated with the configuration. The probability of a measurement outcome is then postulated to be proportional to the summation of such weights from all the applicable configurations. Thus, the task of calculating the probability is reduced to counting the applicable alternatives. The properties of the measured system are manifested through the rules to count the alternatives. Another aspect of novelty of this framework is the introduction of the concept of entanglement to the relational properties. When the quantum system is entangled with the apparatus, the rule of counting the alternatives is adjusted accordingly due to the availability of inference information. Furthermore, the entanglement measure quantifies the difference between time evolution and quantum measurement. Lastly, we show that such framework to calculate probability amplitude can be explicitly implemented using Feynman Path Integral18.

The impacts of this framework are fundamental as it is the basis for deriving the formulations that are mathematically equivalent to the laws in traditional quantum mechanics. The formulation for calculating the probability of finding the system in an eigenstate is equivalent to Born’s rule, but with a new insight: the fact it is an absolute square of a complex number is a consequence that a quantum measurement is a bidirectional process. Wave function is found to be a mathematical tool representing the summation of relational probability amplitude. Thus, the notion of wave function collapse during measurement is just a consequence of changes of relational properties. Schrödinger equation can be derived when the entanglement measure between the observed quantum system and the observing system is zero and unchanged. On the other hand, when there is change in the entanglement measure, quantum measurement theory is obtained.

Although the formulation presented here is mathematically equivalent to the traditional quantum mechanics, the theory presented here provides new understanding on the origin of quantum probability. It shows that an essence of quantum mechanics is a new set of rules to calculate the measurement probability from an interaction process. The most important outcome of this paper is that quantum mechanics can be constructed with the relational properties among quantum systems as the most fundamental building blocks.

The paper is organized as following. We first clarify the definitions of key terminologies. In the Result Section, we introduce the postulates and frameworks to calculate quantum probability. We then apply this framework to develop the formulation for time evolution of a quantum system and derive the conditions when Schrödinger equation can be recovered. In the Discussion and Conclusion Section, we provide a comparison between this works and the original RQM theory, discuss the limitations, and summarize the conclusions. An explicit calculation of the relational probability amplitude using Feynman Path Integral formulation is presented in the Method Section.

Definitions of Key Terminologies

Quantum System, Apparatus, and Observer

To avoid potential confusion, it is useful to define several key terms before proceeding. A Quantum System, denoted by symbol S, is an object under study and follows the postulates that will be introduced in next section. An Apparatus, denoted as A, can refer to the measuring devices, the environment that S is interacting with, or the system from which S is created. It is another quantum system that can interact with S, can acquire or encode information from S. We will strictly follow the assumptions that all systems are quantum systems, including any apparatus. Depending on the selection of observer, the boundary between a system and an apparatus can change. For example, in a measurement setup, the measuring system is an apparatus A, the measured system is S. However, the composite system S + A as a whole can be considered a single system, relative to another apparatus A′. In an ideal measurement to measure an observable of S, the apparatus is designed in such a way that at the end of the measurement, the pointer state of A has a distinguishable, one to one correlation with the eigenvalue of the observable of S.

The definition of Observer is associated with an apparatus. An observer, denoted as O, is an intelligent entity who can operate and read the pointer variable of the apparatus. This can be a human being, or an artificial intelligent computer. The distinction between an observer and an apparatus is that an apparatus directly interacts with S, while an observer does not. Whether or not this observer is a quantum system is irrelevant in our formulation. However, there is a restriction that is imposed by the principle of locality. An observer is defined to be physically local to the apparatus he associates with. This prevents the situation that O can instantaneously read the pointer variable of the apparatus that is space-like separated from O. Receiving the information from A at a speed faster than the speed of light is prohibited. This locality requirement is crucial in resolving the EPR argument13,19. An observer cannot be associated with two or more apparatuses in the same time if these apparatuses are space-like separated.

Quantum Measurement

Given the hypothesis that a quantum system should be described relative to another system, the first question to ask is which another system the description is relative to? A quantum system, at any given time, is either being measured by an apparatus, or interacting with its environment, or is in an isolated environment. It is intuitive to select a reference system that has been previously interacting with the quantum system. A brief review of the traditional quantum measurement theory is helpful since it brings important insights on the meaning of a quantum state.

In the traditional quantum measurement theory proposed by von Neumann20, both the quantum system and the measuring apparatus follow the same quantum mechanics laws. Von Neumann further distinguished two separated measurement stages, Process 1 and Process 2. Mathematically, an ideal measurement process is expressed as

$$\begin{array}{l}|{\rm{\Psi }}{\rangle }_{SA}=|{\psi }_{S}\rangle |{a}_{0}\rangle =\sum _{i}\,{c}_{i}|{s}_{i}\rangle |{a}_{0}\rangle \\ \,\to \sum _{i}\,{c}_{i}|{s}_{i}\rangle |{a}_{i}\rangle \to |{s}_{n}\rangle |{a}_{n}\rangle \end{array}$$
(1)

Initially, both S and A are in a product state described by |Ψ〉SA. In Process 2, referring to the first arrow in Eq. (1), the quantum system S and the apparatus A interact. However, as a combined system they are isolated from interaction with any other system. Therefore, the dynamics of the total system is determined by the Schrödinger Equation. Process 2 establishes a one to one correlation between the eigenstate of observable of S and the pointer state of A. After Process 2, there are many possible outcomes to choose from. In the next step which is called Process 1, referring to the second arrow in Eq. (1), one of these possible outcomes (labeled with eigenvalue n) emerges out from the rest (See Note 1 in Supplemental Information). An observer knows the outcome of the measurement via the pointer variable of the apparatus. Both systems encode information each other, allowing an observer to infer measurement results of S by reading pointer variable of A. This observation is also applicable in the case that a quantum system is prepared in a particular state. The term preparation refers to the situation that S is measured by an apparatus, or is prepared with a particular lab setup (for instance, a spin half particle passes through a Stern-Gerlach Apparatus), such that its state is explicitly known to an observer. The measuring system, and the environment that S interacts with, are collectively termed as the apparatus A. Because of the correlation established between S and A during the state preparation process, it is natural to describe the state of S in reference to A.

After the state preparation, suppose the interaction Hamiltonian between S and A vanishes, S starts its unitary time evolution. During time evolution, S can still be described in reference to the original apparatus A. The dynamics is deterministically governed by the Schrödinger Equation, but there is no change of correlation between them because there is no interaction. When the next measurement occurs, or when the unitary time evolution stops because S starts to interact with another apparatus A′, the relational properties are updated. As a result, the quantum state of S is updated in reference to A′. After the interaction finishes, S enters unitary time evolution again. This process can be repeated continuously.

The key insight of quantum measurement is that it is a question-and-answer bidirectional process. The measuring system interacts (or, disturbs) the measured system. The interaction in turn alters the state of the measuring system. As a result, a correlation is established, allowing the measurement result for S to be inferred from the pointer variable of A.

Quantum State

The notion of information in ref.12, is closely related to concept of correlation. Information exchange between the observed system and the observing apparatus implies change of correlation between the two systems. Correlation is relational and observer-dependent. There are many ways to mathematically define correlation, one of them is introduced in the Result Section. However, in this paper, we use the notion of information in a more general sense. It can be understood as data that represents values attributed to parameters or properties, or knowledge that describes understanding of physical systems or abstract concepts. Correlation is one type of information.

A Quantum State of S describes the complete information an observer O can know about S. From the examination on the measurement process and the interaction history of a quantum system, we consider a quantum state encodes the information relative to the measuring system or the environment that the system previously interacted with. In this sense, the quantum state of S is described relative to A. The idea that a quantum state encodes information from previous interactions is also proposed in ref.13. The information encoded in the quantum state is the complete knowledge an observer can say about S, as it determines the possible outcomes of the next measurement. When the next measurement with another apparatus A′ is completed, the description of quantum state is updated to be relative to A′.

In traditional quantum mechanics, the quantum state is described through an observer-independent variable, the wave function |ψ〉. Its meaning is assigned through the Born’s rule21, which states that the probability to find S in an eigenvector |si〉 is given by pi = |〈si|ψ〉|2, and \(\sum \,{p}_{i}=1\). However, in this paper we consider observer-dependent relational properties more basic. By developing a framework to calculate the quantum probability, the meaning of |ψ〉 is naturally emerged as a secondary concept, as shown in next Section.

With the clarifications of the key terminologies, we can proceed to introduce the postulates and start the reformulation of quantum mechanics.

Results

Probability in a Measurement

Suppose there is no quantum mechanics formulation yet and the goal is to construct a quantum theory that describes a quantum system S in the context of measurement by an apparatus A. We start the reconstruction process by asking a basic question - what are the possible outcomes if one performs a measurement on S using apparatus A? More specifically, if one measures a variable q of S by referring a pointer variable q′ of A, what are the expected outcomes?

From experimental observations, the measurement yields multiple possible outcomes randomly. Each potential outcome is obtained with a certain probability. We call each measurement with a distinct outcome a quantum event. Denote these alternatives events with a set of kets {|si〉} for S, where (i = 0, …, N − 1), and a set of kets {|aj〉} for A, where (j = 0, …, M − 1). A potential measurement outcome is represented by a pair of kets (|si〉, |aj〉). The ket |si〉 is introduced not to represent a quantum state of S, instead as an abstract notation for a quantum event. They reflect the experimental observations that there can be many distinct measurement outcomes when a variable of S, q, is measured. |si〉 is associated with one of the outcomes with a certain probability, with qi as the measured value for variable q. Similarly, a ket |aj〉 represents a measurement outcome when the pointer variable q′ equals \({q^{\prime} }_{j}\). Here finite number of measurement outcomes is assumed. It is always possible to extend the notation to infinite number of events.

With such a representation, the next step is to develop a mathematically framework to calculate the probabilities of possible events. This prediction is carried out before a measurement is actually performed. For instance, what is the probability of a joint event |si〉 and |aj〉, denoted as pij? It is subtle to assign a probability of an outcome from a quantum measurement process. As mentioned earlier, a measurement is an inferring process that depends on the physical interaction between S and A. The interaction process consists A probing (or, disturbing) S, and S in the same time altering A. In other words, it is a bidirectional process. We denote this as \(A\rightleftharpoons S\). Accordingly, pij is called an interactional probability. This process is true for measurement in either classical or quantum mechanics. The difference is that in classical mechanics, the measurement can be setup such that there is only one measurement outcome deterministically. This means there is a one-to-one correlation between the macroscopic state of measured object and the pointer variable in the measuring device. The probability of this correlation always equals to one. On the other hand, in quantum mechanics, measurement of a variable q of the quantum system S yields multiple possible results. To calculate the probability of the joint event |si〉, and |aj〉, the process \(A\rightleftharpoons S\) at the macroscopic level should be replaced by \(|{a}_{j}\rangle \rightleftharpoons |{s}_{i}\rangle \) at the quantum level. This is a two-way process, or, a questioning and answering pair in terms of quantum logic approach12. We expect the framework to calculate pij should model this bidirectional process. This implies pij should be derived as a product of two numbers, with each number associated with one direction. Here we assume process of each direction is independent from each other (See Note 2 in Supplemental Information). The requirements for the interactional probability pij can be summarized as following:

  1. 1.

    pij should be a product of two numbers that are associated with a bidirectional process.

  2. 2.

    pij should be symmetric with respect to either S or A. What this means is that the probability is the same for both processes |aj〉 → |si〉 → |aj〉 that is viewed from A and |si〉 → |aj〉 → |si〉 that is viewed from S.

  3. 3.

    pij should be a non-negative real number.

Mathematically, the first requirement can be expressed as

$${p}_{ij}\propto {Q}^{A\to S}(|{a}_{j}\rangle \cap |{s}_{i}\rangle ){R}^{S\to A}(|{s}_{i}\rangle \cap |{a}_{j}\rangle )$$
(2)

where \({Q}^{A\to S}(|{a}_{j}\rangle \cap |{s}_{i}\rangle )\) is a relational quantity representing that, viewed from the A to S direction, joint event \(|{a}_{j}\rangle \cap |{s}_{i}\rangle \) occurs. Similarly, \({R}^{S\to A}(|{s}_{i}\rangle \cap |{a}_{j}\rangle )\) is a relational quantity representing that, confirmed from the S to A direction, joint event \(|{s}_{i}\rangle \cap |{a}_{j}\rangle \) occurs. To satisfy requirement 2, we rewrite these two quantities as matrix elements, i.e., \({Q}^{A\to S}(|{a}_{j}\rangle \cap |{s}_{i}\rangle )={Q}_{ji}^{AS}\), and \({R}^{S\to A}(|{s}_{i}\rangle \cap |{a}_{j}\rangle )={R}_{ij}^{SA}\). Equation (2) becomes

$${p}_{ij}\propto {Q}_{ji}^{AS}{R}_{ij}^{SA}.$$
(3)

The probability for the process |si〉 → |aj〉 → |si〉 is \({p}_{ij}\propto {R}_{ij}^{SA}{Q}_{ji}^{AS}\), the same as Eq. (3). Thus, requirement 2 is satisfied.

Now let’s consider the third requirement for pij that it should be a non-negative real number. We should assume the weakest possible restrictions to the variables \({Q}_{ji}^{AS}\) and \({R}_{ij}^{SA}\). The three requirements for pij are not necessarily applicable to \({Q}_{ji}^{AS}\) and \({R}_{ij}^{SA}\). First, a unidirectional process |aj〉 → |si〉 does not constitute a complete physical measurement process. We should not consider these variables themselves as probability quantities in the classical sense. This is, \({Q}_{ji}^{AS}\) and \({R}_{ij}^{SA}\) are not necessarily non-negative real number. They can be complex numbers (See Note 3 in Supplemental Information). Second, there is no reason to assume \({R}_{ij}^{SA}={Q}_{ji}^{AS}\). The direction from S to A is significant here and explicitly called out in the superscript. In this notation, index i is reserved for S while index j is reserved for A.

Given Eq. (3), there are many ways to satisfy the third requirement for pij. Since \({Q}_{ji}^{AS}\) and \({R}_{ij}^{SA}\) can be complex numbers, the simplest condition to ensure pij as a non-negative real number is

$${Q}_{ji}^{AS}={({R}_{ij}^{SA})}^{\ast }.$$
(4)

Written in a different format, \({Q}_{ji}^{AS}={({R}^{SA})}_{ji}^{\dagger }\). This means \({Q}^{AS}={({R}^{SA})}^{\dagger }\). Equation (3) then becomes

$${p}_{ij}=|{R}_{ij}^{SA}{|}^{2}/{\rm{\Omega }}$$
(5)

where Ω is a real number normalization factor. Equation (4) can be intuitively understood as this: viewed from A or viewed from S, the probabilistic quantities have the same magnitude, but different in phase. Physically it ensures there is no preferred choice of S and A in defining the relational variables (See Note 4 in Supplemental Information). Eq. (4) is a weaker version of requirement for \({R}_{ij}^{AS}\) compared to the second requirement for pij. \({Q}_{ji}^{AS}\) and \({R}_{ij}^{SA}\) are called relational probability amplitudes. In the Method Section, we will give an explicit calculation of \({R}_{ij}^{SA}\), using the Path Integral formulation. Given the relation in Eq. (4), we will not distinguish the notation R versus \(Q\), and only use R.

The relational matrix RSA gives the complete description of S. It provides a framework to derive the probability of future measurement outcome. We summarize the ideas presented in this section with the following two postulates.

Postulate 1

A quantum system S is completely described by a matrix RSA relative to an apparatus A, where the matrix element \({R}_{ij}^{SA}\) is the relational probability amplitude for the joint events |siand |aj〉.

Postulate 2

Probability of a measurement outcome is calculated by modeling the potential interaction process, i.e., by multiplying two relational probability amplitudes representing the bidirectional process.

There are two important notes. First, \({R}_{ij}^{SA}\) is probabilistic quantity, not a quantity associated with certain physical variable. \({R}_{ij}^{SA}\) should not be considered as certain coupling strength between S and A. In the Method Section, in the context of path integral, \({R}_{ij}^{SA}\) is defined as the sum of quantity \({e}^{i{S}_{p}/\hslash }\), where Sp is the action of the composite system S + A along a path. Physical interaction between S and A may cause change of Sp, which is the phase of the probability amplitude. But \({e}^{i{S}_{p}/\hslash }\) itself is a probabilistic quantity. Second, although \({R}_{ij}^{SA}\) is a probability amplitude, not a probability real number, we assume it follows certain rules in the classical probability theory, such as multiplication rule, and sum of alternatives in the intermediate steps.

Wave Function and Born’s Rule

So far, we have not yet introduced the notion of quantum state for S. We only describe S and A with sets of events and the relational matrix RSA. The next step is to derive the properties of S based on RSA. This can be achieved by examining how the probability of measuring S with a particular outcome of variable q is calculated.

We will take a move on mathematical notation before proceeding further. It is more convenient to introduce a Hilbert space for the quantum system S. The set of kets {|si〉}, previously considered as representing distinct measurement events for S, can be considered as eigenbasis of Hilbert space \({ {\mathcal H} }_{S}\) with dimension N, and |si〉 is an eigenvector. Since each measurement outcome is distinguishable, 〈si|sj〉 = δij. Similarly, the set of kets {|aj〉} is eigenbasis of Hilbert space \({ {\mathcal H} }_{A}\) with dimension N for the apparatus system A. The bidirectional process \(|{a}_{j}\rangle \rightleftharpoons |{s}_{i}\rangle \) is called a potential measurement configuration in the joint Hilbert space \({ {\mathcal H} }_{S}\oplus { {\mathcal H} }_{A}\).

In the previous section, we argue that the probability of the joint events |si〉 and |aj〉 is given by \({p}_{ij}={Q}_{ji}^{AS}{R}_{ij}^{SA}=\) \(|{R}_{ij}^{SA}{|}^{2}\), because the corresponding measurement configuration is |aj〉 → |si〉 → |aj〉. Here we clearly specify that the probability is for the joint event |si〉 and |aj〉. But there is a limitation for such specification if we wish to calculate the probability of measuring S with a particular outcome of variable q. In such case, the measurement configuration used earlier |aj〉 → |si〉 → |ajover-describe the configuration because no measurement is actually performed. We do not know that which event will occur to the quantum system A since it is completely probabilistic. The only way an observer can determine which event occurs is to perform actual measurement, or to infer from another system. Therefore, it is legitimate to generalize the potential measurement configuration as |aj〉 → |si〉 → |ak〉. In other words, the measurement configuration in the joint Hilbert space starts from |aj〉, but can end at |aj〉, or any other event, |ak〉. Correspondingly, we generalize Eq. (3) by introducing a quantity for such configuration

$${W}_{jik}^{ASA}={Q}_{ji}^{AS}{R}_{ik}^{SA}={({R}_{ij}^{SA})}^{\ast }{R}_{ik}^{SA}.$$
(6)

The second step utilizes Eq. (4). We interpret this quantity as a weight associated with the potential measurement configuration |aj〉 → |si〉 → |ak〉. The probability for a measurement outcome can be calculated by identifying the appropriate alternatives and summing up their weights. The classical macroscopic configuration A → S → A can be considered as a special case when the dimension of the Hilbert space is one for either S or A. Indeed, the most general form of measurement configuration in a bipartite system can be |aj〉 → |sm〉 → |sn〉 → |ak〉, and its weight is given by

$${W}_{jmnk}^{ASSA}={Q}_{jm}^{AS}{R}_{nk}^{SA}.$$
(7)

The indeterminacy on which event will occur to a quantum system influences the way possible measurement configurations can be arranged. Consequently, it influences how the applicable configurations are counted and then how the probability is calculated (See Note 5 in Supplemental Information). Suppose we do not perform actual measurement and inference is not available, the probability of finding S in a future measurement outcome can be calculated by summing \({W}_{jmnk}^{ASSA}\) from all applicable alternatives of measurement configurations. Such generalized framework of calculating probability is stated by extending Postulate 2.

Postulate 2e

Probability of a measurement outcome is calculated by modeling the potential interaction process. The probability is proportional to the sum of weights from all applicable measurement configurations, where the weight is defined as the product of two relational probability amplitudes corresponding to the configuration.

With this framework, the remaining task to calculate the probability is to correctly count the applicable alternatives of measurement configuration. This task depends on the expected measurement outcome. Some typical cases are analyzed next.

Case 1.

Suppose the expected outcome of an ideal measurement is event |si〉, i.e., measuring variable q gives eigenvalue qi. The probability of event |si〉 occurs, pi, is proportional to the summation of \({W}_{jmnk}^{ASSA}\) from all the possible configurations related to |si〉. Mathematically, we select all \({W}_{jmnk}^{ASSA}\) with m = n = i, sum over index j and k, and obtain the probability pi.

$${p}_{i}\propto \sum _{j,k=0}^{M}\,{({R}_{ij}^{SA})}^{\ast }{R}_{ik}^{SA}.$$
(8)

To see why this quantity can be considered a probability number, we note that Eq. (8) is symmetric with respect to the swap of index \(j\leftrightarrow k\). It can be rewritten as

$${p}_{i}\propto \sum _{j}\,{({R}_{ij}^{SA})}^{\ast }\,\sum _{k}\,{R}_{ik}^{SA}=|\sum _{j}\,{R}_{ij}^{SA}{|}^{2}.$$
(9)

It is a positive real number. Normalization condition is given by

$$\begin{array}{rcl}\sum _{i}\,|\sum _{j}\,{R}_{ij}{|}^{2} & = & \sum _{jk}\,\sum _{i}\,{R}_{ij}{R}_{ik}^{\ast }\\ & = & \sum _{jk}\,{({R}^{\dagger }R)}_{jk}=1.\end{array}$$
(10)

A notation move is made in the above equation by omitting the superscript in RSA, with the convention that R refers to the relational matrix from S to A. The definition of the wave function naturally emerges out from Eq. (9). Define a variable \({\phi }_{i}={\sum }_{j}\,{R}_{ij}\), then \({p}_{i}=|{\phi }_{i}{|}^{2}\). The quantum state can be described either by the relational matrix R, or by a set of variables \(\{{\phi }_{i}\}\). We call \({\phi }_{i}\) the wave function for eigenvector |si〉. The quantum state of S is a vector representation of the variable set \(\{{\phi }_{i}\}\), i.e., the vector state of S relative to A, is \(|\psi {\rangle }_{S}^{A}={({\phi }_{0},{\phi }_{1},\ldots ,{\phi }_{N})}^{T}\) where superscript T is the transposition symbol. In summary,

$$|\psi {\rangle }_{S}^{A}={({\phi }_{0},{\phi }_{1},\ldots ,{\phi }_{N})}^{T}\,{\rm{where}}\,{\phi }_{i}=\sum _{j}\,{R}_{ij}\mathrm{.}$$
(11)

Note that we have not yet written \(|\psi {\rangle }_{S}^{A}\) as linear combination of \({\phi }_{i}\).

Case 2.

Suppose the expected ideal measurement outcome is that S in a superposition of eigenvectors |s0〉 and |s1〉. This means one cannot determine event |s0〉 or |s1〉 occurs. The compute the probability, the applicable weights should include not only \({\sum }_{jk}\,{R}_{0j}^{\ast }{R}_{0k}=|{\sum }_{j}\,{R}_{0j}{|}^{2}=|{\phi }_{0}{|}^{2}\) and \({\sum }_{jk}\,{R}_{1j}^{\ast }{R}_{1k}=|{\sum }_{j}\,{R}_{1j}{|}^{2}=|{\phi }_{1}{|}^{2}\), but also the terms that index 0 and 1 are inter-exchanged due to the indeterminacy, i.e., \({\sum }_{jk}\,{R}_{0j}^{\ast }{R}_{1k}\) and \({\sum }_{jk}\,{R}_{1j}^{\ast }{R}_{0k}\). Adding these terms together, the probability is

$${p}_{\{0,1\}}={|\sum _{j}{R}_{0j}+\sum _{j}{R}_{1j}|}^{2}=|{\phi }_{0}+{\phi }_{1}{|}^{2}$$
(12)

Equation (12) captures the characteristics of superposition. The wave function for a superposition of eigenvectors |s0〉 and |s1〉 is the linear combination of φ0 and φ1. Based on this observation, it is mathematically convenient to write the state vector of S as linear combination of φi|si

$$|\psi {\rangle }_{S}^{A}=\sum _{i}\,{\phi }_{i}|{s}_{i}\rangle \,{\rm{where}}\,{\phi }_{i}=\sum _{j}\,{R}_{ij}\mathrm{.}$$
(13)

The justification for the above definition is that the probability can be calculated from it by defining a projection operator \({\hat{P}}_{i}=|{s}_{i}\rangle \langle {s}_{i}|\). Noted that {|si〉} are orthogonal eigenbasis, the probability is rewritten as:

$${p}_{i}=\langle \psi |{\hat{P}}_{i}|\psi \rangle =|{\phi }_{i}{|}^{2}$$
(14)

Similarly, introducing a projection operator \({\hat{P}}_{\{0,1\}}=(|{s}_{0}\rangle +|{s}_{1}\rangle )(\langle {s}_{0}|+\langle {s}_{1}|)\), we can rewrite the probability as

$${p}_{\{0,1\}}=\langle \psi |{\hat{P}}_{\{0,1\}}|\psi \rangle =|{\phi }_{0}+{\phi }_{1}{|}^{2}.$$
(15)

Equations (13) and (14) give the equivalent results as what Born’s Rule states, but with more physical insights on how the quantum measurement probability is calculated based on detailed analysis on the interaction process during measurement.

Case 3.

Given a relational matrix R and that the correspondent state vector of S is |ψ〉, suppose the expected measurement outcome is described by another relational matrix Q and the correspondent state vector of S is |χ〉, the probability is

$$p(Q|R)=\Vert \sum _{i,j}{({Q}^{\dagger }R)}_{i,j}\Vert .$$
(16)

The poof is given in the Method Section, using the state vector notation of S, the probability can be equivalently expressed as \(p(\chi |\psi )=\langle \psi |{\hat{P}}_{\chi }|\psi \rangle =\Vert \langle \chi |\psi \rangle \Vert \), where \({\hat{P}}_{\chi }=|\chi \rangle \langle \chi |\). This is a generalization of Eq. (15).

Although the introduction of wave function φi brings much mathematical convenience, the relational matrix R is a more fundamental quantity. φi is introduced as a byproduct of the derivation instead of as a fundamental variable.

Equations (8) and (13) are introduced on the condition that there is no correlation between quantum system S and A. If there is correlation between them, the summation in Eq. (8) over-counts the applicable alternatives of measurement configurations and should be modified accordingly. But first, from the relational matrix R, how to determine whether there is a correlation between S and A?

Entanglement

Correlation between two quantum system means one can infer the information on one system from information on the other system. The relational variable Rij itself does not quantify an inference relation between S and A. Quantity |Rij|2 is the measurement probability for the joint events |si〉 for S and |aj〉 for A. However, given Rij, one cannot infer that event |si〉 occurs to S from knowing event |aj〉 occurs to A. We need to define a different parameter that can quantify the quantum correlation between S and A.

The capability of inferring information of a quantum state of one system from information of the other system is defined as entanglement. Since S and A both are quantum systems, they form a bipartite quantum system. Entanglement between two composite system is quantified by an entanglement measure E. There are many forms of entanglement measure22,23, the simplest one is the von Neumann entropy. Given the relational matrix R, the von Neumann entropy is defined as following. For reason that will become obvious in the next subsection, we first define a product matrix \(\rho =R{R}^{\dagger }\), and denote the eigenvalues of ρ as {λi}, then the von Neumann entropy for the relational matrix R is

$$H(R)=-\,\sum _{i}\,{\lambda }_{i}ln{\lambda }_{i}\mathrm{.}$$
(17)

A change in H(R) implies there is change of entanglement between S and A. Unless explicitly pointed out, we only consider the situation that S is described by a single relational matrix R. In this case, the entanglement measure E = H(R).

The definition of H(R) enables us to distinguish different quantum dynamics. Given a quantum system S and its referencing apparatus A, there are two types of the dynamics between them. In the first type of dynamics, there is no physical interaction and no change in the entanglement measure between S and A. S is not necessarily isolated in the sense that it can still be entangled with A, but the entanglement measure remains unchanged. This type of dynamics is defined as time evolution. In the second type of dynamics, there is a physical interaction and correlation information exchange between S and A, i.e., the von Neumann entropy H(R) changes. This type of dynamics is defined as quantum operation. Quantum measurement is a special type of quantum operation with a particular outcome. Whether the entanglement measure changes distinguishes a dynamic as either a time evolution or a quantum operation. This is summarized in the following postulate.

Postulate 3

In a time evolution process, the entanglement measure of relational matrix is unchanged, while in a quantum operation process, there is change in the entanglement measure of relational matrix.

The following theorem allows us to detect whether relational matrix R is entangled. The theorem will be used extensively later.

Theorem 1

H(R) = 0 if and only if the matrix element Rij can be decomposed as Rij = cidj, where ci and dj are complex numbers.

The proof is left to the Method Section. The wave function in this case is simplified to \({\phi }_{i}={\sum }_{j}\,{c}_{i}{d}_{j}={c}_{i}\,{\sum }_{j}\,{d}_{j}={c}_{i}d\) where d is a constant. If we choose \({\sum }_{i}\,|{c}_{i}{|}^{2}=1\), then d = e. For simplicity, let d = 1 so that φi = ci.

When there is entanglement between S and A, A and S can infer information from each other. The way probability is calculated in Eqs (8) and (12) must be modified because the summation in Eq. (8) over counts the alternatives that are due to indeterminacy. Some of the potential measurement configurations should be excluded in order to calculate the probability correctly.

To see this more clearly, we decompose the relational matrix R to R = UDV by virtue of the singular value decomposition, where U and V are two unitary matrices, and D is a diagonal matrix. Applying the two unitary matrices is equivalent to changing eigenbasis of S and A to \(|{\tilde{s}}_{i}\rangle \) and \(|{\tilde{a}}_{i}\rangle \) such that R is diagonal. D is an irreducible diagonal matrix. H(R) > 0 implies that D has more than one diagonal matrix elements. Each element corresponds to a one to one correlation between \(|{\tilde{s}}_{i}\rangle \) and \(|{\tilde{a}}_{i}\rangle \). One can infer S is in \(|{\tilde{s}}_{i}\rangle \) from knowing A is in \(|{\tilde{a}}_{i}\rangle \), and vice versa. Effectively, neither S nor A is in a superposition state anymore. The contributions in calculating probability due to indeterminacy of eigenvectors must be excluded. This results in a different rule to count the applicable alternatives.

Case 1e.

When there is an entanglement between S and A, to calculate the probability of finding S in eigenvector |si〉, one should only select measurement configuration |aj〉 → |si〉 → |aj〉. The corresponding weight is \({R}_{ij}^{\ast }{R}_{ij}=|{R}_{ij}{|}^{2}\). Summing all possible index j give the probability

$${p}_{i}=\sum _{j}\,|{R}_{ij}{|}^{2}$$
(18)

Case 2e.

Suppose we want to calculate the probability of finding S in eigenvectors |s0〉 or |s1〉 when there is an entanglement between S and A. Since there is inference information on whether S is in eigenvector |s0〉 or |s1〉, to calculate the probability, we can only count the weights \({\sum }_{j}\,{R}_{0j}^{\ast }{R}_{0j}\) and \({\sum }_{j}\,{R}_{1j}^{\ast }{R}_{1j}\), and not to include the interference terms such as \({\sum }_{j}\,{R}_{0j}^{\ast }{R}_{1j}\).

$${p}_{\{0,1\}}=\sum _{j}\,|{R}_{0j}{|}^{2}+\sum _{j}\,|{R}_{1j}{|}^{2}={p}_{0}+{p}_{1}.$$
(19)

As a consequence, we cannot define a wave function as in Eq. (13) to describe the state of S when H(R) > 0. To describe S without explicitly referencing A when S and A are entangled, alternative formulation is needed. This is the reduced density matrix approach.

Reduced Density Matrix

To describe S without explicitly referencing A when S and A are entangled, we first describe the composite system S + A as an isolated system such that Eq. (13) is applicable. We need to describe S + A relative to another apparatus A′ that is unentangled with S + A. Suppose an observer \({{\mathscr{O}}}_{\varepsilon }\) is local to apparatus A′, and has the same information of the relational matrix R. \({{\mathscr{O}}}_{\varepsilon }\) wishes to describe the composite system S + A using Postulate 2e. In order to describe a quantum state of a composite system, another postulate is needed, which is commonly found in standard textbooks, for example,

Postulate 4

Let S12 be the composite system of quantum system S1 and S2 with Hilbert spaces \({ {\mathcal H} }_{1}\) and \({ {\mathcal H} }_{2}\). Then the associated Hilbert space of S12 is a tensor product Hilbert space \({ {\mathcal H} }_{1}\otimes { {\mathcal H} }_{2}\). A physical variable of S1 represented by Hermitian operator A1 on \({ {\mathcal H} }_{1}\) is identified with the physical variable of S12 represented by A1I2 on \({ {\mathcal H} }_{1}\otimes { {\mathcal H} }_{2}\), where I2 is the identity operator on \({ {\mathcal H} }_{2}\)24.

An eigenvector denotes a distinct quantum event that a measurement of a variable yield a distinct eigenvalue. If there are N orthogonal eigenvectors for S, {|si〉}, and M orthogonal eigenvectors for A, {|ai〉}, according to Postulate 4, the orthogonal basis set for the composite S + A system should have N × M eigenvectors, {|si〉  |aj〉}. \({{\mathscr{O}}}_{E}\) would describe S + A with a higher order relational matrix, denoted as R′, with matrix element \({R}_{mn}^{^{\prime} }\). Index m is defined in Hilbert space \({ {\mathcal H} }_{S}\otimes { {\mathcal H} }_{A}\), (m = 0, …, NM − 1), while index n is defined in Hilbert space \({ {\mathcal H} }_{A^{\prime} },(n=0,\ldots ,M^{\prime} -1)\) and \(M^{\prime} ={\rm{\dim }}\,{H}_{A^{\prime} }\). Since there is no entanglement between A′ and S + A, R′ can be used to define the wave function of the composite system as \({\phi }_{m}^{A^{\prime} }={\sum }_{n}\,{R}_{mn}^{^{\prime} }\) according to Eq. (13). However, there is restrictions on \({R}_{mn}^{^{\prime} }\) because the relational matrix between S and A has been established. The relational matrix R′ must satisfy the following condition (See the proof in Note 6 in Supplemental Information)

$${\phi }_{m}=\sum _{n}\,{R}_{mn}^{^{\prime} }={R}_{ij}.$$
(20)

Therefore, relative to \({{\mathscr{O}}}_{\varepsilon }\), the state vector of the composite system S + A is

$$\begin{array}{rcl}|{\rm{\Psi }}\rangle & = & \sum _{m}\,{\phi }_{m}|m\rangle =\sum _{ij}\,{\phi }_{ij}|{s}_{i}{a}_{j}\rangle \\ & = & \sum _{ij}\,{R}_{ij}|{s}_{i}\rangle |{a}_{j}\rangle .\end{array}$$
(21)

Next, we ask how to describe S itself. To answer this, we examine how the probability of the system S in an eigenvector |si〉 can be calculated. We know that the probability of S in eigenvector |siand A in eigenvector |aj〉 is pij = |Rij|2. If event 1.) S in eigenvector |siand A in eigenvector |aj〉, and event 2.) S in eigenvector |siand A in eigenvector |ak〉, are mutually exclusive, the probability of S in eigenvector |si〉 is then just the sum of pij over index j, i.e., \({p}_{i}={\sum }_{i}\,{p}_{ij}={\sum }_{j}\,|{R}_{ij}{|}^{2}\). It gives the desired result as Eq. (18) when S and A are entangled. This is not a surprise since the assumption that event-1 and event-2 are mutually exclusive implies there is no event such that S is in eigenvector |si〉 while A is in eigenvector |ajand |ak〉. In other words, the mutual exclusivity of event-1 and event-2 eliminates the potential measurement configuration |aj〉 → |si〉 → |ak〉, thus satisfies the requirement for calculating probability when there is entanglement between S and A. The mathematical tool to implement this requirement is the reduced density operator of S, defined as

$$\begin{array}{rcl}{\hat{\rho }}_{S} & = & T{r}_{A}|{\rm{\Psi }}\rangle \langle {\rm{\Psi }}|=\sum _{ii^{\prime} }\,(\sum _{k}\,{R}_{ik}{R}_{i^{\prime} k}^{\ast })|{s}_{i}\rangle \langle {s}_{i^{\prime} }|\\ & = & \sum _{ii^{\prime} }\,{(R{R}^{\dagger })}_{ii^{\prime} }|{s}_{i}\rangle \langle {s}_{i^{\prime} }|.\end{array}$$
(22)

The partial trace over A, \(T{r}_{A}(\,.\,\,)={\sum }_{k}\,\langle {a}_{k}|.\,|{a}_{k}\rangle \), ensures the mutual exclusivity of event-1 and event-2 since only the diagonal elements are selected in the sum. To obtain the desired probability \({p}_{i}={\sum }_{j}\,|{R}_{ij}{|}^{2}\), we define a projection operator \({\hat{P}}_{i}=|{s}_{i}\rangle \langle {s}_{i}|\), so that

$${p}_{i}=T{r}_{S}({\hat{P}}_{i}{\hat{\rho }}_{S})=\sum _{j}\,|{R}_{ij}{|}^{2}.$$
(23)

Since the information of A is traced out in \({\hat{\rho }}_{S}\), we find a mathematical tool to describe the state of S without explicitly referring to A. Eq. (22) gives a clear meaning of the matrix product \(R{R}^{\dagger }\) as the reduced density matrix of S, i.e., \({\rho }_{S}=R{R}^{\dagger }\). Thus, the entanglement measure defined in (17) is the von Neumann entropy for the reduced density matrix of S.

Similarly, the probability of event |aj〉 for A is \({p}_{j}^{A}={\sum }_{i}\,{p}_{ij}={\sum }_{i}\,|{R}_{ij}{|}^{2}\). This can be more elegantly written by introducing a partial projection operator \({I}^{S}\otimes {\hat{P}}_{j}^{A}\) where \({\hat{P}}_{j}^{A}=|{a}_{j}\rangle \langle {a}_{j}|\). It is easy to verify that

$$\begin{array}{rcl}{p}_{j}^{A} & = & \langle {\rm{\Psi }}|{I}^{S}\otimes {\hat{P}}_{j}^{A}|{\rm{\Psi }}\rangle \\ & = & \langle {\rm{\Psi }}|{a}_{j}\rangle \langle {a}_{j}|{\rm{\Psi }}\rangle =\sum _{i}\,|{R}_{ij}{|}^{2}\end{array}$$
(24)

To calculate the probability of finding S in |s0〉 or |s1〉, we use the projection operator defined as \({\hat{P}}_{\{01\}}=\) \(|{s}_{0}\rangle \langle {s}_{0}|+|{s}_{1}\rangle \langle {s}_{1}|\), then

$$\begin{array}{rcl}{p}_{\{0,1\}} & = & T{r}_{S}({\hat{P}}_{\{01\}}{\hat{\rho }}_{S})\\ & = & \sum _{j}\,|{R}_{0j}{|}^{2}+\sum _{j}\,|{R}_{1j}{|}^{2}={p}_{0}+{p}_{1}.\end{array}$$
(25)

which is the same as Eq. (19) in Case 2e. The trace operation over S in Eqs. (23) and (25) takes only diagonal matrix elements, effectively eliminates the indeterminacy with respect to eigenvector |si〉 in the information acquisition flows. Together with the partial trace operation in the definition of \({\hat{\rho }}_{S}\), they exclude the interference terms in the summation of weights for calculation of probability, thus effectively factor in the inference information between S and A, and yield the same results as deduced from Postulate 2e.

Normalization of |Ψ〉 requires

$$\begin{array}{rcl}Tr({\rho }_{S}) & = & Tr(R{R}^{\dagger })=\sum _{i}\,(\sum _{j}{R}_{ij}{R}_{ji}^{\dagger })\\ & = & \sum _{ij}\,|{R}_{ij}{|}^{2}=1\end{array}$$
(26)

Note Eqs (10) and (26) cannot be true in the same time. Eq. (10) is true only when the relational matrix R is unentangled. When S + A is entangled, Eq. (13) cannot be used to describe S. This is evident if we rewrite Eq. (13) in the density matrix operator format,

$$\begin{array}{rcl}{\hat{\rho }^{\prime} }_{S} & = & |\psi {\rangle }_{S}\langle \psi |=\sum _{ii^{\prime} }\,(\sum _{jj^{\prime} }{R}_{ij}{R}_{i^{\prime} j^{\prime} }^{\ast })|{s}_{i}\rangle \langle {s}_{i^{\prime} }|\\ & = & {\hat{\rho }}_{S}+\sum _{ii^{\prime} }\,(\sum _{j\ne j^{\prime} }\,{R}_{ij}{R}_{i^{\prime} j^{\prime} }^{\ast })|{s}_{i}\rangle \langle {s}_{i^{\prime} }|.\end{array}$$
(27)

Clearly, \({\hat{\rho }^{\prime} }_{S}\ne {\hat{\rho }}_{S}\) in general. The second term in Eq. (27) comes from the indeterminacy of eigenvector for A. This term should be discarded when H(R) > 0. This confirms that S should be described by Eq. (22) instead of Eq. (13) when H(R) > 0. The second term in Eq. (27) is related to the coherence of the quantum state of S. When H(R) = 0, it turns out both density matrices are mathematically equivalent, as shown in the following theorem.

Theorem 2

If the entanglement measure H(R) = 0, \({\hat{\rho }^{\prime} }_{S}={\hat{\rho }}_{S}\).

Proof is left to the Method Section. Essentially, when H(R) = 0, the coherence term in Eq. (27) is equal to \({\hat{\rho }}_{S}\) multiplied by a constant, effectively making \({\hat{\rho }^{\prime} }_{S}\) and \({\hat{\rho }}_{S}\) differ only by a constant.

We see that there are three mathematical tools to describe a quantum system, namely, the relational matrix R, the reduced density matrix ρS, and the wave function |ψS. They are equivalent in terms of calculating the probability of future measurement outcome. The wave function can only be used when H(R) = 0. The reduced density matrix, on the other hand, can describe the quantum state of S regardless H(R) = 0 or H(R) > 0. It is more generic in quantum mechanics formulation. However, in the case of H(R) = 0, the wave function defined in Eq. (13) reflects better the physical meaning of a superposition quantum state. Both ρS and |ψS are derived from R. This confirms that R is a more fundamental variable in quantum mechanics formulation.

In deriving Eq. (13), we assume observer \({{\mathscr{O}}}_{ {\mathcal E} }\) who is local to apparatus A′ has the latest information of the relational matrix R. \({{\mathscr{O}}}_{ {\mathcal E} }\) then comes to an equivalent description of S using the reduced density matrix, as shown in Eq. (23). This is significant since it gives the meaning of objectivity of a quantum state. Objectivity can be defined as the ability of different observers coming to a consensus independently10. On the other hand, if \({{\mathscr{O}}}_{ {\mathcal E} }\) is out of synchronization on the latest information of R, for instance, there is update on R due to measurement and not known to \({{\mathscr{O}}}_{ {\mathcal E} }\), \({{\mathscr{O}}}_{ {\mathcal E} }\) can have different descriptions of S. This synchronization of latest information is operational, but it is necessary. One can argue that the quantum state is absolute to any observer, but the statement is non-operational if two observers are space-like separated, and causes the EPR paradox13.

Operator

Although \({R}_{ij}^{SA}\) itself is not a probability quantity, we assume it follows some of the rules for probability calculation. For example, the multiplication rule, as seen in Eq. (3). Another important rule is the summation of alternatives in the intermediate steps. Let’s denote the initial relational matrix for S is \({R}_{init}^{SA}\). Suppose there is a dynamic (either a local operation or a time evolution) that changes S to a new state. The effect of the dynamics connects the initial state and new state through a matrix \({R}_{p}^{SS}\). The new relational matrix element between the A and S is

$${({R}_{new}^{SA})}_{ij}=\sum _{k}{({R}_{p}^{SS})}_{ik}{({R}_{init}^{SA})}_{kj}$$
(28)

Figure 1 shows the meaning of Eq. (28). The new matrix element \({({R}_{new}^{SA})}_{ij}\) is obtained by multiplying the initial relational matrix element \({({R}_{init}^{SA})}_{kj}\) and the local dynamics matrix element \({({R}_{p}^{SS})}_{ik}\), then summing over all possible intermediate steps.

Figure 1
figure 1

Summation of alternatives for probability amplitude.

With the notation of wave function φi and reduced density matrix ρS, it is mathematically convenient to rewrite Eq. (28) without referring to A. Defined an operator \(\hat{M}\) in Hilbert space HS as \(\langle {s}_{i}|\hat{M}|{s}_{k}\rangle ={({R}_{p}^{SS})}_{ik}\), Eq. (28) becomes

$${({R}_{new}^{SA})}_{ij}=\sum _{k}{M}_{ik}{({R}_{init}^{SA})}_{kj},\,{\rm{or}}\,{R}_{new}=M{R}_{init}.$$
(29)

If R is not an entangled matrix, we can sum over index j in both sides of the above equation. Referring to the definition of φi we obtain \({({\phi }_{i})}_{new}={\sum }_{k}\,{M}_{ik}{({\phi }_{k})}_{init}\). Substitute this into Eq. (13),

$$|\psi {\rangle }_{new}=\hat{M}|\psi {\rangle }_{init}$$
(30)

which is a familiar formulation. If R is an entangled matrix, we use the reduced density formulation,

$${\rho }_{new}={R}_{new}{({R}_{new})}^{\dagger }=M{\rho }_{init}{M}^{\dagger }.$$
(31)

Once again, we see that change of a quantum system can be described by either the relational matrix R, or the reduced density matrix that traces out the information of the reference system. Both descriptions are equivalent.

General Formulation of Time Evolution

Without loss of generality, we will only consider discrete time evolution here that describes state change from initial time t0 to some finite time later at t. By definition, there is no physical interaction between S and A, S and A are evolving independently. According to Eq. (28), the new state that S is changed to is related to the original state through a local evolution matrix \({R}_{p}^{SS}\). \({R}_{p}^{SS}\) is independent of A since there is no interaction between S and A. Similarly, the new state that A is changed to is related to the original state through a local evolution matrix \({R}_{p}^{AA}\). \({R}_{p}^{AA}\) is independent of S. To simplify the notation, we rewrite \(Q(t-{t}_{0})={R}_{p}^{SS}\) and \(O(t-{t}_{0})={R}_{p}^{AA}\), and denote the initial relational matrix between S and A is R(t0). The time evolution of the relational matrix R(t) is depicted in Fig. 2. Matrix element at time t, Rij(t), shown in the dot line in Fig. 2, is calculated by summation of all the possible intermediate steps between eigenvector |si(t)〉 and eigenvector |aj(t)〉:

$$\begin{array}{rcl}{R}_{ij}^{{S}_{t}{A}_{t}}(t) & = & \sum _{m,n}{Q}_{im}^{{S}_{t}{S}_{0}}(t-{t}_{0}){R}_{mn}^{{S}_{0}{A}_{0}}({t}_{0}){O}_{nj}^{{A}_{0}{A}_{t}}({t}_{0}-t)\\ & = & {(Q(t-{t}_{0})R({t}_{0})O({t}_{0}-t))}_{ij}\end{array}$$
(32)
Figure 2
figure 2

Time evolution of probability amplitude.

The superscripts ensure the consistency of notation for the process (St → S0 → A0 → At), and in the last step, they are omitted. Thus, the general formulation of the time evolution for the relational matrix is given by

$$R(t)=Q(t-{t}_{0})R({t}_{0}){O}^{\dagger }(t-{t}_{0}),$$
(33)

where we assume the property (See Note 7 in Supplemental Information) \(O({t}_{0}-t)={O}^{\dagger }(t-{t}_{0})\). For simplicity, let t0 = 0, the reduced density matrix at time t is \(\rho (t)=R(t){R}^{\dagger }(t)=Q(t)R(0){O}^{\dagger }(t)O(t){R}^{\dagger }(0){Q}^{\dagger }(t)\). According to Postulate 3, in time evolution the entanglement measure is unchanged. This means the von Neumann entropy is an invariance during time evolution, H(R(t)) = H(R(0)). One sufficient condition to meet such requirement is that \(Q(t)\) and O(t) are unitary matrices, consequently ρ(t) and ρ(0) are unitary similar matrices and have the same von Neumann entropy. However, the converse statement is not necessarily true. The condition H(R(t)) = H(R(0)) is too weak to lead to the conclusion that \(Q(t)\) and O(t) are unitary matrices. We wish to find additional conditions such that \(Q(t)\) and O(t) are unitary (See Note 8 in Supplemental Information).

Schrödinger Equation

In the case that the initial state for S and A are unentangled, the eigenvalue of ρ(0) has only one value λ = 1 and H(R(0)) = 0. From Theorem 1, Rmn(0) = cmdn, Eq. (32) becomes

$$\begin{array}{rcl}{R}_{ij}(t) & = & \sum _{m,n}{Q}_{im}(t){c}_{m}{d}_{n}{O}_{nj}^{\dagger }(t)\\ & = & (\sum _{m}{Q}_{im}(t){c}_{m})(\sum _{n}{d}_{n}{O}_{nj}^{\dagger }(t)).\end{array}$$
(34)

The last expression of Rij(t) shows it can be still decomposed into the product of two separated terms, therefore H(R(t)) = 0 as expected. By definition, the initial wave function is \({\phi }_{m}(0)={\sum }_{n}\,{c}_{m}{d}_{n}={c}_{m}{d}_{0}\). At time t it becomes

$$\begin{array}{rcl}{\phi }_{i}(t) & = & \sum _{j}{R}_{ij}(t)=\sum _{m}{Q}_{im}(t){c}_{m}\sum _{j,n}{d}_{n}{O}_{nj}^{\dagger }(t)\\ & = & d(t)\sum _{m}{Q}_{im}(t){\phi }_{m}(0)\end{array}$$
(35)

where \(d(t)={\sum }_{jn}\,({d}_{n}/{d}_{0}){O}_{nj}^{\dagger }(t)\) is a constant independent of S. Defined linear operator \(\hat{Q}(t)\) in Hilbert space HS as \(\langle {s}_{i}|\hat{Q}(t)|{s}_{k}\rangle ={Q}_{ik}(t)\) and substituted Eqs (35) to (13), the state vector

$$|\psi (t)\rangle =d(t)\hat{Q}(t)|\psi (0)\rangle .$$
(36)

Since the total probability should be preserved, \(\langle \psi (t)|\psi (t)\rangle =|d{|}^{2}\langle \psi (0)|{\hat{Q}}^{\dagger }\hat{Q}|\psi (0)\rangle =1\). This is true for any initial sate |ψ(0)〉, thus, \({\hat{Q}}^{\dagger }\hat{Q}=I/|d{|}^{2}\). There is an undetermined constant |d|. In general, one cannot conclude that Q(t) is a unitary matrix unless choosing |d| = 1. If |d| = 1, d = e(t) is an arbitrary phase. Rewritten \(\hat{Q}\) as \(\hat{U}\), Eq. (36) becomes

$$|\psi (t)\rangle ={e}^{i\varphi (t)}\hat{U}(t)|\psi (0)\rangle ={e}^{i\varphi (t)}{e}^{-i\hat{H}t/\hslash }|\psi (0)\rangle $$
(37)

where \(\hat{H}\) is a Hermitian operator for S. Omitting the arbitrary phase, Eq. (37) is the Schrödinger Equation. The derivation here does not give the actual expression of the Hamiltonian operator, but it manifests the fact that there is no change of entanglement measure between the observed system and the observing apparatus.

The above derivation depends on several conditions. First, there is no physical interaction between S and A; Second, S and A are not entangled; Third, the total probability is preserved; Lastly, we choose |d(t)| = 1. The first two conditions are usually what one refers as S is in an isolated state. In summary, given H(R(t)) = H(R(0)), if two more conditions are added, H(R(0)) = 0 and |d(t)| = 1, it is shown that matrix Q(t) is unitary, which leads to the Schrödinger Equation.

A special case for Eq. (33) to be reduced to the Schrödinger Equation is when O(t) = I. With O(t) = I, R(t) = US(t)R(0). Since H(R) = 0, we can use Eq. (13) to calculate the wave function

$$\begin{array}{rcl}{\phi }_{i}(t) & = & \sum _{j}{R}_{ij}(t)=\sum _{m}{Q}_{im}(t)\sum _{j}{R}_{mj}(0)\\ & = & \sum _{m}{Q}_{im}(t){\phi }_{m}(0)\end{array}$$
(38)

which is the same as Eq. (35) with d(t) = 1. The same reasoning from Eqs (35) to (37) is applied here. O(t) = I is a very strong condition, it may not be physically sensible because any quantum system evolves as time elapses. However, this may be considered an approximation that, for a macroscopic classical apparatus, the change as a ratio to its overall state is so infinitesimal in magnitude compared to the change for a microscopic quantum state, that it can be ignored.

Generalized Differential Equation

Next, we consider the more general case that S and A are not interacting but initially entangled, i.e., H(R(0)) > 0. Since entanglement measure is unchanged, H(R(t)) = H(R(0)) > 0. S and A stay entangled at time t. S is not in an isolated state. We need to describe the composite system S + A as a whole relative to another unentangled apparatus A′. To proceed further the following theorem is introduced.

Theorem 3

Applying operator \(\hat{Q}\otimes \hat{O}\) over the composite system S + A is equivalent to change the relational matrix R to R′ = QROT, where the superscript T represents a transposition.

The proof is left to the Method Section. Since the composite system S + A is in isolated state, based on the result in the previous Section, the overall time evolution operator \({\hat{U}}_{SA}\) is unitary. The state vector of the composite system at time t should be \(|{\rm{\Psi }}(t)\rangle ={\hat{U}}_{SA}|{\rm{\Psi }}(0)\rangle ={\hat{U}}_{SA}\,{\sum }_{ij}\,{R}_{ij}^{SA}|{s}_{i}\rangle |{a}_{j}\rangle \). Let \({\hat{U}}_{SA}=exp(\,-\,i{\hat{H}}_{SA}t/\hslash )\) where \({\hat{H}}_{SA}\) is the Hamiltonian of the composite system. Since there is no interaction between S and A, \({\hat{H}}_{SA}={\hat{H}}_{S}+{\hat{H}}_{A}\) where \({\hat{H}}_{S}\) and \({\hat{H}}_{A}\) are the Hamiltonian operators in their respective Hilbert spaces. As shown in the Method Section, \({\hat{U}}_{SA}\) can be decomposed into \({\hat{U}}_{SA}={\hat{U}}_{S}\otimes {\hat{U}}_{A}\). According to Theorem 3, this effectively changes the relational matrix to \(R(t)={U}_{S}(t)R(0){U}_{A}^{T}(t)\). Note that \({U}_{A}^{T}(t)\) is also a unitary matrix. This is equivalent to the general time evolution Eq. (33) with the condition that both \(\hat{Q}(t)\) and \(\hat{O}(t)\) are unitary.

Let’s rewrite the general time evolution dynamics, Eq. (33), in operator notation by introducing a linear operator \(\hat{R}={\sum }_{ij}\,{R}_{ij}|{s}_{i}\rangle \langle {a}_{j}|\). Since \(\hat{Q}(t)={\hat{U}}_{S}(t)=exp\{\,-\,i{\hat{H}}_{S}t/\hslash \}\) and \({\hat{O}}^{\dagger }(t)={\hat{U}}_{A}^{T}(t)=exp\{\,-\,i({\hat{H}}_{A}^{T})t/\hslash \}\), Eq. (33) becomes

$$\hat{R}(t)={e}^{-i{\hat{H}}_{S}t/\hslash }\hat{R}(0){e}^{-i({\hat{H}}_{A}^{T})t/\hslash }.$$
(39)

Because H(R) > 0, the wave function φ(t) of S cannot be defined. Consequently we cannot obtain a dynamics equation of wave function. Instead, a dynamics equation for \(\hat{R}\) can be derived. Taking the derivative over t of both sides of Eq. (39) and noting \([exp\{i({\hat{H}}_{A}^{T})t/\hslash \},{\hat{H}}_{A}^{T}]=0\), one gets

$$i\hslash \frac{d\hat{R}(t)}{dt}={\hat{H}}_{S}\hat{R}(t)+\hat{R}(t){\hat{H}}_{A}^{T}.$$
(40)

Note that \([\hat{R},{\hat{H}}_{A}^{T}]\ne 0\), i.e., \(\hat{R}\) and \({\hat{H}}_{A}^{T}\) are non-commutative (See Note 9 in Supplemental Information). Eq. (40) is a more general form of differential equation that describes the time evolution of R. Once solving the above equation and obtaining R(t), one can calculate the probability according to Postulate 2e.

To derive a differential equation without explicitly referring to the apparatus A, we should use the reduced density matrix approach since S and A can be entangled. Given the dynamics of the relational matrix is \(R(t)={U}_{S}(t)R(0){U}_{A}^{T}(t)\), the reduced density matrix of S is \(\rho (t)=R(t){R}^{\dagger }(t)={U}_{S}(t)\rho (0){U}_{S}^{\dagger }(t)\). Defining density operator \(\hat{\rho }(t)\) for S such that \(\langle {s}_{i}|\hat{\rho }(t)|{s}_{j}\rangle ={[R(t){R}^{\dagger }(t)]}_{ij}\), we can rewrite the equation to

$$\hat{\rho }(t)={e}^{-i{\hat{H}}_{S}t/\hslash }\hat{\rho }(0){e}^{i{\hat{H}}_{S}t/\hslash }$$
(41)

Taking the derivative over t of both sides, we obtain the Liouville–von Neumann equation

$$i\hslash \frac{d\hat{\rho }(t)}{dt}={\hat{H}}_{S}\hat{\rho }(t)-\hat{\rho }(t){\hat{H}}_{S}=[{\hat{H}}_{S},\hat{\rho }(t)].$$
(42)

Eqs (40) and (42) give equivalent descriptions of the dynamics of quantum state of S. Eq. (42) has the advantage of describing the time evolution of S without referencing to A and therefore mathematically more elegant. However, it leads to the impression that the quantum system can be described independent of the reference system.

Eqs (40) and (42) also confirm two equivalent methodologies to describe the change of quantum state of S: (1). Calculate the change of relational matrix R and compute the quantum probability based on Postulate 2e; (2). Derive the wave function of the composite state for S + A, then trace out A over the composite state to obtain ρS.

Discussion and Conclusion

Hypotheses in the Reconstruction

The reconstruction of quantum theory presented in this paper is based on two hypotheses. First, a quantum system should be described relative to a reference system. This implies the relational properties between two quantum systems are more basic than the properties of one system. We take this hypothesis as a starting point to reformulate quantum mechanics. This reference system is not arbitrary. It is the apparatus, or environment, A, that the system S has previously interacted with. Although the reference system A is unique and objectively selected, it is possible that another observer does not have complete information of the interaction (or, measurement) results between A and S. In such case she can describe S differently using a different set of relational properties between S and A. It is in this sense that we say the relational properties themselves are observer-dependent. This is indeed the main thesis of ref.12. In the example of ideal measurement described by Eq. (1), supposed the measurement outcome is correspondent to eigenvector |sn〉. For an observer that operates and reads the pointer variable of A, she knows the measurement outcome. At the end of the measurement, her relational description is given by |sn〉 |an〉. On the other hand, for another observer who only knows there is interaction between S and A, but does not know the measurement outcome, the relational description is given by \({\sum }_{i}\,{c}_{i}|{s}_{i}\rangle |{a}_{i}\rangle \). Both descriptions are based on relational properties, and they are observer-dependent. Thus, there are two layers of relativity here. In this paper, we assume observers share the same information of the relational matrix, and focus on formulating quantum mechanics based on the relational probability amplitude. The observer-dependent aspect of the formulation is more relevant to measurement theory, which will be analyzed in an upcoming paper.

The second hypothesis is due to the realization that a real physical measurement is a bidirectional process. It is a question-and-answer, or a probe-and-response, interaction process. This bidirectional aspect of a physical measurement seems overlooked in other quantum mechanics formulations. Here we mandate that a framework of calculating probability for a potential outcome from a physical interaction must explicitly model the bidirectional process. A variable that only models unidirectional of the process cannot be considered as a real probability number because a one-way process does not model an actual measurement. Instead, such unidirectional quantity is called probability amplitude and is not necessarily a real non-negative number. The distinction of unidirectional versus bidirectional process allows us to relax the mathematical requirement on the probability amplitude and to consider it as a complex number. However, we assume it still follows some of rules in probability calculation, such as multiplication, and sum over alternatives of intermediate steps.

With these two hypotheses, a framework is developed such that the task of calculating probability in a specific measurement setup is reduced to counting the applicable measurement configurations in the joint Hilbert space for the measured system and the apparatus. It is interesting to notice that the two hypotheses philosophically echo the ideas expressed in ref.26 that the physical world is made of processes instead of objects, and the properties are described in terms of relationships between events.

The Reference System

Although a quantum system should be described relative to a reference system, and the relational probability amplitude matrix R is considered as the most basic variable, there are mathematical tools that allow a quantum system S to be described without explicitly calling out the reference system A. When S and A are unentangled, S is described by a wave function that sums out the information of the reference system. When S and A are entangled, S is described by a reduced density matrix that traces out the information of A. These mathematical tools enable us to develop the formulations for time evolution and measurement theory that are equivalent to those in the traditional quantum mechanics.

Except some special scenario such as that is described in the EPR argument, there is no need to explicitly call out the reference quantum system A. Mathematically it is more convenient and elegant to trace out the information of the reference system. However, explicitly including the reference system allows us to develop a framework to explain the origin of the quantum probability and to quantify the difference between time evolution and quantum measurement.

It is interesting to notice that ref.27 also proposes to use relational logic and category theory to deduce the basic laws of quantum mechanics. However, the formulation in ref.27 is rather abstract. Quantum probability is introduced purely from mathematical perspective, instead of associating it with the process of actual physical measurement. How entanglement influences the probability calculation is not discussed and formulated in ref.27.

Comparison with the Original RQM Theory

The works presented here is inspired by the main idea of the original RQM theory12. However, there are several significant improvements that should be pointed out.

The works of refs9,12 establish the idea that relational properties are more basic, and a quantum system must be described relative to another quantum system. However, they do not provide a clear formulation on how a quantum system should be described relative to another system and what the basic relational properties are. On the other hand, our works gives a clear quantification of the relational property, which is the relational probability amplitude. The introduction of the relational probability amplitude is based on a detailed analysis of measurement process. It enables us to develop a framework to calculate probability during quantum measurement. We further show that the relational probability amplitude can be calculated using Feynman path integral in the Method Section.

The second improvement in this works comes from the introduction of the concept of entanglement to the RQM theory. We recognize not only that a quantum system must be described relative to another quantum system, but also that the entanglement between these two systems impacts the formulation the observed system is described. If there is no entanglement, the observed system can be described by a pure wave function. If there is entanglement, a reduced density matrix is more appropriate mathematical tool. In addition, entanglement measure plays a pivot role in determining a system is undergoing a time evolution or measurement process. This allows us to reconstruct both the Schrödinger equation and the measuring theory (The reconstruction of quantum measurement theory is submitted in an upcoming paper.). When one states that a quantum system must be described relative to another quantum system, one can further quantify this relativity via the entanglement measure between these two systems. However, the concept of entanglement is not presented in ref.12. The reconstruction attempts in ref.12 to derive the laws of quantum mechanics based on quantum logic is rather limited since only the Schrödinger equation is reconstructed.

Thirdly, although a quantum system must be described relative to another quantum system, our works show that there are mathematical tools that can describe the observed system without explicitly calling out the observing system, such as the wave function and the reduced density matrix as shown in the Result Section. Therefore, RQM and traditional QM are compatible mathematically. This is important because it confirms that although the main idea of RQM seems radical, it does not change the practical application of quantum mechanics. Again, this point was not clear in ref.12.

Limitations

The formulation presented in this paper has several limitations. It assumes a finite Hilbert space for either the observed quantum system or the observing apparatus. It is desirable to extend the formulation for Hilbert space with infinite dimension. It is mathematically more cumbersome to calculate the wave function from a relational matrix than to just assume an observer independent wave function. Mathematical rigorousness is needed for some of the derivations. For instance, given the general time evolution dynamics in Eq. (33), it is left unanswered on what the sufficient and necessary condition should be in order to keep the entanglement measure as an invariance. The Result Section only gives several sufficient conditions that lead to the Schrödinger Equation. Conceptually, path integral is only one way to implement the relational probability amplitude. There could be other implementations with sound physics foundations. Furthermore, implementing the relational properties between S and A with one definite matrix implies that the composite system S + A is in a pure state. S + A can be in a mixed state and described by an ensemble of relational matrices. A rigorous mathematical treatment for mixed state is desirable, especially when ρSA is not a separable mixed state. It should be also noted that only non-relativistic quantum mechanics is considered here.

Conclusions

We have shown that quantum mechanics can be constructed by shifting the starting point from the independent properties of a quantum system to the relational properties among quantum systems. This idea, combined with the emphasis that a physical measurement is a bidirectional interaction process, enables us to propose a framework to calculate the probability of outcome when measuring a quantum system. Quantum probability is proportional to the summation of weights representing the bidirectional measurement process from all applicable configuration in the joint Hilbert space of the measured and measuring composite system. This postulate explains why the quantum probability is the absolute square of a complex number when there is no entanglement. The wave function of the observed system is simply the summation of relational probability amplitudes. If there is entanglement between the measured and measuring composite system, the way probability is calculated is adjusted due to the presence of correlation. In essence, quantum mechanics demands a new set of rules to calculate probability of a potential outcome from a physical interaction in the quantum world. Quantum theory does not describe directly measurable physical properties such as force, length, etc. Instead it deals with quantity such as probability amplitude, and provides a set of rules to connect to those measurable physical properties. In this sense, quantum mechanics is a probability theory for describing the process of measuring a quantum system through interaction.

Based on the postulates, formulations for time evolution and quantum measurement can be reconstructed. Schrödinger Equation is derived when the observing system is in an isolated state. Although the theory developed in this work is mathematically equivalent to the traditional quantum mechanics, there are several significant implications of this formulation. First, the reformulation shows that relational property can be the most fundamental element to construct quantum mechanics. Second, it brings new insight on the origin of the quantum probability. Third, path integral formulation is generalized to formulate the reduced density matrix of a quantum system. This may pave the way to extend the reformulation to quantum field theory and deserves further research. Finally, as with other efforts of reformulating quantum mechanics, it is always interesting to recognize a new perception on a traditional theory. The hope is that the reformulation presented here can be one step towards a better understanding of quantum mechanics.

Methods

Proof of Eq. (16)

To prove Eq. (16), we perform a transformation of eigenbasis. The initial eigenbasis for S is {|si〉} and the relational matrix is R. If we introduce another set of eigenbasis \(\{|{s^{\prime} }_{i}\rangle \}\) such that the first eigenvector is \(|{s^{\prime} }_{0}\rangle \) is |χ〉. Denote the unitary matrix that relates the two sets of eigenbasis as U. We have \(U|\chi \rangle =|{s^{\prime} }_{0}\rangle \), or,

$$|\chi \rangle ={U}^{\dagger }|{s^{\prime} }_{0}\rangle $$
(43)

From the definition of wave function, we have |χ〉 = {ϕ0, ϕ1, …, ϕN}T, where \({\varphi }_{i}={\sum }_{j}\,{Q}_{ij}\). Substitute this into the above equation, we get \({U}_{i0}^{\dagger }={\varphi }_{i}\), i.e.,

$${U}_{0i}=\sum _{j}\,{Q}_{ij}^{\ast }$$
(44)

In the new eigenbasis, the original relational matrix R is transformed to R′ = UR. The probability of finding S described by state vector χ is correspondent to the probability to find S in engeinvector \(|{s^{\prime} }_{0}\rangle \), which according to Eq. (9) is

$$\begin{array}{rcl}p(\chi |\psi ) & = & {|\sum _{j}{R^{\prime} }_{0j}|}^{2}={|\sum _{j}{(UR)}_{0j}|}^{2}\\ & = & {|\sum _{j}\sum _{i}{U}_{0i}{R}_{ij}|}^{2}={|\sum _{ij}\sum _{k}{Q}_{ik}^{\ast }{R}_{ij}|}^{2}\\ & = & {|\sum _{jk}(\sum _{i}{Q}_{ki}^{\dagger }{R}_{ij})|}^{2}={|\sum _{jk}{({Q}^{\dagger }R)}_{kj}|}^{2}\\ & = & {|\sum _{i}(\sum _{k}{Q}_{ik}^{\ast })(\sum _{j}{R}_{ij})|}^{2}\\ & = & {|\sum _{i}{\varphi }_{i}^{\ast }{\psi }_{i}|}^{2}=\Vert \langle \chi |\psi \rangle \Vert .\end{array}$$
(45)

In the first step of the second line, we use the relation Eq. (44).

Proof of Theorem 1

According to the singular value decomposition, the relational matrix R can be decomposed to R = UDV, where D is rectangular diagonal and both U and V are N × N and M × M unitary matrix, respectively. This gives \(\rho =R{R}^{\dagger }=U(D{D}^{\dagger }){U}^{\dagger }\). If H(R) = 0, matrix ρ is a rank one matrix, therefore \(D{D}^{\dagger }\) is diag{1, 0, 0 …}. This means D is a rectangular diagonal matrix with with only one eigenvalue e. Expanding the matrix product R = UDV gives

$${R}_{ij}=\sum _{nm}\,{U}_{in}{D}_{nm}{V}_{mj}={U}_{i1}{e}^{i\varphi }{V}_{1j}.$$
(46)

We just choose ci = Ui1 and dj = eV1j to get Rij = cidj. Conversely, if Rij = cidj, R can be written as outer product of two vectors,

$$R={({c}_{1}{c}_{2}\ldots {c}_{n})}^{T}\times ({d}_{1}\,{d}_{2}\,\ldots \,{d}_{m}).$$
(47)

Considering vector C1 = {c1, c2, …, cn} as an eigenvector in Hilbert space \({ {\mathcal H} }_{S}\), one can use the Gram-Schmidt procedure22 to find orthogonal basis set C2, …, Cn. Similarly, considering vector D1 = {d1, d2, …, dm} as an eigenvector in Hilbert space \({ {\mathcal H} }_{A}\), one can find orthogonal basis set D2, …, Dm. Under the new orthogonal eigenbasis, R becomes a rectangular diagonal matrix D = diag{1, 0, 0…}. Therefore R = UDV where U and V are two unitary matrices associated with the eigenbasis transformations. Then \(\rho =R{R}^{\dagger }=U(D{D}^{\dagger }){U}^{\dagger }\), and \(D{D}^{\dagger }=diag\{1,0,0\,\ldots \}\) is a square diagonal matrix. Since the eigenvalues of similar matrices are the same, the eigenvalues of ρ are (1, 0, …), thus H(R) = 0.

Proof of Theorem 2

Assuming S + A is in a pure state, we use the Von Neumann entropy H(R) as entanglement measure. If H(R) = 0, by virtue of Theorem 1, Rij = cidj. Assuming both \({{\mathscr{O}}}_{ {\mathcal I} }\) and \({{\mathscr{O}}}_{ {\mathcal E} }\) share the same knowledge of Rij. The reduced density matrices relative to each observer are calculated as

$$\begin{array}{rcl}{\hat{\rho }}_{S}^{^{\prime} } & = & \sum _{ii^{\prime} }(\sum _{jj^{\prime} }{R}_{ij}{R}_{i^{\prime} j^{\prime} }^{\ast })|i\rangle \langle i^{\prime} |\\ & = & \sum _{ii^{\prime} }{c}_{i}{({c}_{i^{\prime} })}^{\ast }(\sum _{jj^{\prime} }{d}_{j}{d}_{j^{\prime} })|i\rangle \langle i^{\prime} |\\ & = & {d}_{A}\sum _{ii^{\prime} }{c}_{i}{({c}_{i^{\prime} })}^{\ast }|i\rangle \langle i^{\prime} |\\ {\hat{\rho }}_{S} & = & \sum _{ii^{\prime} }(\sum _{k}{R}_{ik}{R}_{i^{\prime} k}^{\ast })|i\rangle \langle i^{\prime} |\\ & = & \sum _{ii^{\prime} }{c}_{i}{({c}_{i^{\prime} })}^{\ast }(\sum _{k}|{d}_{k}{|}^{2})|i\rangle \langle i^{\prime} |\\ & = & {d}_{A^{\prime} }\sum _{ii^{\prime} }{c}_{i}{({c}_{i^{\prime} })}^{\ast }|i\rangle \langle i^{\prime} |\end{array}$$
(48)

where dA and dA are two constant. \({\hat{\rho }}_{S}\) and \({\hat{\rho }}_{S}^{^{\prime} }\) only differ by a constant when H(R) = 0. Since \(Tr({\hat{\rho }}_{S}^{^{\prime} })=Tr({\hat{\rho }}_{S})=1\), we can simply choose dA = dA so that \({\hat{\rho }}_{S}={\hat{\rho }}_{S}^{^{\prime} }\).

Proof of Theorem 3

Denote the initial state vector of the composite system as \(|{{\rm{\Psi }}}_{0}\rangle ={\sum }_{ij}\,{R}_{ij}|{s}_{i}\rangle |{a}_{j}\rangle \). Apply the composite operator \(\hat{Q}(t)\otimes \hat{O}(t)\) to the initial state,

$$\begin{array}{rcl}|{{\rm{\Psi }}}_{1}\rangle & = & (\hat{Q}\otimes \hat{O})\sum _{ij}{R}_{ij}|{s}_{i}\rangle \otimes |{a}_{j}\rangle \\ & = & \sum _{ij}{R}_{ij}\hat{Q}|{s}_{i}\rangle \otimes \hat{O}|{a}_{j}\rangle \\ & = & \sum _{ij}\sum _{mn}{R}_{ij}{Q}_{mi}{O}_{nj}|{s}_{m}\rangle \otimes |{a}_{n}\rangle \\ & = & \sum _{mn}(\sum _{ij}{Q}_{mi}{R}_{ij}{O}_{jn}^{T})|{s}_{m}\rangle \otimes |{a}_{n}\rangle \\ & = & \sum _{mn}{(QR{O}^{T})}_{mn}|{s}_{m}\rangle |{a}_{n}\rangle \end{array}$$
(49)

where T represents the transposition of matrix. Compared the above equation to Eq. (13) for the definition of |Ψ1〉, it is clear that the relational matrix is changed to R′ = QROT.

Decomposition of the Unitary Operator of a Bipartite System

Here we show that if there is no interaction between S and A, a global unitary operator for the composite system S + A is decomposed into the tensor product of two local unitary operators. Let {|si〉} be the orthogonal eigenbasis of \({\hat{H}}_{S}\), \({\hat{H}}_{S}|{s}_{i}\rangle ={E}_{i}^{S}|{s}_{i}\rangle \). Recall that the definition of a function of operator \(\hat{H}\) is

$$f(\hat{H})=\sum _{i}\,f({E}_{i})|{s}_{i}\rangle \langle {s}_{i}|$$
(50)

Based on this definition, \({\hat{U}}_{S}=exp\{\,-\,(i/\hslash ){\hat{H}}_{S}t\}=exp\{\,-\,(i/\hslash ){E}_{i}^{S}t\}|{s}_{i}\rangle \langle {s}_{i}|\). Similarly, let {|aj〉} be the orthogonal eigenbasis of \({\hat{H}}_{A}\), \({\hat{H}}_{A}|{a}_{j}\rangle ={E}_{j}^{A}|{a}_{j}\rangle \) and \({\hat{U}}_{A}=exp\{\,-\,(i/\hslash ){E}_{j}^{A}t\}|{a}_{j}\rangle \langle {a}_{j}|\). When there is no interaction between S and A, \({\hat{H}}_{SA}={\hat{H}}_{S}+{\hat{H}}_{A}\) where \({\hat{H}}_{S}\) and \({\hat{H}}_{A}\) are the Hamiltonian operators in their respective Hilbert spaces, thus \({\hat{U}}_{SA}=exp\{\,-\,(i/\hslash )({\hat{H}}_{S}+{\hat{H}}_{A})t)\}\). According to Postulate 4, the set {|si〉|aj〉} forms the orthogonal eigenbasis for \({\hat{H}}_{SA}\), so that \({\hat{H}}_{SA}|{s}_{i}\rangle |{a}_{j}\rangle =({E}_{i}^{S}+{E}_{j}^{A})|{s}_{i}\rangle |{a}_{j}\rangle \) and \(exp\{\,-\,(i/\hslash )({\hat{H}}_{S}+{\hat{H}}_{A})t)|{s}_{i}\rangle |{a}_{j}\rangle =exp\{\,-\,(i/\hslash )({E}_{i}^{S}+{E}_{j}^{A})t)|{s}_{i}\rangle |{a}_{j}\rangle \). From the definition of operator function,

$$\begin{array}{rcl}{\hat{U}}_{SA} & = & \sum _{ij}f({E}_{ij})|{s}_{i}\rangle |{a}_{j}\rangle \langle {s}_{i}|\langle {a}_{j}|\\ & = & \sum _{ij}exp\{\,-\,(i/\hslash )({E}_{i}^{S}+{E}_{j}^{A})t)|{s}_{i}\rangle |{a}_{j}\rangle \langle {s}_{i}|\langle {a}_{j}|\\ & = & \sum _{i}exp\{\,-\,(i/\hslash ){E}_{i}^{S}t\}|{s}_{i}\rangle \langle {s}_{i}|\\ & \otimes & \sum _{j}exp\{\,-\,(i/\hslash ){E}_{j}^{A}t\}|{a}_{j}\rangle \langle {a}_{j}|\\ & = & {\hat{U}}_{S}\otimes {\hat{U}}_{A}\mathrm{.}\end{array}$$
(51)

Path Integral Implementation

This section briefly describes how the relational probability amplitude can be calculated using the Path Integral formulation. Without loss of generality, the following discussion just focuses on one dimensional space-time quantum system. In the Path Integral formulation, the probability to find a quantum system moving from a point xa at time ta to a point xb at time tb is the absolute square of a probability amplitude, i.e., P(b, a) = |K(b, a)|2. The probability amplitude is postulated as the sum of the contribution of phase from each path28:

$$K(b,a)=\frac{1}{N}\sum _{path}{e}^{(i/\hslash ){S}_{p}(x(t))}$$
(52)

where N is a normalization constant, and Sp(x(t)) is the action along a particular path from point xa to point xb. The action is defined as \({S}_{p}(x(t))={\int }_{{t}_{a}}^{{t}_{b}}\,L(\dot{x},x,t)dt\) where L is the Lagrangian of the system. Since there is infinite number of possible paths from point xa to point xb, more precisely the summation in Eq. (52) should be replaced by an integral

$$K(b,a)={\int }_{a}^{b}{e}^{(i/\hslash ){S}_{p}(x(t))}{\mathscr{D}}x(t)$$
(53)

where \({\mathscr{D}}x(t)\) denotes integral over all possible paths from point xa to point xb. It is the wave function for S moving from xa to xb28. The wave function of the particle at position xb is

$$\phi ({x}_{b},{t}_{b})={\int }_{-\infty }^{\infty }K({x}_{b},{t}_{b};{x}_{a},{t}_{a})\phi ({x}_{a},{t}_{a})d{x}_{a}$$
(54)

where φ(xa, ta) is the wave function of the particle at position xa. Eq. (54) is the integral form of the Schrödinger Eq. (37).

Now let’s consider how the relational matrix element can be formulated. At a particular time ta, we denote the matrix element as R(xa; ya). Here the coordinates xa and ya act as indices to the system S and apparatus A, respectively. From time ta to tb, suppose S moves from xa to xb, and A moves from ya to yb, the relational matrix element is written as R(xb, xa; yb, ya). Borrowing the ideas described in Eq. (53), we propose that

$$\begin{array}{rcl}R({x}_{b},{x}_{a};{y}_{b},{y}_{a}) & = & {\int }_{a}^{b}{\int }_{a}^{b}{e}^{(i/\hslash ){S}_{p}^{SA}(x(t),y(t))}\\ & & \times \,{\mathscr{D}}x(t){\mathscr{D}}y(t)\end{array}$$
(55)

where the action \({S}_{p}^{SA}(x(t),y(t))\) consists three terms

$$\begin{array}{rcl}{S}_{p}^{SA}(x(t),y(t)) & = & {S}_{p}^{S}(x(t))+{S}_{p}^{A}(y(t))\\ & & +\,{S}_{{int}}^{SA}(x(t),y(t)).\end{array}$$
(56)

The last term is the action due to the interaction between S and A when each system moves along its particular path. Eq. (55) is considered an extension of Postulate 1. We can validate Eq. (55) by deriving formulation that is consistent with traditional path integral. Suppose there is no interaction between S and A. The third term in Eq. (56) vanishes. Equation (55) is decomposed to product of two independent terms,

$$\begin{array}{rcl}R({x}_{b},{x}_{a};{y}_{b},{y}_{a}) & = & {\int }_{a}^{b}{e}^{(i/\hslash ){S}_{p}^{S}(x(t))}{\mathscr{D}}x(t)\\ & & \times \,{\int }_{a}^{b}{e}^{(i/\hslash ){S}_{p}^{A}(y(t))}{\mathscr{D}}y(t)\end{array}$$
(57)

Noticed that the coordinates ya and yb are equivalent of the index j in Eq. (13), the wave function of S can be obtained by integrating ya and yb over Eq. (57)

$$\begin{array}{rcl}\phi ({x}_{b},{x}_{a}) & = & \int {\int }_{-\infty }^{\infty }R({x}_{b},{x}_{a};{y}_{b},{y}_{a})d{y}_{a}d{y}_{b}\\ & = & \{{\int }_{a}^{b}{e}^{(i/\hslash ){S}_{p}^{S}(x(t))}{\mathscr{D}}x(t)\}\\ & & \times \,\{\int {\int }_{-\infty }^{\infty }{\int }_{a}^{b}{e}^{(i/\hslash ){S}_{p}^{A}(y(t))}{\mathscr{D}}y(t)d{y}_{a}d{y}_{b}\}\\ & = & c{\int }_{a}^{b}{e}^{(i/\hslash ){S}_{p}^{S}(x(t))}{\mathscr{D}}x(t)\end{array}$$
(58)

where constant c is the integration result of the second term in step two. The result is the same as Eq. (53) except an unimportant constant.

Next, we consider the situation that there is entanglement between S and A as a result of interaction. The third term in Eq. (56) does not vanish. We can no longer define a wave function for S. Instead, a reduced density matrix should be used to describe the state of the particle, \(\rho =R{R}^{\dagger }\). From Eq. (55), the element of the reduced density matrix is

$$\begin{array}{rcl}\rho ({x}_{b},{x^{\prime} }_{b};{x}_{a},{x^{\prime} }_{a}) & = & \sum _{{y}_{a},{y}_{b}}{\int }_{{x}_{a}}^{{x}_{b}}{\int }_{{x^{\prime} }_{a}}^{{x^{\prime} }_{b}}{\int }_{{y}_{a}}^{{y}_{b}}{\int }_{{y}_{a}}^{{y}_{b}}{e}^{(i/\hslash ){\rm{\Delta }}S}\\ & & \times \,{\mathscr{D}}x(t){\mathscr{D}}x^{\prime} (t){\mathscr{D}}y(t){\mathscr{D}}y^{\prime} (t)\\ {\rm{where}}\,{\rm{\Delta }}S & = & {S}_{p}^{S}(x(t))-{S}_{p}^{S}(x^{\prime} (t))\\ & & +\,{S}_{p}^{A}(y(t))-{S}_{p}^{A}(y^{\prime} (t))\\ & & +\,{S}_{int}^{SA}(x(t),y(t))\\ & & -\,{S}_{int}^{SA}(x^{\prime} (t),y^{\prime} (t\mathrm{)).}\end{array}$$
(59)

The path integral over \({\mathscr{D}}y^{\prime} (t)\) takes the same end points ya and yb as the path integral over \({\mathscr{D}}y(t)\). After the path integral, a summation over ya and yb is performed. Eq. (59) is equivalent to the J function introduced in ref.29. We can rewrite the expression of ρ using the influence functional, F(x(t), x′(t)),

$$\begin{array}{rcl}{\mathscr{\varrho }}({x}_{b},{x^{\prime} }_{b};{x}_{a},{x^{\prime} }_{a}) & = & \frac{1}{Z}{\int }_{{x}_{a}}^{{x}_{b}}{\int }_{{x^{\prime} }_{a}}^{{x^{\prime} }_{b}}{e}^{(i/\hslash )[{S}_{p}^{S}(x(t))-{S}_{p}^{S}(x^{\prime} (t))]}\\ & & \times \,F(x(t),x^{\prime} (t)){\mathscr{D}}x(t){\mathscr{D}}x^{\prime} (t)\\ F(x(t),x^{\prime} (t)) & = & \sum _{{y}_{a},{y}_{b}}{\int }_{{y}_{a}}^{{y}_{b}}{\int }_{{y}_{a}}^{{y}_{b}}{e}^{(i/\hslash ){\rm{\Delta }}S^{\prime} }\\ & & \times \,{\mathscr{D}}y(t){\mathscr{D}}y^{\prime} (t)\\ {\rm{where}}\,{\rm{\Delta }}S^{\prime} & = & {S}_{p}^{A}(y(t))-{S}_{p}^{A}(y^{\prime} (t))\\ & & +\,{S}_{int}^{SA}(x(t),y(t))\\ & & -\,{S}_{int}^{SA}(x^{\prime} (t),y^{\prime} (t\mathrm{)).}\end{array}$$
(60)

where Z = Tr(ρ) is a normalization factor to ensure Tr(ρ) = 1. The reduced density matrix allows us to calculate the probability of the system changing from one state to another, for instance, the probability of the system initially in a state χ(xa) transitioning to another state ψ(xb). This is similar to calculate the probability of an ideal measurement that specifies the initial state is χ(xa) and the final state is ψ(xb). Defining a project operator \(\hat{P}=|\chi ({x}_{a})\psi ({x}_{b})\rangle \langle \chi ({x}_{a})\psi ({x}_{b})|\), the probability is calculated, similar to Eq. (23), as

$$\begin{array}{rcl}p(\chi ,\psi ) & = & Tr(\rho \hat{P})\\ & = & \int \int \int \int {\psi }^{\ast }({x^{\prime} }_{b})\psi ({x}_{b})\rho ({x}_{b},{x^{\prime} }_{b};{x}_{a},{x^{\prime} }_{a})\\ & & \times \,\chi ({x}_{a}){\chi }^{\ast }({x^{\prime} }_{a})d{x}_{a}d{x}_{b}d{x^{\prime} }_{a}d{x^{\prime} }_{b}\end{array}$$
(61)

This is equivalent to the result in ref.30. To find the particle moving from a particular position \({\bar{x}}_{a}\) at time ta to another particular position \({\bar{x}}_{b}\) at time tb, we substitute \(\chi ({x}_{a})=\delta ({x}_{a}-{\bar{x}}_{a})\) and \(\chi ({x}_{b})=\delta ({x}_{b}-{\bar{x}}_{b})\) into Eq. (61),

$$\begin{array}{rcl}p({\bar{x}}_{b},{\bar{x}}_{a}) & = & \int \int \int \int \rho ({x}_{b},{x^{\prime} }_{b};{x}_{a},{x^{\prime} }_{a})\delta ({x}_{b}-{\bar{x}}_{b})\\ & & \times \,\delta ({x^{\prime} }_{b}-{\bar{x}}_{b})\delta ({x}_{a}-{\bar{x}}_{a})\delta ({x^{\prime} }_{a}-{\bar{x}}_{a})\\ & & \times \,d{x}_{b}d{x^{\prime} }_{b}d{x}_{a}d{x^{\prime} }_{a}\\ & = & \rho ({\bar{x}}_{b},{\bar{x}}_{b};{\bar{x}}_{a},{\bar{x}}_{a}).\end{array}$$
(62)

In summary, we show that the relational probability amplitude introduced in Postulate 1 can be explicitly calculated through Eq. (55). With this definition and the results in earlier sections, we obtain the formulations for wave function in Eq. (58) and probability in Eq. (61) that are the consistent with those in traditional path integral formulation. The reduced density expression in Eq. (59), although equivalent to the J function in ref.28, has richer physical meaning. For instance, we can calculate the entanglement measure from the reduced density matrix.