Introduction

Thermalization, or relaxation to equilibrium, in isolated quantum many-body systems is a ubiquitous yet profound phenomenon. The history of investigation of thermalization dates back to Boltzmann1 and von Neumann2, and many theoretical physicists have studied this problem. The problem originated in the field of nonequilibrium statistical mechanics. However, some techniques developed in quantum information theory have gained attention to provide fresh insight into this old problem3. From the experimental side, the recent development of experimental techniques to manipulate cold atoms enabled us to observe thermalization of isolated quantum many-body systems in the laboratory4,5,6,7,8,9. Experimentalists not only tested established theoretical results, but also revealed some unexpected behaviours9.

A central problem in this field is whether a given system thermalizes3,10. Although almost all-natural quantum many-body systems are expected to thermalize, some systems, including integrable and localized systems, are known to never achieve thermalization11,12,13,14,15. To resolve this problem, the eigenstate thermalization hypothesis (ETH) has been raised as a clue to understanding thermalization phenomena. The ETH claims that all the energy eigenstates of a given Hamiltonian are thermal, that is, indistinguishable from the equilibrium state, as long as we observe macroscopic observables16,17,18,19,20,21. Studies based on numerical simulations support that most non-integrable thermalizing systems satisfy the ETH20,22,23,24. In contrast, recent theoretical studies and elaborated experiments have revealed that some non-integrable systems do not satisfy the ETH9,25,26,27,28,29,30,31. Numerous other theoretical ideas, including largeness of effective dimension10, typicality10,32,33,34, and quantum correlation35,36,37 have been proposed to elucidate thermalization phenomena; however, none of them provides a decisive answer.

We approach the problem of thermalization from the opposite side. We examine the difficulty of the problem from the viewpoint of theoretical computer science. This type of approach is employed in some problems in physics, including prediction of dynamical systems38, repeated quantum measurements39, and the spectral gap problem40. In this approach, these problems were unexpectedly shown to be undecidable, that is, there is no algorithm to determine, e.g., the presence or absence of a spectral gap in arbitrary systems in the case of the spectral gap problem.

Our main achievement in this paper is the finding that whether a given system thermalizes or not with respect to a given observable is undecidable in general. This result shows not merely the difficulty of this problem, but also the logical impossibility of solving it. Hence, the fate of thermalization in a general setup is independent of the basic axioms of mathematics, as implied in the Gödel’s incompleteness theorem41. We prove this by demonstrating that the relaxation and thermalization phenomena in one-dimensional systems have the power of universal computation. Our result not only sets a limit on what we can know about quantum thermalization, but also elucidates a rich variety of thermalization phenomena, which can implement any computational task.

Results

Statement of main results

We first clarify the precise statements of our results, namely, the undecidability of relaxation and thermalization. Since the undecidability of thermalization can be obtained by modifying the result on relaxation, we shall mainly treat relaxation and briefly comment on how to extend this result to thermalization. Throughout this study, we consider a one-dimensional lattice system of size L with the periodic boundary condition (we finally take L →  limit), with d-dimensional local Hilbert space \({{{{{{{\mathcal{H}}}}}}}}\). Although we do not specify the necessary dimension, we roughly estimate that d 120 suffices to obtain undecidability, which is minuscule compared to other results of undecidability in physics40. Let \(\left|\psi (t)\right\rangle\) be the state of the system at time t. The long-time average of an observable \({{{{{{{{\mathcal{A}}}}}}}}}_{L}\) for a given initial state \(\left|\psi (0)\right\rangle =\left|{\psi }_{0}^{L}\right\rangle\) is given by \({\bar{{{{{{{{\mathcal{A}}}}}}}}}}_{L}={{{\mbox{lim}}}}_{T\to \infty }\frac{1}{T}\int\nolimits_{0}^{T}dt\langle \psi (t)| {{{{{{{{\mathcal{A}}}}}}}}}_{L}| \psi (t)\rangle\). Our interest takes the form of whether the thermodynamic limit of the long-time average \({\bar{{{{{{{{\mathcal{A}}}}}}}}}}_{L}\), denoted by \(\bar{{{{{{{{\mathcal{A}}}}}}}}}:= {{{\mbox{lim}}}}_{L\to \infty }{\bar{{{{{{{{\mathcal{A}}}}}}}}}}_{L}\), converges to the vicinity of a given target value A*. This question concerns the fate of a relaxation process with an initial state, an observable, and a Hamiltonian. If A* is equal to the equilibrium value \({{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}:= {{{\mbox{lim}}}}_{L\to \infty }{{{{{{{\rm{Tr}}}}}}}}[{{{{{{{{\mathcal{A}}}}}}}}}_{L}{\rho }_{L}^{{{{{{{{\rm{MC}}}}}}}}}]\) with the microcanonical state \({\rho }_{L}^{{{{{{{{\rm{MC}}}}}}}}}\), this question asks whether thermalization with respect to \({{{{{{{\mathcal{A}}}}}}}}\) takes place. We remark that we take the long-time limit (T → ) first, and then take the thermodynamic limit (L → ). The symbol \({{{{{{{\mathcal{A}}}}}}}}\) means the thermodynamic limit of \({{{{{{{{\mathcal{A}}}}}}}}}_{L}\), while the order of the limit is always in the aforementioned one.

We restrict the class of the Hamiltonians, observables, and initial states to simple ones. The Hamiltonian of the system is restricted to be nearest-neighbour interaction and shift-invariant. Hence, the d2 × d2 local Hamiltonian hi,i+1, which acts only on sites i and i + 1, fully determines the system Hamiltonian as H ∑ihi,i+1. We further restrict observables to a spatial average of a single-site operator: \({{{{{{{{\mathcal{A}}}}}}}}}_{L}:= \frac{1}{L}\mathop{\sum }\nolimits_{i = 1}^{L}{A}_{i}\), where Ai acts only on the site i. In addition, we restrict the initial state as the following form of a product state: \(\left|{\psi }_{0}^{L}\right\rangle =\left|{\phi }_{0}\right\rangle \otimes \left|{\phi }_{1}\right\rangle \otimes \left|{\phi }_{1}\right\rangle \otimes \cdots \otimes \left|{\phi }_{1}\right\rangle\), where \(\left|{\phi }_{0}\right\rangle\) and \(\left|{\phi }_{1}\right\rangle\) are states on a single-site orthogonal to each other; 〈ϕ0ϕ1〉 = 0.

In our setup, both the observable (A) and the initial state (\(\left|{\phi }_{0}\right\rangle\) and \(\left|{\phi }_{1}\right\rangle\)) are given arbitrarily and fixed. Moreover, we put a promise that either \(\left|\bar{{{{{{{{\mathcal{A}}}}}}}}}-{A}^{* }\right| \, < \, {\varepsilon }_{1}\) or \(\left|\bar{{{{{{{{\mathcal{A}}}}}}}}}-{A}^{* }\right|\, > \,{\varepsilon }_{2}\) holds with errors 0 < ε1 < ε2. An alternative expression of the above promise is that we are allowed to answer incorrectly for \({\varepsilon }_{1}\le \left|\bar{{{{{{{{\mathcal{A}}}}}}}}}-{A}^{* }\right|\le {\varepsilon }_{2}\). The ratio of errors M : = ε2/ε1 can be set arbitrarily large. The input of this decision problem is the Hamiltonian. Even in this very simple setup, we show that whether the long-time average of \({{{{{{{\mathcal{A}}}}}}}}\) from this initial state \(\left|{\psi }_{0}^{L}\right\rangle\) under a given Hamiltonian H relaxes to the vicinity of a given value A* is undecidable.

Theorem 1

Given two states \(\left|{\phi }_{0}\right\rangle\) and \(\left|{\phi }_{1}\right\rangle\) on a single site, orthogonal to each other, and a single-site operator A arbitrarily. We require that there exists a state \(\left|{\phi }_{2}\right\rangle\) orthogonal to \(\left|{\phi }_{0}\right\rangle\) and \(\left|{\phi }_{1}\right\rangle\) such that 〈ϕ2Aϕ2〉 ≠ 〈ϕ1Aϕ1〉. The initial state and the observable are set as \(\left|{\psi }_{0}^{L}\right\rangle =\left|{\phi }_{0}\right\rangle \otimes \left|{\phi }_{1}\right\rangle \otimes \cdots \otimes \left|{\phi }_{1}\right\rangle\) and \({{{{{{{{\mathcal{A}}}}}}}}}_{L}:= \frac{1}{L}\mathop{\sum }\nolimits_{i = 1}^{L}{A}_{i}\). Here, the long-time average \(\bar{{{{{{{{\mathcal{A}}}}}}}}}\) is a function of the Hamiltonian H. We also fix M > 0 arbitrarily large.

Then, there exist ε1, ε2 with ε2 = Mε1, and A* which satisfy the following: we suppose the promise that either \(\left|\bar{{{{{{{{\mathcal{A}}}}}}}}}-{A}^{* }\right|\, < \,{\varepsilon }_{1}\) or \(\left|\bar{{{{{{{{\mathcal{A}}}}}}}}}-{A}^{* }\right|\, > \, {\varepsilon }_{2}\) holds (see Fig. 1). In this setting, deciding which is true for a given shift-invariant nearest-neighbour interaction Hamiltonian H = ∑ihi,i+1 is undecidable.

Fig. 1: The problem of thermalization concerns the long-time average of the observable.
figure 1

a We consider whether a nonequilibrium initial state relaxes to the equilibrium or not. b More precisely, we decide whether the long-time average of \(\left\langle \psi (t)| {{{{{{{{\mathcal{A}}}}}}}}}_{L}| \psi (t)\right\rangle\) converges to the value A* with precision ε1, or deviates from A* at least ε2 > ε1, in the thermodynamic limit (If the long-time average settles between ε1 and ε2, we do not have to answer). This problem is shown to be an undecidable problem.

If A* is equal to the equilibrium value \({{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\), our result reads undecidability of thermalization: Whether a given system with a fixed initial state thermalizes or not with respect to a fixed observable A is undecidable. By defining \({A}_{\max }^{01}:= {\max}{\left|\psi \right\rangle \in {{{{{{{\rm{span}}}}}}}}\{\left|{\phi }_{0}\right\rangle ,\left|{\phi }_{1}\right\rangle \}}\langle \psi | A| \psi \rangle\) and \({A}_{\min }^{01}:= {\min}_{\left|\psi \right\rangle \in {{{{{{{\rm{span}}}}}}}}\{\left|{\phi }_{0}\right\rangle ,\left|{\phi }_{1}\right\rangle \}}\langle \psi | A| \psi \rangle\), the precise statement can be expressed as follows:

Theorem 2

Given two states \(\left|{\phi }_{0}\right\rangle\) and \(\left|{\phi }_{1}\right\rangle\) on a single site, orthogonal to each other, and a single-site operator A arbitrarily. We require that there exist states \(\left|{\phi }_{2}\right\rangle\) and \(\left|{\phi }_{3}\right\rangle\) orthogonal to \(\left|{\phi }_{0}\right\rangle\), \(\left|{\phi }_{1}\right\rangle\), \(A\left|{\phi }_{0}\right\rangle\), and \(A\left|{\phi }_{1}\right\rangle\) such that \(\langle {\phi }_{2}| A| {\phi }_{2}\rangle \, > \,{A}_{\max }^{01}\) and \(\langle {\phi }_{3}| A| {\phi }_{3}\rangle \, < \,{A}_{\min }^{01}\). The initial state and the observable are set as \(\left|{\psi }_{0}^{L}\right\rangle =\left|{\phi }_{0}\right\rangle \otimes \left|{\phi }_{1}\right\rangle \otimes \cdots \otimes \left|{\phi }_{1}\right\rangle\) and \({{{{{{{{\mathcal{A}}}}}}}}}_{L}:= \frac{1}{L}\mathop{\sum }\nolimits_{i = 1}^{L}{A}_{i}\). We also fix M > 0 arbitrarily large.

Then, there exist ε1, ε2 with ε2 = Mε1, which satisfy the following: We suppose the promise that either \(\left|\bar{{{{{{{{\mathcal{A}}}}}}}}}-{{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\right| \, < \, {\varepsilon }_{1}\) or \(\left|\bar{{{{{{{{\mathcal{A}}}}}}}}}-{{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\right|\, > \, {\varepsilon }_{2}\) holds. In this setting, deciding which is true for a given shift-invariant nearest-neighbour interaction Hamiltonian H = ∑ihi,i+1 is undecidable.

The condition on the presence of \(\left|{\phi }_{2}\right\rangle\) and \(\left|{\phi }_{3}\right\rangle\) ensures that the initial state is not at the edge of the spectrum of A. We note that the equilibrium value \({{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\) depends on the choice of the Hamiltonian, and thus the promise restricts the class of Hamiltonians.

Mapping classical Turing machines to a quantum system

We here sketch the main idea of the proof. A rigorous proof is presented in the Supplementary Note. We first introduce a key ingredient, the halting problem of a Turing machine (TM), which is a prominent example of undecidable problems. The halting problem of a TM asks whether the TM with a given input halts at some time or does not halt and runs forever. Turing proved in his celebrated paper that there exists no general procedure to solve the halting problem42.

Following various studies demonstrating undecidability43, we apply the reduction to the halting problem. We shall construct a family of Hamiltonians with which the long-time average of an observable is connected to the halting or non-halting of a TM. Below, a universal reversible Turing machine (URTM) is arbitrarily given and fixed, whose possible input code is denoted by u.

Lemma:

Given a complete orthogonal normal basis of the local Hilbert space \(\{\left|{e}_{i}\right\rangle \}\) and an observable A on a single site satisfying 〈e1Ae1〉 = 0 and 〈e2Ae2〉 > 0 arbitrarily. Then, for any η > 0, there exists a shift-invariant nearest-neighbour interaction Hamiltonian H and a set of unitary operators {Vu} on the local Hilbert space \({{{{{{{\mathcal{H}}}}}}}}\) corresponding to all possible inputs for the fixed URTM u such that they satisfy \({V}_{{{{{{{{\boldsymbol{u}}}}}}}}}\left|{e}_{0}\right\rangle =\left|{e}_{0}\right\rangle\) for any u and the following property:

Set the initial state as

$$\left|{\psi }_{0}^{L}\right\rangle =({V}_{{{{{{{{\boldsymbol{u}}}}}}}}}\left|{e}_{0}\right\rangle )\otimes {\left({V}_{{{{{{{{\boldsymbol{u}}}}}}}}}\left|{e}_{1}\right\rangle \right)}^{\otimes L-1}.$$
(1)

If the URTM halts with the input u, then

$$\bar{{{{{{{{\mathcal{A}}}}}}}}}\ge \left(\frac{1}{4}-\eta \right)\langle {e}_{2}| A| {e}_{2}\rangle$$
(2)

holds, and if the URTM does not halt with the input u, then

$$\bar{{{{{{{{\mathcal{A}}}}}}}}}\le \eta$$
(3)

holds.

By setting the initial state, the observable, and the Hamiltonian in Theorem 1 as \(\left|{e}_{0}\right\rangle \otimes {(\left|{e}_{1}\right\rangle )}^{\otimes L-1}\), VAV, and \({{V}_{{{{{{{{\boldsymbol{u}}}}}}}}}^{{{{\dagger}}} }}^{\otimes L}H{V}_{{{{{{{{\boldsymbol{u}}}}}}}}}^{\otimes L}\), respectively, the degree of freedom in the choice of unitary transformation is mapped onto that of the local Hamiltonian. Then, the setup of Lemma can be mapped onto that of Theorem 1 by shifting the origin of A so that 〈ϕ1Aϕ1〉 = 0, and setting \(\left|{\phi }_{i}\right\rangle\) (i = 0, 1, 2) to \(\left|{e}_{i}\right\rangle\). Because the halting problem of the URTM is undecidable, the above lemma directly implies the undecidability of the long-time average in quantum many-body systems.

To prove the lemma, we first introduce an elaborated classical machine that simulates the given URTM and changes the value of A depending on whether the URTM halts. We then construct a quantum many-body system emulating the above classical machine. Since the dynamics of the quantum system is a superposition of classical machines with different inputs, we first compute the long-time average for computational basis initial states, which corresponds to a single input, and then treat the quantum superposition.

Classical machines

Here, we outline the construction of a classical TM, which simulates the halting problem of a given URTM and changes the long-time average of the observable A depending on the behaviour of the URTM. This machine consists of three TMs, TM1, TM2, and TM3, on two types of cells, M-cells and A-cells. Unlike conventional TMs, the finite control settles in the line of cells. TM2 simulates the URTM with the input code u, whose reversibility is induced by the unique direction property44. TM1 decodes the input code u from a sequence of two qubits. Two TMs, TM1 and TM2, work in M-cells. TM3 is a simple TM, which flips the state of A-cells if and only if TM2 halts. Through the above trick, the long-time average \(\bar{{{{{{{{\mathcal{A}}}}}}}}}\) in our system reflects the result of the halting problem of TM2.

An M-cell consists of three layers: The first layer simulates the URTM, and the second and the third layers, both of which consist of sequences of qubits, store the input code of TM2. The relative frequency of 1 in the second layer is set to β whose binary expansion is equal to the input code u. TM1 decodes a bit sequence u on the first layer from the second and the third layers by estimating the relative frequency of 1 (see the first part of Fig. 2), and then TM2 runs with this input u. Throughout this procedure, the machine passes all A-cells transparently.

Fig. 2: Roles of three layers in M-cells and schematic of dynamics of two Turing machines.
figure 2

[Top]: In the first part, a Turing machine, TM1, decodes a bit sequence u on the first layer through the estimation of the number of \(\left|1\right\rangle\) in the second layer (step (i)). The relative frequency of 1 in the second layer is set to β, whose binary expansion is equal to u. The number of qubits TM1 should read is determined by the leftmost cell with 1 in the third layer. Here, we draw only M-cells and omit A-cells for visibility. [Bottom]: In the second part, a universal reversible Turing machine, TM2, runs with the input u (step (ii)). If TM2 halts, then TM3 starts to flip the state in the A-cells from a1 to a2 (step (iii)). If TM2 does not halt, the states in A-cells are not flipped. Note that we have not drawn the second and third layers of M-cells for visibility. In these figures, qj, qu, and r are examples of internal states of TM1, TM2, and TM3, respectively.

A-cells are responsible for changing the long-time average of \({{{{{{{{\mathcal{A}}}}}}}}}_{L}\). At the initial state, all A-cells are set to the state a1, whose expectation value of A is zero. If and only if TM2 halts, TM3 starts flipping states of A-cells from a1 to another state a2, whose expectation value of A is a nonzero value. To inflate the difference between the halting and non-halting cases, we set the initial state such that most of the cells are A-cells.

The procedure is summarized as follows:

  1. (i)

    TM1 decodes the input code u on the first layer.

  2. (ii)

    TM2, a URTM, runs with the input u in the first layer.

  3. (iii)

    If and only if TM2 halts, then TM3 starts flipping the states in A-cells (see the second part of Fig. 2). This induces a visible difference between the long-time average of \({{{{{{{{\mathcal{A}}}}}}}}}_{L}\) in the case of halting and non-halting.

  4. (iv)

    In the case of halting, the head returns to the cell where TM2 halts due to the periodic boundary condition. By this time, all A-cells have already been flipped, and TM3 stops.

Because the halting problem of TM2 with an arbitrary input u is undecidable, the long-time average of \({{{{{{{{\mathcal{A}}}}}}}}}_{L}\) with an arbitrary local unitary transformation Vu is likewise undecidable.

Hamiltonian construction and its eigenstates

Our implementation of the classical TM in quantum systems stems from the construction of the Feynman-Kitaev Hamiltonian45,46, while we delete the clock part. Each site takes one of the states of the finite control or that of a single cell in the tape, or some additional symbols. If the site i is a state of the finite control, then the head reads the site i + 1 or i − 1 (see Fig. 3). In the initial state, we set the finite control at site 1, and set all other sites not to the states of the finite control. Because the dynamics conserve the number of sites of the finite control, only a single site takes the state of the finite control at all times.

Fig. 3: Evolution of the quantum state of the total system.
figure 3

We draw a possible hopping between quantum states in the computational basis. Although here, we depict only the first and second layers for visibility, the quantum state actually consists of three layers. Similar to the Feynman-Kitaev Hamiltonian case, the total Hamiltonian induces the forward and backward one-step time evolution of the TM. We denote the state of the classical TM at the nth step by xn. The symbols a1, 0 and 1 represent the states of cells, and qu and q2 represent the internal states of the TM.

The dynamics of TM are encoded in the local Hamiltonian as follows. Suppose, e.g., that the cell at the head is sa, the state of the finite control is qb, and the TM moves to the right with keeping the state of the finite control and the cell. Then, the local Hamiltonian hi,i+1 must contain the term

$$\left|{s}_{a},{q}_{b}\right\rangle \left\langle {q}_{b},{s}_{a}\right|+{{{{{{{\rm{c}}}}}}}}.{{{{{{{\rm{c}}}}}}}}.\ .$$
(4)

Similarly, we add all transition rules of TMs (both TM1, TM2, and TM3) to the local Hamiltonian in the form of (4). Owing to the deterministic property of TMs, all the legal states of the total system have a unique descendant state.

Because treatment of almost uniform initial states is slightly complicated, we first take an analogous and easier setting. Our original setting is discussed in the next section. We set the initial state as a non-uniform computational basis state, such that the dynamic of TMs is uniquely determined without quantum fluctuation. Let \(\left|{{{{{{{{\boldsymbol{x}}}}}}}}}^{1}\right\rangle\) denote the initial configuration of the total system, and \(\left|{{{{{{{{\boldsymbol{x}}}}}}}}}^{n}\right\rangle\) be the n-th state (i.e., after n − 1 steps from \(\left|{{{{{{{{\boldsymbol{x}}}}}}}}}^{1}\right\rangle\)). By restricting the Hilbert space to the subspace spanned by \(\{\left|{{{{{{{{\boldsymbol{x}}}}}}}}}^{n}\right\rangle \}\), the total Hamiltonian is expressed as (see also Fig. 3)

$$H^{\prime} =\mathop{\sum }\limits_{n=1}^{J-1}\left|{{{{{{{{\boldsymbol{x}}}}}}}}}^{n+1}\right\rangle \left\langle {{{{{{{{\boldsymbol{x}}}}}}}}}^{n}\right|+{{{{{{{\rm{c}}}}}}}}.{{{{{{{\rm{c}}}}}}}}.,$$
(5)

where the J-th state is the final state of this dynamics. This Hamiltonian takes the same form as a single-particle system on a closed one-dimensional lattice with only hopping terms. Employing the result on a tridiagonal matrix, eigenenergies and energy eigenstates are calculated as

$${E}_{j}=2\cos \left(\frac{j\pi }{J+1}\right),$$
(6)
$$\left|{E}_{j}\right\rangle =\sqrt{\frac{2}{J+1}}\mathop{\sum }\limits_{k=1}^{J}\sin \left(\frac{kj\pi }{J+2}\right)\left|{{{{{{{{\boldsymbol{x}}}}}}}}}^{k}\right\rangle ,$$
(7)

with j = 1, 2, …, J.

By expanding the initial state as \(\left|{{{{{{{{\boldsymbol{x}}}}}}}}}^{1}\right\rangle =\mathop{\sum }\nolimits_{j = 1}^{J}{c}_{j}|{E}_{j}\rangle\), the long-time average of \({{{{{{{{\mathcal{A}}}}}}}}}_{L}\) reads \({\bar{{{{{{{{\mathcal{A}}}}}}}}}}_{L}=\mathop{\sum }\nolimits_{j = 1}^{J}{|{c}_{j}|}^{2}\langle {E}_{j}| A| {E}_{j}\rangle\), because all the off-diagonal elements vanish in the long-time average. Since the number of steps until TM2 halts is independent of the system size L, by setting L to be sufficiently large, we can make the flipping of A-cells start before J/2 steps. In this condition, half of the A-cells have been flipped before 3J/4 steps, which confirms the nonzero expectation value of \({\bar{{{{{{{{\mathcal{A}}}}}}}}}}_{L}\) in the case of halting. In contrast, in the case of non-halting, the flipping by TM3 does not occur, and hence the long-time average \({\bar{{{{{{{{\mathcal{A}}}}}}}}}}_{L}\) is kept close to zero.

Uniform initial state

We now describe the decoding process from the second and third layers of M-cells in our original setting, almost uniform initial states. The sites in the second and third layers are set to \(\sqrt{\beta }\left|1\right\rangle +\sqrt{1-\beta }\left|0\right\rangle\) and \(\sqrt{\gamma }\left|1\right\rangle +\sqrt{1-\gamma }\left|0\right\rangle\), respectively. The state on m of M-cells is a superposition of 2m × 2m computational basis states. TM1 runs on each computational basis state, and thus the dynamics of TMs is also a superposition of 2m × 2m branches.

The quantity β stores the input code in the form such that the binary expansion of β equals the input code u. TM1 calculates β by estimating the relative frequency of the state \(\left|1\right\rangle\) in the second layer. Due to the law of large numbers, the set of computational basis states such that the relative frequency of \(\left|1\right\rangle\) is not close to β has negligibly small probability amplitude. The quantity γ (more precisely, \(-1/{{{{{{\mathrm{ln}}}}}}}\,\gamma\)) characterizes the length of qubits that TM1 must read in. TM1 reads the qubits in the second layer until it first encounters \(\left|1\right\rangle\) in the third layer (the first part of Fig. 2). By setting γ to be sufficiently small, the probability of two unwanted cases, namely, (a) TM1 stops decoding before u is decoded to the last, and (b) TM1 can access only an insufficiently small number of qubits in the second layer and fails to estimate the correct β, becomes negligible.

From relaxation to thermalization

We shall sketch how Theorem 2, the undecidability of thermalization, is derived from the proof techniques of Theorem 1. Careful calculation with slightly modified version of TM3 implies that the long-time average \(\bar{{{{{{{{\mathcal{A}}}}}}}}}\) when TM2 halts approaches 〈e2Ae2〉. Since the basis \(\{\left|{e}_{i}\right\rangle \}\) with i ≥ 2 can be set arbitrarily in the proof of Theorem 1, it suffices to show the presence of an orthogonal normal basis \(\{\left|{e}_{i}\right\rangle \}\) such that \(\langle {e}_{2}| A| {e}_{2}\rangle ={{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\). We remark that the Hamiltonian depends on the basis \(\{\left|{e}_{i}\right\rangle \}\), and thus \({{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\) also depends on it through the Hamiltonian.

Since \(\left|{\phi }_{0}\right\rangle\) and \(\left|{\phi }_{1}\right\rangle\) are not at the edge of the spectrum of A, the equilibrium value \({{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\) always settles between the maximum and the minimum expectation values of A in the subspace orthogonal to \(\left|{\phi }_{0}\right\rangle\), \(\left|{\phi }_{1}\right\rangle\), \(A\left|{\phi }_{0}\right\rangle\), and \(A\left|{\phi }_{1}\right\rangle\). Let \(\left|{\sigma }_{\max }\right\rangle\) and \(\left|{\sigma }_{\min }\right\rangle\) be states in this subspace accompanying the maximum and minimum expectation values of A. We set \(\left|{e}_{2}(p)\right\rangle := \sqrt{p}\left|{\sigma }_{\max }\right\rangle +\sqrt{1-p}\left|{\sigma }_{\min }\right\rangle\) and change p from p = 0 to p = 1. With recalling \(\langle {e}_{2}(0)| A| {e}_{2}(0)\rangle \le {{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\le \langle {e}_{2}(1)| A| {e}_{2}(1)\rangle\) and the continuity of 〈e2(p)Ae2(p)〉, we find that there exists a proper p (a proper \(\left|{e}_{2}\right\rangle\)) which realizes \(\langle {e}_{2}(p)| A| {e}_{2}(p)\rangle ={{{{{{{{\mathcal{A}}}}}}}}}^{{{{{{{{\rm{MC}}}}}}}}}\). Using this Hamiltonian with this basis \(\{\left|{e}_{i}\right\rangle \}\), we arrive at the undecidability of thermalization by following the same argument to that of relaxation.

We here remark two points. First, the tuning of \(\left|{e}_{2}\right\rangle\) can be accomplished in the choice of the local Hamiltonian, and both the observable and the initial state are kept as arbitrary fixed parameters. Second, since a finite error from the equilibrium value is allowed, we can compute a proper p (i.e., a proper local Hamiltonian) within this error in a finite number of steps.

No sufficiently large system size

Our result claims that we cannot solve the problem of thermalization by any elaborated method even with unlimited computational resource. In order to elucidate the significance of the constructed systems, we compare them with near-integrable systems, H = Hint + εV, where Hint is an integrable Hamiltonian and ε is a small parameter. In near-integrable systems, the small parameter ε determines the necessary system size and time length to distinguish the true thermodynamic limit from prethermal plateaus, and by taking ε → 0 the necessary size diverges. If our computational resource is unlimited, by setting the system size and running time sufficiently large depending on ε as determined above we safely obtain the true long-time behaviour in the thermodynamic limit within an arbitrarily small error.

In contrast to near-integrable systems, the constructed systems of undecidability have no such small parameters and no sufficiently large system size. This fact is clearly demonstrated by introducing the busy beaver function BB(n). The busy beaver function gives the largest number of steps which a halting TM with n internal states and an empty input can take. Since the number of internal states can be connected to the length of the input code to a TM with a fixed number of internal states, the busy beaver function also serves as the indicator of the necessary time steps with respect to the length of the input. In terms of thermalization, the busy beaver function provides the necessary system size and time length to observe the true thermodynamic limit. However, the busy beaver function is proven to be uncomputable. More surprisingly, if the Zermelo-Fraenkel set theory with the axiom of choice (ZFC), which is roughly equivalent to the whole of our mathematics, is consistent, then BB(748) is shown to be uncomputable47,48. Notice that all possible TMs with 748 internal states can be implemented by a (large but) finite set of Hamiltonians. These Hamiltonians obviously have no small parameters going to zero, because no quantity tends to go to zero in a finite set. In spite of this, we do not have a sufficiently large system size for these (finite number of) Hamiltonians.

Discussion

The presence or absence of thermalization in a given quantum many-body system, which has been a topic of debate among researchers in various fields, is proven to be undecidable. Hence, there exists no general systematic procedure to determine the long-time behaviour of quantum many-body systems. The undecidability is still valid for a class of simple systems; one-dimensional systems with a shift-invariant and nearest-neighbour interaction. Our result leads to a fundamental limitation to reach a general theory on thermalization.

Our proof also shows the computational universality of thermalization phenomena. Contrary to the apparent simplicity of thermalization phenomena, the above fact leads to an astonishing consequence that the variety of thermalization phenomena is no less than all possible tasks computers can manage. A striking example bridging physics and mathematics is a system that thermalizes if and only if the Riemann hypothesis is true. The above system reflects the existence of a TM which halts if and only if the Riemann hypothesis is false49.

From the context of physics, the extremely slow relaxation of our model in case of halting is induced by quasi-conserved non-local quantities, which are close to conserved quantities but not conserved. Recently, some non-integrable systems (the transverse Ising model with z magnetic field) have been reported to relax very slowly, which is caused by quasi-conserved local quantities23,50,51. Numerical simulations with ordinal size and time length fail to address thermalization in these systems. Similar things can also be seen in glassy systems, whose connection with computational hardness is also discussed intensively52. The extremely slow relaxation in our system might be understood from the aforementioned more general viewpoints, which is worth further investigation.

We remark that our definition of thermalization is conditional with respect to an observable. There exists another definition of thermalization in an unconditional form, where a system is said to thermalize if and only if the system thermalizes with respect to all macroscopic observables. In this article, we do not employ this alternative definition because no shift-invariant system is proven to thermalize in this sense. To prove undecidability, we should prepare infinitely many thermalizing and non-thermalizing systems with proof. Constructing a thermalizing system in this sense is considered to be a very hard problem, and therefore we give up adopting this definition.

We finally comment on the limitations of our result and conclude this study. First, our result does not exclude the possibility that one proves the presence or absence of thermalization in specific systems. Our result only excludes the possibility to obtain a general and ultimate criterion to judge the presence or absence of thermalization. We emphasize that our results do not tarnish the meaningfulness of numerical simulations in ordinal systems with finite size. Second, our undecidability is shown in only a highly artificial model with a particular form of Hamiltonians, which is another limitation of our result. One needs to proceed to a more natural model exhibiting undecidability, or to find a set of a restricted class of physical Hamiltonians whose fate of thermalization is now decidable. These problems are left for future works.

Method

Decoding from the second and third layers

We discuss how to decode the input code u from the sequence of two qubits in the second and third layers. The amount of β, whose binary expansion is equal to u, is guessed by the relative frequency of 1’s in the second layer (see Fig. 2). We expand m copies of \(\sqrt{\beta }\left|1\right\rangle +\sqrt{1-\beta }\left|0\right\rangle\) as

$$\begin{array}{lll}{(\sqrt{\beta }\left|1\right\rangle +\sqrt{1-\beta }\left|0\right\rangle )}^{\otimes m}=\mathop{\sum}\limits_{{{{{{{{\boldsymbol{w}}}}}}}}\in {\{0,1\}}^{\otimes m}}{\sqrt{\beta }}^{{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}{\sqrt{1-\beta }}^{m-{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}\left|{{{{{{{\boldsymbol{w}}}}}}}}\right\rangle ,\end{array}$$
(8)

where w is a sequence of 01 with length m, and N1(w) is the number of 1’s in the binary sequence w. The probability amplitude for a state \(\left|{{{{{{{\boldsymbol{w}}}}}}}}\right\rangle\) is \({\left|{c}_{{{{{{{{\boldsymbol{w}}}}}}}}}\right|}^{2}={\beta }^{{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}{(1-\beta )}^{m-{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}\). Due to the law of large numbers, the probability amplitude for states with relative frequency of 1’s close to β converges to 1 in the large m limit:

$$\mathop{\lim}\limits_{m\to \infty }\mathop{\sum}\limits_{{{{{{{{\boldsymbol{w}}}}}}}}:\frac{{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}{m}\simeq \beta }{\left|{c}_{{{{{{{{\boldsymbol{w}}}}}}}}}\right|}^{2}=1.$$
(9)

Here, the symbol \(\frac{{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}{m}\simeq \beta\) means that \(\frac{{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}{m}\) is close to β, whose rigorous definition is presented soon later (in (11)). Hence, if m is sufficiently large compared with the length of the input code, TM1 guesses β correctly from the frequency of 1’s.

The length m is determined by another bit sequence, \(\sqrt{\gamma }\left|1\right\rangle +\sqrt{1-\gamma }\left|0\right\rangle\), in the third layer. Let 0 < ξ < 1 be a given accuracy. We encode the information of m into γ as satisfying

$${(1-\gamma )}^{m}\ge 1-\xi .$$
(10)

In other words, almost all qubits are \(\left|0\right\rangle\) in this sequence, and \(\left|1\right\rangle\) appears only after m-th digit with probability larger than 1 − ξ. Owing to this, if \(\left|1\right\rangle\) appears at the \(m^{\prime}\)-th digit for the first time, this is taken as the sign that \(m\le m^{\prime}\). Based on the observed value \(m^{\prime}\), the length of the output by TM1 (i.e., the presumed length of the digit of β) is determined as \(n^{\prime} =\lceil \frac{1}{4}{{{{{{{\mathrm{log}}}}}}}\,}_{2}m^{\prime} \rceil\), which ensures

$$\mathop{\lim}\limits_{{m}^{\prime}\to \infty }{{{{{{{\rm{Prob}}}}}}}}\left[\left|\frac{{N}_{1}({{{{{{{\boldsymbol{w}}}}}}}})}{m^{\prime} }\,-\,\beta \right| \, < \, \frac{1}{{2}^{n^{\prime} +1}}\right]=1.$$
(11)

With this choice of output length \(n^{\prime}\), guessing \(m^{\prime}\) larger than the true value m does not affect the correctness of the estimation of β.

Modification of TM3 in case of thermalization

When we show the undecidability of thermalization, we need to modify TM3 to another TM named TM3+. TM3 flips all A-cells from a1 to a2, and after the flipping TM3 stops. Similarly, TM3+ first flips all A-cells from a1 to a2, but after the flipping TM3+ still runs in order to spend time steps of order O(L2). Note that TM1 and TM2 take O(1) steps, and TM3 takes O(L) steps. In TM3+, most of the steps before stopping are dominated by those after flipping. This additional trick makes the long-time average of \({{{{{{{\mathcal{A}}}}}}}}\) with halting TM2 from 〈e2Ae2〉/2 (in case with TM3) to 〈e2Ae2〉.

In the construction of TM3+, we introduce two new states of A-cells, bl, and br, and equip the rule such that the position of bl is fixed and the position of br moves right one cell through a single round trip of the finite control between bl and br. At the beginning, br sits right of bl, and we set TM3+ stop when br hits bl from left. In this setting, it takes O(L2) steps until br hits bl from left, which indeed meets the requirement.

Busy beaver function

The busy beaver function BB(n) is defined as follows: We consider all possible TMs with n internal states, and start running these TMs with empty inputs. Some TMs will halt, and some other TMs will not. We pay attention only to the former TMs and record the maximum number of steps before halting, which is BB(n). Since we exclude non-halting TMs, BB(n) must be finite for all n.

We remark that a TM with m internal states and input u with length l can be emulated by another TM with m + l internal states and empty input. The emulation is performed as follows: This TM first outputs the code u on the blank tape by using l internal states, and then works as the TM with m internal states. Conversely, a URTM can emulate any TM with any number of internal states, whose information is given in the input code for the URTM. Thus, the busy beaver function also characterizes the maximum number of steps in terms of the length of the input.

The uncomputability of BB(n) is a direct consequence of the undecidability of the halting problem. We show this by contradiction. Suppose that BB(n) is computable for any n. Then, for any input code u with length l, we run this TM with this input for BB(m + l) steps and observe whether this TM halts or not. By definition, if this TM does not halt at this step, we can confirm that this TM does not halt forever. This procedure solves the halting problem, which is a contradiction.

The uncomputability of BB(748) is shown by resorting to the fact that there is a TM with 748 internal states such that this TM halts if and only if ZFC is inconsistent47,48. Gödel’s incompleteness theorem shows that ZFC cannot prove the consistency of ZFC itself if ZFC is consistent. Following a similar argument to above, ZFC cannot compute BB(748) if ZFC is consistent.

Proof

All the results in this paper are rigorously proved in the Supplementary Note.