Introduction

Quantum speed limit (QSL) is a fundamental topic in quantum mechanics focusing on the characterization of minimum time for quantum states to fulfill certain known targets, such as rotating a state to its orthogonal states, or some angles quantified by certain metrics. In principle, the target could be chosen flexibly due to the problem of interest. In the year of 1945, Mandelstam and Tamm provided the first lower bound for this minimum time based on the uncertainty relation1. In 1996 Braunstein et al. extended the lower bound to time-dependent Hamiltonians utilizing the generalized uncertainty relation2 where the time-average variance was applied. In 1998, Margolus and Levitin3 provided another bound based on the mean energy. After these pioneer works, the topic of QSL entered a period of rapid development in the next 20 years, especially in 2010s4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42.

Most existing tools in QSL belong to the lower-bound-type (LBT) tools. The advantage of this type of tools is that they are easy to compute, especially in numerical aspects. However, the disadvantage of them are also significant. On one hand, most LBT tools are dependent on the initial states. This dependence would cause a problem that even the initial state cannot actually fulfill the given target, the LBT tools would still provide finite results, which is reasonable in mathematics since any finite value is a legitimate lower bound of infinity. However, it also indicates that from these tools one cannot acquire the information whether a state is capable to fulfill the target. For example, consider a qubit Hamiltonian ωσz/2 with σz(x) the Pauli Z (X) matrix and ω the energy gap. For this Hamiltonian, the Mandelstam-Tamm and Margolus-Levitin bounds for the state with the density matrix \({\mathbb{1}}/2+{\sigma }_{x}/4+\sqrt{3}{\sigma }_{z}/4\) are 2π/ω and \(2\pi /(\sqrt{3}\omega )\). Here \({\mathbb{1}}\) is the identity matrix. However, in fact this state cannot fulfill the target π/2 at all since the maximum angle it can rotate under the given Hamiltonian is only π/321. Hence, without the information whether the target can be fulfilled, the conclusions based on the lower-bound-type tools might be suboptimal since the results are actually unphysical for the states unable to reach the target.

On the other hand, in the case that the Hamiltonians are time-dependent, the LBT tools are usually functions of time9,10,11,12,13,14,15,16,17,18,19,20. As a matter of fact, these formal time-dependent lower bounds are difficult to reveal both the true minimum time and true physics behind it. As clarified in ref. 21, in noncontrolled scenarios the true minimum time for a fixed state to fulfill a given target is only a fixed time point, and the results of LBT tools have to go across this time point due to their time dependence. In the time before this time point, the finite results of the LBT tools cannot reveal the fact that this state is actually uncapable to reach the target in this time regime. And in the time after this time point, the results of LBT tools have to be no larger than this point since they are its lower bounds, which indicates that in this time regime the attainability of the LBT tools is lousy. These disadvantages of LBT tools could be further magnified with the growth of system dimension or the complexity of dynamics. Hence, locating the true minimum time for the fulfillness of a given target in many-body systems and complex dynamics is still an important yet unsolved problem. Finding this minimum time or at least providing efficient methodologies to search it is thus the major motivation of this paper.

Results and discussion

Operational definition of the quantum speed limit

The target in QSL could be quantified via different tools, such as the Bures metric or various types of fidelity8,9,10,11,12,13, relative purity14,15,16, Bloch angle16,17,21,22, gauge invariant distances18,19, and Wigner-Yanase information20. Different tools usually lead to different mathematical bounds or methods for the description of QSL, and a general and unified methodology that fits all tools is still in lack. Recently, an operational definition of the quantum speed limit (OQSL) was proposed21 based on the Bloch angle, which is capable to be extended to a general tool due to the fact that it is intrinsically a methodology, rather than a concept. Denote ρ as the density matrix of a quantum state, Φ as any type of metric or tool to quantify the target and Φtar as the corresponding target value, then the reachable state set can be defined as \({{{\mathcal{S}}}}:= \{\rho \,| \,\Phi (t,\rho )={\Phi }_{{{{\rm{tar}}}}},\exists t\}\), which is the set of states that can fulfill the target. Moreover, it is possible that in some cases not all states in the state space, but the states in a subset \({{{\mathcal{Q}}}}\), are concerned. In this case, \({{{\mathcal{S}}}}\) can be further expressed by \({{{\mathcal{S}}}}:= \{\rho \,| \,\rho \in {{{\mathcal{Q}}}}\,\& \,\Phi (t,\rho )={\Phi }_{{{{\rm{tar}}}}},\exists t\}\). Utilizing \({{{\mathcal{S}}}}\), the OQSL (denoted by τ) can be defined by

$$\begin{array}{l}\tau \,:= \,\mathop{\min }\limits_{\rho \in {{{\mathcal{S}}}}}t\\ \qquad\,\,\,\,{{\mbox{subject to}}}\,\,\Phi (t,\rho )={\Phi }_{{{{\rm{tar}}}}}.\end{array}$$
(1)

The Bloch vector is one of the most famous geometric representations for the quantum state and has been widely applied in many fields of quantum physics, such as the quantum computation43 and quantum control44. In the Bloch representation, the density matrix can be expressed by \(\rho =\frac{1}{N}\left({\mathbb{1}}+\sqrt{\frac{1}{2}N(N-1)}\overrightarrow{r}\cdot \overrightarrow{\lambda }\right)\), where N is the dimension of ρ, \(\overrightarrow{\lambda }\) is the vector of SU(N) generators, \(\overrightarrow{r}\) is the Bloch vector satisfying \(| \overrightarrow{r}| \le 1\), and \({\mathbb{1}}\) is the identity matrix. The Bloch angle θ between \(\overrightarrow{r}\) and its evolved vector \(\overrightarrow{r}(t)\) is \(\theta (t,\overrightarrow{r}):= \arccos \left(\frac{\overrightarrow{r}\cdot \overrightarrow{r}(t)}{| \overrightarrow{r}| | \overrightarrow{r}(t)| }\right)\in \left(0,\pi \right]\). Denote Θ as the fixed target, then the reachable state set can be rewritten into \({{{\mathcal{S}}}}=\{\overrightarrow{r}\,| \,\overrightarrow{r}\in {{{\mathcal{Q}}}}\,\& \,\theta (t,\overrightarrow{r})=\Theta ,\exists t\}\), and the OQSL reads \(\tau =\mathop{\min }\nolimits_{\overrightarrow{r}\in {{{\mathcal{S}}}}}\,t\), subjecting to the constraint \(\theta (t,\overrightarrow{r})=\Theta\).

In the perspective of OQSL, when two tools to quantify the target has a one-to-one correspondence, for example the angle of relative purity \(\arccos \left(\frac{{{{\rm{Tr}}}}(\rho \rho (t))}{{{{\rm{Tr}}}}({\rho }^{2})}\right)\) and Bloch angle (calculation details are in the Supplementary Information), then the reachable state sets for these tools are exactly the same, which means the results of OQSL would also be equivalent. This equivalence reveals an important fact that a physical target can be mathematically quantified by different tools, yet the true minimum time to fulfill the physical target should not be affected by this quantification process since it is not physical.

The OQSL is closely related to the quantum brachistochrone problem45,46, which focuses on searching the minimum time for a given initial state to a fixed target state or the realization of a target gate. In the language of OQSL, instead of a given initial state, we can study the minimum time for a set of initial states, i.e., the aforementioned set \({{{\mathcal{Q}}}}\), to reach a target state ρtar under a given Hamiltonian. In this problem \({{{\mathcal{S}}}}\) can be expressed by \({{{\mathcal{S}}}}=\{\rho | \rho \in {{{\mathcal{Q}}}}\,\& \,{e}^{{{{\mathcal{L}}}}}(\rho )={\rho }_{{{{\rm{tar}}}}},\exists t\}\) where \({{{\mathcal{L}}}}\) is a superoperator satisfying \({\partial }_{t}{\rho }_{t}={{{\mathcal{L}}}}({\rho }_{t})\) with ρt the evolved state of ρ. Furthermore, the OQSL can be expressed by

$$\begin{array}{l}\tau \,:= \,\mathop{\min }\limits_{\rho \in {{{\mathcal{S}}}}}t\\ \qquad\,\,\, \,{{\mbox{subject to}}}\,\,{e}^{{{{\mathcal{L}}}}}(\rho )={\rho }_{{{{\rm{tar}}}}}.\end{array}$$
(2)

Notice that if \({\rho }_{{{{\rm{tar}}}}}\in {{{\mathcal{Q}}}}\), the optimal state in \({{{\mathcal{Q}}}}\) to reach ρtar must be ρtar itself for any Hamiltonian and the corresponding time is nothing but zero, which means this is a trivial case. Therefore, \({\rho }_{{{{\rm{tar}}}}}\notin {{{\mathcal{Q}}}}\) should be satisfied to make sure the problem is nontrivial. Here we still take the qubit Hamiltonian ωσz/2 as a simple demonstration. The target state is assumed to be \((\left\vert 0\right\rangle -\left\vert 1\right\rangle )/\sqrt{2}\) with \(\left\vert 0\right\rangle\) (\(\left\vert 1\right\rangle\)) the eigenstate of σz corresponding to the eigenvalue 1 ( − 1). \({{{\mathcal{Q}}}}=\{\rho | {{{\rm{Tr}}}}(\rho {\sigma }_{x})\ge 0\}\). Utilizing the spherical coordinates of the Bloch vector \(\overrightarrow{r}=\eta {(\sin \alpha \cos \varphi ,\sin \alpha \sin \varphi ,\cos \alpha )}^{{{{\rm{T}}}}}\), \({{{\mathcal{S}}}}\) in this example reads \(\left\{\overrightarrow{r}\,| \,\eta =1,\alpha =\pi /2,\varphi \in [0,\pi /2]\cup [3\pi /2,2\pi )\right\}\), and the OQSL τ = π/(2ω). This minimum time can be attained by the state \((\left\vert 0\right\rangle +i\left\vert 1\right\rangle )/\sqrt{2}\). Calculation details can be found in the Supplementary Information.

Compared to lower-bound-type QSLs, the advantages of OQSL are that it can reveal the information that whether a state can fulfill the target, and it is always attainable21. In the case of complex dynamics, these advantages come at a price of high computational complexity, which is not only due to the optimization in the definition, but also the preliminary assumption that \({{{\mathcal{S}}}}\) is known. For example, in the analytical calculation of the OQSL, the search of \({{{\mathcal{S}}}}\) is the first step and usually finished by finding the condition of ρ when the equation \(\Phi \left(t,\rho \right)={\Phi }_{{{{\rm{tar}}}}}\) has a finite solution t. Then the evolution time to fulfill the target is calculated and optimized under this condition to further obtain the OQSL. In this case, the calculation of \({{{\mathcal{S}}}}\) and the optimization of time are performed separably and thus their contributions to the computational complexity are different. In the numerical evaluation of OQSL, the contributions of these two processes are the same when the brute-force search is applied since the search of \({{{\mathcal{S}}}}\) in this method is based on the rigorous dynamics of each state. When \({{{\mathcal{S}}}}\) is obtained, the corresponding time to fulfill the target for each state is also obtained. Hence, the computational complexity in this case is basically contributed by the search of \({{{\mathcal{S}}}}\). However, it is obvious that the brute-force search is not always feasible in practice, especially when the dynamics is complex or the system size is large, which is actually a non-negligible scenario in the study of QSL23,24,25,26. Hence, finding methods for the evaluation of OQSL that are friendly to the complex dynamics or large-size systems is critical, and thus the major motivation of this paper.

The time-dependent Hamiltonians with time-independent eigenstates

In many cases, the complexity of dynamics comes from the time dependency of the Hamiltonian. The OQSL for a general time-dependent Hamiltonian is difficult to obtain analytically. However, for the time-dependent Hamiltonians with time-independent eigenstates, the OQSL can be obtained analytically when taking the Bloch angle as the quantification of target. In the energy space, these Hamiltonians can be expressed by \(H(t)={\sum }_{i}{E}_{i}(t)\left\vert {E}_{i}\right\rangle \left\langle {E}_{i}\right\vert\), where the eigenstate \(\left\vert {E}_{i}\right\rangle\) is time-independent for any i and the eigenvalue Ei(t) depends on time. Many well-known models in quantum mechanics fit this scenario, such as the one-dimensional Ising model with a time-varying longitudinal field, the resonant Jaynes-Cummings model with time-dependent coupling47,48,49, and the semiclassical qubit-field model in the strong coupling regime50. For such Hamiltonians, we present the following theorem.

Theorem

For a N-dimensional time-dependent Hamiltonian whose eigenstates are all time-independent, the OQSL τ satisfies the equation

$$\int\nolimits_{0}^{\tau }\left[{E}_{\max }(t)-{E}_{\min }(t)\right]{{{\rm{d}}}}t=\Theta ,$$
(3)

where \({E}_{\max }(t)\) and \({E}_{\min }(t)\) are the maximum and minimum energies of the Hamiltonian at time t. Further denoting the p-dimensional set \(\{\left\vert {E}_{\min }\right\rangle \}\) and q-dimensional set \(\{\left\vert {E}_{\max }\right\rangle \}\) as the sets of eigenstates with respect to \({E}_{\min }(t)\) and \({E}_{\max }(t)\), the optimal states to reach the OQSL are

$$\begin{array}{l}\mathop{\sum}\limits_{i}\frac{1}{N}\left\vert {E}_{i}\right\rangle \left\langle {E}_{i}\right\vert +\,\mathop{\sum}\limits_{\begin{array}{l}\left\vert {E}_{k}\right\rangle \in \{\left\vert {E}_{\min }\right\rangle \},\\ \left\vert {E}_{l}\right\rangle \in \{\left\vert {E}_{\max }\right\rangle \}\end{array}}\,{\xi }_{kl}\left\vert {E}_{k}\right\rangle \left\langle {E}_{l}\right\vert +{\xi }_{kl}^{* }\left\vert {E}_{l}\right\rangle \left\langle {E}_{k}\right\vert ,\end{array}$$

where the matrix ξ (with kl th entry ξkl) satisfies \({N}^{2}{\xi }^{{\dagger} }\xi \le {{\mathbb{1}}}_{q}\) with \({{\mathbb{1}}}_{q}\) the q-dimensional identity matrix.

The proof is given in the Supplementary Information. As a matter of fact, this theorem covers Theorem 1 in Ref. 21 due to the fact that Eq. (3) reduces to \(\tau =\Theta /({E}_{\max }-{E}_{\min })\) when the eigenvalues are time-independent. As a simple demonstration, consider the Hamiltonian H(t) = f(t)σz with f(t) a time-dependent function. It is obvious that the eigenstates of this Hamiltonian are independent of time. Hence the corresponding OQSL is given in the theorem above. In the case that \(| \int\nolimits_{0}^{t}f({t}_{1}){{{\rm{d}}}}{t}_{1}|\) is upper bounded by cf, \({{{\mathcal{S}}}}\) is fully determined by the value of cf, which leads to the following corollary.

Corollary

For the Hamiltonian H(t) = f(t)σz where f(t) satisfies \(| \int\nolimits_{0}^{t}f({t}_{1}){{{\rm{d}}}}{t}_{1}| \le {c}_{f}\), no state can fulfill the target Θ if cf < Θ/2.

In the case that cf ≥ Θ/2, \({{{\mathcal{S}}}}\) is symmetric about z axis in the Bloch sphere, similar to the time-independent Hamiltonian ωσz/221. This is due to the fact that in this case the dynamics of all states in the Bloch sphere are the precessions about z axis, and thus it obeys the rotational symmetry about z axis. Therefore, \({{{\mathcal{S}}}}\) can be fully expressed by the angle between the Bloch vector and z axis (denoted by α). More specifically to say, when cf [Θ/2, π/2], \({{{\mathcal{S}}}}=\{\overrightarrow{r}\,| \alpha \,\in \,[{\alpha }_{f},\pi -{\alpha }_{f}]\}\) with \({\alpha }_{f}=\arcsin \left(\frac{\sin (\Theta /2)}{\sin {c}_{f}}\right)\), and \({{{\mathcal{S}}}}=\{\overrightarrow{r}\,| \alpha \,\in \,[\Theta /2,\pi -\Theta /2]\}\) when cf > π/2. Furthermore, the OQSL satisfies \(\int\nolimits_{0}^{\tau }| f(t)| {{{\rm{d}}}}t=\Theta /2\). A physical example here is \(f(t)\,=\,-g{\mu }_{{{{\rm{B}}}}}B\cos (\omega t)/2\)51 with g the Lande factor, μB the electron magnetic moment and \(B\cos (\omega t)\) a periodic magnetic field. Due to the fact \(| \int\nolimits_{0}^{t}f({t}_{1}){{{\rm{d}}}}{t}_{1}| \le g{\mu }_{{{{\rm{B}}}}}B/(2\omega )\), \({{{\mathcal{S}}}}\) is determined by the ratio between B and ω. The OQSL reads \(\tau =\arcsin (\frac{\omega \Theta }{g{\mu }_{{{{\rm{B}}}}}B})/\omega\), and the optimal states are the states in the xy plane. It is obvious that τ ≤ π/(2ω) as \(\arcsin (\cdot )\) is always less than or equal to π/2. This upper bound is nothing but the time when the first degenerate point occurs, which leads to an interesting phenomenon that all targets can be fulfilled before the first degenerate point occurs with the states in the xy plane. In the case that a bounded control u(t) (u(t)ub) is invoked, f(t) becomes \(u(t)\,-\,g{\mu }_{{{{\rm{B}}}}}B\cos (\omega t)/2\) and the upper bound of \(| \int\nolimits_{0}^{t}f({t}_{1}){{{\rm{d}}}}{t}_{1}|\) can always overcome π/2 at a long enough time. Hence, in this case \({{{\mathcal{S}}}}=\{\overrightarrow{r}\,| \alpha \,\in \,[\Theta /2,\pi -\Theta /2]\}\) and the OQSL satisfies \(\int\nolimits_{0}^{\tau }| g{\mu }_{{{{\rm{B}}}}}B\cos (\omega t)/2-u(t)| {{{\rm{d}}}}t=\Theta /2\). The minimum τ with respect to u(t) (denoted by \({\tau }_{\min }\)) satisfies the equation \(g{\mu }_{{{{\rm{B}}}}}B\sin (\omega {\tau }_{\min })/(2\omega )+{u}_{b}{\tau }_{\min }=\Theta /2\), and \({\tau }_{\min }\approx \Theta /(g{\mu }_{{{{\rm{B}}}}}B+2{u}_{b})\) for a small ω. The calculation details are in the Supplementary Information.

Another practical scenario to apply Theorem 1 is the one-dimensional Ising model with a longitudinal field, where two boundary conditions (periodic and open) exist. Let us first consider the case of periodic boundary condition, in which the Hamiltonian reads \(H/J=-\mathop{\sum }\nolimits_{j = 1}^{n}{\sigma }_{j}^{z}{\sigma }_{j+1}^{z}-\mathop{\sum }\nolimits_{j = 1}^{n}g(t){\sigma }_{j}^{z}\) with \({\sigma }_{n+1}^{z}={\sigma }_{1}^{z}\). Here J > 0 is the interaction strength of the nearest-neighbor coupling, and g(t) is a global time-dependent longitudinal field. \({\sigma }_{j}^{z}\) is the Pauli Z matrix for jth spin. The spin number n ≥ 3. In this case, the minimum energy is − n[1 + g(t)], and the maximum energy is n − η[2 − g(t)] when g(t) < 2 and n[g(t) − 1] when g(t)≥2. Here η [1 + (−1)n+1]/2. If g(t)≥2 for all time t, the OQSL satisfies the equation \(\int\nolimits_{0}^{\tau }| g(t)| {{{\rm{d}}}}t=\Theta /(2n)\). Due to the fact that \(\int\nolimits_{0}^{\tau }| g(t)| {{{\rm{d}}}}t\ge \int\nolimits_{0}^{\tau }2{{{\rm{d}}}}t=2\tau\), one can immediately finds that τ ≤ Θ/(4n). If g(t) < 2 all the time, Eq. (3) reduces to \(2\left(n-\eta \right)\tau +\left(n+\eta \right)\int\nolimits_{0}^{\tau }| g(t)| {{{\rm{d}}}}t=\Theta\). In this case \(\tau \in \left[\frac{\Theta }{4n},\frac{\Theta }{2n-2\eta }\right]\) since \(\int\nolimits_{0}^{\tau }| g(t)| {{{\rm{d}}}}t\in [0,2\tau ]\). For a g(t) that is not always bounded by 2, the integration in Eq. (3) needs to be calculated part by part and the rigorous solution may not easy to be acquired in general. However, in some cases a good approximation can still be obtained since τ is usually small. Take \(g(t)=B\cos (\omega t)\) as an example, where B and ω are the amplitude and frequency. In this case, if ω is not very large, then τ ≈ Θ/[2(n − η) + B(n + η)] when B < 2 and τ ≈ Θ/(2Bn) when B ≥ 2, which are nothing but the OQSLs with respect to the constant field g(t) = B.

In the case of open boundary condition, the Hamiltonian reads \(-\mathop{\sum }\nolimits_{j = 1}^{n-1}{\sigma }_{j}^{z}{\sigma }_{j+1}^{z}-\mathop{\sum }\nolimits_{j = 1}^{n}g(t){\sigma }_{j}^{z}\). The minimum energy is − n[1 + g(t)] + 1, and the maximum energy is n + ηg(t) − 1 when g(t)≤1, n − (2 − η)[2 − g(t)] + 1 when g(t) (1, 2), and n[g(t) − 1] + 1 when g(t)≥2. For \(g(t)\,=\,B\cos (\omega t)\) with a not very large ω, an interesting phenomenon occurs when B < 2 and n is even. The OQSL in this case approximates to Θ/[n(B + 2) − 2] when B ≤ 1, and Θ/[n(B + 2) + 2(B − 2)] when B (1, 2), which are different from the OQSL under the periodic boundary condition. These two OQSLs, as well as their difference, are quite robust to global and local dephasing. Therefore, the OQSL may be used to detect whether an even-numbered spin ring is ruptured, especially when the number is not very large. More details are in the Supplementary Information.

CRC methodology

The brute-force search is the most common method for the numerical evaluation of OQSL and is easy to execute for simple dynamics. However, when the evaluation of dynamics for one state is too time-consuming, the entire brute-force search would be impossible to finish as it usually requires executing thousand and even million rounds of dynamics. In recent years, machine learning has been successfully applied to quantum physics for the simulation of complex dynamics, such as the theoretical dynamics of many-body systems52,53,54 and realistic dynamics of experimental systems55,56. With the help of trained neural networks, the computing time to evaluate the dynamics significantly reduces compared to the rigorous calculation. Therefore, such learning techniques could be powerful tools to evaluate the OQSL. Hereby we provide a three-step methodology (CRC methodology) based on learning to evaluate the OQSL for complex dynamics. The three steps are (1) classification; (2) regression; and (3) calibration, as illustrated in Fig. 1. As a matter of fact, classification and regression are two terminologies in supervised learning. Classification is a problem to identify the categories of objects and regression is to predict some values related to the objects.

Fig. 1: CRC methodology to learn the OQSL for complex dynamics.
figure 1

The three steps are classification (gray box), regression (orange box), and calibration (blue box).

The reachable state set \({{{\mathcal{S}}}}\) is crucial in the evaluation of OQSL. It is not only essential for the further calculation of OQSL, but also reveals information that whether a state is capable to fulfill the target. Hence, the first step (classification) in CRC methodology is to find \({{{\mathcal{S}}}}\). In this step, a reasonable number of quantum states and corresponding binary labels (0 or 1) consist of the training set. Quantum states and binary labels are the input and output of the neural network. In our calculation, label 1 (0) represents the state is in (not in) \({{{\mathcal{S}}}}\). The performance of the trained network can be tested via a test set. After the training and performance verification, a large number of random states are input into the network to construct \({{{\mathcal{S}}}}\) according to the outputs. In the following the learned reachable state set in this step is denoted by \({{{{\mathcal{S}}}}}_{{{{\rm{learn}}}}}\).

The second step is regression. In this step, a subset of \({{{{\mathcal{S}}}}}_{{{{\rm{learn}}}}}\) and the corresponding time to reach the target consist of the training set. The time to reach the target is extracted from the rigorous dynamics. Notice that it is possible some states in this subset cannot fulfill the target and need to be removed from the training set since \({{{{\mathcal{S}}}}}_{{{{\rm{learn}}}}}\) could be slightly different from \({{{\mathcal{S}}}}\) in practice. After the training and performance verification, all states in \({{{{\mathcal{S}}}}}_{{{{\rm{learn}}}}}\) will be input into the trained network, and the minimum output (τlearn) and corresponding states (ρlearn) are extracted. The performance of τlearn relies on the performance of the trained neural network in this process. Usually enlarging the scale of the training set is a possible way to improve the performance of learning. However, in many cases this improvement is not always positively correlated to the scale growth of the training set. In the meantime, choosing an appropriate neural network would also be helpful, yet whether a network is appropriate usually needs to be thoroughly tested case by case. Moreover, large-scale models or quantum machine learning are also possible candidates to further improve the performance of τlearn, and we will continue to investigate this problem in the future.

In principle τlearn could be treated as an approximation of OQSL. However, if the methodology stops here then the accuracy of learned OQSL would be strongly affected by the residuals, namely, the differences between the true and predicted values. In the meantime, ρlearn may not be the actual optimal state in the neighborhood due to the existence of residuals. To further improve the methodology’s performance, we introduce the third step: calibration. In this step, a reasonable region around ρlearn in the state space is picked, and the dynamics of enough random states in this region are calculated rigorously. Then the minimum time to reach the target in this region (τopt) and corresponding state (ρopt) are picked out. τopt is the final evaluated value of OQSL in the methodology. Due to the fact that the process of calibration is designed to reduce the influence of residuals, a general principle for a proper region in calibration is that in this region it should clearly show that whether ρlearn is a local minimum point.

To verify the validity of CRC methodology, we apply it in the Landau-Zener model where the reachable state set and OQSL have been thoroughly discussed via brute-force search among about one million states21, and thus the methodology’s performance is easy to be tested. The Hamiltonian for the Landau-Zener model is H = Δσx + vtσz with Δ and v two time-independent parameters. In the step of classification, three training sets with different numbers of data are used to train the network and about one million states are used as the test set. The scores (correctness of prediction) are no less than 99.59%, 97.83%, and 98.00% for all training sets in the cases of Δ = 0, 1, and 2. In the step of regression, the mean square errors of learning are on the scale of 10−5 for Δ = 0, 2, and no larger than 1.22 × 10−4 for Δ = 1. In the last step, the region for calibration is chosen as [αlearn − 0.1, αlearn + 0.1] and [ϕlearn − 0.1, ϕlearn + 0.1] where αlearn and ϕlearn are the spherical coordinates of ρlearn, i.e., \(\cos ({\alpha }_{{{{\rm{learn}}}}})={{{\rm{Tr}}}}({\rho }_{{{{\rm{learn}}}}}{\sigma }_{z})\) and \(\cos ({\phi }_{{{{\rm{learn}}}}})={{{\rm{Tr}}}}({\rho }_{{{{\rm{learn}}}}}{\sigma }_{x})/\sin ({\alpha }_{{{{\rm{learn}}}}})\). The results of calibration show that in this case ρlearn is just ρopt for all values of Δ, and the corresponding τopt coincides with the exact OQSL obtained from the brute-force search. The validity of CRC methodology is then verified.

One advantage of CRC methodology is that it can deal with controlled dynamics, where the brute-force-search evaluation is usually difficult to realize due to the complexity of twofold optimizations. In the meantime, CRC methodology can also deal with noisy scenarios where the rigorous dynamics is usually more time-consuming than the unitary counterpart. Let us still consider the Landau-Zener model with the time-varying control Hamiltonian \(\overrightarrow{u}(t)\cdot \overrightarrow{\sigma }\). Here \(\overrightarrow{u}=({u}_{x}(t),{u}_{y}(t),{u}_{z}(t))\) is the vector of control amplitudes and \(\overrightarrow{\sigma }=({\sigma }_{x},{\sigma }_{y},{\sigma }_{z})\) is the vector of Pauli matrices. All control amplitudes are assumed to be in the regime \([-\sqrt{v},\sqrt{v}]\). Both the noiseless and noisy scenarios are studied. In the noisy scenario, the dynamics is governed by the master equation ∂tρ = − i[H, ρ] + γ(σzρσz − ρ) with γ the decay rate, which is taken as \(0.5\sqrt{v}\) as a demonstration. In this example, the evaluation of OQSL for Δ = 0 via brute-force search among one million states on a daily-use computer costs more than 830 days, which reduces to 30 days when the CRC methodology is applied (The actual computing time is less than the evaluation since parallel computing is applied.). The result of CRC methodology shows that all states in the state space can fulfill the target Θ = π/2 under control in both noisy and noiseless cases. Furthermore, the OQSL is very robust to the dephasing in both noncontrolled and controlled cases, as shown in Fig. 2. In the meantime, the controls can significantly reduce the OQSL when Δ is not very large. However, this improvement becomes limited with the increase of Δ. An interesting phenomenon is that regardless of the existence of both noise and controls, the OQSL always converges to Θ/(2Δ), which is nothing but the OQSL for the Hamiltonian Δσx in the absence of noise21. This phenomenon on speed limit is difficult to be revealed by lower-bound-type QSLs not only due to their dependence on both initial states and time, but also the lousy attainability when controls are involved.

Fig. 2: OQSL as a function of Δ in different cases.
figure 2

The solid black line, red circles, blue squares, and yellow triangles represent the values of OQSL for noiseless dynamics, noisy dynamics, controlled noiseless dynamics, and controlled noisy dynamics, respectively. The cyan dotted line represents Θ/(2Δ). The target Θ = π/2.

Another example we studied is the transverse Ising model with a periodic external field. The Hamiltonian is \(H/J\,=\,-\,\mathop{\sum }\nolimits_{j = 1}^{n}{\sigma }_{j}^{z}{\sigma }_{j+1}^{z}\,-\,\mathop{\sum }\nolimits_{j = 1}^{n}g(t){\sigma }_{j}^{x}\) with \(g(t)\,=\,B\cos (\omega t)\). In the demonstration, the amplitude B is taken as 0.5 and the frequency ω/J = 1. Because of the enormous state space (2n), it is difficult to construct a training set that is general enough for the CRC methodology, especially when n is large. To feasibly apply the CRC methodology, we need to analyze the state structure first and reduce the state space for the study. A simple way to categorize the states is based on the number of nonzero entries in a certain basis, such as the basis \({\{\left\vert \uparrow \right\rangle ,\left\vert \downarrow \right\rangle \}}^{\otimes n}\) considered as follows. \(\left\vert \uparrow \right\rangle\) (\(\left\vert \downarrow \right\rangle\)) is the eigenstate of σz with respect to the eigenvalue 1 (−1). Moreover, here we only consider the noiseless dynamics and that \({{{\mathcal{Q}}}}\) is the set of pure states. The ratios of reachable states for the target Θ = π/2 in the categories of 2 (red pentagrams), 3 (green crosses), and 10 nonzero entries (blue triangles) are given in Fig. 3. The ratio in each category is obtained from 2000 random states. It can be seen that basically all states in each category can fulfill the target when n is large, which is reasonable as more target directions exist when the dimension is high. Moreover, the ratio increases with the rise of the nonzero entry number. More interestingly, the ratio in each category basically fits the function \(1/(1+a{n}^{b}{e}^{-c{n}^{d}})\), and the parameters a,b,c,d can be found in the Supplementary Information. The general behaviors of the ratio and the physical mechanism behind it are still open questions that require further investigation. The minimum time to reach the target for all states in each category is also investigated and the specific results are given in the Supplementary Information, which indicates that in this example we only need to focus on the states with few nonzero entries for the study of OQSL.

Fig. 3: Ratio of states that can fulfill the target Θ = π/2 in different categories.
figure 3

The red pentagrams, green crosses, and blue triangles represent the ratios for the states with 2, 3, and 10 nonzero entries. The dash-dotted red, dotted green, and dashed blue lines represent the corresponding fitting functions.

Next we perform the CRC methodology in the case of n = 10. The methodology is applied to the categories of states with 2 to 5 nonzero entries. Here we present the result in the category of 2 nonzero entries, and others are given in the Supplementary Information. 22500 and 7500 states and corresponding labels are used as the training and test sets for the classification. The best score of the trained network we obtained is 94.55%. Then about one million states are input into this network, and the result shows that 7.71% states can fulfill the target, close to the result (5.15%) obtained from 2000 random states. In the regression process, 22500 and 7500 states consist of the training and test sets. The best mean square error is 8.95 × 10−4 and the corresponding τlearn is 0.24, close to the true evolution time (0.19) of ρlearn. About 10000 states in the neighborhood of ρlearn are used in the calibration and the final result is 0.18. Combing the results of the other three categories, the final value of OQSL obtained from the CRC methodology is 0.18, which can be realized by certain states with 2 nonzero entries.

Methods

Training sets and neural networks in the CRC methodology

In both cases of controlled Landau-Zener model and transverse Ising model, 22500 and 7500 datasets are generated for training and testing in the classification and regression processes. Each dataset is composed of the initial state and corresponding time to reach the given target. In these datasets, the initial states are generated randomly and the time is solved via rigorous dynamics. In the case of controlled Landau-Zener model, the optimal control is obtained via the automatic differentiation. In the case of transverse Ising model, the initial states are expressed by the matrix product state which is implemented via Julia package ITensors57, and the time for reaching the given target is calculated with time evolving block decimation technique. In the process of calibration, 10000 datasets are generated in a reasonable neighborhood of ρlearn.

The Python package sklearn58 is used in this paper to build and train the neural networks for the classification and regression processes. In the cases of noncontrolled and controlled Landau-Zener models, the layer number of the neural network is 5 to 6, and each layer contains about 250 neurons. The hyperbolic tangent function and rectified linear unit function are chosen as the activation loss function in the classification and regression, respectively. In the case of transverse Ising model, the neural networks in classification for the states with 2, 3, 4, and 5 nonzero entries are all activated by the hyperbolic tangent function. With respect to the regression, the activation loss function for the neural networks is rectified linear unit function for the states with 2 nonzero entries, logistic function for those with 3 nonzero entries, and identity function for those with 4 and 5 nonzero entries.

In the process of classification, average cross-entropy loss function is used to train the neural networks, which is of the form

$$f(\hat{x},x,W)=-\frac{1}{m}\mathop{\sum }\limits_{i=0}^{m}\left[{x}_{i}\ln {\hat{x}}_{i}+(1-{x}_{i})\ln (1-{\hat{x}}_{i})\right]+\frac{\alpha }{2m}| | W| {| }_{2}^{2},$$
(4)

where x and \(\hat{x}\) represent the true results and the results predicted by the neural network. m is the number of datasets. W is the weight matrix of the neural network and \(\alpha | | W| {| }_{2}^{2}=\alpha {\sum }_{ij}{W}_{ij}^{2}\) represents the penalty term. And in the regression, the loss function in the training is the mean square error function,

$$f({t}_{{{{\rm{pre}}}}},{t}_{{{{\rm{ext}}}}},W)=\frac{1}{2m}\mathop{\sum }\limits_{i = 0}^{m}{\left[{t}_{{{{\rm{pre}}}}}^{(i)}-{t}_{{{{\rm{ext}}}}}^{(i)}\right]}^{2}+\frac{\alpha }{2m}{\left\Vert W\right\Vert }_{2}^{2},$$
(5)

here \({t}_{{{{\rm{pre}}}}}\) and text are the time predicted by the regression neural network and the exact time obtained via rigorous dynamics. More details of the methods can be found in the Supplementary Information.

In the case that Bloch angle is used to quantify the target, when the value of Θ is changed the reachable state set changes accordingly, which means all the neural networks in the CRC methodology have to be retrained. How to train general neural networks that work for all target values is still a very challenging problem, and requires further and continuous investigations in the future.