Bounding the resources for thermalizing many-body localized systems

Understanding the conditions under which physical systems thermalize is one of the long-standing questions in many-body physics. While it has been observed that generic quantum systems do thermalize, rather little is known about the precise underlying mechanism. Furthermore, instances in which thermalization is hindered for many-body systems have been uncovered, now known as many-body localization, offering promising insights into the mechanisms that underlie thermalization. In this work, we derive upper and lower bounds on the size of a heat bath required to thermalize a many-body localized system, for a broad class of collision models. To obtain these bounds, we employ a recently developed tool from quantum information theory known as the convex split lemma. We apply our results to the disordered Heisenberg chain, which we study numerically, and we characterize the robustness of the MBL phase in this system for the family of thermalization processes considered, in terms of the required bath size. Part of the significance of this work stems from transferring tools from resource-theoretic quantum thermodynamics to the study of interacting quantum many-body systems.


I. INTRODUCTION
When pushed out of equilibrium, closed interacting quantum many-body systems generically relax to an equilibrium state that can be described using thermal ensembles that only depend on the energy of the initial state. While this behaviour is plausible from the perspective of quantum statistical mechanics [1][2][3][4][5], it is far from clear which local properties are responsible for the emergence of thermalization. The discovery of many-body localization (MBL) offers a fresh perspective into this question, as this effect occurs in interacting many-body systems preventing them from actually thermalizing [6,7]. Examples of systems that are non-thermalizing, like integrable systems, have been known before, but these have been fine-tuned such that small perturbations restore thermalization. Many-body localization is strikingly different in this respect, as its non-thermalizing behaviour appears to be robust to changes in the Hamiltonian [8]. A related open question, recently considered in several papers, is whether MBL is stable with respect to its own dynamics when small ergodic regions are present [9][10][11][12][13], or when the system is in contact with an actual external environment [14][15][16][17]. This is both a critical question for the experimental realization of systems exhibiting MBL properties, and for its fundamental implications on the process of thermalization in quantum systems.
The main focus in this work is the robustness of the MBL phase under instances of external dissipative processes. More specifically, we study how efficiently one may thermalize an MBL spin lattice system, by coupling it to a finite thermal bath. We allow a broad, yet physically realistic class of interaction models, that describe the interaction between system and bath in terms of energy-preserving stochastic collisions which take place between a lattice region and sub-regions of the bath. During these interactions, system and bath can be either weakly or strongly coupled. Our aim is then to thermalize an MBL system with the least amount of thermal resources possible, meaning with a bath of the smallest possible dimension.
It is key to the approach taken here -and one of the merits of this work -that in order to arrive at our results, we make use of tools from quantum information theory, tools that might at first seem somewhat alien to the problem at hand, but which turn out to provide a powerful machinery. We demonstrate this by using a technical result known as the convex split lemma [18,19], to derive quantitative bounds on the bath size required for a region of the spin lattice to thermalize. Given that MBL phases are challenging to study theoretically, and most known results are numerical in nature [20][21][22][23][24][25][26], our work provides a fresh approach in understanding such phases from an analytical perspective. The convex split lemma has originally been derived in the context of quantum Shannon theory, which is the study of compression and transmission rates of quantum information. Our main contribution is to connect this mathematical result to a class of thermodynamic models which can be used to describe thermalization processes in quantum systems. This gives rise to surprisingly stringent and strong results. Note, however, in the approach taken for thermalizing processes, it is assumed that systems thermalize close to exactly, a requirement that will be softened in future work.
As part of our results, we find that the max-relative entropy [27,28] and its smoothed version emerge as operationally significant measures, that quantify the robustness of the MBL phase in a spin lattice. The max-relative entropy is an element within a family of entropic measures that generalize the Rényi divergences [29] to the quantum setting. In order to illustrate the practical relevance of our results, we consider a specific system exhibiting the MBL phase, namely the disordered Heisenberg chain. Employing exact diagonalization, we numerically compute the value of the max-relative entropy as a function of the disorder and of the size of the lattice region that we are interested in thermalizing. Our findings suggest that the MBL phase is robust to thermalization despite being coupled to a finite external bath, under our collision models, indicating that such models allow for a conceptual understanding of the MBL phase stability. This extends the narrative of Refs. [14][15][16], which find that MBL is thermalized when coupled to an infinite sized bath. Moreover, our numerical simulations show that the max-relative entropy signals the transition from the ergodic to the MBL phase.

II. THERMALIZATION FRAMEWORK
We first set up some basic notation. Given some Hamiltonian H S of a system S with Hilbert space H S , we define the thermal state with respect to inverse temperature β = 1/(k B T ) as the quantum state τ β (H S ) := e −βH S Tr(e −βH S ) . ( In what follows, we model the process of thermalization of S with an external heat bath B. In particular, let H S and H B denote the Hamiltonian of S and B respectively. If S is initially in a state ρ, then, for fixed β and any > 0, we say that a global process E : where · 1 is the trace norm. Intuitively, this corresponds to the situation where the process E acts on the compound of the initial state of the system ρ and the bath state τ β (H B ), and brings the system state close to its thermal state τ β (H S ) while leaving the bath mostly invariant. We write if Eq.
(2) holds. It is worth noting that -thermalization requires the global system SB to be close to thermal after the channel E is applied. Monitoring the bath as well is necessary in order to avoid the possibility of "trivial" thermalization processes in which the non-thermal state ρ is simply swapped into the bath, which would merely move the excitation out of the considered region, rather than describing a physically realistic dissipation process. Thus, our notion of thermalization is different from previously considered ones, where for instance the sole system's evolution is considered. At the same time, and as mentioned before, it is a rather stringent measure, in that close to full global thermalization is required. Having introduced the basic notation and terminology, we now turn to the model used in this work. We consider a spin lattice V , where each site is described by a finite-dimensional Hilbert space H. The Hamiltonian of the system is composed of local operators, i. e., where x is labeling a specific subset of adjacent sites in the lattice and H x is the corresponding Hamiltonian operator whose support is limited to these sites. Within the lattice, we consider a local region R ⊆ V with Hilbert space H R . We are interested in the stability of the MBL phase with respect to stochastic collisions between the lattice region R and an external thermal bath B. In order to precisely re-cast this problem in terms of -thermalization, we first need to detail our choices for the initial state of the region R, the Hamiltonians H B and H R , and the class of maps E that model the thermalization process.
Throughout this work, we assume that the lattice has equilibrated (to a non-thermal state) before any coupling to B. In particular, if the initial state of the lattice is given by the state vector |ψ(0) , then the equilibrium state is given by the infinite time-average [30] ω := lim so that the state that describes region R at the time of its first interaction with the bath is ω R = Tr R c [ω], where we trace out the remaining of the lattice R c = V \R.
Regarding the choice of Hamiltonian H R , we initially disregard interactions between R and the rest of the lattice R c . As such, we consider the Hamiltonian H R = x:x⊆R H x , which includes only terms H x whose support is contained in R, and denote the corresponding thermal state τ β (H R ). We model the external thermal bath as a collection of n − 1 copies of the region R in thermal equilibrium. More formally, B is a system with Hilbert space H B = H ⊗n−1 R and Hamiltonian where the operator H (i) . . ⊗ I n−1 only acts non-trivially on the i-th subsystem of the bath. With this choice of Hamiltonian, the initial state on B is τ β (H B ) = τ β (H R ) ⊗n−1 . Such a choice for the bath is crucial to make our problem analytically tractable, but is also physically relevant for experimental setups, where it is possible to engineer one dimensional systems which are then coupled to a bath per site [31,32] or a mixture of a system and bath species that interact via contact interactions [33]. Moreover, we note that for the model of system-bath interactions that we introduce below, any state transition that can be realised on R with any heat bath Hamiltonian H B can also be realized, for some n, with a Hamiltonian of the form (6) [34].
We now turn to our model of the system-bath interactions, described via the following master equation, where ρ RB is the global state on R and B. This equation models a series of collisions, each described by a unitary operator U (k) acting non-trivially on the region R and a subset of bath components, occurring at a given rate τ −1 k > 0 in time according to a Poissonian distribution (see Appendix A for details). We consider elastic collisions, that is, we require that each unitary operator U [U (k) see Fig. 1 for a graphical illustration of the setup. Note that Eq. (7) is already in standard Lindblad form, and therefore describes a Markovian dynamical semi-group on RB. It is also important to note, however, that the process happening on the R < l a t e x i t s h a 1 _ b a s e 6 4 = " 1 y e U U s M V f k M W 2 D d x e 9 R o m 7 C g l h I = " > A A A C I X i c d V D L S s N A F J 3 4 r P X V 6 t J N s A i u S l I F X R b d u G z F P q A N Z T K 9 a Y f O J G H m R i i h X + B W v 8 C v c S f u x J 9 x 2 m Z h U z w w c D j n X u 6 Z 4 8 e C a 3 S c b 2 t j c 2 t 7 Z 7 e w V 9 w / O D w 6 L p V P 2 j p K F I M W i 0 S k u j 7 V I H g I L e Q o o B s r o N I X 0 P E n 9 3 O / 8 w x K 8 y h 8 w m k M n q S j k A e c U T R S 8 3 F Q q j h V Z w F 7 n b g Z q Z A M j U H Z K v a H E U s k h M g E 1 b r n O j F 6 K V X I m Y B Z s Z 9 o i C m b 0 B H 0 D A 2 p B O 2 l i 6 Q z + 8 I o Q z u I l H k h 2 g v 1 7 0 Z K p d Z T 6 Z t J S X G s 8 9 5 c / M / D s V y 5 n v q K T g B z m i 9 z E T G 4 9 V I e x g l C y J Y J g 0 T Y G N n z v u w h V 8 B Q T A 2 h T H H z S Z u N q a I M T a u m P T f f 1 T p p 1 6 r u V b X W v K 7 U 7 7 I e C + S M n J N L 4 p I b U i c P p E F a h B E g L + S V v F n v 1 o f 1 a X 0 t R z e s b O e U r M D 6 + Q W 3 g 6 P A < / l a t e x i t > V < l a t e x i t s h a 1 _ b a s e 6 4 = " D U M n 2 g + H / 8 l V g t e J d 9 G w k k J W B T Y = " > A A A C I X i c d V D L S s N A F J 3 U V 6 2 v V p d u g k V w V Z I q 6 L L o x m U L 9 g F t K J P p T T t 0 J g k z N 0 I J / Q K 3 + g V + j T t x J / 6 M 0 z Y L m + K B g c M 5 9 3 L P H D 8 W X K P j f F u F r e 2 d 3 b 3 i f u n g 8 O j 4 p F w 5 7 e g o U Q z a L B K R 6 v l U g + A h t J G j g F 6 s g E p f Q N e f P i z 8 7 j M o z a P w C W c x e J K O Q x 5 w R t F I r c 6 w X H V q z h L 2 J n E z U i U Z m s O K V R q M I p Z I C J E J q n X f d W L 0 U q q Q M w H z 0 i D R E F M 2 p W P o G x p S C d p L l 0 n n 9 q V R R n Y Q K f N C t J f q 3 4 2 U S q 1 n 0 j e T k u J E 5 7 2 F + J + H E 7 l 2 P f U V n Q L m N F / m I m J . Our framework is inspired by the resource theoretic framework of thermal operations [35,36], which have been used to investigate a wide variety of questions, such as the notion of work for microscopic systems [34,37], the quantum fluctuation theorems [38], the third law [38,39], and several other topics [40,41].
region R is in principle non-Markovian, since B is modelled here explicitly, and can be a very small heat bath, which retains memory of the system's initial state.
We are finally in a position to define our central measure of robustness to thermalization. Let E n denote the set of quantum channels [43] on RB that can be generated via the above collision process for a bath of size n. Each channel in this set is realized through a different choice of unitary operations {U (k) RB }, collisions rates {τ −1 k }, and final time t. Given the region initial state ω R , its Hamiltonian H R , and the bath Hamiltonian H B defined as in Eq. (6), we define n as the minimum integer n such that there exists an element in E n that -thermalizes R, The integer n quantifies the smallest size of a thermal environment required to thermalize region R under stochastic collisions, and hence provides a natural measure to quantify robustness of an MBL system against thermalization.

III. MAIN RESULTS
Our main results are upper and lower bounds on n that are essentially tight for a wide range of Hamiltonians H R . These bounds are stated using the (smooth) max-relative entropy, an entropic quantity that has received considerable attention in recent years in quantum information and communication theoretical research [34,44,45]. The max-relative entropy [27] between two quantum states ρ, σ ∈ S (H) such that supp(ρ) ⊆ supp(σ) is defined as while the smooth max-relative entropy between the same two quantum states, for some > 0, is defined as where B (ρ) is the ball of radius around the state ρ with respect to the distance induced by the trace norm.

A. Upper bound
We first present and discuss the upper bound, which can be easily stated in terms of the quantities just introduced.
Theorem 1 (Upper bound on the size of the bath). For a given Hamiltonian H R , inverse temperature β, and a constant > 0, we have that The above theorem provides a quantitative bound on the size of the thermal bath needed to -thermalize a lattice region R, when the coupling is mediated by stochastic collisions. For this specific dynamics, the region can be -thermalized if the size of the bath (the number of components) is proportional to the exponential of the max-relative entropy between the state of the region ω R and its thermal state τ β (H R ).
Theorem 1 is proven in Sec. II of the Supp. Mat. Here, we present a sketch of the proof in two steps. In the first step, we show that E n can be connected to so-called random unitary channels [46]. In the second step, we use this connection to find a particular channel in E n that achieves the upper bound of the above theorem. A central ingredient to the second step is a result known as convex split lemma [18,19].
Turning to the first step, recall that a random unitary channel is a map of the form where {p k } k is a probability distribution, and {U k } k is a set of unitary operators. For a given number n − 1 of bath subsystems, we define the class of energy-preserving random unitary channels R n as those random unitary channels on RB for which each unitary operator commutes with the Hamiltonian of the global system, i. e., In Sec. I of the Supp. Mat., we show that for any n ≥ 1, E n ⊆ R n , therefore allowing us to analyse any element of E n as a random unitary channel.
Turning to the second step, we design a stochastic collision model with a simple representation in terms of random unitary channels. Let us first recall that the thermal bath B is described by n − 1 copies of τ β (H R ), the Gibbs state of the Hamiltonian H R at inverse temperature β. The collisions occur either between the region R and one subsystem of the bath, or between two bath subsystems. The rate of collisions is uniform, and given by a τ −1 > 0. During a collision involving the i-th and j-th subsystems of RB, the states of the two colliding components are swapped, so that the interaction is described by the unitary operator U (i,j) swap [47]. For an initial global state ρ (in) where the a-thermality of the region has been "uniformly hidden" into the different components of the bath. The mapping from the initial state of region and bath to the steady state is achieved by the following random unitary channelĒ which uniformly swaps each of the bath subsystems with R.
Since all subsystems share the same Hamiltonian H R , it is easy to see that each one of the U (1,i) swap commutes with the joint Hamiltonian, so thatĒ n ∈ R n . Finally, we can invoke the convex split lemma [48], which allows us to show that, for any > 0, the channelĒ n can -thermalize the region R when the number of subsystems is n = 1 ) . The collision model presented here already encompasses a wide range of possible and physically realistic thermalization processes. However, before turning to a lower bound, we note that, due to the particularly simple nature of (15), one can use the above construction to upper bound the required size of a bath even for other thermalization models as well. For example, by noting that the channel (15) is permutationsymmetric (in the sense that it has permutation-invariant states as its fixed points), it follows that thermalization models with permutation-symmetric dynamics (i.e. those that allow for any permutation symmetric channel) are also subject to the above upper bound.

B. Lower bound and optimality results
We now turn to deriving a lower bound on n . This bound is obtained through a further assumption on the Hamiltonian H R , which we call the energy subspace condition (ESC).
Definition 2 (Energy subspace condition). Given a Hamiltonian H R , we say that it fulfills the ESC iff for any n ∈ N [49], given the set of energy levels {E k } d k=1 of the Hamiltonian H R , we have that for any vectors m, m ∈ N d with the same normalization factor, namely k m k = k m k = n, The above condition entails that energy levels cannot be exact integer multiples of one another. In generic interacting systems, non-degeneracy is enforced by level repulsion. Having exact integer multiple energy levels is a very fine-tuned situation that plausibly breaks as soon as randomness is introduced in the Hamiltonian [50]. It is also interesting to note that the ESC is always satisfied when R is a single qubit with a non-degenerate Hamiltonian. We can now state the following theorem, proved in Sec. III of the Supp. Mat., on the optimality of the channel associated with the convex split lemma.

Theorem 3 (Optimal thermalization processes).
If H R satisfies the ESC, then the channelĒ n in Eq. (15) provides the optimal thermalization process, that is, for any n ∈ N, Theorem 3 shows that, for Hamiltonians satisfying the ESC, the channelĒ n provides the optimal thermalization of R, that is, no other random energy-preserving channel acting on the same global system can achieve a smaller value of in Eq.
(2). This result additionally implies the following corollary, which provides our lower bound.
Corollary 4 (Lower bound to size of the bath). For a given β and > 0, and some Hamiltonian H R satisfying the ESC, we have Note that this lower bound is arising from the stringent model of thermalization of Eq. (2), and that less stringent models will potentially lead to smaller lower bounds. When H R does not satisfy the ESC, it is easy to find counterexamples to the optimality of the channelĒ, as we show in Sec. III of the Supp. Mat. The idea is that this channel is optimal only when it is able to produce a uniform distribution within each energy subspace of the global system, and this is possible if each subspace is fully characterized by a different frequency of single-system eigenvectors, which is exactly given by the ESC. Indeed, when the ESC is maximally violated, i. e., when the system Hamiltonian is completely degenerate, then no bath is required at all. See Sec. III for a discussion of this and its relation to known bounds from randomness extraction.

IV. THE DISORDERED HEISENBERG CHAIN
Our results from the previous section show that, for systems that satisfy the ESC, the max-relative entropy between the local state of a lattice region and its thermal state provides a natural measure for the robustness of that region to thermalization for a broad family of interactions. This includes manybody systems close to the transition between the ergodic and MBL phase, where both level repulsion and randomness effects favour a lack of exact degeneracies, so that it seems reasonable to expect for such systems to satisfy the ESC to sufficient order n. In this section, we use these results to study the robustness of the MBL phase to the thermal noise for a concrete system. Specifically, we consider the disordered Heisenberg chain, a one-dimensional spin-1 2 lattice system composed  For low values of disorder ∆ the max-relative entropy is almost constant as |R| increases, while for higher values of ∆ it scales linearly in |R|, hinting toward a robustness of the MBL phase with respect to the class of interaction models we are considering. Bottom. The slope of the max-relative entropy as a function of the disorder ∆ provides information on the phase transition. Indeed, we can see that this quantity abruptly increases in proximity of the expected phase transition from the ergodic to the MBL phase. The slope is obtained by a linear fit with error bars indicating least squares errors. The inset shows the derivative of the slope, with the grey lines indicate a possible transition region.
of L sites, governed by the Hamiltonian where σ x , σ y , σ z ∈ B C 2 are the Pauli operators, ∆ is the (dimensionless) disorder strength, and each parameter h i ∈ [−1, 1] is drawn uniformly at random. We employ periodic boundary conditions. It has been demonstrated both theoretically [51] and experimentally [7] that this system undergoes a localization transition above the critical disorder strength ∆ c ≈ 7. The transition manifests itself in a breakdown of conductance [7,51], and a slowdown of entanglement growth after a quench [23,24]. Moreover, a phenomenological model in terms of quasilocal constants of motions exists which provides an explanation for the non-thermal behavior of the system [52, 53].
In our simulation, we choose as initial state vector |Ψ(0) a variation of the Néel state with support on the total-magnetization sectors M = ±1, 0 [54]. For each random realization, we numerically compute the infinite time average of |Ψ as defined in Eq. (5), using exact diagonalization. We then trace out part of the lattice so as to obtain the state ω R , describing the infinite time averaged state reduced to the region R. Notice that in the ergodic phase, when the disorder strength ∆ < ∆ c , this state is expected to be close to thermal, with a temperature which depends on the energy of the initial state of the lattice. However, when the disorder strength ∆ passes its critical value, the state ω R is not thermal any more [7].
To numerically compute the max-relative entropy for this system, we use the Gibbs state of the reduced Hamiltonian H R , obtained from the Hamiltonian in Eq. (19) by only considering terms with full support on the region R. The inverse temperature β is obtained by constructing the global Gibbs state of the lattice and requiring its energy to equal to the one of the initial state vector |Ψ(0) . We compute D max (ω R τ β (H R )) for different disorder strengths ∆, and different sizes of the region R, see Fig 2. We find that in the ergodic phase the state is approximately thermal, and the max-relative entropy remains almost constant as |R| increases [55]. However, as ∆ approaches the critical value, we find that the max-relative entropy scales linearly in the region size |R|, with a linear coefficient which increases with the disorder strength, see the top panel of Fig. 2. As a result, the size of the external thermal bath n scales exponentially in the region size, due to the bounds we have obtained in the previous section. This exponential scaling in the size of the bath suggests a robustness of the MBL phase with respect to the dynamics given by Eq. (7), since the relative size of the bath n /|R| needs to diverge as |R| tends to infinity. In other words, for the MBL phase to be destroyed one needs, under the interaction models we consider, an exponentially vast amount of thermal noise. It is worth noting that our characterization of robustness of the MBL phase to thermal noise is distinctly different from others found in the literature [14][15][16][17]. Indeed, we couple the system with a finite-sized thermal bath, and we quantify the robustness in terms of its size. Furthermore, our notion of thermalization accounts for the evolution of both the system and bath, rather than focusing on the system only. Other works instead consider infinite thermal reservoirs and quantified the robustness as a function, for instance, of the coupling between system and environment. A promising experimental realization are recent optical lattice experiments [31][32][33]. However, to connect to our findings one would need full state tomography on both system and bath which so far is out of reach for these platforms.
We additionally study the first derivative of the max-relative entropy with respect to the region size |R|, as a function of the disorder strength, shown in the bottom panel of Fig. 2. We find that, during the ergodic phase, the derivative remains constant and small. As ∆ approaches the critical value, the derivative increases, and for ∆ ∆ c the derivative becomes constant again. Thus, we find that the derivative of the max-relative entropy with respect to the region size is an order parameter for the MBL phase transition. We then use this order parameter to estimate the critical value ∆ (L) c for the finite-length spin chain we are considering, obtaining a value of approximatively 4.5 for L = 15 sites. While the critical value for infinite-length spin chains is considered to be ∆ c ≈ 7, we find that our value, which we stress is obtained for a finite number of sites, seems to be in good accord with known results found in the literature using other measures [56, 57].

V. CONCLUSION AND OUTLOOK
We show that mathematical results originally developed to study quantum information processing may find their applications in many body physics as well, in particular for the study of MBL in this paper. We demonstrate this by applying the recently developed convex split lemma technique, to derive upper and lower bounds for the size of the external thermal bath required to thermalize an MBL system. The class of interaction models between the lattice and the thermal bath is described by the master equation (7), and consists in stochastic energy-preserving collisions between the system and bath components. The bounds we obtain depend on the max-relative entropy between the state we aim to thermalize and its thermal state.
We make use of these analytic results to study a specific and at the same time much ubiquitous system exhibiting MBL features, known as the disordered Heisenberg chain. We show that the MBL phase in this system is in fact robust with respect to the thermalization processes considered here, and that the derivative of the max-relative entropy with respect to region size serves as an approximate order parameter of the ergodic to MBL transition. We emphasize that this is not in contradiction with previous results, where a breakdown of localization was reported [14][15][16], as the size of the baths considered in these works was unbounded. Resource-theoretic frameworks offer another potentially useful approach for studying thermalization with infinite-dimensional baths; the framework of elementary thermal operations [58] which involves a single bosonic bath that is coupled only with two levels of the system of interest. One may then study the resources (the number of bosonic baths with different frequencies) required to achieve thermalization. Also, and more technically, it would be interesting to study the extent to which both the ESC condition and the requirement of exact commutation in our framework can be relaxed to only approximately hold true and how this in turn affects the lower bound of Corollary Lemma 18. These questions we leave to be studied in future work.
The success of our application implies that, potentially, other information theoretic tools could be employed to study the thermalization of MBL systems -and non-equilibrium dynamics of many-body systems in more generality, for that matter. For instance, results in randomness extraction [59] might be useful to provide new bounds. In randomness extraction, a weakly random source is converted into an approximately uniform distribution, with the use of a seed (a small, uniformly distributed auxiliary system). In analogy, thermalization requires a non-thermal state to be mapped into an almost thermal state, with the help of an external bath (the seed). Thus, it seems possible that results from randomness extraction might be modified to study this setting, and to obtain bounds on the thermal seed.
It has been shown that excited states of one-dimensional MBL systems are well-approximated by matrix product states (MPS) with a low bond dimension [60,61] if the system features an information mobility gap. These states have several interesting properties, and in particular they feature an area law for the entanglement entropy which is logarithmic in the bond dimension [62]. Since our result is based on a particular entropic quantity, it might be possible to use the properties of MPS to derive a fully analytical bound on the robustness of these systems with respect to thermal noise. It is the hope that our work stimulates further cross-fertilization between the fields of quantum thermodynamics and the study of quantum many-body systems out of equilibrium.

VI. ACKNOWLEDGMENTS
We would like to thank Alvaro Martin Alhambra, Johnnie Gray, Volkher Scholz, and Henrik Wilming for helpful discussions. CS is funded by EPSRC. NN is funded by the Alexander von Humboldt foundation. PB acknowledges funding from the Templeton Foundation. JE has been supported by the DFG (FOR 2724, CRC 183), the FQXi, and the Templeton Foundation. Below, we summarize the climate expenses of this work. We acknowledge the Freie Universität Berlin for covering the costs of the emissions offset. [55] For big enough sizes of the region, the max-relative entropy starts increasing even in the ergodic case. This is due to the finite size of the lattice in our simulation, and this effect can be mitigated by increasing the number of lattice sites (at the expenses of a higher computational cost). In this section, we consider a large class of dynamical processes involving n subsystems (for example, particles or molecules), where each of the subsystems has a given probability in time of interacting with another one (or with more than one at a time), and the interaction is fully general as long as it conserve the total energy. For simplicity, below we only consider two-body interactions, but extending the setting to k-body interactions is straightforward. Any such process can be described by the following master equation,

Numerical simulations
where ρ n (t) is the state of the n subsystems at time t, and λ i,j > 0 is the rate at which the interaction U i,j between the i-th and j-th subsystem occurs. Each interaction is energy preserving, that is, U i,j , H (n) = 0, where H (n) is the Hamiltonian of the global system. In the case in which only two-body interactions are present in the above equation, the total number of different unitaries is N = 1 2 n(n − 1). In the following, we re-label the two indices i, j with a single one k = i, j taking values between 1 and N .
At any given time t > 0, we show in Sec. A 1 that the solution of the above process has the form where the distribution is given by the product of N Poisson distributions, each with a different mean value λ k t. For a given m = (m 1 , m 2 , . . . , m N ) T such that m = N k=1 m k , the value of P t (m) is the probability that m interactions occurs during the time t, of which m k are described by the k-th unitary operator U k . On the other hand, the state ρ(m) is a uniform mixture of different states obtained from the initial state ρ n (t = 0), by considering all different sequences of events giving rise to the same m, where the unitary operator is and π is an element of the symmetric group S m which produces a different permutation of the m unitaries describing the two-body interactions. Furthermore, is the multinomial coefficient, which is precisely the number of different permutations. Notice that the unitary operator U m π commutes with the total Hamiltonian H (n) , since it is obtained from two-body interactions which commutes with H (n) . In the special case in which the two-body operators {U k } k commute with each other, the permutation π can be dropped, and in Eq. (A4) we can remove the weighted sum.
As a result, the evolution of the system at any finite time t ≥ 0 can be described by a convex mixture of energy-preserving unitary operations; specifically, for the dynamics given by the master equation A1, the channel which maps the initial state into ρ n (t) is given by We can formally define a class of channels which are associated with this dynamics. This very general set of maps is at the core of the thermalization results we derive in Sec. B.

Solution of a non-commuting Poisson process
Proposition 6 (Solution to non-commuting Poisson process). Consider a finite-dimensional Hilbert space H and the following master equation where ρ(t) ∈ S (H), and for all k the coefficient λ k > 0, and U k is a unitary operator acting on H. The solution of this master equation is where p (λ) t (m) is a Poisson distribution whose mean value isλ t, withλ = k λ k . The state ρ(m) is obtained from the initial state ρ(t = 0) by applying m times the channel Λ, i.e. ρ(m) = Λ (m) (ρ(t = 0)), where Proof. Let us first rearrange Eq. (A8) in such a way that, on the right-hand side, only one operator is acting on the state ρ(t), that is where the map Λ is defined in Eq. (A10). To show that Eq. (A9) is the solution of the above differential equation, first notice that the Poisson distribution p t (m) is the sole time-dependant object, since ρ(m) = Λ (m) (ρ 0 ) = Λ • . . . • Λ(ρ 0 ), and ρ 0 = ρ(t = 0). Thus, when taking the time derivative of ρ(t), we can exploit the fact that Then, the time derivative of the solution in Eq. (A9) is given by where in the third line we use the fact that Λ is linear and continuous.
A straightforward corollary of the above proposition concerns the relation between the maps in E n and the class of (energypreserving) random unitary channels [46], defined as follows.
Definition 7 (Energy preserving random unitary channels). For a given number of subsystems n ∈ N and a global Hamiltonian H (n) , we define the class of energy-preserving random unitary channels R n as composed by every maps of the form where {p k } k is a probability distribution, and each unitary operator U k preserves the energy, that is, U k , H (n) = 0.
The corollary of Prop. 6 is then the following.
Corollary 8 (E n as subsets of energy preserving random unitary channels). Given number of subsystems n ∈ N and a global Hamiltonian H (n) , the set of maps E n is a subset of the class of energy-preserving random unitary channels R n .

Solution of several independent Poisson processes
Notice that the solution of Eq. (A8) can be modified so that the single Poisson distribution in Eq. (A9) is replaced by a product of N Poisson distributions, each of them governing the number of a specific two-body interaction applied to the initial state at time t. To show this, let us first introduce some useful notation. For the i-th two-body process, denote λ i as the corresponding rate, and denote the tuple λ = (λ 1 , · · · , λ N ). Consider the state ρ(m) = Λ (m) (ρ 0 ), which is the state where a total of m such two-body interactions have happened. Let M m = {m ∈ N N | i m i = m} denote the set of N -dimensional tuples consisting of non-negative integers, such that the sum of all elements equals m. We can explicitly re-write ρ(m) as where N m := m m1,...,m N is the multinomial coefficient, is the product of the corresponding weights, and the state ρ(m) is a mixture over all possible combination of m unitaries, where each unitary U k appears m k times, where the unitary U m π has been defined in Eq. (A5). Let us now consider the corresponding Poisson distribution P (λ) t (m), with a mean valueλt. We want to see how this relates to the N independent Poisson processes with mean values λ. Noting that k m k = m and k λ k =λ, we can re-write this distribution as follows, by noting that each p (λ k ) t (m k ) is a Poisson distribution with mean value λ k t. If we now replace Eq. (A12) and Eq. (A15) into Eq. (A9), we see that the coefficients N m and C λ cancel out, and therefore, for a given time t > 0, In this section we prove the main results presented in the main text, namely the upper and lower bounds to the size of the external thermal bath required to thermalize a region of a many-body system. These bounds depend on the entropic quantity known as max-relative entropy [27], defined for two quantum states ρ, σ ∈ S (H) such that supp(ρ) ⊆ supp(σ) as The smooth max-relative entropy between the same two quantum states, for > 0, is defined as where B (ρ) is the ball of radius around the state ρ with respect to the distance induced by the trace norm. The main technical tool we use to derive our bounds is a result from quantum information theory known as the convex split lemma, first introduced and proved in Ref. [ Lemma 9 (Convex split lemma). Consider a finite-dimensional Hilbert space H and two states ρ, σ ∈ S (H) such that supp(ρ) ⊆ supp(σ). Then the state ρ (n) ∈ S (H ⊗n ) defined as is such that its trace distance to the n-copy i.i.d state σ ⊗n is upper-bounded as In the following, we consider a specific channel which, when acting on the n-subsystem state ρ ⊗ σ ⊗n−1 , is able to produce the state ρ (n) given in Eq. (B3) of the above lemma. This channel belongs to the set of random unitary channels acting on n subsystems, and is defined asĒ where U (i,j) swap denotes the unitary swap between the i-th and the j-th subsystems. We now specialise the setting to the one considered in the main text. We consider a region R described by the Hilbert space H R , with Hamiltonian H R and state ω R , and an external bath B composed by n − 1 subsystems at inverse temperature β. The Hilbert space of the bath is R only acts non-trivially on the i-th subsystem of the bath. The state of the bath is thermal, thus defined by τ β (H B ) = τ β (H R ) ⊗n−1 . We are interested in the process of thermalization of the region R by means of the collisional models described by the master equation of the form given in Eq. (A1). For this specific setting, we say that a channel E ∈ E n is able to -thermalize the region R if, that is, if the output of the channel is close, in trace distance, to the thermal state of region and bath. The quantity we seek to bound is n , that is, the minimum number of subsystems needed to -thermalize the region R, when the global dynamics is produced by a master equation of the form given in Eq. (A1), It is worth noting that, for the current setting, the channelĒ n defined in Eq. (B5) belongs to the class of maps E n . Indeed, this channel can be obtained from a master equation describing stochastic collisions occurring between the region R and each subsystem of the bath B, where the collision is described by a swap operator. Notice that, since the Hamiltonian of each subsystem is the same, the swap operator trivially conserves the energy.
We are now able, with the help of Lemma 9, to derive an upper bound on the quantity n . The upper bound is obtained by providing an explicit protocol able to -thermalize the system, and by computing the number of subsystems n needed for it.
Theorem 10 (Upper bound on n ). For a given Hamiltonian H R , inverse temperature β, and a constant > 0, we have that Proof. Let us consider the action of the channelĒ n , defined in Eq. (B5), on the initial state of the global system (region and bath) ω R ⊗ τ β (H R ) ⊗n−1 . It is easy to show that the final state of this channel is given bȳ which takes the same form of the state in the convex split lemma, see Eq. (B3). Then, it directly follows from Lemma 9 that For the above trace distance to be smaller than , that is, for the region to -thermalize, we need a number of subsystems which closes the proof.
Let us now derive a lower bound to the quantity n . To do so, we first need to introduce two lemmata; the first one is just a slight modification of Ref. [19,Fact 4 in Supp. Mat.], where we replace the quantum fidelity with the trace distance.
Proof. Under the hypotheses of this lemma, it was shown in Ref. [19,Fact 4 in Supp. Mat.] that where F (ρ, σ) = √ ρ √ σ 1 if the quantum fidelity between ρ and σ. It is known that the trace distance between two states is linked to the quantum fidelity by the following chain of inequalities, Therefore, we have that where the first inequality follows from the rhs of Eq. (B14), the equality from Eq. (B13), and the second inequality follows from the lhs of Eq. (B14).
We now recall and prove another result used in Ref. [19,Fact 6 in Supp. Mat.] that we use to lower bound the quantity n in the next theorem.
Proof. Since ρ AB is a classical-quantum state, there exists a probability distribution {p i } d i=1 , being d the dimension of the support of the classical part, and a set of states {ρ The reduced state on the quantum part of the system is ρ Then, it is easy to show that the operator is positive semi-definite, since it is composed by a positive mixture of states.
We are now in the position to derive a lower bound for the quantity n defined in Eq. (B7). Our proof is inspired by the one used in Ref. [19,Sec. 3.2 in Supp. Mat.] to derive a converse to the convex split lemma.
Theorem 13 (Lower bound on n ). For a given β and > 0, and a Hamiltonian H R satisfying the energy subspace condition (see Def. 16), we have Proof. For the sake of simplicity, in the following we refer to the initial state as ρ RB = ω R ⊗ τ β (H R ) ⊗n −1 , and to the target state as τ RB = τ β (H R ) ⊗n . For a fixed parameter > 0, letÊ ∈ E n be the (not necessarily unique) channel such that In Prop. 18 we show that, when the Hamiltonian H R satisfies the ESC, the channelĒ n minimizes the above trace distance over the set of channels E n , for any given n ∈ N. Thus, we have that for the specific number of subsystems n , the channelĒ n allows us to -thermalize the system, since We now make use of the above bound on the trace distance, and of the specific form of the channelĒ n to derive a lower bound on the quantity n . Let us first recall that the channel where the unitary operator U (1,i) swap ∈ B (H R ⊗ H B ) swaps the state of the first subsystem (the region R) with that of the i-th subsystem (belonging to the bath B). We can dilate this map by introducing the following unitary operation, which acts over the region, the bath, and an ancillary system A of dimension n . Then, for an ancillary system described by the state ρ A = n i=1 n −1 |i i| A , we can define the global stateρ RBA = V RBA (ρ RB ⊗ ρ A ) V † RBA . This is a classical-quantum state, and when the ancillary subsystem A is traced out it coincides withĒ n (ρ RB ).
From Lemma 11 it follows that there exists a classical-quantum extension of the target state τ RB , which we refer to as τ RBA (where A is the classical part of the state with dimension n ), such that ρ RBA − τ RBA 1 ≤ 2 √ . Furthermore, since τ RBA is classical-quantum, we have that the operator inequality τ RBA ≤ τ RB ⊗ I A holds, see Lemma 12. Using this operator inequality, the fact that ρ A = I A /n , and the definition of the max-relative entropy it follows that D max (τ RBA |τ RB ⊗ ρ A ) ≤ log n . By monotonicity of this measure with respect to CPTP maps, we have that Let us consider the first argument of the above max-relative entropy. By the monotonicity of the trace distance under partial trace, and its unitary invariance, we have where the last inequality follows from how we have defined τ RBA . Thus, the initial state of the region ω R is within a ball of radius 2 √ from the state in the first argument of the max-relative entropy in Eq. (B23). The state in the second argument instead can be explicitly computed, where the first equality follows from the definition of V RBA , while the second one from the fact that τ RB = τ β (H R ) ⊗n is invariant under permutation. As a result, we can replace Eq. (B23) with the following one, which closes the proof.
Let us briefly discuss on the two main assumptions made in Prop. 18, and the relevance of this result for MBL systems. First, notice that the two states we are comparing are ω R and τ β (H R ), both of them obtained through a partial trace over the whole spin chain with the exception of the region R. In particular, ω R is obtained by partial tracing the infinite time average of the spin chain, which is diagonal in the energy eigenbasis. As a result, no coherence on the energy basis is present in ω R . Likewise, the state τ β (H R ) is the Gibbs state of the Hamiltonian H R = Tr R c [H], which again is diagonal in the energy eigenbasis. Thus, the first assumption of the above theorem seems to be satisfied. The second assumption involves the energy gap of the region Hamiltonian. Due to the noise affecting the Hamiltonian of the spin chain, it seems a reasonable assumption to have non-degenerate energy gaps which could, at least up to a give n ∈ N, satisfy this condition. Thus, the theorem seems to apply to our setting, and therefore the optimal protocol for thermalizing a region of the spin chain using a bath of size n is given by the channelĒ n .
Remark 19 (Role of trivial Hamiltonians). For the case of trivial Hamiltonians, which clearly violates the ESC, the target thermal state is simply the maximally mixed state. Since we allow for the use of random unitary channels, thermalization in this case can already be achieved without further bath copies. Another way to put it is that the usage of bath copies and randomness coincide fully. The results in Ref. [66] imply that only an amount of log d of randomness (bits), are needed to perform this task, where d = dim H.

Limitation of the stochastic swapping collision model
We have seen that, while the channelĒ n can be proven to be optimal, necessary conditions on the Hamiltonian follow. When such conditions are dropped, we in fact know of cases where this channel is non-optimal (see Remark 19 for example), in the sense that there exists a much more efficient protocol using a smaller number of bath copies. The case in Remark 19 is somewhat less interesting since considering trivial Hamiltonians reduces the bath to a purely randomness resource, without any heat considerations whatsoever. In this section, we provide a counter example using non-trivial Hamiltonians.
One implication of the restriction given by the ESC of Def. 16 is that energy gaps of a single copy of the system cannot be degenerate. The following counter-example is constructed then as follows; let H (1) = contains two different types (as characterized by the tuple in Eq. (C5)). In particular, if we denote Π i,j = |i, j i, j| + |j, i j, i|, then the projector on energy subspace E t can be written as Π Et = Π 1,4 + Π 2,3 . Let τ (2) := τ ⊗2 denote the target state. We know that the total weight within this subspace is p 1 p 4 + p 2 p 3 + p 3 p 2 + p 4 p 1 = 4p 1 p 4 , where we know that since τ is a thermal state, Eq. (C9) implies that p 2 p 3 = p 1 p 4 . Therefore, within the E t energy subspace, we have which is a uniform distribution. What about the output state of the channelĒ 2 ? Assuming an initial state ρ = i q i |i i|, the stochastic swapping process produces an output state ρ (2) =Ē 2 (ρ) = 1 2 (ρ ⊗ τ + τ ⊗ ρ). Within the same energy subspace, the total weight is q 1 p 4 + q 2 p 3 + q 3 p 2 + q 4 p 1 , and the distribution is which is uniform across the subspace for each type. Fig. 4 shows the eigenvalues in this 4-dimensional subspace, compared to the distribution given by τ (2) . Note that as long as q 1 p 4 + q 4 p 1 2 > p 1 p 4 > q 2 p 3 + q 3 p 2 2 , or if q 1 p 4 + q 4 p 1 2 < p 1 p 4 < q 2 p 3 + q 3 p 2 2 , holds, one can always further decrease the trace distance by using another state ρ * that makes the entire subspace uniform. An example of this is when q 1 = 1, q 2 = q 3 = q 4 = 0, while assuming that p 1 ≤ 1 2 , so that p 1 p 4 ≤ p4 2 . Note that since p 1 is the prob (14) prob (41) prob (23) prob (32) FIG. 4. Comparison of eigenvalues of τ (2) and ρ (2) within the energy Et subspace. If each individual eigenvalues of ρ (2) are larger (or smaller) than τ (2) , then smoothening the subspace on ρ (2) does not affect the trace distance. However, if we have the situation in the diagram where some eigenvalues of ρ (2) are larger than that of τ (2) , while some are smaller, then one can further reduce the trace distance by smoothening out the subspace on ρ (2) .
Appendix D: Robustness of the numerical results with respect to the choice of thermal state In the main text, we study a localized region R in a spin lattice V whose Hamiltonian is H V = x H x , where x labels a specific subset of adjacent lattice sites, and the Hamiltonian operator H x has full support on those sites. When we consider the thermalization of the region R, we demand the state of the region to be close (in trace distance) to the Gibbs state of the reduced Hamiltonian H R = x:x⊆R H x , where only the operators H x with full support in R are considered. We refer to this state as τ β (H R ), where β is the inverse temperature. An alternative choice for the thermal state of the region is the reduced state obtained from the Gibbs state of the lattice by tracing out the whole lattice but R. We refer to this state as τ R = Tr R c [τ β (H V )], where τ β (H V ) is the Gibbs state of the lattice Hamiltonian. For high temperature states, it has been rigorously established that local reduced states of R converge in trace norm to the reduction of the infinite Gibbs state to R [64]; here we look at the same physical feature making use of a different quantity.
It is easy to see that, when no local operators in H V are partially supported on the region R, the two thermal states coincide. However, these states are different if there are operators {H x } whose support is shared between R and R c . For example, this is the case in the Hamiltonian of the disordered Heisenberg chain we consider in the main text. In this section, we investigate how the max-relative entropy changes when the final state of the thermalization process is τ R , for the disordered Heisenberg chain with L = 14 sites, see Fig. 5.
We find that, for any fixed size of the region |R| and any disorder strength ∆, the numerical value of D max (ω R τ R ) is smaller than the one of D max (ω R τ β (H R )). This fact should be expected, at least for low values of the disorder parameter ∆, since in such regime the system is ergodic and the reduced state ω R should approach the thermal state τ R . Critically, however, we find that the scaling of the two max-relative entropies as a function of the size of the region |R| is equivalent, thus hinting at a robustness of the MBL phases with respect to thermalization toward any of the two thermal states.
Finally, it is worth mentioning that our upper bound on the size of the thermal bath is valid for both choices of the final thermal state. Indeed, for the system to thermalize toward τ R , we only need to slightly modify the class of processes under consideration, so as to allow the energy of system and environment not to be conserved exactly. This is due to the fact that the specific process which provides our upper bound is composed by swap operations, which do not conserve the energy exactly when the bath is represented by copies of τ R . The situation is different for our lower bound, since to derive it we make use of the fact that the class of processes under consideration preserves the global energy exactly, see the proof of Thm. 13. It would be interesting to study whether the lower bound could be modified to accommodate for processes which only approximately conserve the energy of system and environment.  Fig. 2 in the main text, it is easy to see that the scaling of the two max-relative entropies is analogous, with an almost constant trend for low values of disorder, and a linear dependence in |R| for higher values of the disorder strength.