Introduction

Despite its phenomenological beginnings, thermodynamics has been inextricably linked throughout the past century with the abstract concept of information. Such connections have proven essential for solving paradoxes in a variety of thought experiments, notably including Maxwell’s demon1 and Loschmidt’s paradox2. This integration between classical thermodynamics and information is also one of the main motivating factors in extending the theory to the quantum realm, where information held by the observer plays a similarly fundamental role3.

This work is concerned with the transition from classical to quantum thermodynamics in the context of the Gibbs paradox4,5,6. This thought experiment considers two gases on either side of a box, separated by a partition and with equal volume and pressure on each side. If the gases are identical, then the box is already in thermal equilibrium, and nothing changes after removal of the partition. If the gases are distinct, then they mix and expand to fill the volume independently, approaching thermal equilibrium with a corresponding entropy increase. The (supposed) paradox can be summarised as follows: what if the gases differ in some unobservable or negligible way—should we ascribe an entropy increase to the mixing process or not? This question sits uncomfortably with the view that thermodynamical entropy is an objective physical quantity.

Various resolutions have been described, from phenomenological thermodynamics to statistical mechanics perspectives, and continue to be analysed6,7,8. A crucial insight by Jaynes9 assuages our discomfort at the observer-dependent nature of the entropy change. For an informed observer, who sees the difference between the gases, the entropy increase has physical significance in terms of the work extractable through the mixing process—in principle, they can build a device that couples to the two gases separately (for example, through a semi-permeable membrane) and thus let each gas do work on an external weight independently. An ignorant observer, who has no access to the distinguishing degree of freedom, has no device in their laboratory that can exploit the difference between the gases, and so cannot extract work. For Jaynes, there is no paradox as long as one considers the abilities of the experimenter—a viewpoint central to the present work.

A study of Gibbs mixing for identical quantum bosons or fermions is motivated by recognising that the laws of thermodynamics must be modified to account for quantum effects such as coherence10, which can lead to enhanced performance of thermal machines11,12,13. The thermodynamical implications of identical quantum particles have received renewed interest for applications such as Szilard engines14,15, thermodynamical cycles16,17 and energy transfer from boson bunching18. Moreover, the particular quantum properties of identical particles, including entanglement, can be valuable resources in quantum information processing tasks19,20,21.

In this work, we consider a toy model of an ideal gas with non-interacting quantum particles, distinguishing the two gases by a spin-like degree of freedom. We describe the mixing processes that can be performed by both informed and ignorant observers, taking into account their different levels of control, from which we can calculate the corresponding entropy changes and thus work extractable by each observer. For the informed observer, we recover the same results as obtained by classical statistical mechanics arguments. However, for the ignorant observer, there is a marked divergence from the classical case. Counter-intuitively, the ignorant observer can typically extract more work from distinguishable gases—even though they appear indistinguishable—than from truly identical gases. In the continuum and large particle number limit which classically recovers the ideal gas, this divergence is maximal: the ignorant observer can extract as much work from apparently indistinguishable gases as the informed observer. Our analysis hinges on the symmetry properties of quantum states under permutations of particles. For the ignorant observer, these properties lead to non-trivial restrictions on the possible work extraction processes. Viewed another way, the microstates of the system described by the ignorant observer are highly non-classical entangled states. This implies a fundamentally different way of counting microstates, and therefore computing entropies, from what is done classically or even in semi-classical treatments of quantum gases. Therefore we uncover a genuinely quantum thermodynamical effect in the Gibbs mixing scenario.

Results

Set-up

We consider a gas of N particles inside a box, such that each particle has a position degree of freedom, denoted x, and a second degree of freedom which distinguishes the gases. Since we only consider the case of two types of gases, this is a two-dimensional degree of freedom and we refer to this as the ‘spin’ s (although it need not be an actual angular momentum). Classically, the two spin labels are , , and their quantum analogues are orthogonal states \(\left|\uparrow \right\rangle ,\left|\downarrow \right\rangle\).

Following the traditional presentation of the Gibbs paradox, the protocol starts with two independent gases on different sides of a box: n on the left and m = N − n on the right (see Fig. 1). Each side is initially thermalised with an external heat bath B at temperature T.

Fig. 1: The Gibbs paradox.
figure 1

Two distinct gases of n particles at the same temperature and pressure are separated by a partition. This partition is removed and the gases are allowed to mix and reach equilibrium. Two observers calculating the entropy increase during the process disagree depending on their ability to distinguish the particles. An informed observer, who can measure the difference between the gases, calculates \(2n\,{\mathrm{ln}}\,2\), while an observer ignorant of the difference records no entropy change. In this work, we ask how the situation changes when classical particles are replaced by identical quantum particles.

In our toy model, each side of the box consists of d/2 ‘cells’ (d is even) representing different states that can be occupied by each particle. These states are degenerate in energy, such that the Hamiltonian of the particles vanishes. This might seem like an unrealistic assumption; however, this model contains the purely combinatorial (or ‘state-counting’) statistical effects, first analysed by Boltzmann22, that are known to recover the entropy changes for a classical ideal gas8,23,24 using the principle of equal a priori probabilities. One could instead think of this setting as approximating a non-zero Hamiltonian in the high-temperature limit, such that each cell is equally likely to be occupied in a thermal state. Since the particle number is strictly fixed, we are working in the canonical ensemble (rather than the grand canonical ensemble).

Work extraction can be modelled in various ways in quantum thermodynamics. In the resource-theoretic approach based on thermal operations25,26, one keeps track of all resources by treating the system (here, the particles), heat bath and work reservoir (or battery) as interacting quantum systems. The work reservoir is an additional system with non-degenerate Hamiltonian whose energy changes are associated with work done by or on the system (generalising the classical idea of a weight being lifted and lowered).

The gases on either side of the box start in a state of local equilibrium and via mixing approach global equilibrium. We therefore consider the extractable work to be given by the difference in non-equilibrium free energy F27 between initial and final states, where F(ρ) = 〈Eρ − kBTS(ρ), \({\langle E\rangle }_{\rho }\,=\,{\rm{tr}}(\rho H)\) being the mean energy (zero in our case) and \(S(\rho )\,=\,-{\rm{tr}}(\rho\, {\mathrm{ln}}\,\rho )\) the von Neumann entropy in natural units. The extractable work in a process that takes ρ to \(\rho ^{\prime}\) is then

$$W\,\le\, F(\rho )\,-\,F(\rho ^{\prime} )\,=\,{k}_{B}T\left[S(\rho ^{\prime} )\,-\,S(\rho )\right].$$
( 1)

In a classical reversible process, the extractable work is equal to the change in free energies. This is generally an over-simplification for small systems, in which work can be defined in various ways28—e.g. required to be deterministic in the resource theory context25 or as a fluctuating random variable29,30, requiring consideration of other varieties of free energy. However, Eq. (1) will turn out to be sufficient for our purposes in the sense of mean extractable work. We find the inequality to be saturable using thermal operations and characterise fluctuations around the mean in the latter part of our results section.

Our analysis compares the work extracted by two observers with different levels of knowledge: the informed observer, who can tell the difference between the two gases, and the ignorant observer, who cannot. The difference between these observers is that the former has access to the spin degree of freedom s, whereas the latter does not (summarised in Table 1).

Table 1 Summary of the observers’ abilities.

It is important to point out that, for the informed observer, the spin acts as a ‘passive’ degree of freedom, meaning that it can be measured but not actively changed. In other words, the two types of gases cannot be converted into each other. This assumption is always implicitly present in discussions of the Gibbs paradox—without it, the distinguishing degree of freedom would constitute another subsystem with its own entropy changes. One could also describe the spin as an information-bearing degree of freedom31. The question is whether the information encoded within the spin state has an impact upon the thermodynamics of mixing.

Classical case

Classically, the microstates described by the informed observer are specified by counting how many particles exist with each position x and spin s—since the particles are indistinguishable32. The ignorant observer has a different state space given by coarse-graining these states—the classical equivalent of ‘tracing out’ the spin degree of freedom. Thus the ignorant observer can extract only as much work from two different gases as from a single gas, recovering Jaynes’ original statement9. These intuitively obvious facts are shown by a formal construction of the state spaces in Supplementary Note 1. Paralleling our later quantum treatment, this establishes that the classical and quantum cases can be compared fairly.

The amount of extractable work in the classical case can be straightforwardly argued by state counting. Consider the gas initially on the left side—the number of ways of distributing n particles among d/2 cells is \(\left(\begin{array}{c}n\,+\,d/2\,-\,1\\ n\end{array}\right)\). In the thermal state, each configuration occurs with equal probability. Therefore the initial entropy, also including the gas on the right, is \({\mathrm{ln}}\,\left(\begin{array}{c}n\,+\,d/2\,-\,1\\ n\end{array}\right)\,+\,{\mathrm{ln}}\,\left(\begin{array}{c}m\,+\,d/2\,-\,1\\ m\end{array}\right)\). For distinguishable gases, each gas can deliver work independently, with an equal distribution over \(\left(\begin{array}{c}n\,+\,d\,-\,1\\ n\end{array}\right)\left(\begin{array}{c}m\,+\,d\,-\,1\\ m\end{array}\right)\) configurations. For indistinguishable gases, the final thermal state is described as an equal distribution over all ways of putting N = n + m particles into d cells, of which there are \(\left(\begin{array}{c}N\,+\,d\,-\,1\\ N\end{array}\right)\). Hence the entropy change in each case is

$${{\Delta }}S ={\mathrm{ln}}\,\left(\begin{array}{c}n\,+\,d\,-\,1\\ n\end{array}\right)\,+\,{\mathrm{ln}}\,\left(\begin{array}{c}m\,+\,d\,-\,1\\ m\end{array}\right)-{\mathrm{ln}}\,\left(\begin{array}{c}n\,+\,d/2\,-\,1\\ n\end{array}\right)\\ \quad-\,{\mathrm{ln}}\,\left(\begin{array}{c}m\,+\,d/2\,-\,1\\ m\end{array}\right)\quad (\,\text{distinguishable}\,),$$
(2)
$${{\Delta }}S ={\mathrm{ln}}\,\left(\begin{array}{c}N\,+\,d\,-\,1\\ N\end{array}\right)\,-\,{\mathrm{ln}}\,\left(\begin{array}{c}n\,+\,d/2\,-\,1\\ n\end{array}\right)\\ \quad \,-\,{\mathrm{ln}}\,\left(\begin{array}{c}m\,+\,d/2\,-\,1\\ m\end{array}\right)\quad (\,\text{indistinguishable}\,).\,\,\qquad \qquad$$
( 3)

Note that ΔS ≠ 0 even in the indistinguishable case, which may seem at odds intuitively with the result for an ideal gas. However, one can check that \({{\Delta }}S\,=\,O({\mathrm{ln}}\,N)\) in the limit of large d (whereby the box becomes a continuum) and large N. This is negligible compared with the ideal gas expression of \(N\,{\mathrm{ln}}\,2\) for distinguishable gases33 (See ref. 8, p. 43 for a more detailed discussion of this approximation). (Due to a subtle technicality with classical identical particles, formulas (2), (3) might be regarded as upper bounds to the true values—see Supplementary Note 1.) Note that a classical analogue of fermions can be made by importing the Pauli exclusion principle, so that two or more particles can never occupy the same cell. This has the effect of replacing the binomial coefficients of the form \(\left(\begin{array}{c}N\,+\,d\,-\,1\\ N\end{array}\right)\) in (2) and (3) by \(\left(\begin{array}{l}d\\ N\end{array}\right)\).

Quantum case

Compared with the classical case, we must be more explicit about the role of the spin s as a ‘passive’ degree of freedom for the informed observer. This observer may obtain information about the numbers of spin— and spin— particles. Thus they can engineer spin-dependent operations conditional on these numbers, but cannot change the number of each spin.

For identical gases, the result is of course the same as for the ignorant observer, and the classical case (3). For distinguishable gases, each gas behaves as an independent subsystem; thus, the entropy changes are the same as for classical distinguishable gases (2).

The remainder of this section is devoted to the ignorant observer, for which we find a departure from the classical case.

The peculiarities of the quantum case stem from a careful look at the Hilbert space structure. The Hilbert space of a single particle is a product \({{\mathcal{H}}}_{1}\,=\,{{\mathcal{H}}}_{x}\,\otimes\, {{\mathcal{H}}}_{s}\) of a part for the spatial degree of freedom x and a part for the spin s. Since there are d cell modes and two spin states, these parts have dimensions dim \({{\mathcal{H}}}_{x}\,=\,d\), dim \({{\mathcal{H}}}_{s}\,=\,2\). For N distinguishable particles, the state space would be \({{\mathcal{H}}}_{1}^{\otimes N}\). However, for bosons and fermions, which are quantum indistinguishable particles, states lie in the symmetric and antisymmetric subspaces, respectively (in first quantisation). This symmetry refers to the wavefunction under permutations of particles: for bosons, there is no change, whereas for fermions, each swap of a pair incurs a minus sign in the global phase. The physical Hilbert space of N particles can then be written as

$${{\mathcal{H}}}_{N}\,=\,{P}_{\pm }\left({{\mathcal{H}}}_{x}^{\otimes N}\,\otimes\, {{\mathcal{H}}}_{s}^{\otimes N}\right),$$
( 4)

where P+(−) is the projector onto the (anti-)symmetric subspace.

Since each particle carries a position and spin state, a permutation Π of particles is applied simultaneously to these two parts: Π acts on the above Hilbert space in the form Πx Πs. The requirement of an overall (anti-)symmetric wavefunction effectively couples these two degrees of freedom via their symmetries. For a familiar example, consider two particles. The spin state space can be broken down into the symmetric ‘triplet’ subspace spanned by \(\left|\uparrow \uparrow \right\rangle \,,\,\ \left|\downarrow \downarrow \right\rangle\) and \(\left|\uparrow \downarrow \right\rangle \,+\,\left|\downarrow \uparrow \right\rangle\), and the antisymmetric ‘singlet’ subspace consisting of \(\left|\uparrow \downarrow \right\rangle \,-\,\left|\downarrow \uparrow \right\rangle\). For bosons, overall symmetry requires that a triplet spin state be paired with a symmetric spatial wavefunction, and a singlet spin state with an antisymmetric spatial function. For fermions, opposite symmetries are paired.

With more particles, the description is more complex, but the main idea of paired symmetries remains the same. Following ref. 34, our main tool is Schur-Weyl duality35, which decomposes

$${{\mathcal{H}}}_{x}^{\otimes N} = \bigoplus _{\lambda }{{\mathcal{H}}}_{x}^{\lambda }\,\otimes\, {{\mathcal{K}}}_{x}^{\lambda },$$
( 5)

where λ runs over all Young diagrams of N boxes and no more than d rows (A Young diagram can be described simply by a non-increasing set of (≤d) positive integers summing up to N). In technical terms, \({{\mathcal{H}}}_{x}^{\lambda }\) and \({{\mathcal{K}}}_{x}^{\lambda }\) carry irreducible representations of the unitary group U(d) and the permutation group SN of N particles, respectively. More concretely, a non-interacting unitary operation on the positions of all the particles, \({u}_{x}^{\otimes N}\), is represented in the decomposition (5) as an independent rotation within each of the \({{\mathcal{H}}}_{x}^{\lambda }\) spaces. The term ‘irreducible’ refers to the fact that each of these spaces may be fully explored by varying the unitary ux. Similarly, a permutation of the particles in the spatial part of the wavefunction is represented by an action on each \({{\mathcal{K}}}_{x}^{\lambda }\) space. Thus each block labelled by λ in the decomposition (5) has a specific type of permutation symmetry.

The same decomposition works for the spin part \({{\mathcal{H}}}_{s}^{\otimes N}\). However, since this degree of freedom is two-dimensional, each λ is constrained to have no more than two rows. We can think of s as describing a total angular momentum formed of N spin-1/2 particles, and in fact λ can be replaced by a total angular momentum eigenvalue J varying over the range N/2, N/2 − 1,….

After putting the spatial and spin decompositions together, projecting onto the overall (anti-)symmetric subspace causes the symmetries of the two parts to be linked. For bosons, the λ label for x and s must be the same; for fermions, they are transposes of each other (i.e. related by interchanging rows and columns). This results in the form

$${{\mathcal{H}}}_{N} = \bigoplus _{\lambda }{{\mathcal{H}}}_{x}^{\lambda }\,\otimes\, {{\mathcal{H}}}_{s}^{\lambda }\quad \,\text{for} \;\text{bosons},$$
( 6)
$${{\mathcal{H}}}_{N} = \bigoplus _{\lambda }{{\mathcal{H}}}_{x}^{{\lambda }^{T}}\,\otimes\, {{\mathcal{H}}}_{s}^{\lambda }\quad \,\text{for} \;\text{fermions}.$$
( 7)

Instead of the label λ, from now on we use the angular momentum number J and generally write this decomposition as \({\bigoplus }_{J}{{\mathcal{H}}}_{x}^{J}\,\otimes\, {{\mathcal{H}}}_{s}^{J}\)—bearing in mind that \({{\mathcal{H}}}_{x}^{J}\) is different for bosons and fermions. In terms of the earlier N = 2 example, J = 1 corresponds to the spin triplet subspace, and J = 0 to the spin singlet.

Another way of describing the decomposition (6) is that it provides a convenient basis \({\left|J,q\right\rangle }_{x}{\left|J,M\right\rangle }_{s}{\left|{\phi }_{J}\right\rangle }_{xs}\), known as the Schur basis36. Here, \({\{{\left|J,q\right\rangle }_{x}\}}_{q}\) is a basis for \({{\mathcal{H}}}_{x}^{J}\) and \({\{{\left|J,M\right\rangle }_{s}\}}_{M}\) a basis for \({{\mathcal{H}}}_{s}^{J}\). M = −J, −J + 1, …, J can be interpreted as the total angular momentum quantum number along the z-axis. \({\left|{\phi }_{J}\right\rangle }_{xs}\,\in\, {{\mathcal{K}}}_{x}^{J}\,\otimes\, {{\mathcal{K}}}_{s}^{J}\) is a state shared between the x and s degrees of freedom.

We now consider how the state thermalises for the ignorant observer. Since the ignorant observer cannot interact with spin, their effective state space is described by tracing out the factor \({{\mathcal{H}}}_{s}\) for each particle. In terms of the decomposition (6) and corresponding basis described above, this means that an initial density matrix ρ, after tracing out s, is of the form

$${\rho }_{x}\,:=\, {{\rm{tr}}}_{s}\,\rho = \bigoplus _{J}{p}_{J}{\rho }_{x}^{J}\,\otimes\, {{\rm{tr}}}_{s}\,{\left|{\phi }_{J}\right\rangle \left\langle {\phi }_{J}\right|}_{xs},$$
( 8)

where \({\rho }_{x}^{J}\) is a density matrix on \({{\mathcal{H}}}_{x}^{J}\), occurring with probability pJ. Note that there is no coherence between different values of J, and that the components \({\rho }_{x}^{J}\) are mutually perfectly distinguishable by a measurement of their J.

Additionally, the allowed operations must preserve the bosonic or fermionic exchange symmetry. Any global unitary UxBW, coupling the spatial degree of freedom of the particles to the heat bath and work reservoir, must therefore commute with permutations on the spatial part: [UxBW, Πx] = 0 for all Π. By Schur’s Lemma, such a unitary decomposes as U = JUJIJ, where UJ operates on the \({{\mathcal{H}}}_{x}^{J}\) component, with an identity IJ on \({{\mathcal{K}}}_{x}^{J}\). Hence each J component is operated upon independently, the spin eigenvalue J being conserved.

In summary, therefore, the ignorant observer may engineer any thermal operation extracting work separately from each J component (depicted in Fig. 2). We can think of their operations being conditioned on the spatial symmetry type, and although J is observed to fluctuate randomly, a certain amount of work is extracted for each J (see the latter part of the results section for a more detailed analysis of this fluctuation). For each J, there exists a free operation within the thermal operations framework25 that performs deterministic work extraction saturating inequality (1). This is because the transformation is between (energy-degenerate) uniformly mixed states of differing dimension. Note that the work extraction process does not involve a measurement by the observer—only a coupling to the apparatus that depends on the value of J. Therefore there is no need to consider an additional entropic measurement cost, unlike the case of Maxwell’s demon1,37.

Fig. 2: Schematic of the quantum mixing process.
figure 2

Two diagrams representing the mixing of indistinguishable (bosonic) quantum gases from the perspective of the informed (left) and ignorant (right) observers. Initially, n spin— particles are found on the left and m spin— on the right. The particles are then allowed to mix while coupling to an external heat bath and work reservoir. The informed observer describes microstates via the number of particles in each cell, and their respective spins. The ignorant observer cannot tell the spins’ states, but describes microstates (schematically depicted here by different colours) as superpositions of cell configurations, determined by the decomposition (6).

The question of optimal work extraction thus reduces to calculating the entropy of the initial state (8) and finding the maximum entropy final state. The fully thermalised final state seen by the ignorant observer is maximally mixed within each J block:

$${\rho }_{x}^{\prime} = \bigoplus _{J}{p}_{J}\frac{{I}_{x}^{J}}{{d}_{J}}\,\otimes\, {{\rm{tr}}}_{s}\left|{\phi }_{J}\right\rangle {\left\langle {\phi }_{J}\right|}_{xs},$$
( 9)

where \({I}_{x}^{J}\) is the identity on \({{\mathcal{H}}}_{x}^{J}\) and dJ is the corresponding dimension.

The overall entropy change is the average over all J, found to be (with details in Supplementary Note 2):

$${{\Delta }}{S}_{\mathrm{igno}}=\sum _{J}{p}_{J}{{\Delta }}{S}_{{\mathrm{igno}}}^{J}$$
( 10)
$$\,=\,\sum _{J}{p}_{J}{\mathrm{ln}}\,{d}_{J}^{B}\,-\,{\mathrm{ln}}\,\left(\begin{array}{c}n\,+\,d/2\,-\,1\\ n\end{array}\right)\,-\,{\mathrm{ln}}\,\left(\begin{array}{c}m\,+\,d/2\,-\,1\\ m\end{array}\right)$$
( 11)

for bosons, and

$${{\Delta }}{S}_{\text{igno}}\,=\,\sum _{J}{p}_{J}{\mathrm{ln}}\,{d}_{J}^{F}\,-\,{\mathrm{ln}}\,\left(\begin{array}{c}d/2\\ n\end{array}\right)\,-\,{\mathrm{ln}}\,\left(\begin{array}{c}d/2\\ m\end{array}\right)$$
( 12)

for fermions. Expressions for the dimensions \({d}_{J}^{B,F}\) are found in Supplementary Note 4:

$${d}_{J}^{B}\,=\,\frac{(2J\,+\,1)\left(\frac{N}{2}\,-\,J\,+\,d\,-\,2\right)!\left(\frac{N}{2}\,+\,J\,+\,d\,-\,1\right)!}{\left(\frac{N}{2}\,-\,J\right)!\left(\frac{N}{2}\,+\,J\,+\,1\right)!(d\,-\,1)!(d\,-\,2)!},$$
( 13)
$${d}_{J}^{F}\,=\,\frac{(2J\,+\,1)d!(d\,+\,1)!}{\left(\frac{N}{2}\,+\,J\,+\,1\right)!\left(\frac{N}{2}\,-\,J\right)!\left(d\,-\,\frac{N}{2}\,+\,J\,+\,1\right)!\left(d\,-\,\frac{N}{2}\,-\,J\right)!}.$$
( 14)

The probabilities pJ are found (see Supplementary Note 2) from the Clebsch-Gordan coefficients C(j1, m1; j2, m2; J, M) describing the coupling of two spins with angular momentum quantum numbers (j1, m1), (j2, m2) into overall quantum numbers (J, M). Here, the two spins are the groups of particles on the left and right, respectively.

For identical gases, all particles have spins in the same direction, so the spin wavefunction is simply \({\left|\uparrow \right\rangle }^{\otimes N}\). This state lies fully in the subspace of maximal total spin eigenvalue, J = M = N/2—which is also fully symmetric with respect to permutations. Thus the spin part factorises out (i.e. there is no correlation between spin and spatial degrees of freedom). It is then clear that dimension counting reduces to the classical logic of counting ways to distribute particles between cells. Indeed, the dimension of the subspace \({{\mathcal{H}}}_{x}^{N/2}\) is \({d}_{N/2}^{B}\,=\,\left(\begin{array}{c}N\,+\,d\,-\,1\\ N\end{array}\right)\)for bosons and \({d}_{N/2}^{F}\,=\,\left(\begin{array}{c}d\\ N\end{array}\right)\) for fermions. It follows that we recover the entropy as the classical case of indistinguishable particles (3).

For orthogonal spins, there are n spin— and m spin—, leading to M = (n − m)/2 and a distribution over different values of J according to

$${p}_{J}\,=\,\frac{(2J\,+\,1)n!m!}{\left(\frac{N}{2}\,+\,J\,+\,1\right)!\left(\frac{N}{2}\,-\,J\right)!}.$$
( 15)

The resulting entropies and significant limits are discussed after an example.

Example

Taking n = m = 1 demonstrates the mechanism behind the state space decomposition. For two particles, there are only two values of J, corresponding to the familiar singlet and triplet subspaces:

$${{\mathcal{H}}}_{s}^{0}\, =\,{\rm{span}}\left\{\left|\uparrow \downarrow \right\rangle \,-\,\left|\downarrow \uparrow \right\rangle \right\},\\ {{\mathcal{H}}}_{s}^{1}\, =\,{\rm{span}}\left\{\left|\uparrow \uparrow \right\rangle \,,\,\left|\downarrow \downarrow \right\rangle \,,\,\left|\uparrow \downarrow \right\rangle \,+\,\left|\downarrow \uparrow \right\rangle \right\}.$$
( 16)

Consider a spatial configuration where a spin— particle is on the left in cell i, and a spin— is on the right in cell j. For bosons, the properly symmetrised wavefunction is

$$\left|{\psi }_{i,j}\right\rangle \,:=\, \frac{1}{\sqrt{2}}\left({\left|{i}_{L}{j}_{R}\right\rangle }_{x}{\left|\uparrow \downarrow \right\rangle }_{s}\,+\,{\left|{j}_{R}{i}_{L}\right\rangle }_{x}{\left|\downarrow \uparrow \right\rangle }_{s}\right)$$
( 17)
$$\, = \,\frac{1}{\sqrt{2}}\left[\frac{\left|{i}_{L}{j}_{R}\right\rangle \,-\,\left|{j}_{R}{i}_{L}\right\rangle }{\sqrt{2}}\cdot \frac{\left|\uparrow \downarrow \right\rangle \,-\,\left|\downarrow \uparrow \right\rangle }{\sqrt{2}}\right.\quad (J\,=\,0)\\ \quad\left.\,+\,\frac{\left|{i}_{L}{j}_{R}\right\rangle \,+\,\left|{j}_{R}{i}_{L}\right\rangle }{\sqrt{2}}\cdot \frac{\left|\uparrow \downarrow \right\rangle \,+\,\left|\downarrow \uparrow \right\rangle }{\sqrt{2}}\quad (J\,=\,1)\right].$$
( 18)

So p0 = p1 = 1/2, and the spatial component of this state is conditionally pure for both J. The initial thermal state is a uniform mixture of all such \(|{\psi }_{i,\,j}\rangle\), with (d/2)2 terms. Thus \(S({\rho }_{x}^{0})\,=\,S({\rho }_{x}^{1})\,=\,2({\mathrm{ln}}\,d\,-\,{\mathrm{ln}}\,2)\). For the final thermal state, we observe that

$${{\mathcal{H}}}_{x}^{0}\,=\,{\rm{span}}\left\{\left|ij\right\rangle \,-\,\left|ji\right\rangle | i\,<\,j\right\},$$
( 19)
$${{\mathcal{H}}}_{x}^{1}\,=\,{\rm{span}}\left\{\left|ij\right\rangle \,+\,\left|ji\right\rangle | i\,\le\, j\right\},$$
( 20)

where i, j now label cells either on the left or right. The corresponding dimensions are d0 = d(d − 1)/2, d1 = d(d + 1)/2. Within the J = 0 subspace, the entropy change is \({\mathrm{ln}}\,[d(d\,-\,1)/2]\,-\,2{\mathrm{ln}}\,d\,+\,2{\mathrm{ln}}\,2\,=\,{\mathrm{ln}}\,(1\,-\,1/d)\,+\,{\mathrm{ln}}\,2\), and for J = 1, it is \({\mathrm{ln}}\,[d(d\,+\,1)]\,-\,2{\mathrm{ln}}\,d\,+\,2{\mathrm{ln}}\,2\,=\,{\mathrm{ln}}\,(1\,+\,1/d)\,+\,{\mathrm{ln}}\,2\). Overall, therefore,

$${{\Delta }}{S}_{\text{igno}}\,=\,\frac{1}{2}{\mathrm{ln}}\,\left(1\,-\,\frac{1}{d}\right)\,+\,\frac{1}{2}{\mathrm{ln}}\,\left(1\,+\,\frac{1}{d}\right)\,+\,{\mathrm{ln}}\,2$$
( 21)
$$\,=\,\frac{1}{2}{\mathrm{ln}}\,\left(1\,-\,\frac{1}{{d}^{2}}\right)\,+\,{\mathrm{ln}}\,2.$$
( 22)

For the informed observer, we have \({{\Delta }}{S}_{\text{info}}\,=\,2{\mathrm{ln}}\,2\). For identical gases, we find \({{\Delta }}{S}_{\text{iden}}\,=\,{\mathrm{ln}}\,(1\,+\,1/d)\,+\,{\mathrm{ln}}\,2\), strictly greater than ΔSigno, but the two become equal in the limit d → .

Repeating the same calculation with fermions, the symmetric and antisymmetric states now pair up oppositely. Then ΔSigno is the same as for bosons. However, we have \({{\Delta }}{S}_{\text{iden}}\,=\,{\mathrm{ln}}\,(1\,-\,1/d)\,+\,{\mathrm{ln}}\,2\,<\,{{\Delta }}{S}_{\text{igno}}\). Unlike for bosons, two distinguishable fermions permit more extractable work by the ignorant observer than two identical fermions.

Entropy changes and limits

In Fig. 3 we plot both ΔSinfo and ΔSigno as a function of dimension for bosons and fermions. Below we analyse the special cases and limits which emerge from these expressions, summarised in Table 2.

Fig. 3: Entropy changes as a function of dimension.
figure 3

Series of plots showing ΔSinfo, ΔSigno against the total cell number d of the system. a, b Bosonic systems of particle number n = 4 and n = 24 respectively. c, d The same for fermionic systems. Note that we have taken the initial number of particles on either side of the box to be equal, n = m in all cases. For comparison, all four figures also display the classical changes in entropy for an informed/ignorant observer. The behaviour of the deficit between ΔS for an informed/ignorant observer of quantum particles agrees with the low density limit in Eq. (23) where we can see ΔSinfo tending to the classical limit \(2n\,\mathrm{ln}\,(2)\) with ΔSigno trailing behind by a deficit of n2/2d2 + H(p). Additionally, by comparing the different plots, we can see the low-dimensional fermionic advantage where the change in entropy is even greater than the classical \(2n\,\mathrm{ln}\,(2)\) value.

Table 2 Summary of results.

With bosons, there are two special cases in which it is easily proven that distinguishable gases are less useful than indistinguishable ones for the ignorant observer. The first case is the example above, with n = m = 1. In addition, for d = 2, we have \({d}_{J}^{B}\,=\,2J\,+ 1\) —so the largest subspace is that with maximal J = N/2. The largest entropy change is then obtained when pN/2 = 1, which is satisfied precisely for indistinguishable gases.

For fermions, we see from Fig. 3 that the greatest work—for both observers—is obtained for small d. An intuitive explanation is that the Pauli exclusion principle causes the initial state to be constrained and thus have low entropy. For example, with the minimal dimension d = 2n = 2m, we have \({{\Delta }}{S}_{\text{info}}\,=\,2{\mathrm{ln}}\,\left(\begin{array}{c}2n\\ n\end{array}\right)\,\approx\, 4n\,{\mathrm{ln}}\,2\) to leading order when n is large. The ignorant observer can do almost as well: the state is entirely contained in the J = 0 subspace, with \({d}_{0}^{F}\,=\,\frac{(2n)!(2n\,+\,1)!}{{(n!)}^{2}(n\,+\,1){!}^{2}}\,=\,\frac{2n\,+\,1}{{(n\,+\,1)}^{2}}{\left(\begin{array}{c}2n\\ n\end{array}\right)}^{2},\) giving \({{\Delta }}{S}_{\text{igno}}\,\approx\, 4n\,{\mathrm{ln}}\,2\) for large n. This is twice as much as for the classical ideal gas.

The most interesting conclusion is reached in the limit of large dn, which we term the low density limit. For simplicity, we take n = m. We find

$${{\Delta }}{S}_{\text{igno}}\,\approx\, {{\Delta }}{S}_{\text{info}}\,-\,H({\bf{p}}),$$
(23)

where \(H({\bf{p}})\,=\,-{\sum }_{J}{p}_{J}{\mathrm{ln}}\,{p}_{J}\) is the Shannon entropy of the distribution pJ, and the lowest order correction is −n2/2d2. Thus, as d → , the ignorant observer can extract as much work as the informed one, minus an amount H(p). This gap is evident from the graphs in Fig. 3.

Now consider the limit dn 1, with both low density and large particle number. Classically, this limit recovers ideal gas behaviour—the large dimension limit can be thought of as letting the box become a continuum. In Supplementary Note 6, we show that H(p) (which depends only on n, not d), behaves as

$$H({\bf{p}})\,\approx\, \frac{1}{2}{\mathrm{ln}}\,n\,+\,0.595...,$$
( 24)

with a correction going to zero as n → . Recall that the entropy change for the informed observer is approximately \(2n\,{\mathrm{ln}}\,2\) in this limit. Therefore the deficit H(p), which is logarithmic, becomes negligible compared with \(2n\,{\mathrm{ln}}\,2\). Thus the ignorant observer can extract essentially as much work as the informed observer: \({{\Delta }}{S}_{\text{igno}}\,\approx\, {{\Delta }}{S}_{\text{info}}\,\approx\, 2n\,{\mathrm{ln}}\,2\). This result is remarkable because it shows a macroscopic departure from the classical case in this limit.

How can we understand this low density limit? An important feature of the low density limit is that the final entropy becomes as large as it could possibly be: \({\rho }_{x}^{\prime}\) becomes maximally mixed over its whole state space. This is true for any N, not just large numbers. We now give an explanation of this phenomenon, which proceeds by counting the number of mutually orthogonal states which can be accessed by the ignorant observer.

The important point about the low density limit is that particles almost never sit on top of each other—that is, almost all states are such that precisely N cells are occupied, each with a single particle. More formally, the number of ways of putting N bosonic particles into d cells is \(\left(\begin{array}{c}N\,+\,d\,-\,1\\ N\end{array}\right)\,\approx\, \left(\begin{array}{c}d\\ N\end{array}\right)\) when d is large, where the approximation means the ratio of the two sides is close to unity. Let us refer to each of these \(\left(\begin{array}{c}d\\ N\end{array}\right)\) choices of (singly) occupied cells as a cell configuration. For each cell configuration, there are \(\left(\begin{array}{c}N\\ n\end{array}\right)\)spin configurations, i.e. ways of distributing the n spin— and m spin— particles. In classical physics, the ignorant observer cannot distinguish any of the spin configurations corresponding to a single cell configuration. In quantum mechanics, remarkably, there are precisely \(\left(\begin{array}{c}N\\ n\end{array}\right)\) states which can be fully distinguished by the ignorant observer, each being a superposition of different spin configurations.

Let us choose a single cell configuration—without loss of generality, let cells 1,…,N be occupied. The state of a spin configuration is denoted as a permutation of

$${\left|\uparrow \right\rangle }_{1}\ldots {\left|\uparrow \right\rangle }_{n}{\left|\downarrow \right\rangle }_{n\,+\,1}\ldots {\left|\downarrow \right\rangle }_{N}\,\in\, {({{\mathbb{C}}}^{2})}^{\otimes N},$$
( 25)

where each cell is treated as a qubit with basis states \(\left|\uparrow \right\rangle ,\left|\downarrow \right\rangle\) according to which type of spin occupies it. (Note that the subsystems being labelled are here are the occupied cells, not particles.)

Again using Schur-Weyl duality, the state space of N qubits can be decomposed as

$${({{\mathbb{C}}}^{2})}^{\otimes N} = \bigoplus _{J}{{\mathcal{H}}}^{J}\,\otimes\, {{\mathcal{K}}}^{J}.$$
( 26)

Due to this decomposition, there is a natural basis \(\left|J,\,M,\,p\right\rangle\), where SU(2) spin rotations \({u}_{s}^{\otimes N}\) act on the M label (denoting the eigenvalue of the total z-direction spin), and permutations Π of the N cells act on the p label.

How do we represent the effective state seen by the ignorant observer? In the representation used here, this corresponds to twirling over the spin states, i.e. performing a Haar measure average over all spin rotations \({u}_{s}^{\otimes N}\)38. In the basis \(\left|J,\,M,\,p\right\rangle\), however, this is a straightforward matter of tracing out the \({{\mathcal{H}}}^{J}\) subspaces, since only these are acted on by the twirling operation. Thus the ignorant observer has access to states labelled as \(\left|J,\,p\right\rangle\).

How much information has been lost by tracing out \({{\mathcal{H}}}^{J}\)? In fact, none—the label M = (n − m)/2 is fixed. Therefore the experimenter can perfectly distinguish all the basis states \(\left|J,\,p\right\rangle\) —and there are just as many of these as there are spin configurations, namely \(\left(\begin{array}{c}N\\ n\end{array}\right)\).

For example, take n = m = 1: the two spin configurations are \(\left|\uparrow \downarrow \right\rangle \,,\,\left|\downarrow \uparrow \right\rangle\), and for some pair of occupied cells, the two distinguishable states are

$$\left|J\,=\,1,\ M\,=\,0,\ p\,=\,0\right\rangle \,=\,\frac{1}{\sqrt{2}}\left(\left|\uparrow \downarrow \right\rangle \,+\,\left|\downarrow \uparrow \right\rangle \right),$$
( 27)
$$\left|J\,=\,0,\ M\,=\,0,\ p\,=\,0\right\rangle \,=\,\frac{1}{\sqrt{2}}\left(\left|\uparrow \downarrow \right\rangle \,-\,\left|\downarrow \uparrow \right\rangle \right).$$
( 28)

Since these are respectively in the triplet and singlet subspaces, they remain orthogonal even after twirling. They can be distinguished by mixing the cells at a balanced beam splitter: it is easy to show that the symmetric state ends up with a superposition of both particles in cell 1 and both in cell 2, while the antisymmetric state ends up with one particle on each side. Therefore, after this beam splitter, the two states can be distinguished by counting the total particle number in each cell.

A slightly more complex example is with n = 2, m = 1. Then the distinguishable basis states for three occupied cells are

$$\left|J\,=\,\frac{3}{2},\ M\,=\,\frac{1}{2},\ p\,=\,0\right\rangle \,=\,\frac{1}{\sqrt{3}}\left(\left|\uparrow \uparrow \downarrow \right\rangle \,+\,\left|\uparrow \downarrow \uparrow \right\rangle \,+\,\left|\downarrow \uparrow \uparrow \right\rangle \right),$$
( 29)
$$\left|J\,=\,\frac{1}{2},\ M\,=\,\frac{1}{2},\ p\,=\,0\right\rangle \,=\,\frac{1}{\sqrt{2}}\left(\left|\downarrow \uparrow \uparrow \right\rangle \,+\,\left|\uparrow \downarrow \uparrow \right\rangle \right),$$
( 30)
$$\left|J=\frac{1}{2},\ M=\frac{1}{2},\ p=1\right\rangle =\sqrt{\frac{2}{3}}\left|\uparrow \uparrow \downarrow \right\rangle \,-\,\frac{1}{\sqrt{6}}\left(\left|\uparrow \downarrow \uparrow \right\rangle +\left|\downarrow \uparrow \uparrow \right\rangle \right).$$
( 31)

Observe that the argument in this section does not depend in anyway on the exchange statistics of the particles, explaining why we see the same limit for bosons and fermions.

Quantumness of the protocol

The above discussion of the low density limit clarifies the fundamental reason why the quantum ignorant observer performs better than the classical one. The distinguishable states comprising the final thermalised state are superpositions of different spin configurations. We might describe a classical observer within the quantum setting as one who is limited to operations diagonal in the basis of cell configurations—that is, they are only able to count the number of particles occupying each cell. For such an observer, these superposition states are indistinguishable.

A crucial question is then: how difficult is it to engineer the quantum protocol for the ignorant observer? We can imagine that the heat bath and work reservoir might naturally couple to the system in the cell occupation basis (if this is the basis that emerges in the classical case). The required coupling is in the Schur basis \({\left|J,\,q\right\rangle }_{x}\), which are generally highly entangled between cells. A sense of their complexity is given by the unitary that rotates the Schur basis to the computational basis, known as the Schur transform. Efficient algorithms to implement this transform have been found39, with a quantum circuit whose size is polynomial in \(N,\,d,\,{\mathrm{ln}}\,(1/\epsilon )\), allowing for error ϵ. This circuit is related to the quantum Fourier transform, an important subroutine in many quantum algorithms. Thus, while the Schur transform can be implemented efficiently, it appears that engineering the required work extraction protocol—in the absence of fortuitous symmetries in the physical systems being used—may be as complex as universal quantum computation.

Work fluctuations

The work extraction protocol we have presented is not deterministic: for each value of J, a different amount of work is extracted with probability pJ. This is typically expected of thermodynamics of small systems; however, in classical macroscopic thermodynamics, such fluctuations are negligible. We can ask whether the same is true of the work extracted by the ignorant observer in the quantum case, especially in the low density and large particle number limits.

One informative way of quantifying the fluctuations is via the variance of entropy change. Let us denote the entropy change for each J by ΔSigno(J). The mean is ΔSigno = ∑JpJΔSigno(J), and the variance is \(V({{\Delta }}{S}_{\text{igno}})\,=\,{\sum }_{J}{p}_{J}{{\Delta }}{S}_{\text{igno}}{(J)}^{2}\,-\,{{\Delta }}{S}_{\,\text{igno}\,}^{2}\). This can be computed straightforwardly from our expressions for pJ, dJ, and approximated in various limits.

Consider first a high density BEC-limit case with d = 2 and N = 2n 1 bosons. We have \({d}_{J}^{B}\,=\,2J\,+\,1\), and using the techniques of Supplementary Note 6, \({p}_{J}\,\approx\, \frac{2J}{n}{e}^{-{J}^{2}/n}\). Then \({{\Delta }}{S}_{\text{igno}}\,=\,{\sum }_{J}{p}_{J}{\mathrm{ln}}\,(2J+1)\,\approx\, \frac{1}{2}{\mathrm{ln}}\,n+{\mathrm{ln}}\,2\,-\,\frac{\gamma }{2}\,\approx\, \frac{1}{2}{\mathrm{ln}}\,n+0.405.\) Similarly, we compute \(V({{\Delta }}{S}_{\text{igno}})\,=\,{\sum }_{J}{p}_{J}{[{\mathrm{ln}}\,(2J\,+\,1)]}^{2}\,\approx\, \frac{{\pi }^{2}}{24}\,\approx\, 0.411\). Therefore the mean work dominates its fluctuations (logarithmic versus a constant).

Next, consider the closest analogue for fermions: the case of minimal dimension d = 2n = 2m. Recall that \({{\Delta }}{S}_{\text{igno}}\,\approx\, {{\Delta }}{S}_{\text{info}}\,\approx\, 4n\,{\mathrm{ln}}\,2\) for large n. Since p0 = 1, work extraction is in fact completely deterministic in this case.

Finally, take the low density limit. As found before, for both bosons and fermions, \({{\Delta }}{S}_{\text{igno}}\,\approx\, 2n\,{\mathrm{ln}}\,2\) —linear in n—and yet we still find a constant \(V({{\Delta }}{S}_{\text{igno}})\,\approx\, \frac{{\pi }^{2}}{24}\).

In these macroscopic limits, therefore, work extraction is either fully deterministic or effectively deterministic in that the fluctuations are negligible compared with the mean.

Non-orthogonal spins

The results generalise to the case of partially distinguishable spins—that is, initially with n in spin state \(\left|\uparrow \right\rangle\) on the left and m in state \(\left|\nearrow \right\rangle\) on the right, where

$$\left|\nearrow \right\rangle \,=\,\cos (\theta /2)\left|\uparrow \right\rangle \,+\,\sin (\theta /2)\left|\downarrow \right\rangle .$$
( 32)

For this, we must be more explicit about the operations permitted by the informed observer. The most general global unitary that does not affect the number of each type of spin is of the form \(U{ \,=\, \bigoplus }_{M}{U}_{xsBW}^{(M)}\), where the block structure refers to subspaces with fixed M as defined by the Schur basis (recalling that the total number of particles is fixed). We find (see Supplementary Note 3 for details) that ΔSinfo is an average of entropy changes for each value of M. For ΔSigno, all that changes is the probability pJ, now being obtained by an average over Clebsch-Gordan coefficients. Importantly, for both observers, the result is a function of θ only via the probability distribution qM for the spin value M. In Fig. 4, one observes the smooth transition from identical to orthogonal spin states as θ varies from 0 to π.

Fig. 4: Results for partially distinguishable spins.
figure 4

Plots of ΔSinfo, ΔSigno as a function of orthogonality of the spin states as determined by θ in (32). The figure is for a bosonic system with initial numbers of particles on either side of the box n = m = 15, and d = 50 cells. For comparison, the figure also displays the classical change in entropy, \(2n\,\mathrm{ln}\,(2)\). Here, the greatest change in entropy occurs when the spin states are orthogonal at θ = π.

Discussion

In contrast to the classical Gibbs paradox setting, we have shown that quantum mechanics permits the extraction of work from apparently indistinguishable gases, without access to the degree of freedom that distinguishes them. It is notable that the lack of information about this ‘spin’ does not in principle impede an experimenter at all in a suitable macroscopic limit with large particle number and low density—the thermodynamical value of the two gases is as great as if they had been fully distinguishable.

The underlying mechanism is a generalisation of the famous Hong–Ou–Mandel (HOM) effect in quantum optics34,40,41. In this effect, polarisation may play the role of the spin. Then a non-polarising beam splitter plus photon detectors are able to detect whether a pair of incoming photons are similarly polarised. The whole apparatus is polarisation-independent and thus accessible to the ignorant observer. Given this context, it is therefore not necessarily surprising that quantum Gibbs mixing can give different results to the classical case. However, the result of the low density limit is not readily apparent. This limit is reminiscent of the result in quantum reference frame theory38 that the lack of a shared reference frame presents no obstacle to communication given sufficiently many transmitted copies42.

Two recent papers18,43 have studied Gibbs-type mixing in the context of optomechanics. A massive oscillator playing the role of a work reservoir interacts with the photons via their pressure. This oscillator simultaneously acts as a beam splitter between the two sides of the cavity. In ref. 18, the beam splitter is non-polarising and thus (together with the interaction with the oscillator) accessible to the ignorant observer. The main behaviour there is driven by the HOM effect, which enhances the energy transfer to the oscillator, albeit in the form of fluctuations. In ref. 43, which studies Gibbs mixing as a function of the relative polarisation rotations between left and right, bosonic statistics are therefore described as acting oppositely to Gibbs mixing effects—which is different from our conclusions. However, there is no contradiction: we have shown that an advantage is gained by optimising over all allowed dynamics. Moreover, the scheme in ref. 43 uses a polarisation-dependent beam splitter, which is only accessible to the informed observer. Therefore the effect described here cannot be seen in such a set-up. It is an interesting question whether such proposals can be modified to see an advantage of the type described here, even if not optimal.

It is important to determine how the thermodynamic enhancements predicted in this paper may have implications for physical systems. Such an investigation should make use of more practical proposals (such as refs. 16,18,43) to better understand possible realisations of mixing. For example, systems of ultra-cold atoms in optical lattices44 may provide a suitable platform to experimentally realise the thermodynamic effects predicted in this work. The question of the maximal enhancement in the macroscopic limit is particularly compelling given the rapid progress in the manipulation of large quantum systems45.

Methods

The Supplementary Information contains detailed proofs. Supplementary Note 1 describes the treatment of classical particles, starting from a description akin to first quantisation, and then coarse-graining the state space along with appropriate restrictions on the allowed dynamics. Supplementary Note 2 fills out the derivation for the quantum ignorant observer sketched in the main text. Supplementary Note 3 provides details for the general case of non-orthogonal spins. Supplementary Note 4 computes the dimensions of the spaces \({{\mathcal{H}}}_{x}^{\lambda }\) from representation theory formulas. Supplementary Notes 5, 6 show how to take the low density and large particle number limits, respectively.