Simulating quantum physics will likely be one of the first practical applications of quantum computers. In simulating the many body problem, most algorithmic progress so far has focused on systems with binary degrees of freedom, e.g., spin-\(\frac{1}{2}\) systems1,2 or fermionic systems3,4. The latter case is relevant for simulations of chemical electronic structure5,6, nuclear structure7, and condensed matter physics8. This focus on binary degrees of freedom seems to be a natural development, partly because qubit-based quantum computation is the most widespread model used in theory, experiment, and the nascent quantum industry.

However, for a large subset of quantum physics problems, important roles are played by components that are d-level particles (qudits) with d > 2, including bosonic fundamental particles9, vibrational modes10, spin-s particles11, or electronic energy levels in molecules12 and quantum dots13. Accordingly, several qubit-based quantum algorithms were recently developed for efficiently studying some such processes, including nuclear degrees of freedom in molecules14,15,16,17,18, the Holstein model19,20, and quantum optics21,22.

In principle, there are combinatorially many ways to map a quantum system to a set of qubits23,24. Mapping a d-level system to a set of qubits may be done by assigning an integer to each of the d levels and then performing an integer-to-bit mapping. Some consideration of d-level-to-qubit mappings has been published in the very recent literature, primarily for truncated bosonic degrees of freedom14,17,18,19,20,21,25, but this is still an unexplored area of theory especially in regards to determining which encodings are optimal for which problem instances. The purpose of this work is both to provide a complete yet flexible framework for the mappings, and to analyze several encodings (both newly proposed herein and previously proposed) for a widely used set of operations and Hamiltonians. This aids in determining which mappings are more efficient for particular operators and specific hardware, including near-term intermediate-scale quantum (NISQ) devices.

When choosing which encoding to use for a given problem, it is conceptually useful to think in terms of a hardware budget, as shown in Fig. 1. Similar considerations have been studied for fermionic mappings26. For near- and intermediate-term hardware, one will often have stringent resource constraints in terms of both qubit count and gate count. Imagine that one plans to perform Hamiltonian simulation for some N-particle system. Using some set of criteria for acceptable error and other parameters, one can in principle work backwards to determine how much of a quantum resource is available for each operation. This quantity would be different for each device. Perhaps one quantum computer would allow for more qubits but another allows for more operations, as in Fig. 1. Because different encodings yield differing resource requirements, considering multiple encodings may be essential for determining whether the available resources are sufficient.

Fig. 1: Especially for near-term noisy quantum hardware, gate counts, and qubit counts will be limited.
figure 1

In principle, these constraints can be used to approximate a hardware budget for a set of hardware and a particular Hamiltonian simulation problem. For example, if one wants to simulate a collection of N bosons on a small quantum computer, the decoherence time and gate errors will constrain the allowed number of gates, while the total number of qubits will constrain the qubit count per boson. In this schematic, we show two arbitrary hardware budgets for Trotterizing the exponential of \({\hat{q}}^{2}\) for one boson with truncation d = 5. In device A, both the Gray and standard binary encodings are satisfactory, but the unary code requires too many qubits. However, because device B allows for more qubits but fewer operations, the unary code is sufficient while the former two encodings require too many operations. This highlights the need for considering multiple encodings, as an encoding that is best for one type of hardware is not necessarily universally superior.

Here we briefly summarize our results for resource comparisons of real Hamiltonian problems, in order to highlight the utility of encoding analyses and to demonstrate the ultimate practical objective of this work. Figure 2 shows the relative two-qubit operation requirements for a set of five prominent physics and chemistry problems (defined in Supplementary Section 6). All comparisons are made within a given Hamiltonian. Our investigations revealed a somewhat rich interplay between qubit counts, operation counts, encodings, and conversions. The difficulty in a priori predicting the optimal encoding scheme suggests that sophisticated compilation procedures, for automatically choosing and converting between multiple encodings, will play a large role in future quantum simulation efforts for d-level systems.

Fig. 2: Using an arbitrary selection of parameters for common physics and chemistry Hamiltonians, we have plotted the comparative computational costs required for first-order Trotterization.
figure 2

Costs are reported in terms of number of two-qubit entangling gates, relative to the cost of standard binary (SB). The three encodings shown here—standard binary, Gray code, and unary—are defined in the text. The five Hamiltonians are the Bose–Hubbard model, one-dimensional quantum harmonic oscillator (QHO), Franck–Condon calculation, boson sampling, and spin-s Heisenberg model. The optimal encodings are sensitive both to the Hamiltonian class and the number of levels d (determined by bosonic truncation or by the spin value s). In some cases, it is best to stay in a particular encoding for the duration of the simulation. Other times, it is worth bearing the resource cost of converting between encodings, because it saves on total operations. Still other times, the decision to save operations by converting between encodings will depend on whether available hardware is gate count limited or qubit count limited. Four Scenarios, A through D, are discussed in Section Composite system.

We note that the optimal encoding schemes have differing characteristics, all of which are present in Fig. 2. The results for each Hamiltonian can be categorized as one of four scenarios. Scenario A: the optimal choice is either standard binary (SB) only or Gray-only, with no benefit from converting between encodings (Bose–Hubbard d = 4; 1D QHO d = 4). Scenario B: the optimal choice is to convert between SB and Gray, in order to perform different local operators in different encodings (Heisenberg \(s=\frac{7}{2}\); Franck–Condon d = 4). Scenarios A and B are notable because they require both the fewest operations and the fewest qubits, as there is no benefit to expanding into the qubit-hungry unary encoding. Scenario C: unary-only is the optimal choice, and saving memory by compacting the data back to Gray or SB is still cheaper than remaining in the latter encodings (Bose–Hubbard d = 10; boson sampling d = 10; Franck Condon d = 10). Scenario D: unary is the optimal encoding, but only if the data remains in high-qubit-count unary for the duration of the calculation (1D QHO d = 10; Heisenberg s = 2). The optimal encoding choice is highly sensitive to both the Hamiltonian class and the truncation value d. This suggests the need to perform an encoding-based analysis for any new digital quantum simulation of d-level particles.

Throughout this work, we study four encoding types: unary (also called one-hot), standard binary, the Gray code, and a new class of encodings we name block unary, all defined in Section Binary encoding. After outlining how to map d-level operators to qubit operators for any arbitrary encoding in Section Mapping d-level matrix operators to qubits, we consider resource counts for most standard local operators used in bosonic and spin-s Hamiltonians in Section Local operators. In Section Conversions between encodings we present circuits for converting between encodings and enumerate their resource counts. Finally, in Section Composite systems we obtain relative resource estimates for various commonly studied Hamiltonians in theoretical physics and chemistry. Conclusions are summarized in Section Discussion. This work highlights how the careful choice of the encoding scheme can greatly reduce the resource requirements when simulating a system of d-level particles or modes.


Binary encodings

We begin by giving definitions for various integer to binary encodings. In this work, we use the term binary encoding to refer to any code that represents an arbitrary integer as a set of ordered binary numbers, not just the familiar base-two numbering system. These encodings can be used to represent states regardless of the type of system or basis we are considering. For a given encoding enc, each integer l has some qubit (i.e., binary) representation, denoted \({{\mathcal{R}}}^{{\rm{enc}}}(l)\), which is a bit string on Nq bits, \({x}_{{N}_{q}-1}\cdots {x}_{1}{x}_{0}\). To specify the value (0 or 1) of a particular bit i in the encoding, we use the notation \({{\mathcal{R}}}^{{\rm{enc}}}(l;i)\).

Standard binary

We refer to the familiar base-two numbering system as the standard binary (SB) encoding, such that an integer l is represented by

$$l\,\,\mapsto \,\,{x}_{0}{2}^{0}+{x}_{1}{2}^{1}+{x}_{2}{2}^{2}+...$$

This straight-forward mapping has been used for qubit-based quantum simulation of bosons previously14,17,18. The SB mapping uses \({N}_{q}=\lceil {\mathrm{log}\,}_{2}d\rceil\) qubits when the range of integers under consideration is {0, 1, 2, …, d − 1}, where is the ceiling function.

Gray code

In principle there are combinatorically many one-to-one mappings between a set of integers and a set of Nq-bit strings. One mapping from classical information theory with particularly useful properties is called the Gray code or the reflected binary code27. Its defining feature is that the Hamming distance dH between nearest-integer bit strings is always 1, formally \({d}_{H}({{\mathcal{R}}}^{{\rm{Gray}}}(l),{{\mathcal{R}}}^{{\rm{Gray}}}(l+1))=1\). The Hamming distance counts the number of mismatched bits between two bit strings. In other words, moving between adjacent integers requires only one bit flip (see Table 1). As will be seen below, this encoding is especially favorable for tridiagonal operators with zero diagonals, since all non-zero elements \(\left|l\right\rangle \left\langle l^{\prime} \right|\) then have hamming distance one. As far as we are aware, the Gray code has not been previously proposed for use in the simulation of bosons and other d-level systems on a qubit-based quantum computer. This encoding inspires the possibility of having a specialized encoding for many possible matrix sparsity patterns (for example a code for which \({d}_{H}={d}_{H}({{\mathcal{R}}}^{{\rm{Gray}}}(l){{\mathcal{R}}}^{{\rm{Gray}}}(l+1))=2\) to be used for pentadiagonal matrices like \({\hat{q}}^{2}\)), but we do not consider this possibility here. Throughout this work we refer to the SB and Gray codes as compact encodings because they make use of the full Hilbert space of the qubits used.

Table 1 The standard binary (SB), Gray code, and unary encodings.

Unary encoding

Mappings that do not use all \({2}^{{N}_{q}}\) states of the Hilbert space are possible. In the unary encoding (also called one-hot28, single-excitation subspace29, or direct mapping17), the number of qubits required is Nq = d. Therefore, only an exponentially small subspace of the qubits’ Hilbert space is used. The relationship between the unencoded and encoded ordered sets is

$$\{{\mathtt{0}},{\mathtt{1}},{\mathtt{2}},{\mathtt{3}},...\}\,\mapsto\, \{...{\mathtt{000001}},...{\mathtt{000010}},...{\mathtt{000100}},...{\mathtt{001000}},...\}.$$

Previous proposals for bosonic simulation on a universal quantum computer have used this encoding17,18,28,29,30 under different names. The Unary encoding makes less efficient use of quantum memory, but it will become clear below that it usually allows for fewer quantum operations.

Block unary encoding

One can interpolate between the two extremes, using less than the full Hilbert space but more of the space than the unary code uses. In some limited instances this allows one to make tunable trade-offs between required qubit counts and required operation counts, which may be especially useful for mapping physical problems to the specific hardware budget of a particular near-term intermediate-scale (NISQ) device. In this work, we introduce a class of such encodings that we call block unary.

The block unary code is parameterized by choosing an arbitrary compact encoding (e.g., SB or Gray code) and a local parameter g that determines the size of each block. It can be viewed as an extension of the unary code, where each digit (block) ranges from 0 to g. Within each block, the local encoding is used to represent the local digit. The number of qubits required is \({N}_{q}=\lceil \frac{d}{g}\rceil \lceil {\mathrm{log}\,}_{2}(g+1)\rceil\). Examples of the block unary encoding are given in Table 2. We use the notation BU\({}_{g}^{{\rm{enc}}}\) to define block unary with a particular pair of parameters. For a transition within a particular block, the number of operations is similar to a compact code with d = g + 1. For elements that move between different blocks, the transition will be conditional on all qubits in both blocks.

Table 2 Examples of the block unary SB and block unary Gray encodings.

Bitmask subset

Because some encodings do not make use of the full Hilbert space, it will be useful to define the subset of bits that is necessary to determine each integer l. For a given encoding we call this subset the bitmask subset, denoted C(l) where Cl {0, 1, 2, . . . , Nq − 1}. The bitmask subset for the SB and Gray encodings is always CSB(l) = CGray(l) = {0, 1, 2, . . . , Nq − 1}, since all Nq bits must be known to determine the integer value. In the unary encoding, the bitmask subset is simply CUnary(l) = {l}, because if one knows that bit l is set to 1, then one knows the other bits are 0. In the block unary code, the bitmask subset simply contains the bits that represent the current block. For example, for the Block Unary Gray code with g = 3 (see Table 2), CBU[g=3](2) = {0, 1} and CBU[g=3](3) = {2, 3}.

Mapping d-level matrix operators to qubits

Any operator for a d-level system can be written as

$$\hat{A}={\sum \limits_{l,l^{\prime} = 0}^{d-1}}{a}_{l,l^{\prime} }\left|l\right\rangle \left\langle l^{\prime} \right|,$$

where l and \(l^{\prime}\) are integers labelling pairwise orthonormal quantum states. In this work we conceptualize the mappings primarily in terms of Fock-type encodings (or alternatively second quantization encodings) where each \(\left|l\right\rangle\) represents one level in the d-level system. However, the mapping procedure is identical to the one used for first quantization operators19,20 that we briefly discuss. When dealing with bosonic degrees of freedom, one must choose an arbitrary level d at which to truncate, since in principle a bosonic mode may have an unbounded particle number. Choosing this cutoff such that truncation error is below a given threshold is an essential step that has been previously studied18,31,32, although it is beyond the scope of the current work.

In performing a mapping of any d-by-d matrix operator to a sum of Pauli strings, the following approach may be used. For each term in the sum, one first assigns an integer to each level and then uses an arbitrary binary encoding \({\mathcal{R}}\) to encode each integer:

$$\left|l\right\rangle\, \mapsto \,\left|{\mathcal{R}}(l)\right\rangle =\left|{\mathcal{R}}(l;{N}_{q}-1)\right\rangle \cdots \left|{\mathcal{R}}(l;1)\right\rangle \left|{\mathcal{R}}(l;0)\right\rangle\, \mapsto \,\left|{x}_{{N}_{q}-1}\right\rangle \cdots \left|{x}_{0}\right\rangle,\ {x}_{i}\in \{0,1\}.$$

For codes using less than the full Hilbert space of the qubits (e.g., unary and block unary), some qubits can be safely ignored for a given element \(\left|l\right\rangle \left\langle l^{\prime} \right|\). This is because operations on these excluded qubits will not affect the manifold on which the problem is encoded. Therefore, for each element \(\left|l\right\rangle \left\langle l^{\prime} \right|\) a mapping needs to consider only the bitmask subsets of the two integers, ignoring other bits. The operator on the qubit space is then

$$\left|l\right\rangle \left\langle l^{\prime} \right|\,\mapsto \,\bigotimes _{i\in C(l)\cup C(l^{\prime} )}\left|{x}_{i}\right\rangle {\left\langle {x}_{i}^{\prime}\right|}_{i},$$

where \(C(l)\cup C(l^{\prime} )\) is the union of the bitmask subsets of the two integers and the subscripts i denote qubit number. One then converts each qubit-local term \(\left|{x}_{j}\right\rangle \left\langle x^{\prime} \right|\) to qubit operators using the following four expressions:

$$\left|0\right\rangle \left\langle 1\right|=\frac{1}{2}(\hat \sigma_x +i\hat \sigma_y )\equiv {\hat{\sigma }}^{-},$$
$$\left|1\right\rangle \left\langle 0\right|=\frac{1}{2}(\hat \sigma_x -i\hat \sigma_y )\equiv {\hat \sigma }^{+},$$
$$\left|0\right\rangle \left\langle 0\right|=\frac{1}{2}(I+\hat \sigma_z ),$$
$$\left|1\right\rangle \left\langle 1\right|=\frac{1}{2}(I-\hat \sigma_z ).$$

For a single term in Eq. (3), the result is a sum of Pauli strings,

$$\hat{A}\,\mapsto\, {\sum \limits_{k}^{P}}{c}_{k}\mathop{\bigotimes }\limits_{j}^{{N}_{q}}{\hat{\sigma }}_{kj},$$

where P is the number of Pauli strings in the sum, ck is a coefficient for each Pauli term, and every operator is either a Pauli matrix or the identity: \({\hat{\sigma }}_{kj}\in \ \{\hat \sigma_x,\ \hat \sigma_y,\ \hat \sigma_z \}\cup \{I\}\). Note that in this work the set of Pauli matrices is defined to exclude the identity.

Significance of the Hamming distance

It is useful to analyze encoding efficiency based on the Hamming distance between \({\mathcal{R}}(l)\) and \({\mathcal{R}}(l^{\prime} )\). The Hamming distance, which we denote \({d}_{H}^{{\mathcal{R}}}(l,l^{\prime} )\equiv {d}_{H}({\mathcal{R}}(l),{\mathcal{R}}(l^{\prime} ))\), is defined as the number of unequal bits between two bit strings of equal length. The important observation is that, for a given element \({a}_{l,l^{\prime} }\left|l\right\rangle \left\langle l^{\prime} \right|\) in Eq. (3), the average length of the Pauli strings increase as the Hamming distance increases, where length is defined as the number of Pauli operators (excluding identity) in the term. In this subsection, for simplicity we at first assume that the bitmask subset is C(l) = {0, 1, . . . , Nq − 1}, implying that we are using a compact code such as Gray or SB. But we note that these Hamming distance considerations are relevant to all encodings. In the case of a noncompact encoding, one would consider only the union of the bitmask subsets. To clarify this result, consider the following. For an arbitrary element \(\left|l\right\rangle \left\langle l^{\prime} \right|\) written in binary form \(\left|{\mathcal{R}}(l)\right\rangle \left\langle {\mathcal{R}}(l^{\prime} )\right|\), one performs the following mapping:

$$\left|{x}_{0}\right\rangle \left\langle x^{\prime} \right|\otimes \cdots \otimes \left|{x}_{N}\right\rangle \left\langle x^{\prime} \right|\,\,\mapsto \,\,\frac{1}{{2}^{N}}({\hat{a}}_{0}+{\hat{b}}_{0})({\hat{a}}_{1}+{\hat{b}}_{1})\cdots ({\hat{a}}_{N}+{\hat{b}}_{N})$$

where subscripts denote qubit number, \(\hat{a}\in \{\hat \sigma_x,I\}\), and \(\hat{b}\in \{\pm i\hat \sigma_y,\pm \hat \sigma_z \}\) (according to Eqs. (6) through (9)). Expanding the RHS of Eq. (11) leads to an equation of form (10). For any mismatched qubit j, i.e., any qubit for which \({x}_{j}\,\ne \,x^{\prime}\), the clause \(({\hat{a}}_{j}+{\hat{b}}_{j})\) contains two Pauli operators. For matched qubits (\({x}_{j}=x^{\prime}\)), the clause \(({\hat{a}}_{j}+{\hat{b}}_{j})\) instead has one identity and one Pauli operator. Hence more matched bits lead to more identity operators in expression (11), leading to fewer Pauli operators in the final sum of Pauli strings. It follows that the number of non-identity operators can be reduced by having a smaller Hamming distance. This is relevant because Hamiltonians with more Pauli operators require more quantum operations to implement.

Consider the illustrative example of mapping the Hermitian term \(\left|3\right\rangle \left\langle 4\right|+\left|4\right\rangle \left\langle 3\right|\) to a set of qubits. In the SB encoding, Eqs. (6)–(9) yield the following Pauli string representation:

$$\left|3\right\rangle \left\langle 4\right| +\left| 4\right\rangle \left\langle 3\right|\,{\mathop \longmapsto ^{{\rm{Std}}.{\rm{Binary}}}}\,\left| 011\right\rangle \left\langle 100\right|+\left|100\right\rangle \left\langle 011\right|=\frac{1}{4}(\hat\sigma_x^{(2)}\hat\sigma_x^{(1)}\hat\sigma_x^{(0)}+\hat\sigma_y^{(2)}\hat\sigma_y^{(1)}\hat\sigma_x^{(0)}+\hat\sigma_y^{(2)}\hat\sigma_x^{(1)}\hat\sigma_y^{(0)}-\hat\sigma_x^{(2)}\hat\sigma_y^{(1)}\hat\sigma_y^{(0)}).$$

Using the Gray code, the Pauli string instead takes the form

$$\left|3\right\rangle \langle 4| +| 4\rangle \langle 3|\,{\mathop\longmapsto^{{\rm{Gray}}}}\,| 010\rangle \left\langle 110\right|+\left|110\right\rangle \left\langle 010\right| =\frac{1}{4}(-\hat\sigma_x^{(2)}\hat\sigma_z^{(1)}\hat\sigma_z^{(0)}+\hat\sigma_x^{(2)}\hat\sigma_z^{(0)}-\hat\sigma_x^{(2)}\hat\sigma_z^{(1)}+\hat\sigma_x^{(2)}).$$

The Hamming distance between \({\mathcal{R}}(3)\) and \({\mathcal{R}}(4)\) is \({d}_{H}^{{\rm{SB}}}=3\) in the former case and \({d}_{H}^{{\rm{Gray}}}=1\) in the latter. The result is that the Gray code has fewer Pauli operators per Pauli string, meaning that it can be implemented with fewer operations.

Avoiding superfluous terms in noncompact codes

When implementing local products of operators using noncompact codes (unary or BU), one should multiply operators in the matrix representation before performing the encoding to qubits. If one instead first maps local operators to qubit operators, and then multiplies the operators, superfluous terms may result. For example, when implementing an arbitrary squared operator \({\hat{A}}^{2}\), one should begin with the matrix representation of \({\hat{A}}^{2}\) instead of squaring the qubit representation of \(\hat{A}\). To see this, consider the unary encoding of the square of a 3-level matrix operator \({\hat{n}}_{d = 3}={\rm{diag}}[0,1,2]\,\)\(\,\mapsto \,\frac{3}{2}I-\frac{1}{2}\hat\sigma_z^{(1)}-\hat\sigma_z^{(2)}\). If one begins with \({\hat{n}}_{d = 3}^{2}={\rm{diag}}[0,1,4]\), the encoded operator is

$${\hat{n}}_{d = 3}^{2}={\rm{diag}}{[0,1,4]}\,{\mathop\longmapsto ^{{\rm{Unary}}}}\,\frac{5}{2}I-\frac{1}{2}\hat\sigma_z^{(1)}-2\hat\sigma_z^{(2)}.$$

If one instead squares the already-encoded Pauli operator for \({\hat{n}}_{d = 3}\), this yields


Superscripts denote qubit number. Pauli operators (14) and (15) behave identically on the subspace of the unary encoding, although operator (14) is less costly to implement. One might attempt to eliminate superfluous terms after the mapping is complete, but this is likely a hard problem. In principle it may require combinatorial effort to determine which combinations of operators leave the encoding space unaffected. Hence the most prudent strategy is to always perform as much multiplication as possible in the matrix representation. These considerations are irrelevant when using one of the compact encodings.

Trotterization and gate count upper bounds

Hamiltonian simulation often consists of implementing the unitary operator

$$\hat{U(t)}=\exp (-i\hat{H}t)$$

for some user-defined time-independent Hamiltonian \(\hat{H}\), where t is the evolution time and we have set  = 1. Any Hamiltonian can be expressed as a sum of local Pauli strings such that

$$\hat{H}={\sum \limits_{k}}{c_{k}}{\mathop \otimes \limits_j}{\hat{\sigma }}_{kj}={\sum \limits_{k}}{\hat{h}}_{k},$$

which takes the same form as Eq. (10) with the Pauli strings and their coefficients compacted into terms \(\{{\hat{h}}_{k}\}\). In practice, Hamiltonian simulation can be performed using a Suzuki–Trotter decomposition

$$\hat U(t) =\exp \left(-it{\sum _{k}}{\hat{h}}_{k}\right)\approx {\left({\prod_{k}}\exp (-i{\hat{h}}_{k}t/\eta )\right)}^{\eta }=\tilde U(t),$$

where the expression is exact in the limit of large η or small t1,33. The numerical studies of this work consider the encoding-dependent resource counts for Eq. (18), for a subset of prominent physics problems. We focus on determining the resources required for simulating a single Trotter step.

There are several variations and extensions to the Hamiltonian simulation approach of Eq. (18), including higher-order Suzuki–Trotter methods33, the Taylor series algorithm34, quantum signal processing35, and schemes based on randomization36,37. Notably for the current work, recent results suggest that simple first-order Trotterization will have lower error for near- and medium-term hardware38,39, even if the other methods are asymptotically more efficient. Since \({\hat{h}}_{k}\) takes a different form depending on the chosen encoding, the resource counts as well as the error \(| \hat U(t) -\tilde U(t) |\) will be different. We leave the study of numerical error for future work, as the goal of the current work is to introduce these mappings and to understand some trends in their resource requirements.

Each term \(\exp (-it{\hat{h}}_{k})=\exp (-it{c{}_{k}\bigotimes }_{j}{\hat{\sigma }}_{kj})\) may be implemented using the well-known CNOT staircase quantum circuit shown in Fig. 3. If a qubit j is acted on by \(\hat\sigma_x\) or \(\hat\sigma_y\), additional single-qubit gates are placed on qubit j as shown in the figure. HiRx+z(π) denotes the Hadamard gate that changes between the Z- and X-basis, and iRx+y(π) converts between the Z- and Y-basis. To exponentiate a single Pauli string, the number of CNOT (CX) gates required is


where p is the number of Pauli operators (\(\{\hat\sigma_x,\hat\sigma_y,\hat\sigma_z\}\), excluding I) in the term to be exponentiated.

Fig. 3: Canonical quantum circuits used to exponentiate Pauli strings on a universal quantum computer.
figure 3

One needs 2(p − 1) two-qubit gates for such an operation, where p is the number of Pauli operators in the term. When a product of many exponentials is used, as in the Suzuk–Trotter procedure, there tends to be significant gate cancellation.

It is instructive to calculate upper bounds for entangling gate counts. Consider a simple Hermitian term \(\alpha \left|l\right\rangle \langle l^{\prime} | +{\alpha }^{* }| l^{\prime} \rangle \left\langle l\right|\) and a single diagonal element \(\left|l\right\rangle \left\langle l\right|\). Here, dH denotes \({d}_{H}({\mathcal{R}}(l),{\mathcal{R}}(l^{\prime} ))\), which will depend on the chosen encoding, and dH = 0 in the case of a diagonal element. We define K as \(| {C}^{{\rm{enc}}}(l)\cup {C}^{{\rm{enc}}}(l^{\prime} )|\), the number of qubits in the relevant bitmask subsets. As we are considering products of two-term sums (Eqs. (6)–(9)), the distribution of Pauli strings can be analyzed in terms of binomial coefficients. In the Supplementary Section 1 we show that

$${n}_{{\rm{cx,UB}}}[{d}_{H}(l,l^{\prime} ),K]=\frac{1}{2}{\sum \limits_{p=2}^{K}}{2}^{{d}_{H}}\left( {\begin{array}{*{20}{l}}{K - {d_H}}\\{p - {d_H}}\end{array}} \right)(2p-2),$$

where UB denotes upper bound and the \(\frac{1}{2}\) factor is not present for diagonal terms. Some resulting upper bounds for particular Hamming distances are

$$\begin{array}{r}{n}_{{\rm{cx,UB}}}[{d}_{H}=0,K]=({2}^{K}K-2({2}^{K})-2)\\ {n}_{{\rm{cx,UB}}}[{d}_{H}=1,K]=\frac{1}{2}({2}^{K}K-{2}^{K})\\ {n}_{{\rm{cx,UB}}}[{d}_{H}=2,K]=\frac{1}{2}({2}^{K}K).\end{array}$$

Again, the above expressions are for a single Hermitian element pair or a single diagonal term. In practice, because substantial gate cancellation is possible once the quantum circuit has been compiled, these upper bounds are not always directly applicable when choosing an encoding. However, the above expressions may find direct utility in limiting cases, and they demonstrate the basic relationship between Hamming distance, size of the bitmask union, and gate counts. Below, we study a common sparsity pattern, a tridiagonal real matrix operator \(\hat{B}\) with zeros on the diagonal, i.e., with matrix structure \(\langle i| \hat{B}| j\rangle ={\sum }_{k}{B}_{k}({\delta }_{i,k}{\delta }_{j,k+1}+{\delta }_{i,k+1}{\delta }_{j,k})\). This is the sparsity pattern of several commonly used d-level operators, such as the bosonic position operator \(\hat{q}\).

In Table 3 we show analytical upper bounds for three different levels of sparsity, derived in Supplementary Section 1. We consider a single Hermitian pair, the \(\hat{B}\) operator with O(d) non-zero entries, and a dense matrix operator with O(d2) non-zero entries. In Supplementary Section 1 we show that the upper bound of entangling gate counts for an arbitrary operator on K qubits is O(K4K). Because \(K=\lceil {\mathrm{log}\,}_{2}d\rceil\) in the compact codes, this means that the compact codes will never require more than \(O({d}^{2}\mathrm{log}\,d)\) entangling gates.

Table 3 Asymptotic upper bound complexity of entangling gate counts, for Trotterizing a matrix exponential of a d-level particle.

An important consequence is that, as the matrix density increases, the comparative advantage of the unary encoding decreases. The compact codes’ upper bound both for a \(\hat{B}\) and for a fully dense operator are both \(O({d}^{2}\mathrm{log}\,d)\), since this is the maximum upper bound. The unary encoding’s upper bound of O(d2) for fully dense operator is only slightly lower. And because actual gate counts for various matrix instances will be less than these upper bounds, it appears possible that compact codes might often be superior for dense matrices in both qubit count and gate count. However, because the most commonly used quantum operators tend to have O(d) density, it is likely that unary will most often be superior in gate counts, at least for Hamiltonian simulation. Many exceptions to these trends are shown in Section Local operators.

As a more concrete demonstration of typical operator scaling, we calculated numerical upper bounds for \(\hat{B}\) with increasing d. Qubit counts and upper bounds for \(\hat{B}\) are shown in Fig. 4, as a function of d for different encodings. We first encoded the entire operator \(\hat{B}\) into a sum of Pauli strings before collecting and cancelling terms, leading to some favorable cancellations. Then we applied Eq. (19). There is roughly an inverted relationship between the qubit counts and the operation counts, because sparser encodings like the unary and block unary have smaller bitmask subsets but require more total qubits.

Fig. 4: Numerical upper bounds for resource counts of implementing one Suzuki–Trotter step of a d-by-d real Hermitian matrix operator \(\hat{B}\), where \(\hat{B}\) is tridiagonal with zeros on the diagonal.
figure 4

Top: Qubit counts for mappings considered in this work. BU\({}_{g}^{{\mathrm{Gray}}}\) stands for block unary where g is the size of the block. Asymptotically, the number of qubits scales logarithmically for the SB and Gray encodings, and linearly for the unary and block unary encodings. Bottom: Upper bounds of CNOT operation counts for implementing one Suzuki–Trotter step of \(\hat{B}\). This is the sparsity pattern of canonical bosonic position and momentum operators as well as the Sx spin operators in spin-s systems. Upper bounds were calculated by mapping the full operator to a sum of weighted Pauli strings, combining terms, and then using Eq. (19). Notably, encodings with higher qubit counts tend to have lower upper bounds for gate counts, and vice-versa.

The differing gate count upper bounds between the SB and gray encodings (Fig. 4) are explained by Hamming distances. Because all non-zero terms in \(\hat{B}\) have unity Hamming distance, upper bounds for the Gray code are substantially lower. The other notable trend is that the unary code has lower upper bounds, asymptotically, than the other codes. This can be explained using Eq. (21) by noting that K = 2 for any matrix element, while \(K=\lceil {\mathrm{log}\,}_{2}d\rceil\) for Gray and SB. In other words, K stays constant in the unary encoding, whereas in the compact codes K increases with d. Upper bounds for BU\({}_{g = 3}^{{\mathrm{Gray}}}\) are between the compact codes and the unary code, as this encoding has an intermediate value of K. Below we will see that, although these trends generally persist, they are less pronounced and less predictable after cancelling of Pauli terms and circuit optimization.

Diagonal binary-decomposable operators

An important class of operators to consider is those which we call diagonal binary-decomposable (DBD). We define DBD operators as being diagonal matrix operators for which the diagonal entries of the operator (\(\hat{O}\)) may be expressed as

$${\hat{O}}_{l,l}={\sum \limits_{i = 1}^{\lceil {\mathrm{log}\,}_{2}d\rceil }}{k}_{i}{R}^{{\rm{SB}}}(l;i)$$

where RSB(l; i) {0, 1}. A common subclass of DBD is the set of diagonal operators containing evenly-spaced entries. We call these diagonal evenly-spaced (DES) operators. An example is the bosonic number operator

$$\hat{n}={\rm{diag}}[0,1,2,3,\cdots \ ]$$

and any linear combination \(a\hat{n}+bI\) where a and b are constants. If \({\mathrm{log}\,}_{2}d\) is an integer, then the Pauli operator is simply the base-two numbering system with ki = 2i,


The DBD class of operators is notable because, when \({\mathrm{log}\,}_{2}d\) is an integer, exactly implementing \(\exp (-i\theta \hat{n})\) requires only \({\mathrm{log}\,}_{2}d\) single-qubit rotations and no entangling gates.

An operator for which \({\mathrm{log}\,}_{2}d\) is a non-integer will not lead to this favorable only single-qubit decomposition. For example, the \({\hat{S}}_{z}\) operator for a spin-s system is DBD, but the advantage appears for the SB mapping only when d = 2s + 1 is a power of 2, namely s {3/2, 7/2, … }. However, for other operators one may simply increase d without changing the simulation result. For example, if one is required to exponentiate a truncated bosonic operator \(\hat{n}\) with at least d = 11, it is most efficient simply to implement the SB encodings of \(\hat{n}\) with d = 16 instead. A simple example illustrates this point. The standard operator for the bosonic number operator with truncation d = 3 is

$${\hat{n}}_{d = 3}={\rm{diag}}{[0,1,2]}\,{\mathop\longmapsto^{{\rm{Std}}.{\rm{Binary}}}}\,\frac{3}{4}I+\frac{1}{4}\hat\sigma_z^{(0)}-\frac{3}{4}\hat\sigma_z^{(0)}\hat\sigma_z^{(1)}-\frac{1}{4}\hat\sigma_z^{(1)}$$

while that for d = 4 is

$${\hat{n}}_{d = 4}={\rm{diag}}{[0,1,2,3]}\,{\mathop\longmapsto ^{{\rm{Std}}.{\rm{Binary}}}}\,\frac{3}{2}I+\frac{1}{2}\hat\sigma_z^{(0)}-\hat\sigma_z^{(1)}$$

The latter operator (d = 4) is composed only of single-qubit operators but the former (d = 3) is not. Operations counts for CNOT gates are shown in Fig. 5, where it is clear that the SB mapping is superior when d is a power of 2. The right panel gives gate counts for operators such as \({\hat{n}}^{2}\), where it is again advangatgeous for d to be a power of 2, although entangling gates are still required. As is also clear from the right panel, the square of a DES or DBD operator is in general not DBD.

Fig. 5: Entangling gates for a single Suzuki–Trotter step of an arbitrary diagonal evenly-spaces (DES) operator \(\hat{O}\) (left), and its square (right).
figure 5

Gate counts are for optimized quantum circuits. DES operators are a subset of the diagonal binary-decomposable (DBD) operator class. Because it is diagonal, the unary code always requires only single-qubit operations. When \({\mathrm{log}\,}_{2}d\) is an integer, the SB code requires no entangling gates and just \({\mathrm{log}\,}_{2}d\) single-qubit operations, making it the most efficient encoding (in terms of both qubits and operations). DBD operators are a common operator class, encompassing e.g., the bosonic number operator \(\hat{n}\) and the spin-s operator \({\hat{S}}_{z}\).

Local operators

Although local d-level operators can in principle contain arbitrary terms and even be entirely dense (i.e., a molecule’s electronic energy levels with non-zero transitions between each), in practice there is a small set of sparse bosonic and spin-s operators that are used most often. Here we summarize the set of d-level operators used in this study, where it is conceptually useful to explicitly write down some d-by-d matrix representations.

Bosonic operators can be constructed from the well-known ladder operators \(\hat{a}\) and \({\hat{a}}^{\dagger }\), where (importantly for encoding considerations) all non-zero terms \(\left|l\right\rangle \left\langle l^{\prime} \right|\) obey \(\left|l-l^{\prime} \right|=1\). The position operator \(\hat{q}=\frac{1}{\sqrt{2}}({\hat{a}}_{j}^{\dagger }+{\hat{a}}_{j})\) is tridiagonal with zeros on the diagonal:

$$\hat{q}=\frac{1}{\sqrt{2}}\left[\begin{array}{*{20}{l}}0&1&0&0&\ldots \\ 1&0&\sqrt{2}&0&\ldots \\ 0&\sqrt{2}&0&\sqrt{3}&\ldots \\ 0&0&\sqrt{3}&0&\ldots \\ \vdots &\vdots &\vdots &\vdots &\ddots \end{array}\right]$$

This means that the square of \(\hat{q}\), often used in vibrational and bosonic Hamiltonians, is pentadiagonal but with zeros for terms where \(| l-l^{\prime} | =1\):

$${\hat{q}}^{2}=\frac{1}{2}\left[\begin{array}{*{20}{l}}1&0&\sqrt{1\cdot 2}&0&0&\ldots \\ 0&3&0&\sqrt{2\cdot 3}&0&\ldots \\ \sqrt{1\cdot 2}&0&5&0&\sqrt{3\cdot 4}&\ldots \\ 0&\sqrt{2\cdot 3}&0&7&0&\ldots \\ 0&0&\sqrt{3\cdot 4}&0&9&\ldots \\ \vdots &\vdots &\vdots &\vdots &\vdots &\ddots \end{array}\right].$$

Notably, \(| l-l^{\prime} |\) for non-zero entries is either 0 or 2, making the Gray code less useful for this operator. The momentum operator \(\hat{p}=\frac{i}{\sqrt{2}}({\hat{a}}_{j}^{\dagger }-{\hat{a}}_{j})\) and its square \({\hat{p}}^{2}\) have the same sparsity patterns as \(\hat{q}\) and \({\hat{q}}^{2}\), respectively. The number operator \(\hat{n}={\hat{a}}^{\dagger }\hat{a}\) of Eq. (23) is diagonal and DBD, which leads to efficient SB mappings as discussed in Section Diagonal binary-decomposable operators. Finally, we study the two-site bosonic interaction operator

$${\hat{a}}_{i}^{\dagger }{\hat{a}}_{j}+{\hat{a}}_{i}{\hat{a}}_{j}^{\dagger }.$$

In order to consider spin Hamiltonians such as Heisenberg models11,40, we encode spin-s operators of arbitrary s, where the number of levels is d = 2s + 1. Matrix elements for transitions \(\left|l\right\rangle \left\langle l^{\prime} \right|\) are defined as follows41

$$\left\langle l\right|\hat S_z\left|l^{\prime} \right\rangle =\hslash (s+1-l){\delta }_{l,l^{\prime} }$$
$$\left\langle l\right|\hat S_x \left|l^{\prime} \right\rangle =\frac{\hslash }{2}({\delta }_{l,l^{\prime} +1}+{\delta }_{l+1,l^{\prime} })\sqrt{(s+1)(l+l^{\prime} -1)-ll^{\prime} }$$
$$\left\langle l\right|\hat S_y \left|l^{\prime} \right\rangle =\frac{i\hslash }{2}({\delta }_{l,l^{\prime} +1}-{\delta }_{l+1,l^{\prime} })\sqrt{(s+1)(l+l^{\prime} -1)-ll^{\prime} }$$

where δα,β is the Kronecker delta. The \(\hat S_z\) are DBD operators, while \(\hat S_x\) and \(\hat S_y\) are tridiagonal with zeros on the diagonal, the same sparsity pattern as bosonic \(\hat{p}\) and \(\hat{q}\) operators.

The local operators considered thus far are effectively second quantization operators—each ket tends to correspond to an eigenstate in an isolated d-level system. Also of note is a recently proposed approach19,20 which maps bosons to qubits using the first quantized representation of the quantum harmonic oscillator. The original proposal maps Hermite–Gauss functions, the eigenfunctions of the quantum harmonic oscillator, into a discretized position space. The approximate position operator is defined as

$${\tilde{X}}_{{\rm{FQ}}}={\sum \limits_{i = 0}^{{N}_{x}-1}}{x}_{i}\left|{x}_{i}\right\rangle \left\langle {x}_{i}\right|$$

and Nx is the number of discrete position points such that

$${x}_{i}=(i-{N}_{x}/2)\Delta,i\in [0,{N}_{x}-1]$$

where Δ is chosen such that the desired highest-order Hermite–Gauss function is contained within \(({x}_{0},{x}_{{N}_{x}-1})\). Advantages and disadvanteges are discussed in the Supplementary Section 5. We raise the possibility of using this approach partly to point out that a Nx-by-Nx matrix operator may be mapped to qubits using the exact same procedure as the other operators, with Nx replacing d. Note that \({\tilde{X}}_{{\rm{FQ}}}\) is a DBD operator.

Quantum circuits for approximating the exponential of each operator and for each d were compiled and then optimized using the procedure given in the Supplementary Section 2. The optimization consists of searching for and performing gate cancellations where possible. For instance, two adjacent CNOT gates or two adjacent Hadamard gates will cancel. Entangling gate counts for the optimized circuits of bosonic operators \(\hat{q}\), \({\hat{q}}^{2}\), and \({\hat{a}}_{i}^{\dagger }{\hat{a}}_{j}+{\hat{a}}_{i}{\hat{a}}_{j}^{\dagger }.\) are plotted in Fig. 6. We place significant focus on smaller d values because they tend to be more common in physics simulation, but we note that applications requiring larger d values do exist, for example in vibronic simulations where occupation numbers can approach d = 7018.

Fig. 6: CNOT gate counts for optimized circuits of the bosonic position operator (first row), position operator squared (second row), and two-site bosonic interaction operator (third row).
figure 6

Plots on the left correspond to enlargements of the dotted boxes in the plots on the right. There are several notable trends and anomalies: a Although unary usually requires the fewest operations as d increases, there are several operators for which the Gray or SB code is more efficient than the unary in both qubit and operation count. This occurs most pronouncedly at values such as d = 4, 7, and 8. b The Gray code is usually more efficient than SB even after circuit optimization, especially for operators composed of tridiagonal operators (top and bottom rows). c The Gray code's advantage is either less pronounced or disappears for \({\hat{q}}^{2}\), because \({\hat{q}}^{2}\) is a pentadiagonal operator, for which the unity Hamming distance of the Gray code is less useful. d The reduction in operation count for \({\hat{q}}^{2}\) at values such as d = 8, 24, or 32 occur because the diagonal of \({\hat{q}}^{2}\) is DBD.

Comparing the tridiagonal operator \(\hat{q}\) with the upper bounds given in Fig. 4 demonstrates that the circuit optimization greatly reduces gate count for the compact codes and block unary, often by a factor of 2–3. On the other hand, the unary encoding effectively sees no improvement from optimization, although it remains the code with fewest entangling gates for a large subset of d values.

As was the case in the upper bound calculations, operators built from tridiagonal matrices (\(\hat{q}\) and \({\hat{a}}_{i}^{\dagger }{\hat{a}}_{j}+{\hat{a}}_{i}{\hat{a}}_{j}^{\dagger }.\)) show the Gray encoding outperforming SB, although after optimization the advantage is less pronounced. In contrast, for the pentadiagonal \({\hat{q}}^{2}\), the Gray code outperforms SB asymptotically, while SB is better for lower d values (and lower d values are likely to be more common in relevant bosonic Hamiltonian simulations). The changed trend can be explained by noting that the unity Hamming distance of the Gray code is not as advantageous for the sparsity structure of \({\hat{q}}^{2}\), given in Eq. (28). Also notable is the apparent dip in operation count at d = 8, due to the fact that the diagonal of \({\hat{q}}^{2}\) is DBD.

Importantly, when mapping bosonic problems using compact encodings, it is sometimes the case that increasing the truncation value d is beneficial. For instance, suppose one knows one can safely truncate at d = 5 for a bosonic problem. When implementing \({\hat{q}}^{2}\), one would instead simply implement the operator for d = 8, as the number of gates decreases while the number of qubits remains the same. Note that this is not possible in spin-s particles, as it would cause leakage to unphysical states.

One of the more intriguing results is that the unary code is often inferior to the Gray or SB encodings. Pronounced examples of this inversion include d = 4, 7, 8 for \(\hat{q}\) and d = 4 and 8 for \({\hat{q}}^{2}\), among others. This is notable because, for these values of d, Trotterizing the operator requires both fewer qubit and fewer operations if Gray or SB is used. These results are in contrast to the naively expected trend that there would be a more consistent trade-off between qubit count and operation count. Results for single-particle spin-s operators \({\hat{S}}_{x}\) and \({\hat{S}}_{z}\), as well as interaction operator \(\hat S_z^{(i)} \hat S_z^{(j)}\) are plotted in Fig. 7. Unlike bosonic Hamiltonians, the d values are not simulation parameters but are determined by s in the system we wish to simulate. The trends in spin operators tend to be more unruly than those in the bosonic operators.

Fig. 7: CNOT gate counts of optimized Suzuki–Trotter circuits, for approximating the exponentials of the spin-s operators shown.
figure 7

There is no clean overall trend for \(\hat S_x\) and \(\hat S_z^{(i)} \hat S_z^{(j)}\) (except that Gray tends to out-perform SB), highlighting the need to study encodings thoroughly for each new use case. Notably, because \({\hat{S}}_{z}\) is diagonal binary-decomposable, for values of \(s=\frac{3}{2},\frac{7}{2}\) (which are 4- and 8-level system, respectively) the SB code requires both fewest operations and fewest qubits.

Analogous to the bosonic case, \({\hat{S}}_{z}\) is DBD and therefore SB requires only single-qubit gates when d = 4, 8 (s = \(\frac{3}{2}\),\(\frac{7}{2}\)) and no entangling gates. For these two values, SB uses both the fewest operations and the fewest qubits (fewer than unary). However, the Gray code is superior to SB for other values of s, the same behavior seen in the general DES matrix of Fig. 5. Because \({\hat{S}}_{z}\) is diagonal, the unary always requires just d single-qubit rotations and no entangling gates. The two-particle operator \(\hat S_z^{(i)} \hat S_z^{(j)}\) displays similar trends to \({\hat{S}}_{z}\).

As expected, the Gray code is usually superior to SB for the tridiagonal \({\hat{S}}_{x}\), because of the unity Hamming distance between nearest levels. Unary is inferior in both gate count and qubit count for most values, a result highlighted earlier in low-d bosonic operators. For both bosonic and spin-s operators, we have until now omitted discussion of the Gray-based block unary encoding with parameter g = 3. There is never a case where this BU mapping is the sole encoding with the lowest entangling gate count. However, at least in principle, there may be limited cases where a particular hardware budget (Fig. 1) dictates the need for a block unary encoding. A necessary condition for even considering the use of BU\({}_{g = 3}^{{\mathrm{Gray}}}\) is that its operation count is less than both compact codes, but more than unary. In such cases (\(\hat{q}\) and \({\hat{a}_i^\dagger} \hat{a}_{i+1}+{\rm{h}}.{\rm{c}}.\) for d = 9; \({\hat{n}}^{2}\)\(\sim {\hat{O}}^{2}\) in Fig. 5 for several values; \(\hat S_z^{(i)} \hat S_z^{(j)}\) and \({\hat{S}}_{z}\) for s = 2), there may be a particular hardware budget would require this encoding for its particular memory/operation trade-off. Such highly specific hardware budgets seem unlikely to often appear.

Note that the results herein should generally be considered constant-factor savings, because in most relevant systems d does not increase with system size, i.e., with the number of particles. For the simulation of scientifically relevant quantum systems, Hamiltonians are composed of more than one simple operator. For such situations, one may calculate the overall cost within a given encoding, as we do in Section Composite systems. As will be discussed in Section Conversions between encodings, it is often beneficial to Trotterize different parts of the Hamiltonian in different encodings, if the cost difference outweighs the overhead of conversion.

Conversions between encodings

It is often the case that different terms in a Hamiltonian are more efficiently simulated in different encodings. For example, in the Bose–Hubbard model, the number operator \(\hat{n}\) is usually more efficient in SB, while the hopping term \({\hat{b}}_{i}^{\dagger }{\hat{b}}_{i+1}+{\rm{h}}.{\rm{c}}.\) is usually more efficient in the Gray encoding (see Fig. 6). Here we show that the cost of converting from one encoding to another is often substantially less than the difference in resource efficiency between two encodings, which means that it can be advantageous to continually be compacting and uncompacting the data. For example, if unary is the most efficient for implementing an operator, one may wish to compact the data between operations to save memory resources, as shown in Fig. 8. In this section we give general quantum circuits and resource counts for converting between all encodings considered in this work.

Fig. 8: Schematic showing the utility of swapping between encodings.
figure 8

When extra memory resources are available and the unary code is the most efficient for implementing an operator, one may expand into the unary representation, perform the operation, and then compact the data back to SB or Gray. The example shown here is the bosonic interaction operator \({\hat{a}_i^\dagger} \hat{a}_{i+1}+{\rm{h}}.{\rm{c}}.\). This operator is present in bosonic Hamiltonians and for digital simulations of beamsplitters. For many values of d, implementing this operator in unary is much cheaper than implementing it in Gray or SB. When this strategy is worth the cost of conversion, the Hamiltonian simulation is in Scenario C discussed in Section Composite Systems. Hence each particle starts with \(\lceil {\mathrm{log}\,}_{2}d\rceil\) qubits, expands out to d qubits, and then compacts back. Whether this procedure leads to cost savings is heavily dependent on the problem and the parameters.

One can convert between the Gray and SB encodings by applying \((\lceil {\mathrm{log}\,}_{2}d\rceil -1)\) CNOTs in sequential order42 as shown in Fig. 9.

Fig. 9: Quantum circuit for SB to Gray conversion.
figure 9

The number of gates required is logarithmic in d.

Algorithm 1 An algorithm to build a quantum circuit for SB–unary conversion given an arbitrary truncation d. For multi-qubit gates, the first argument specifies the control qubit and the successive ones are targets.

The conversion between unary and SB is more complex. The conversion may be especially relevant in a future fault-tolerant quantum computing era, when extra quantum resources are available, because the unary encoding becomes more beneficial as d increases and because the conversion cost is significant. Inspired by previous work43, in Fig. 10 we show an example case for converting from SB to unary when d = 16. A state is initially encoded in SB using qubits on the left, and the memory space is enlarged to include the number of qubits needed for unary. No ancilla qubits are required. As quantum circuits are reversible, unary-to-SB conversion follows by inversion of the circuit. In Table 4, we provide the converter circuit resource count for a general d-level truncated quantum system, following Fig. 10. Resource counts assume a decomposition of CSWAP into Clifford+T gates44. A general algorithm to build the converter circuit can be seen in the Algorithm 1. and are, respectively, the ceiling and floor functions. The validity of the conversion procedure is most easily shown by tracing a single unary state through the reverse algorithm. When d is not a power of 2, modifications are needed. These modifications are already accounted for in Algorithm 1 and example circuits for d = 5 and 7 are given in the Supplementary Section 3.

Fig. 10: Circuit to convert the SB representation into the unary representation.
figure 10

Every CSWAP is accompanied by a CNOT. Modifications are required when d is not a power of two, as discussed in the main text and Supplementary Section 3.

Table 4 Resource counts for the SB to unary conversion, for arbitrary d.

For completeness, we constructed a circuit for converting between SB and block unary, for BU\({}_{g = 3}^{{\rm{SB}}}\), shown in the Supplementary Section 4. We do not further analyze BU conversions, as block unary is expected to have limited utility, and even then it will usually be the case that decoherence times are too low to allow for conversions (see Section Local operators).

Composite systems

Here we consider resource counts for simulating five physically and chemically relevant Hamiltonian systems. The Hamiltonians correspond to the shifted one-dimensional QHO, the Bose–Hubbard model9,45,46, multidimensional molecular Franck–Condon factors17,18,47, a spin-s transverse-field Heisenberg model11,40, and simulating Boson sampling48 on a digital quantum computer. The former four systems consist of an arbitrary number of d-level particles. For the Franck–Condon factors, the Duschinsky matrix is assumed to have a constant k = 4 non-zero entries per row. With the exception of the simple QHO, all of these problem classes would benefit from digital quantum simulation, because there are limits to the theoretical and practical questions that can be answered by classical computers. Supplementary Section 6 gives a more thorough overview of these problems. Assuming that d remains constant as the particle number increases, differences in resource counts between mappings are constant-factor savings that are independent of system size.

Using resource counts from the optimized circuits for Trotterizing individual operators and from the circuits for interencoding conversion, we calculated and compared the required two-qubit entangling gate counts for the selected composite Hamiltonians. We considered five encoding schemes: (i) SB-only, (ii) Gray-code-only, (iii) unary-only, (iv) allowing for conversion between SB and Gray, and (v) using all three while compacting to save memory. For (iv) and (v), the reported results include the cost of conversion. To the best of our knowledge, schemes (ii), (iv), and (v) are novel to digital quantum simulation. In Fig. 8, an example of encoding scheme (v) is shown. These encoding schemes do not directly correspond to the ‘scenarios’ discussed below; the scenarios denote the optimal encoding scheme under different hardware budgets.

The result for (iv), the encoding scheme that combines both the SB and Gray codes, is reported only when it represents an improvement over both SB-only and Gray-only. We give results for (v), which compacts and uncompacts the qubits for unary computations, only when the unary code was the most efficient of the first four encoding schemes. We give all results in terms of resource counts relative to the SB mapping, noting again that the relative resource requirements between encoding schemes are independent of system size (i.e., number of particles or modes).

For some local bosonic operators, the number of entangling gates is not a monotonically increasing function of d. Such operators include \(\hat{n}\), \({\hat{n}}^{2}\), and \({\hat{q}}^{2}\) (Figs. 5 and 6). Our numerics for the composite systems take this into account, increasing the cutoff d if it is beneficial. For instance, if d = 5 is a sufficient truncation and we implement \({\hat{q}}^{2}\), we use resource counts for d = 8, because this uses the same number of qubits but fewer operations (Fig. 6). This trick is not possible for the spin-s systems, where d is determined not by a sufficient truncation value but by the nature of the particle itself (its spin s).

A selection of resource comparisons is shown in Fig. 2. We show results from d = 4 and 10 because they highlight the variety of rankings that occur, and demonstrate that the best encoding scheme can be highly sensitive to d even within the same Hamiltonian class. Numerical results up to d = 16 are given in the Supplementary Section 7.

In terms of which encoding class should be used, the results can be categorized into four scenarios. Scenario A applies when using just one of the compact encodings (SB or Gray) is the best choice. Of the results shown in Fig. 2, the Bose–Hubbard and 1D QHO models for d = 4 fit this description. The optimal choice is to stay in one of the compact encodings for the entire calculation, while never using more than \({\mathrm{log}\,}_{2}d\) qubits per particle for the calculation.

Scenario B refers to Hamiltonians for which the optimal strategy is to use a compact amount of memory but to allow for conversion between Gray and SB. This includes the Heisenberg model for \(s=\frac{7}{2}\) and the Franck–Condon Hamiltonian for d = 4. Because the cost of conversion is very small, this scenario usually implies that at least one of the local operators in the Hamiltonian are optimal in Gray, at least one is optimal in SB, and none are optimal in unary. To take the Heisenberg model with \(s=\frac{7}{2}\) as an example, one can see this is the case by comparing \(\hat S_x\) and \(\hat S_z^{(i)} \hat S_z^{(j)}\) in Fig. 7.

Scenario C applies when unary is the superior encoding and it is still considered the best encoding even if one repeatedly unravels and compacts to preserve memory (as in Fig. 8). This latter trait is important because it means that, even including the substantial cost of SB-to-unary conversion, with a cost of ~9d entangling gates (Table 4), it is still better to convert back and forth between unary and SB/Gray. This is true even when memory constraints require that one stores the information compactly for most of the time. This occurs with d = 10 for the Bose–Hubbard, boson sampling, and Franck–Condon Hamiltonians, all of which are bosonic problems.

Scenario D refers to cases where unary is the superior encoding, assuming that the information is not compacted back to SB in order to save memory. This scenario implies that, if one has the qubit space to stay in unary form for the entire calculation, unary is optimal. If one does not have the memory resources for this, it is best to simply perform all operations in Gray and/or SB. The reason for this discrepancy is that the cost of converting binary to unary is substantial, as mentioned above. This scenario applies to the 1D QHO for d = 10 as well as the Heisenberg model for s = 2.

Results for d values up to 16 are given in the Supplementary Section 7. Note that the novel memory-efficient schemes (ii), (iv), and (v) did not lead to improvements in every case; in a minority of the problem instances we considered, the optimal encoding scheme was to use only SB or only unary. By memory-efficient we mean that the scheme stores the encoded subsystems in \(\lceil {\mathrm{log}\,}_{2}d\rceil\) qubits, with the possible exception of when they are being operated on. Comparing to the memory-inefficient unary-only scheme, our novel approaches reduced two-qubit entangling gate counts by up to 33%. Compared to the memory-efficient SB-only scheme, we observed gate count reductions of up to 49%. The latter case is more relevant when qubit count is a substantial constraint. These savings are especially important for running algorithms on near-term hardware, since by simply modifying the encoding procedure one can substantially decrease the effective circuit depth.


After introducing a general framework for encoding d-level systems to multi-qubit operators, we have analyzed the utility and trade-offs of several integer-to-bit encodings for qubit-based Hamiltonian simulation. The mappings may be used for Hamiltonians built from subsystems of bosons, spin-s particles, molecular electronic energy levels, molecular vibrational modes, or other d-level subsystems.

We analyzed the mappings primarily in terms of qubit counts and the number of entangling operations required to estimate the exponential of an operator.

Of the Gray and SB codes, we demonstrated that the Gray code tends to be more efficient for tridiagonal matrix operators, while SB tends to be superior for a common class of diagonal matrix operator. Importantly, we show that converting between encodings within a Suzuki–Trotter step often leads to savings. Notably, although the unary code tends to require more qubits but fewer operations, it is often the case that the SB or Gray code is more efficient both in terms of qubit counts and operation counts. To the best of our knowledge, the Gray code had not been previously used in Hamiltonian simulation.

We compared resource requirements between encodings for the following composite Hamiltonians: the Bose–Hubbard model, one-dimensional quantum harmonic oscillator, vibronic molecular Hamiltonian (i.e., Franck–Condon factors), spin-s Heisenberg model, and boson sampling. The optimal encoding, and whether it was beneficial to interconvert between encodings, was heavily dependent both on the Hamiltonian class and on the truncation level d for the particle. We placed optimal encoding strategies into four different "scenarios,” each of which points to a different optimal encoding and simulation strategy. The simulation scenario depends on which encodings require the fewest operations, on whether interconverting between mappings is worth the additional cost, and on qubit memory constraints. The many anomalies in our results highlight the need to perform an analysis of each new class of Hamiltonian simulation problem, determining numerically which simulation strategy is optimal before performing a simulation on real hardware.

There are several directions open for future research. First, there are ways to analyze resource requirements other than enumerating the entangling operations. For long-term error-corrected hardware, estimating T gate count may be most relevant49. Additionally, we assumed all-to-all connectivity in this work, which tends to be a feature of ion trap quantum computers50. But other quantum hardware types require one to consider the topology of the qubit connections and implementation of SWAP gates51, a consideration that would modify the resource counts and may modify some trends observed here.

We envision that the methodology and results of this work will be helpful for both theorists and experimentalists in designing resource efficient approaches to quantum simulation of a broader set of physically and chemically relevant Hamiltonians.