Abstract
The experimental realization of increasingly complex synthetic quantum systems calls for the development of general theoretical methods to validate and fully exploit quantum resources. Quantum state tomography (QST) aims to reconstruct the full quantum state from simple measurements, and therefore provides a key tool to obtain reliable analytics^{1,2,3}. However, exact bruteforce approaches to QST place a high demand on computational resources, making them unfeasible for anything except small systems^{4,5}. Here we show how machine learning techniques can be used to perform QST of highly entangled states with more than a hundred qubits, to a high degree of accuracy. We demonstrate that machine learning allows one to reconstruct traditionally challenging manybody quantities—such as the entanglement entropy—from simple, experimentally accessible measurements. This approach can benefit existing and future generations of devices ranging from quantum computers to ultracoldatom quantum simulators^{6,7,8}.
Main
Machine learning methods have been demonstrated to be particularly powerful at compressing highdimensional data into lowdimensional representations^{9,10}. Largely developed in the domain of data science, these techniques have recently been used to address fundamental questions in the domain of physical sciences. Applications to quantum manybody systems have been put forward in the last year, for example, to classify phases of matter^{11,12,13}, and to simulate quantum systems^{14}.
QST is itself a datadriven problem, in which we aim to obtain a complete quantummechanical description of a system, on the basis of a limited set of experimentally accessible measurements. While compressed sensing approaches^{15} reduce the experimental burden of full QST, large systems can be studied only through techniques requiring a feasible number of measurements. For example, permutationally invariant tomography^{16} makes efficient use of the symmetries of prototypical quantum optics states, and can be amenable to a large number of qubits. However, the general case of manybody systems is challenging for QST. In this context, matrix product states are the stateoftheart tool for QST of lowentangled states^{17,18}. For highly entangled quantum states resulting either from deep quantum circuits or highdimensional physical systems, alternative representations are required for QST.
Here, we show how machine learning approaches can be used to find such representations. In particular, we argue that suitably trained artificial neural networks offer a natural and general way of performing QST driven by a limited amount of experimental data. Our approach is demonstrated on controlled artificial data sets, comprising measurements from several prototypical quantum states with a large number of degrees of freedom (qubits, spins and so on), which are thus hard for traditional QST approaches.
We consider here the goal of reconstructing a generic manybody target wavefunction \({\rm{\Psi}}({{\bf{x}}})\equiv \left\langle {{\bf{x}}} {\rm{\Psi}}\right\rangle\), where x is some reference basis (for example, σ^{z} for spin\(1/2\)). To act as the model, we use a representation of the manybody state in terms of artificial neural networks^{14}:
where the networks p_{λ}(x) and ϕ_{µ}(x) represent, respectively, the amplitude and phase of the state, and Z_{λ} is the normalization constant. The neuralnetwork architecture we use in this work is based on the restricted Boltzmann machine (RBM). This architecture features a visible layer (describing the physical qubits) and a hidden layer of binary neurons, fully connected with weighted edges to the visible layer (see Methods). RBM states offer a compact variational representation of manybody quantum states, capable of sustaining nontrivial correlations, such as high entanglement, or topological features^{19,20,21,22,23,24}. Specifically, we take p_{λ} to be an RBM with parameters λ, and a separate RBM network, p_{µ} with parameters µ to model the phase, ϕ_{µ} = log p_{µ}(x). Our machine learning approach to QST is then carried out as follows. First, the RBM is trained on a data set consisting of a series of independent density measurements \({\left{\rm{\Psi}}\left({{\bf{x}}}^{[b]}\right)\right}^{2}\) realized in a collection of bases {x^{[b]}} of the Nbody quantum system. During this stage, the network parameters (λ, µ) are optimized to maximize the dataset likelihood, in a way that \({\left{\psi }_{\rm{\lambda},{\rm{\mu}}}\left({{\mathbf{{x}}}}^{[b]}\right)\right}^{2}\simeq {\left{\rm{\Psi }}\left({{\mathbf{{x}}}}^{[b]}\right)\right}^{2}\) (see Methods). Once trained, ψ_{λ,μ}(x) approximates both the wavefunction’s amplitudes and phases, thus reconstructing the target state. The accuracy of the reconstruction can be systematically improved by increasing the number of hidden neurons M in the RBM for fixed N, or equivalently the density of hidden units α = M/N (refs ^{14,25}). One key feature of our QST approach is that it needs only raw data (that is, many experimental snapshots coming from single measurements), rather than estimates of expectation values of operators^{1,4,16,17,18}. This setup implies that we circumvent the need to achieve low levels of intrinsic Gaussian noise in the evaluations of mean values of operators.
To demonstrate this approach, we start by considering QST of the W state, a paradigmatic Nqubit multipartite entangled wavefunction defined as
To mimic experiments, we generate several data sets with an increasing number of synthetic density measurements obtained by sampling from the W state in the σ^{z} basis. These measurements are used to train an RBM model featuring only the set of parameters λ, since the target \(\left{{\rm{\Psi }}}_{W}\right\rangle\) is real and positive in this basis. After the training, we sample from \({\left{\psi }_{\rm{\lambda}}\left({{\mathbf{\sigma }}}^{z}\right)\right}^{2}\) and build a histogram of the frequency of the N components \(\left(\left100\ldots \right\rangle, \left010\ldots \right\rangle \ldots \right)\). In Fig. 1a we show three histograms obtained with a different number of samples in the training data set for N = 20 and α = 1. From the histograms, we see that the N components converge to equal frequency in the limit of large sample number, as expected for the W state. To better quantify the quality of this reconstruction, we compute the overlap \({O}_{W}=\left\left\langle {{\rm{\Psi }}}_{W} {\psi }_{{\rm{\lambda }}}\right\rangle \right\) of the RBM wavefunction with the original W state. In Fig. 1b, O_{ W } is shown as a function of the number of samples in the training data sets for three different values of N. For a system size substantially larger than what is currently available in experiments^{26}, an overlap O_{ W } ~ 1 can be achieved with a moderate number of samples. As a comparison, for N = 8, a bruteforce QST requires almost 10^{6} measurements^{4}. Our RBM achieves similar accuracy in reconstructing the wavefunction with only about 100 Nbit measurements, a number comparable to other stateoftheart QST approaches^{15,16,17}. To examine a more challenging case for QST, one can augment the W state with a local phase shift \({\rm{\exp }}\left(i\theta \left({{\mathbf{\sigma }}}_{k}^{z}\right){\rm{/}}2\right)\) with random phase \(\theta \left({{\mathbf{\sigma }}}_{k}^{z}\right)\) applied to each qubit. QST must now be carried out using the full RBM wavefunction equation (1), and trained on 2(N − 1) additional bases. In Fig. 1 we plot a comparison between the exact phases (Fig. 1c) and the phases learned by the RBM (Fig. 1d) for N = 20 qubits, showing very good agreement (O_{ W } = 0.997). We expect our approach to perform equally well for other paradigmatic quantum optics states. In the Supplementary Information we provide more details, including an examination of the effects of varying α on QST of the W state, discuss overfitting, and demonstrate that RBMs can encode compactly (that is, with a polynomial number of hidden units) the Greenberger–Horne–Zeilinger and Dicke states.
We now turn to the case of more complex systems, and demonstrate QST for two interacting manybody problems that are directly relevant for quantum simulators based on ultracold ions and atoms. To mimic such experimental scenarios, we generate artificial data sets by sampling different quantum states of two lattice spin models: the transversefield Ising model (TFIM), with Hamiltonian
and the XXZ spin\(1/2\) model, with Hamiltonian
where the σ_{ i } are Pauli spin operators.
First, we consider groundstate wavefunctions. Using quantum Monte Carlo (QMC) methods, we synthesize artificial data sets by sampling the exact ground states of equations (3) and (4) for different values of the coupling parameters h and Δ, and for nearestneighbour interactions J_{ ij } = J, in both one and two spatial dimensions. The quality of the learned wavefunctions is tested by computing various observables using the RBM, and comparing them with the exact values known via the QMC simulations. For the twodimensional (2D) TFIM, Fig. 2a illustrates how the RBMs can reproduce the average values of both diagonal and offdiagonal observables to high accuracy for N ≳ 100 spins. For the 2D XXZ model, Fig. 2b illustrates the expectation values of the diagonal \({\sigma }_{\rm{a}}^{z}{\sigma }_{\rm{b}}^{z}\) and offdiagonal \({\sigma }_{\rm{a}}^{x}{\sigma }_{\rm{b}}^{x}\) spin correlations, with a and b being neighbours along the lattice diagonal. In addition, we consider the full spin–spin \({\sigma }_{i}^{z}{\sigma }_{j}^{z}\) correlation function for the 1D TFIM, which involves nonlocal correlations. Figure 2d shows that the reconstructed RBM correlation function closely matches the exact result (obtained via QMC measurements in Fig. 2c). Here, deviations between the RBM and QMC are compatible with statistical uncertainty due to the finiteness of the training set.
To go beyond the case of groundstate wavefunctions, we also consider states originating from dynamics under unitary evolution. We focus on a case of ‘quench’ dynamics that is realizable in experiments with ultracold ions^{27}. Specifically, we study 1D Ising spins initially prepared in the state \({{\rm{\Psi }}}_{0}=\left\to, \to, \ldots, \to \right\rangle\) (fully polarized in the σ^{x} basis), subject to unitary dynamics enforced by the Hamiltonian in equation (3) with longrange interactions \({J}_{ij}\propto 1{\rm{/}}{\leftij\right}^{\gamma }\) and magnetic field set to zero (h = 0). For a given time t, we perform QST on the state \(\left{\rm{\Psi }}(t)\right\rangle ={\rm{\exp }}(i{\mathscr{H}}t)\left{{\rm{\Psi }}}_{0}\right\rangle\) by training the RBM on spin density measurements performed in 2N + 1 different bases. In Fig. 2e, we show the overlap between the RBM wavefunction ψ_{λ,μ}(σ) and the timeevolved state Ψ(σ; t) for different system sizes N, as a function of the number N_{S} of samples per basis. In the lower plot, we show for N = 12 the exact (Fig. 2f) and the reconstructed phases (Fig. 2g).
For both ground and dynamically evolved states, these results indicate that our neuralnetwork QST is able to obtain highquality results with a moderate number of measurements, important for ultracold atoms and similar systems where state preparation is costly.
Finally, we turn to the important and highly nonlocal quantum quantity that is perhaps the most challenging for direct experimental observation^{28}, the entanglement entropy. Consider a bipartition of the physical system into a region A and its complement. The second Renyi entropy is defined as \({S}_{2}\left({\rho }_{{\rm{A}}}\right)={\rm{log}}\left({\rm{Tr}}\left({\rho }_{{\rm{A}}}^{2}\right)\right)\), with the reduced density matrix ρ_{A} describing the subsystem A. We estimate S_{2} by employing sampling of the ‘swap’ operator^{29} using the wavefunction generated by the RBM. In Fig. 3 we show the entanglement entropy for the 1D TFIM with three values of the transverse field, and for the critical (Δ = 1) 1D XXZ model. In both instances, we consider a chain with N = 20 spins and plot the entanglement entropy as a function of the subsystem size \(\ell \in [1,N{\rm{/}}2]\). From this, we see that values generated from the RBM agree with the exact entanglement entropy to within statistical errors. Using our approach, an estimate of the entanglement entropy from experimental data can then be obtained using only simple measurements of the density, currently accessible with cold atoms^{30}.
Due to their power, flexibility and ease of use, unsupervised machine learning approaches such as those developed in this paper can readily be adapted to reconstruct complicated manybody quantum states from a limited number of experimental measurements. Our results suggest that RBM approaches will perform well on physically relevant manybody and quantum optics states, whereas poorer performance is expected for structureless, random states (as studied in the Supplementary Information). Feasible applications range from validating quantum computers and adiabatic simulators^{31}, to reconstructing quantities that are challenging for a direct observation in experiments. In particular, we predict that the use of our machine learning approach for bosonic ultracold atom experiments will allow for the determination of the entanglement entropy on systems substantially larger than those currently accessible with quantum interference techniques^{28}.
Methods
Experimental measurements and Kullback–Leibler divergences
We provide here a detailed description of the different steps required to perform quantum state tomography (QST) with neural networks for manybody quantum systems. We concentrate on the case of systems with two local degrees of freedom (spin\(1/2\), qubits and so on) and choose σ ≡ σ^{z} as the reference basis for the Nbody wavefunction \({\rm{\Psi }}({\mathbf{{\sigma}}})\equiv \left\langle {\mathbf{{\sigma}}} {\rm{\Psi}}\right\rangle\) we intend to reconstruct. This highdimensional function can be approximated with an artificial neural network (NN). Given a set of input variables (for example σ = σ_{1}, σ_{2}, …, σ_{ N }), a NN is a highly nonlinear function whose output is determined by some internal parameters κ. The architecture of the network consists of a collection of elementary units, called neurons, connected by weighted edges. The strength of these connections, specified by the parameters κ, encode conditional dependence among neurons, in turn leading to complex correlations among the input variables. Increasing the number of auxiliary neurons systematically improves the expressive power of the NN function, which can then be used as a generalpurpose approximator for the target wavefunction^{14}. The goal of our tomography scheme is to find the best NN approximation for the manybody wavefunction, ψ_{κ}(σ), using only numerical data obtained through some outside means (such as simulation or experiment).
Our scheme proceeds as follows. First, we assume that a set of experimental measurements in a collection of bases b = 0, 1, 2 … N_{ B } is available. These measurements are distributed according to the probabilities \({P}_{b}\left({{\mathbf{{\sigma} }}}^{[b]}\right)\propto {\left{\rm{\Psi }}\left({{\mathbf{{\sigma} }}}^{[b]}\right)\right}^{2}\), and thus contain information about both the amplitudes and the phases of the wavefunction in the reference basis σ. The goal of the NN training is to find the optimal set of parameters κ such that ψ_{κ}(σ) mimics as closely as possible the data distribution in each basis; that is, \({\left{\psi }_{{{{\rm{\kappa}}}}}\left({{\mathbf{{\sigma} }}}^{[b]}\right)\right}^{2}\simeq {P}_{b}\left({{\mathbf{{\sigma} }}}^{[b]}\right)\). This is achieved by searching for the NN parameters that minimize the total statistical divergence Ξ(κ) between the target distributions and the reconstructed ones. Several possible choices can be made for Ξ(κ). Here, we define it as the sum of the Kullback–Leibler (KL) divergences in each basis:
The total divergence Ξ(κ) is positive definite, and attains the minimum value of 0 when the reconstruction is perfect in each basis: \({\left{\psi }_{\rm{\kappa }}\left({{\mathbf{\sigma }}}^{[b]}\right)\right}^{2}={P}_{b}\left({{\mathbf{\sigma }}}^{[b]}\right)\). Depending on the target wavefunction, a sufficiently large set of measurement bases must be included in order to have enough information to estimate the phases in the reference basis. In practice, for most states of interest it is enough to include a number of bases that scales only polynomially with system size.
Once the training is complete, the NN provides a compact representation ψ_{κ}(σ) of the target wavefunction Ψ(σ). In turn, this representation can be used to efficiently compute various observables of interest, overlaps with other known quantum states and other information not directly accessible in the experiment. In the next two subsections, we describe in detail the specific parametrization of the NN wavefunction adopted in this work and its optimization.
The RBM wavefunction
There are many possible architectures and NNs that can be employed to represent a quantum manybody state. Following ref. ^{14}, we employ a powerful stochastic NN called a restricted Boltzmann machine (RBM). The network architecture of an RBM features two layers of stochastic binary neurons, a visible layer σ describing the physical variables, and a hidden layer h. The expressive power of the model can be characterized by the ratio α = M/N between the number of hidden neurons M and visible neurons N. An RBM is also an energybased model, sharing many properties of physical models in statistical mechanics. In particular, it associates with the graph structure a probability distribution given by the Boltzmann distribution
where we omitted the normalization and κ now consists on the weights W^{κ} connecting the two layers and the fields (biases) b^{κ} and c^{κ} coupled to each visible and hidden neuron, respectively. The distribution (of interest) over the visible layer is obtained by marginalization over the hidden degrees of freedom
The RBM wavefunction is then defined as
where \({Z}_{{\rm {\lambda }}}={\sum }_{{\mathbf{\sigma }}}{p}_{{\rm {\lambda }}}({\mathbf{\sigma }})\) is the normalization constant, ϕ_{μ}(σ) = log p_{μ}(σ), and λ and μ are the two set of parameters. Note that the sampling of configurations σ from \({\left{\psi }_{{\rm {\lambda }},{{\mu }}}({\mathbf{\sigma }})\right}^{2}\) involves only the amplitude distribution p_{λ}(σ)/Z_{λ} . This can be achieved, as usual for RBMs, by performing block Gibbs sampling with the two conditional distributions \({p}_{{\rm{\lambda }}}({\mathbf{\sigma }} {\mathbf{{h}}})\) and \({p}_{{\rm{\lambda }}}({\mathbf{{h}}} {{{\mathbf{\sigma}}}})\), which can be computed exactly. This procedure is very efficient since each neuron in one layer of the RBM is connected only to neurons of a different layer, thus enabling us to sample all units (in one layer) simultaneously.
Gradients of the total divergence
The first step in the RBM's training is to build the data set of measurements. In general, different bases are needed to estimate both amplitudes and phases of the target state Ψ(σ). We define a series of data sets D_{ b } for each base b = 1, …, N_{ B } − 1, with each data set \({D}_{b}={\left\{{{\mathbf{\sigma }}}_{i}^{[b]}\right\}}_{i=1}^{\left{D}_{b}\right}\) consisting of \(\left{D}_{b}\right\) density measurements with underlying distribution \({P}_{b}\left({{\mathbf{\sigma }}}^{[b]}\right)\propto {\left{\rm{\Psi }}\left({{\mathbf{\sigma }}}^{[b]}\right)\right}^{2}\), where \({{\mathbf{\sigma }}}^{[b]}=\left({\sigma }_{1}^{[b]},\ldots, {\sigma }_{N}^{[b]}\right)\) and σ^{[0]} = σ . The quantity to minimize, also called negative loglikelihood, is then
where we omitted here a constant term given by the sum of the the crossentropies of the data sets \({\sum }_{b}{\mathbb{H}}({D}_{b})\). The NN wavefunction in the σ^{[b]} basis is simply obtained by
with U_{ b }(σ, σ^{[b]}) being the basis transformation matrix. The rotated state, ψ_{λ,μ}(σ^{[b]}), can be computed efficiently, provided that U acts nontrivially on a limited number of qubits.
We proceed now to give the expressions for the various gradients needed in the training. By plugging equation (8) into equation (9), we obtain
We define now the gradients \({{\mathscr{D}}}_{{\rm{\kappa }}}({\mathbf{\sigma }})={{{\nabla }}}_{{\rm{\kappa }}}{\rm{log}}{p}_{{\rm{\kappa }}}({\mathbf{\sigma }})\) with κ = λ, μ, and the quasiprobability distribution
Then, the derivatives of the KL divergence with respect to the parameters λ and µ are
and
In the expression above, we have defined the pseudoaverages:
which can be efficiently computed directly summing over the samples in the data sets D_{ b }. On the other hand, the evaluation of the average
requires the knowledge of the normalization constant Z_{λ}, which is not directly accessible. However, as per standard RBM training^{32}, one can approximate this average by
where σ_{ k } are samples generated using a Markov chain Monte Carlo simulation.
Finally, we point out that in our work we have adopted a slightly simplified training scheme. In particular, we break down the training into two steps. First, we learn the amplitudes only by optimizing the parameters λ. In this case, it is sufficient to minimize the KL divergence over the reference basis alone (that is, σ). This part of the training is essentially a standard unsupervised learning procedure, involving the generation of samples from the RBM^{33}. Then, we fix the parameters λ, and use the measurements in the auxiliary bases to determine the optimal values of the phase parameters μ. This other part of the training is achieved using the gradient in equation (14), and thus does not require Monte Carlo sampling from the NN.
Training the neural network
For a given set of parameters (that is, μ), the easiest way to numerically minimize the total divergence, equation (9), is by using simple stochastic gradient descent^{33}. Each parameter μ_{ j } is updated as
where the gradient step η is called the learning rate and the gradient g_{ j } is averaged over a batch B (\(\leftB\right\ll \leftD\right\)) of samples drawn randomly from the full data set:
Stochastic gradient descent is the optimization method used to learn the amplitudes of each physical systems presented in the paper. However, for the learning of the phases, we instead implement the natural gradient descent method^{34}, which is found to be more effective, although at the cost of increased computational resources. In this case, we update the parameters as
where we have introduced the Fisher information matrix:
The learning rate magnitude η is set to
with some initial learning rate η_{0}. The matrix \({\left\langle {S}_{ij}\right\rangle }_{B}\) takes into account the fact that, since the parametric dependence of the RBM function is nonlinear, a small change in some parameters may correspond to a very large change in the distribution. In this way, one implicitly uses an adaptive learning rate for each parameter μ_{ j } and speeds up the optimization compared to the simplest gradient descent. We note that a very similar technique is successfully used in quantum Monte Carlo for optimizing highdimensional variational wavefunctions^{35,36}. Similarly to our case, noisy gradients, which come from the Monte Carlo statistical evaluation of energy derivatives with respect to the parameters, are present, while the matrix S is instead given by the covariance matrix of these forces. Since the matrix \({\left\langle {S}_{ij}\right\rangle }_{B}\) is affected by statistical noise, we regularize it by adding a small diagonal offset, thus improving the stability of the optimization.
Data availability
The data that support the plots within this paper and other findings of this study are available from the corresponding author upon reasonable request.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Vogel, K. & Risken, H. Determination of quasiprobability distributions in terms of probability distributions for the rotated quadrature phase. Phys. Rev. A. 40, 2847 (1989).
 2.
Leonhardt, U. Quantumstate tomography and discrete Wigner function. Phys. Rev. Lett. 74, 4101–4105 (1995).
 3.
White, A. G., James, D. F. V., Eberhard, P. H. & Kwiat, P. G. Nonmaximally entangled states: production, characterization, and utilization. Phys. Rev. Lett. 83, 3103–3107 (1999).
 4.
Häffner, H. et al. Scalable multiparticle entanglement of trapped ions. Nature 438, 643–646 (2005).
 5.
Lu, C.Y. et al. Experimental entanglement of six photons in graph states. Nat. Phys. 3, 91–95 (2007).
 6.
Bloch, I., Dalibard, J. & Zwerger, W. Manybody physics with ultracold gases. Rev. Mod. Phys. 80, 885–964 (2008).
 7.
Blatt, R. & Roos, C. F. Quantum simulations with trapped ions. Nat. Phys. 8, 277–284 (2012).
 8.
Shulman, M. D. et al. Demonstration of entanglement of electrostatically coupled singlettriplet qubits. Science 336, 202–205 (2012).
 9.
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
 10.
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
 11.
Wang, L. Discovering phase transitions with unsupervised learning. Phys. Rev. B 94, 195105 (2016).
 12.
Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434 (2017).
 13.
van Nieuwenburg, E. P. L., Liu, Y.H. & Huber, S. D. Learning phase transitions by confusion. Nat. Phys. 13, 435–439 (2017).
 14.
Carleo, G. & Troyer, M. Solving the quantum manybody problem with artificial neural networks. Science 355, 602–606 (2017).
 15.
Gross, D., Liu, Y.K., Flammia, S. T., Becker, S. & Eisert, J. Quantum state tomography via compressed sensing. Phys. Rev. Lett. 105, 150401 (2010).
 16.
Tóth, G. et al. Permutationally invariant quantum tomography. Phys. Rev. Lett. 105, 250403 (2010).
 17.
Cramer, M. et al. Efficient quantum state tomography. Nat. Commun. 1, 149 (2009).
 18.
Lanyon, B. P. et al. Efficient tomography of a quantum manybody system. Nat. Phys. 13, 1158–1162 (2017).
 19.
Deng, D.L., Li, X. & Sarma, S. D. Machine learning topological states. Phys. Rev. B 96, 195145 (2017).
 20.
Torlai, G. & Melko, R. G. Neural decoder for topological codes. Phys. Rev. Lett. 119, 030501 (2017).
 21.
Deng, D.L., Li, X. & Das Sarma, S. Quantum entanglement in neural network states. Phys. Rev. X 7, 021021 (2017).
 22.
Gao, X. & Duan, L.M. Efficient representation of quantum manybody states with deep neural networks. Nat. Commun. 8, 662 (2017).
 23.
Chen, J., Cheng, S., Xie, H., Wang, L. & Xiang, T. On the equivalence of restricted Boltzmann machines and tensor network states. Preprint at http://arxiv.org/abs/1701.04831 (2017).
 24.
Huang, Y. & Moore, J. E. Neural network representation of tensor network and chiral states. Preprint at http://arxiv.org/abs/1701.06246 (2017)
 25.
Torlai, G. & Melko, R. G. Learning thermodynamics with Boltzmann machines. Phys. Rev. B 94, 165134 (2016).
 26.
Wang, X.L. et al. Experimental tenphoton entanglement. Phys. Rev. Lett. 117, 210502 (2016).
 27.
Richerme, P. et al. Nonlocal propagation of correlations in quantum systems with longrange interactions. Nature 511, 198–201 (2014).
 28.
Islam, R. et al. Measuring entanglement entropy in a quantum manybody system. Nature 528, 77–83 (2015).
 29.
Hastings, M. B., González, I., Kallin, A. B. & Melko, R. G. Measuring Renyi entanglement entropy in quantum Monte Carlo simulations. Phys. Rev. Lett. 104, 157201 (2010).
 30.
Bakr, W. S. et al. Probing the superfluidtoMott insulator transition at the singleatom level. Science 329, 547–550 (2010).
 31.
Johnson, M. W. et al. Quantum annealing with manufactured spins. Nature 473, 194–198 (2011).
 32.
Hinton, G. E. Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002).
 33.
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, Cambridge, MA, 2016).
 34.
Amari, S.i Natural gradient works efficiently in learning. Neural Comput. 10, 251–276 (1998).
 35.
Sorella, S. Green function Monte Carlo with stochastic reconfiguration. Phys. Rev. Lett. 80, 4558 (1998).
 36.
Becca, F. & Sorella, S. Quantum Monte Carlo Approaches for Correlated Systems (Cambridge Univ. Press, Cambridge, 2017).
Acknowledgements
We thank L. Aolita, H. Carteret, G. Tóth and B. Kulchytskyy for useful discussions. G.T. thanks the Institute for Theoretical Physics, ETH Zurich, for hospitality during various stages of this work. G.T. and R.M. acknowledge support from NSERC, the Canada Research Chair programme, the Ontario Trillium Foundation and the Perimeter Institute for Theoretical Physics. Research at the Perimeter Institute is supported through Industry Canada and by the Province of Ontario through the Ministry of Research and Innovation. G.C., G.M. and M.T. acknowledge support from the European Research Council through ERC Advanced Grant SIMCOFE, and the Swiss National Science Foundation through NCCR QSIT and MARVEL. Simulations were performed on resources provided by SHARCNET, and by the Swiss National Supercomputing Centre CSCS.
Author information
Affiliations
Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario, Canada
 Giacomo Torlai
 & Roger Melko
Perimeter Institute of Theoretical Physics, Waterloo, Ontario, Canada
 Giacomo Torlai
 & Roger Melko
Theoretische Physik, ETH Zurich, Zurich, Switzerland
 Guglielmo Mazzola
 , Matthias Troyer
 & Giuseppe Carleo
Vector Institute, Toronto, Ontario, Canada
 Juan Carrasquilla
DWave Systems, Burnaby, British Columbia, Canada
 Juan Carrasquilla
Quantum Architectures and Computation Group, Station Q, Microsoft Research, Redmond, WA, USA
 Matthias Troyer
Center for Computational Quantum Physics, Flatiron Institute, New York, NY, USA
 Giuseppe Carleo
Authors
Search for Giacomo Torlai in:
Search for Guglielmo Mazzola in:
Search for Juan Carrasquilla in:
Search for Matthias Troyer in:
Search for Roger Melko in:
Search for Giuseppe Carleo in:
Contributions
G.C. designed the research. G.T. devised the machine learning methods. G.T., G.M. and J.C. performed the machine learning numerical experiments. G.M. performed QMC simulations. All authors contributed to the analysis of the results and writing of the manuscript.
Competing interests
The authors declare no competing financial interests.
Corresponding author
Correspondence to Giuseppe Carleo.
Supplementary information
Supplementary Information
Supplementary Figures 1–5, Supplementary Notes, Supplementary References.
Rights and permissions
To obtain permission to reuse content from this article visit RightsLink.
About this article
Further reading

Quantum machine learning for electronic structure calculations
Nature Communications (2018)

Enter the machine
Nature Physics (2018)