# Machine learning in electronic-quantum-matter imaging experiments

## Abstract

For centuries, the scientific discovery process has been based on systematic human observation and analysis of natural phenomena1. Today, however, automated instrumentation and large-scale data acquisition are generating datasets of such large volume and complexity as to defy conventional scientific methodology. Radically different scientific approaches are needed, and machine learning (ML) shows great promise for research fields such as materials science2,3,4,5. Given the success of ML in the analysis of synthetic data representing electronic quantum matter (EQM)6,7,8,9,10,11,12,13,14,15,16, the next challenge is to apply this approach to experimental data—for example, to the arrays of complex electronic-structure images17 obtained from atomic-scale visualization of EQM. Here we report the development and training of a suite of artificial neural networks (ANNs) designed to recognize different types of order hidden in such EQM image arrays. These ANNs are used to analyse an archive of experimentally derived EQM image arrays from carrier-doped copper oxide Mott insulators. In these noisy and complex data, the ANNs discover the existence of a lattice-commensurate, four-unit-cell periodic, translational-symmetry-breaking EQM state. Further, the ANNs determine that this state is unidirectional, revealing a coincident nematic EQM state. Strong-coupling theories of electronic liquid crystals18,19 are consistent with these observations.

## Access optionsAccess options

from\$8.99

All prices are NET prices.

## Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request. The data include experimental datasets, their standardized input forms for ANN analysis, the ANN output statistics, as well as Mathematica notebook files for generating training sets, standardizing input images and defining colour scales. The data used for Extended Data Figs. 1, 4 are provided as Source Data.

## Code availability

The custom computer codes used to build and train the ANNs and to use the trained ANNs for data analysis are available from the corresponding author upon reasonable request.

## References

1. 1.

Bacon, F., The Advancement of Learning (1605; Paul Dry Books, 2001).

2. 2.

Ouyang, R. et al. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mat. 2, 083802 (2018).

3. 3.

Stanev, V. et al. Machine learning modeling of superconducting critical temperature. Npj Comput. Mater. 4, 29 (2018).

4. 4.

Rosenbrock, C. W., Homer, E. R., Csányi, G. & Hart, G. L. W. Discovering the building blocks of atomic systems using machine learning: application to grain boundaries. Npj Comput. Mater. 3, 29 (2017).

5. 5.

Kusne, A. G. et al. On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets. Sci. Rep. 4, 6367 (2014).

6. 6.

Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434 (2017).

7. 7.

Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).

8. 8.

Torlai, G. & Melko, R. G. Neural decoder for topological codes. Phys. Rev. Lett. 119, 030501 (2017).

9. 9.

van Nieuwenburg, E. P. L., Liu, Y.-H. & Huber, S. D. Learning phase transitions by confusion. Nat. Phys. 13, 435–439 (2017).

10. 10.

Broecker, P., Carrasquilla, J., Melko, R. G. & Trebst, S. Machine learning quantum phases of matter beyond the fermion sign problem. Sci. Rep. 7, 8823 (2017).

11. 11.

Ch’ng, K., Carrasquilla, J., Melko, R. G. & Khatami, E. Machine learning phases of strongly correlated fermions. Phys. Rev. X 7, 031038 (2017).

12. 12.

Zhang, Y. & Kim, E.-A. Quantum loop topography for machine learning. Phys. Rev. Lett. 118, 216401 (2017).

13. 13.

Deng, D.-L., Li, X. & Das Sarma, S. Quantum entanglement in neural network states. Phys. Rev. X 7, 021021 (2017).

14. 14.

Stoudenmire, E. M. & Schwab, D. J. Supervised learning with tensor networks. Adv. Neural Inf. Process. Syst. 29, 4799–4807 (2016).

15. 15.

Schindler, F., Regnault, N. & Neupert, T. Probing many-body localization with neural networks. Phys. Rev. B 95, 245134 (2017).

16. 16.

Torlai, G. et al. Neural-network quantum state tomography. Nat. Phys. 14, 447–450 (2018).

17. 17.

Fujita, K. et al. in Strongly Correlated Systems: Experimental Techniques (eds Avella, A. & Mancini, F.) 73–109 (Springer, 2015).

18. 18.

Kivelson, S. A., Fradkin, E. & Emery, V. J. Electronic liquid-crystal phases of a doped Mott insulator. Nature 393, 550–553 (1998).

19. 19.

Zaanen, J. Self-organized one dimensionality. Science 286, 251–252 (1999).

20. 20.

Keimer, B., Kivelson, S. A., Norman, M. R., Uchida, S. & Zaanen, J. From quantum matter to high-temperature superconductivity in copper oxides. Nature 518, 179–186 (2015).

21. 21.

Wang, F. & Lee, D.-H. The electron-pairing mechanism of iron-based superconductors. Science 332, 200–204 (2011).

22. 22.

Comin, R. & Damaschelli, A. Resonant X-ray scattering studies of charge order in cuprates. Annu. Rev. Condens. Matter Phys. 7, 369–405 (2016).

23. 23.

Fradkin, E. et al. Nematic Fermi fluids in condensed matter physics. Annu. Rev. Condens. Matter Phys. 1, 153–178 (2010).

24. 24.

Fradkin, E., Kivelson, S. A. & Tranquada, J. M. Theory of intertwined orders in high temperature superconductors. Rev. Mod. Phys. 87, 457 (2015).

25. 25.

Hamidian, M. H. et al. Atomic-scale electronic structure of the cuprate d-symmetry form factor density wave state. Nat. Phys. 12, 150–156 (2016).

26. 26.

Robertson, J. A. et al. Distinguishing patterns of charge order: stripes or checkerboards. Phys. Rev. B 74, 134507 (2006).

27. 27.

Del Maestro, A., Rosenow, B. & Sachdev, S. From stripe to checkerboard ordering of charge-density waves on the square lattice in the presence of quenched disorder. Phys. Rev. B 74, 024520 (2006).

28. 28.

Mesaros, A. et al. Commensurate 4a 0-period charge density modulations throughout the Bi2Sr2CaCu2O8+x pseudogap regime. Proc. Natl Acad. Sci. USA 113, 12661–12666 (2016).

29. 29.

Nielsen, M. A. Neural Networks and Deep Learning (Determination Press, 2015).

30. 30.

Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

31. 31.

Cover, T. M. & Thomas, J. A. Elements of Information Theory 2nd edn (Wiley, 1991).

32. 32.

Nie, L. et al. Quenched disorder and vestigial nematicity in the pseudogap regime of the cuprates. Proc. Natl Acad. Sci. USA 111, 7980–7985 (2014).

33. 33.

Cybenko, G. Approximation by superposition of a sigmoidal function. Math. Contr. Signals Syst. 2, 303–314 (1989).

## Acknowledgements

We thank P. Ginsparg, J. Hoffman, S. Kivelson, R. Melko, A. Millis, M. Stoudenmire, K. Weinberger and J. Zaanen for discussions and communications. A.M. and Y.Z. acknowledge support from DOE DE-SC0010313; Y.Z. acknowledges support from DOE DE-SC0018946; E.-A.K. and J.C.S.D. acknowledge support by the Cornell Center for Materials Research with funding from the NSF MRSEC programme (DMR-1719875); E.K. and K.C. acknowledge support from the NSF through grant number DMR-1609560. E.-A.K. and E.K. acknowledge support from the Kavli Institute for Theoretical Physics (where initial discussions about the project took place), which is supported in part by the NSF under grant number PHY-1748958. S.U. and H.E. acknowledge support from a Grant-in-aid for Scientific Research from the Ministry of Science and Education of Japan and the Global Centers of Excellence Program of the Japan Society for the Promotion of Science. K.F. and J.C.S.D. acknowledge support from the US Department of Energy, Office of Basic Energy Sciences, under contract number DEAC02-98CH10886; S.D.E, M.H.H. and J.C.S.D. acknowledge support from the Moore Foundation’s EPiQS Initiative through grant GBMF4544; J.C.S.D. acknowledges support from Science Foundation Ireland under award SFI 17/RP/5445 and from the European Research Council (ERC) under award number DLV-788932.

### Reviewer information

Nature thanks Giuseppe Carleo, Andrea Perali and the other anonymous reviewer(s) for their contribution to the peer review of this work.

## Author information

E.-A.K. and J.C.S.D. conceived the project. E.K., K.C., E.-A.K. and Y.Z. designed the ML strategy. Y.Z. implemented the ANN-based ML strategy. E.-A.K. and A.M. constructed the mathematical model for the training set and A.M. generated the training set. K.F., S.U. and H.E. synthesized and characterized the crystals studied. K.F., S.D.E. and M.H.H. carried out the experiments and image array data processing. E.-A.K. and J.C.S.D. supervised the investigation and wrote the paper with key contributions from K.F., Y. Z. and A.M. The manuscript reflects the contributions of all authors.

Correspondence to Eun-Ah Kim.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Extended data figures and tables

### Extended Data Fig. 1 ANN detection of unidirectionality at different electron energies.

Output categorization by 81 ANNs of 16 nm × 16 nm Z(rE) images of Bi2Sr2CaCu2O8 in the electron energy range E = 30–150 meV in steps of 6 meV for P = 0.08 (Tc = 45 K). Markers are larger than the statistical spread (one standard deviation) of the ANN outputs, as estimated from our ensemble of 81 ANN realizations (see Methods). a, The output for modulation orientation X is obtained by inputting the Z(rE) image array to the ANNs. b, The output for modulation orientation Y is obtained by inputting the 90°-rotated versions of the Z(rE) used in to the ANNs. Source Data

### Extended Data Fig. 2 Schematic of DWs arising in the CuO2 plane according to strong-coupling position-based theories.

a, The d-symmetry 4a0 charge DW. The charge density at the Ox site is modulated with four-unit-cell periodicity along the horizontal direction, and similarly for that at Oy, but out of phase by π (d symmetry). Cu locations are marked by small dots. b, The 8a0 pair DW state. The d-wave Cooper pair density is modulated with eight-unit-cell periodicity along the horizontal direction. Such modulation in the Cooper pair density can cause a 4a0-period modulation in the local density of states N(r).

### Extended Data Fig. 3 Local commensurate motifs in scanning tunnelling microscopy images.

a, Large-field-of-view, high-precision scanning tunnelling microscopy image of the electronic structure of Bi2Sr2CaCu2O8 with p ≈ 0.08, integrated to E = 100 meV. The larger inset shows the FT of the power spectral density, whereas the smaller inset shows the same data plotted along a line from 0.1 to 0.5 in units of 2π/a0. Clearly, the maximum intensity peak occurs at $$\left\langle Q\right\rangle =0.28$$. b, Within each of the eight 6.5-nm2 regions marked in a there are many commensurate, unidirectional 4a0 electronic-structure motifs (inside the white rectangles). The Cu sites, independently determined from topographic imaging, are shown as fine dots.

### Extended Data Fig. 4 Categories defined by electronic orders.

ad, Example images from the simulated training set, from category C = 1 (a), C = 2 (b), C = 3 (c) and C = 4 (d), defined by DFF unidirectional modulation with wavelengths λC = 4.348a0, 4a0, 3.704a0 and 3.448a0, respectively. The CuO2 unit-cell size a0 is 6 pixels diagonally. Source Data

### Extended Data Fig. 5 ANN training and testing.

a, Examples of the accuracy of the ANN outputs for the independent validation dataset and the cross-entropy cost function are compared for different neuron activation functions during the initial training processes. The inset illustrates the nonlinear activation functions, that is, the sigmoid function and the rectified linear unit (ReLU). b, Examples of the accuracy and the cross-entropy cost versus the number of neurons in a single hidden layer after 25 epochs of training.

### Extended Data Fig. 6 Experimental SISTM images.

a, Example Z(rE) of underdoped Bi2Sr2CaCu2O8 with hole density p = 0.06 (Tc = 20 K). The inset is a zoom-in with marked atom positions determined from the topograph (Cu, red/light; O, purple/dark). b, A small region of a. c, Standardized version of b (see Methods).

### Extended Data Fig. 7 Experimental SISTM images used as input for categorization.

a, Z(rE) of underdoped Bi2Sr2CaCu2O8 with p = 0.06 (Tc = 20 K) at energy E = Δ1 (see main text). f, The 516 × 516 pixel (2 × 86 × 86 CuO2 unit cells) input data from a (see Methods). b–e, g–j, As in a, f, but for underdoped Bi2Sr2CaCu2O8 with p = 0.08 (Tc = 45 K) (b, g), underdoped Bi2Sr2CaCu2O8 with p = 0.085 (Tc = 50 K) (c, h), underdoped Bi2Sr2CaCu2O8 with p = 0.14 (Tc = 74 K) (d, i) and overdoped Bi2Sr2CaCu2O8 with p = 0.20 (Tc = 82 K) (e, j). Too-small images are tiled, with unit cells intact at the tiling boundary, whereas too-large images are cropped.

### Extended Data Fig. 8 Benchmarking categorization using experimental images.

a, Input data for the topograph of overdoped Bi2Sr2CaCu2O8 with p = 0.22 (Tc = 70 K). b, Output categorization of a by 81 ANNs, showing absence of a translation-breaking signal. Results for the modulation orientations x and y are obtained by inputting the image in a and its 90°-rotated version, respectively, to the ANNs (see Methods). c, The input Z(rE) data for NCCOC with a doping of p = 0.12 at E = 150 meV. d, Output categorization of c by 81 ANNs, showing commensurate modulations (category 2).

### Extended Data Fig. 9 Categorization is robust to changes in training-set parameters.

a–d, The evolution of output categorizations upon increasing hole doping (as in Fig. 4k–m, o). e, Energy dependence of the output categorizations (as in Extended Data Fig. 1a). f–j, Categorizations of the same inputs as for a–e, respectively, obtained from the output of a single ANN trained using a different training set (see Methods).

### Extended Data Fig. 10 Weakness of FT analysis of EQM.

a, b, The DFF Fourier amplitude, $$\left|\widetilde{\Psi }\left({\boldsymbol{q}}\right)\right|$$, with the wavevector q restricted to a square area with its corner at the origin of Fourier space (black square) and its centre at $${{\boldsymbol{Q}}}_{X}=\frac{1}{4}{{\boldsymbol{G}}}_{X}$$ (in a) and $${{\boldsymbol{Q}}}_{Y}=\frac{1}{4}{{\boldsymbol{G}}}_{Y}$$ (in b), where GX amd GY are the Bragg peaks. Data from a Bi2Sr2CaCu2O8 sample with a doping level of p = 0.10 (Tc = 65 K). Figure reproduced from ref. 28 with permission from PNAS. c, The modulation is the real part of the complex wave $$\psi \left(x\right)=A\left(x\right){{\rm{e}}}^{i\left[{Q}_{0}x+\phi \left(x\right)\right]}$$, which has commensurate domains with local wavevector $${Q}_{0}=\frac{1}{4}\times \frac{2{\rm{\pi }}}{a}$$ (period 4a). The amplitude A(x) ≥ 0 varies smoothly around 1. Phase slips are incorporated in φ(x) (see d). The average wavevector is $$\bar{Q}=0.3\times \frac{2{\rm{\pi }}}{a}$$. d, The local phase φ(x) of ψ(x) in c, constructed as a discommensuration array in the phase argument $$\Phi \left(x\right)={Q}_{0}x+\phi \left(x\right)$$. Phase slips of all discommensurations are set to +π. The distances between neighbouring discommensurations vary randomly around the average distance set by the value of incommensurability, $$\delta =\bar{Q}-{Q}_{0}=0.05\times \frac{2{\rm{\pi }}}{a}$$. e, Fourier amplitudes $$\left|\widetilde{\psi }\left(q\right)\right|$$ of the modulation ψ(x) in c (blue line) show a narrow peak at $$\bar{Q}=0.3\times \frac{2{\rm{\pi }}}{a}$$. The demodulation residue |Rq| (red dashed line) has its minimum exactly at the average $$\bar{Q}$$.

## Rights and permissions

Reprints and Permissions

• #### DOI

https://doi.org/10.1038/s41586-019-1319-8

• ### Identifying quantum phase transitions using artificial neural networks on experimental data

• Benno S. Rem
• , Niklas Käming
• , Matthias Tarnowski
• , Luca Asteria
• , Nick Fläschner
• , Christoph Becker
• , Klaus Sengstock
•  & Christof Weitenberg

Nature Physics (2019)

• ### Heterogeneity at multiple length scales in halide perovskite semiconductors

• Elizabeth M. Tennyson
• , Tiarnan A. S. Doherty
•  & Samuel D. Stranks

Nature Reviews Materials (2019)

• ### Classifying snapshots of the doped Hubbard model with machine learning

• Annabelle Bohrdt
• , Christie S. Chiu
• , Geoffrey Ji
• , Muqing Xu
• , Daniel Greif
• , Markus Greiner
• , Eugene Demler
• , Fabian Grusdt
•  & Michael Knap

Nature Physics (2019)