Deep learning on the 2-dimensional Ising model to extract the crossover region with a variational autoencoder

Walker, Nicholas; Tam, Ka-Ming; Jarrell, Mark

doi:10.1038/s41598-020-69848-5

Download PDF

Article
Open access
Published: 03 August 2020

Deep learning on the 2-dimensional Ising model to extract the crossover region with a variational autoencoder

Nicholas Walker¹,
Ka-Ming Tam^1,2 &
Mark Jarrell^1,2

Scientific Reports volume 10, Article number: 13047 (2020) Cite this article

10k Accesses
21 Citations
2 Altmetric
Metrics details

Subjects

Abstract

The 2-dimensional Ising model on a square lattice is investigated with a variational autoencoder in the non-vanishing field case for the purpose of extracting the crossover region between the ferromagnetic and paramagnetic phases. The encoded latent variable space is found to provide suitable metrics for tracking the order and disorder in the Ising configurations that extends to the extraction of a crossover region in a way that is consistent with expectations. The extracted results achieve an exceptional prediction for the critical point as well as agreement with previously published results on the configurational magnetizations of the model. The performance of this method provides encouragement for the use of machine learning to extract meaningful structural information from complex physical systems where little a priori data is available.

Optimization of physical quantities in the autoencoder latent space

Article Open access 30 May 2022

Collective dynamics of repeated inference in variational autoencoder rapidly find cluster structure

Article Open access 29 September 2020

Machine-learning-assisted insight into spin ice Dy2Ti2O7

Article Open access 14 February 2020

Introduction

Machine learning (ML) and consequently data science as a whole have seen rapid development over the last decade or so, due largely to considerable advances in implementations and hardware that have made computations more accessible. Conceptually, the ML approach can be regarded as a data modeling approach employing algorithms that eschew explicit instructions in favor of strategies based around pattern extraction and inference driven by statistical analysis. This presents a colossal opportunity for modern scientific investigations, particularly numerical studies, as they naturally involve large data sets and complex systems where obvious explicit instructions for analysis can be elusive. Conventional approaches often neglect possible nuance in the structure of the data in favor of rather simple measurements that are often untenable for sufficiently complex problems. Some ML methods such as inference methods have been routinely applied to certain physical problems, such as the maximum likelihood method and the maximum entropy method^1,2, but applications which utilizing ML methods have only recently attracted attention in the physical sciences, particularly for the study of interacting systems on both classical and quantum scales³. There is a unique opportunity to take advantage of the advances in ML algorithms and implementations to provide interesting new approaches to understanding physical data and even perhaps improve upon existing numerical methods⁴. Outstanding problems involving the predictions of transition points and phase diagrams are also of great interest for treatment with ML methods.

In order to utilize ML approaches for studying phase transitions, one must assume that there is some pattern change in the measured data across the phase transition. Fortunately, this is in fact exactly what happens in most phase transitions. The widely adopted Lindemann parameter, for example, is essentially a measure of the deviations of atomic positions in the system from equilibrium positions and is often used to characterize the melting of a crystal structure⁵. Similar form of pattern changes in the positions of the constituent atoms are often present in molecular systems in general. Perhaps more importantly, for some sufficiently complex systems, their phase transitions do not have obvious order parameters, often prohibiting the detection of such pattern changes using conventional methods. This is not a hypothetical situation, indeed hidden orderings for some interesting materials, such as heavy fermion materials and cuprate superconductors, have been proposed for long time^6,7,8.

Other systems may not even exhibit a true phase transition, but rather a crossover region where there is no singularity across different phases that can be difficult to characterize with conventional methods. A conventional phase transition can be identified in two ways, with the first being a singularity in a derivative of the free energy as proposed by Ehrenfest and the second being a broken symmetry exhibited by an order parameter as proposed by Landau. Unlike a conventional phase transition, a crossover is not identified by a singularity in the free energy. There is also no broken symmetry in such a situation and thus no order parameter is associated with a crossover. The order parameter and singularity in the free energy are presumably sharp and obvious features which can be rather easily identified. The absence of such features clearly present a challenging situation in the prediction of a crossover region by ML. ML is a new route of studying these systems by searching for hidden patterns in the measured data where readily applicable a priori information is in short supply.

A viable ML method for detecting a crossover will find its use in many interesting systems related to the quantum phase transition⁹. While the quantum phase transition is a second order phase transition controlled by non-thermal parameters at zero temperature, all experiments and most numerical simulations are conducted at finite albeit low temperatures for practical reasons. As a consequence of said thermal conditions, quantum critical points at low temperatures behave as crossover phenomenona. It is widely believed that many interesting materials, particularly high temperature cuprate superconductors, harbor a quantum critical point. An ML approach for detecting the crossover phenomenon can thus be an important tool for studying quantum critical points.

Work has been done on various problems to characterize phase transitions in physical systems using ML methods, including the Ising model in the vanishing field case^{3,10,11,12,13,14,15,16,17,18,19,20,20}. This work will use a similar approach to those seen in these papers, but will focus on the crossover regions that are introduced in the non-vanishing field case of the 2-dimensional Ising model instead of seeking only the exactly known transition point in the vanishing field case²¹. This is a somewhat more difficult problem, as there is no explicit transition to be found, but it remains an interesting problem nonetheless and possibly carries much greater implications for crossover regions in more complicated problems.

The Ising model itself is a mathematical model for ferromagnetism that is often explored in the field of statistical mechanics in physics to describe magnetic phenomena²². Originally, the Ising model was developed to investigate magnetic phenomena, as mentioned earlier. With the discovery of electron spins, the model was designed to determine whether or not local interactions between magnetic spins could induce a large fraction of the electronic spins in a material to align in order to produce a macroscopic net magnetic moment. It is expressed in the form of a multidimensional array of spins $s_i$ that represent a discrete arrangement of magnetic dipole moments of atomic spins²². The spins are restricted to spin-up or spin-down alignments such that $s_i \in \{-1, +1\}$. The spins interact with their nearest neighbors with an interaction strength given by $J_{ij}$ for neighbors $s_i$ and $s_j$. The spins can additionally interact with an applied external magnetic field $H_i$ (where the magnetic dipole moment $\mu $ has been absorbed). The full Hamiltonian describing the system is thus expressed as

$$\begin{aligned} \mathscr {H} = -\sum _{\langle i, j \rangle } J_{ij} s_i s_j - \sum _i H_i s_i \end{aligned}$$

(1)

where $\langle i, j\rangle $ indicates a sum over adjacent spins. For $J_{ij} > 0$, the interaction between the spins is ferromagnetic, for $J_{ij} < 0$, the interaction between the spins is antiferromagnetic, and for $J_{ij} = 0$, the spins are noninteracting. Furthermore, if $H_i > 0$, the spin at site i tends to prefer spin-up alignment, if $H_i < 0$, the spin at site i tends to prefer spin-down alignment, and if $H_i = 0$, there is no external magnetic field influence on the spin at site i. The model has seen extensive use in investigating magnetic phenomena in condensed matter phhysics^{23,24,25,26,27,28,29}. Additionally, the model can be equivalently expressed in the form of the lattice gas model, described by the following Hamiltonian

$$\begin{aligned} \mathscr {H} = -4J\sum _{\langle i,j\rangle }n_i n_j - \mu \sum _i n_i \end{aligned}$$

(2)

where the external field strength H is reinterpreted as the chemical potential $\mu $, J retains its role as the interaction strength, and $n_i \in \{0, 1\}$ represents the lattice site occupancy. The original Ising Hamiltonian can be recovered using the relation $\sigma _i = 2n_i-1$ up to a constant. This model describes a multidimensional array of lattice sites which can be either occupied or unoccupied by a hard shell atom, disallowing occupancy greater than one. The first term is then interpreted as a short-range attractive interaction term while the second is the flow of atoms between the system and the reservoir. This is a simple model of density fluctuation and the liquid-gas transformations used primarily in chemistry, albeit often with modifications^30,31. Additionally, modified versions of the lattice gas models have been applied to binding behavior in biology^32,33,34.

Typically, the model is studied in the case of $J_{ij} = J = 1$ and the vanishing field case $H_i = H = 0$ is of particular interest for dimension $d \ge 2$ since a phase transition is exhibited as the critical temperature is crossed. For two dimensions, the critical temperature can be identified by exploiting Kramers-Wannier duality symmetry^35,36,37. At low temperatures with a vanishing field, the physics of the Ising model is dominated by the nearest-neighbor interactions, which for a ferromagnetic model means that adjacent spins tend to align with one another. However, as the temperature is increased, the thermal fluctuations will eventually overpower the interactions such that the magnetic ordering is destroyed and the orientations of the spins can be considered independent of one another. This is called a paramagnet.

In such a case, if an external magnetic field were to be applied, the paramagnet would respond to it and tend to align with it, though for high temperatures, a sufficiently strong external field will be required to overcome the thermal fluctuations. Since the magnetization smoothly decreases to zero with increasing temperature in the presence of an external magnetic field, there is no phase transition where the magnetization abruptly vanishes. Instead, the region in which the system goes from an ordered to a disordered state is referred to as the crossover region. Generically, a crossover refers to when a system undergoes a change in phase without encountering a canonical phase transition characterized by a critical point as there are no discontinuities in derivatives of the free energy (as determined by Ehrenfest classification) or symmetry-breaking mechanisms (as determined by Landau classification). A well known example is the BEC-BCS crossover in an ultracold Fermi gas in which tuning the interaction strength (the s-wave scattering length) causes the system to crossover from a Bose-Einstein-condensate state to a Bardeen-Cooper-Schrieffer state³⁸. Additionally, the Kondo Effect is important in certain metallic compounds with dilute concentrations of magnetic impurities that cross over from a weakly-coupled Fermi liquid phase to a local Fermi liquid phase upon reducing the temperature below some threshold³⁹. Furthermore, examples of strong crossover phenomena have also been recently discovered in classical models of statistical mechanics such as the Blume-Capel model and the random-field Ising model^40,41.

The organization of this work is as follows. The next section details the data science and ML methods explored in this work. In sect. 3, the results of the analysis of the 2-dimensional square Ising model are reported. Section 4 concludes this work with a discussion of the interpretation, implications, and greater impacts of these findings.

Methods

The Ising configurations are generated using a standard Monte Carlo algorithm written in Python using the NumPy library^42,43. The algorithm was also optimized to be parallel using the Dask library and select subroutines were compiled at run-time for efficiency using the JIT compiler provided by the Numba library^44,45. The Monte Carlo moves used are called spin-flips. A single spin flip attempt consists of flipping the spin of a single lattice site, calculating the resulting change in energy $\Delta E$, and then using that change in energy to define the Metropolis criterion $\exp (-\frac{\Delta E}{T})$. If a randomly generated number is smaller than said Metropolis criterion, the configuration resulting from the spin-flip is accepted as the new configuration. The data analyzed in this work consists of 1,024 square Ising configurations of side length 32 with periodic boundary conditions across 65 external field strengths and 65 temperatures respectively uniformly taken from $[-2, 2]$ and [1, 5]. The interaction energies were set to unity such that $J_{ij} = J = 1$. Each sample was equilibrated with 8,192 Monte Carlo updates before data collection began. Data was then collected at an interval of 8 Monte Carlo updates for each sample up to a sample count of 1,024. At the end of each data collection step, a replica exchange Markov chain Monte Carlo move was performed across the full temperature range for each set of Ising configurations that shared the same external field strength^46,47,48. This allows for more robust sampling of the ensemble across the temperature range by allowing high-temperature states to be available at low temperatures as well as the inverse. Additionally, this helps to prevent samples on the vanishing field line from relaxing into either positive or negative magnetization states since two states at temperatures close to one another opposite spins would be very likely to swap.

In this work, the Ising spins were rescaled such that a spin-down atomic spins carry the value 0 and spin-up atomic spins carry the value 1, which is a standard setup for binary-valued features in data science. Physically, this would be interpreted as the lattice gas model as described in the prior section.

The goal is to map the raw Ising configurations to a small set of descriptors that can discriminate between the samples using a structural criterion inferred by an ML algorithm. This application is referred to as representation learning and is often presented as dimensionality reduction. There are many methods in the field of unsupervised ML that seek to achieve such data dimensionality reduction^49,50 however, such methods do not respect the multidimensional structure of the input data, so a deep neural network will be used instead to accomplish the data dimensionality reduction in the form of a self-supervised variational autoencoder (VAE)⁵¹. Such a neural network is composed of three main components, an encoder network, a decoder network, and a sampling function. The encoder and decoder neural networks are implemented as deep convolutional neural networks (CNN) in order to preserve the spatially dependent 2-dimensional structure of the Ising configurations⁵². The general idea of a VAE is to encode configurations into a latent variable space composed of the parameters for a chosen prior distribution. A multidimensional Gaussian distribution was used for this work. Random variables from these distributions can then be decoded to recover the original input configurations. In this way, VAEs are both generative models and latent variable models. Assuming a model is sufficiently trained, new sample data can be generated through traversing the latent space input to the decoder network.

The purpose for using a VAE in this manner is to extract a low-dimensional representation of the Ising configurations that are otherwise unwieldy to compare directly in a meaningful manner without a priori knowledge of the important derived measurements from statistical physics used to accomplish the same tasks. The motivation then for using a VAE to encode and decode the Ising configurations lies in the desire to automate the parameterization of the Ising configurations without conventional methods from statistical physics, preferring instead to allow the neural network to learn and discover the important features itself directly from the structures of the configurations. The latent representations of the configurations will be small sets of descriptors for the configurations that can be used to discriminate between them by relying on the assumption that proximities between latent representations of the Ising configurations in the latent space are notions of structural similarity between the configurations in their original 2-dimensional lattice representations. In this way, the VAE is is used as an alternative to conventional statistical mechanics algorithms to accomplish the same task of characterizing the structural features of input configurations.

The encoder CNN uses four convolutional layers with kernel shapes of (3, 3) following the input layer with kernel strides of (2, 2) and increasing filter counts by a factor of 4. Zero-padding is used to ensure that the entire input is reached with convolutions. Furthermore, each convolutional layer uses scaled exponential linear unit activation functions (SELU) and LeCun normal initializations as well as kernels of shape (3, 3)⁵³. The output of the final convolutional layer is then flattened, feeding into two dense layers of eight neurons respectively representing the latent variables that correspond to the means $\mu _i$ and logarithmic variances $\log \sigma _i^2$ of multivariate Gaussian distributions using linear activations. A random variable $z_i$ is drawn from the distribution such that $z_i = \mu _i+\exp [\frac{1}{2}\log \sigma _i^2]\mathrm {N}_{0,1}$, where $\mathrm {N}_{0,1}$ is the standard normal distribution. The logarithmic variance is used in favor of the standard deviation directly in the interest of maintaining numerical stability. The random variable $z_i$ is then used as the input layer for the decoder CNN, where $z_i$ is mapped to a dense layer that is then reshaped to match the structure of the output from the final convolutional layer in the encoder CNN. From there, the decoder CNN is simply the reverse of the encoder network in structure, albeit with convolutional transpose layers in favor of standard convolutional layers. The final output layer from the decoder network is thus a reproduction of the original input configurations to the encoder network using a sigmoid activation function. The structures of the VAE as a whole is shown in Fig. 1 as well as an example of a convolution operation that composes the bulk of the operations in the encoder and decoder networks is shown in Fig. 2.

The loss term consists of two separate components. The first is the standard reconstruction loss, which was implemented using the binary crossentropy between the encoder input and decoder output in this work. Other choices for the reconstruction loss are still valid, however, such as mean squared error or mean absolute error. The second loss term is a Kullback-Liebler divergence term which acts as a regularizer to ensure the latent variables $\mu _i$ an $\sigma _i$ faithfully represent multivariate Gaussian parameters. The combination of the reconstruction loss and the Kullback–Leibler divergence is called the tractable evidence lower bound, often referred to as ELBO. In this work, the Kullback–Leibler term was decomposed in a manner similar to a $\beta -$ total correlation VAE ($\beta -$TCVAE) network, separating it into three parts describing the index-code mutual information, total correlation, and dimension-wise Kulback-Leibler divergence⁵⁴. Minibatch stratified sampling was also employed during training⁵⁴. The specific parameters of the decomposition used were $\alpha =\lambda =1$ and $\beta =8$.

The Nesterov-accelerated Adaptive Moment Estimation (Nadam) optimizer was used to optimize the loss, though many other choices are available⁵⁵. It was found that the adaptive nature of the Nadam optimizer more efficiently arrived at minimizing the loss during training of the $\beta -$TCVAE model than other optimizers. The specific parameters used for the Nadam optimizer were $\beta _1 = 0.9$, $\beta _2 = 0.999$, a schedule decay of 0.4, and the default epsilon provided by the Keras library. A learning rate of 0.00001 was chosen. Training was performed over 16 epochs with a batch size of 845 and the samples were shuffled before training started. A callback was used to reduce the learning rate on a loss plateau with a patience of 8 epochs.

After fitting the $\beta -$TCVAE model, the latent encodings of the Ising configurations were extracted for further analysis. Principal component analysis (PCA) was used on the latent means and standard deviations independently to produce linear transformations of the Gaussian parameters that more clearly discriminate between the samples using the scikit-learn package in an attempt to further disentangle the representations provided by the $\beta $-TCVAE⁴⁹. This is done by diagonalizing the covariance matrix of the original features to find a set of independent orthogonal projections that describe the most statistically varied linear combinations of the original feature space⁴⁹. The PCA projections are then interpreted for the 2-dimensional Ising model. The motivation for using the principal components (PC) of the latent variables instead of the raw latent variables is to more effectively capture measurements that are both statistically independent due to the orthogonality constraint and also explain the most variance possible in the latent space under said constraint. Given that the latent representations characterize the structure of the Ising configurations, the principal components of the latent representations allow for more effective discrimination between the different structural characteristics of the configurations than the raw latent variables do. The $\beta -$TCVAE model used in this work was implemented using the Keras ML library with TensorFlow as a backend^56,57.

Results

All of the plots in this section were generated with the MatPlotLib package using a perceptually uniform colormap⁵⁸. In each plot, the coloration for a square sector on the diagram represents the average value of the measurement at that sector on the diagram, with lightness corresponding to magnitude.

The latent means $\mu _i$ contain only one PC with noticeable statistical significance as it explains 77.1% of the total statistical variances between the $\mu _i$ encodings of the Ising configurations while the rest explained less than 4% each. This PC will be denoted with $\nu _0$. These results reflect the accomplishments of prior published works³.

By comparing $\nu _0$ depicted in Fig. 3 to the calculated magnetizations m of the Ising configurations in Fig. 4, it is readily apparent that $\nu _0$ is rather faithfully representing the magnetizations of the Ising configurations. There are some inaccuracies in the intermediate magnetizations produced by a relationship resembing a sigmoid between $\nu _0$ and m, but a very clear discrimination between the ferromagnetic spin-up and ferromagnetic spin-down configurations is shown. Since the magnetizations act as the order parameter for the 2-dimensional Ising model, this shows that the extraction of a reasonable representation of the order parameter is possible with a VAE. It is important to note that since the magnetization is a linear feature of the Ising configurations, a much simpler linear model would be sufficient for extracting the magnetization.

The latent standard deviations $\sigma _i$ show much more interesting behavior, however. Two PCs of the $\sigma _i$ encodings, denoted as $\tau _0$ and $\tau _1$, are investigated with respect to the external field strengths and the temperatures.

By comparing $\tau _0$ depicted in Fig. 5 to the calculated energies E of the Ising configurations shown in Fig. 6, it is clear that $\tau _0$ exhibits a strong discrimination between the low to intermediate energy regions and the highest energy region characterized by a cone starting at the vanishing field critical point approximated at $T_C \approx 2.25$ that extends symmetrically to include more external field values with rising temperature, which is rather similar to the critical point predicted using a dense autoencoder¹⁵. This is in effect capturing the concretely paramagnetic samples and the relative error in the estimation of the critical temperature is acceptable with a 0.85% overestimate error with respect to the exact value of $T_C = \frac{2}{\ln [1+\sqrt{2}]} \approx 2.27$²¹. Given that the paramagnetic samples are essentially noise due to entropic contributions from thermal fluctuations destroying any order that would otherwise be present, it makes sense that these would be easy to discriminate from the rest of the samples using a $\beta -$TCVAE model. This is because the samples with $\nu _0$ values corresponding to nearly zero magnetizations and rather high values for $\tau _0$ will resemble Gaussian noise with no notable order preference, which is indeed reflected in the raw data. In this way, it seems that $\nu _0$ is suitable for tracking the ferromagnetic ordering while $\tau _0$ is suitable for characterizing the paramagnetic disorder.

The behavior of $\tau _1$ shown in Fig. 7 is even more interesting, as it is not simply discriminating samples with intermediate energies from the rest of the data set. If this were true, then some samples at temperatures below the critical point at non-zero external field strengths would be included, as is readily apparent in the energies shown in Fig. 6. Rather, there is another cone shape as was seen with $\tau _0$, albeit much wider and with the the samples represented strongly by $\tau _0$ omitted. In effect, it would appear as if $\tau _1$ is capturing regions in the diagram with intermediate structural disorder as opposed to the highly disordered structures captured by $\tau _0$. Interestingly, $\tau _1$ bears a rather strong resemblance to the specific heat capacity C depicted in Fig. 8. It is worth noting that there is a slight asymmetry between the spin up and the spin down configurations in $\tau _1$, but it has negligible effects on the relevant analysis.

The distribution of the error between the true and $\beta -$TCVAE predicted values for the Ising model spins is shown in Fig. 9. The distribution is very sharply centered around and reasonably symmetric about zero, showing suitable spin prediction accuracy without a considerable bias towards one spin over the other. The distribution of the absolute errors between the true and $\beta -$TCVAE predicted values is shown in Fig. 10, showing that the bulk of the predictions exhibit very little error. The distribution of the Kullback–Leibler divergences is depicted in Fig. 11 and is well-behaved with few outliers.

Discussion

In essence, using a VAE to extract structural information from raw Ising configurations exposes interesting derived descriptors of the configurations that can be used to not only identify a transition point, but also a crossover region amongst other regions of interest. The crux of this analysis is in the interpretation of the extracted feature space as represented by the latent variables. This is done by studying the behavior of the latent variable mappings of the Ising configurations with respect to the external magnetic fields and temperatures.

Considering that $\nu _0$ reflects the magnetization for the 2-dimensional Ising model, this means it can be readily interpreted as an indicator for the ferromagnetic ordering exhibited by the configurations. By contrast, $\tau _0$ and $\tau _1$ can be interpreted as an indicator of paramagnetic disorder that also provides a suitable estimate of the transition temperature. The extracted region from $\tau _1$ can readily be interpreted as the crossover region, as these configurations exhibit order preferences alongside a significant amount of noise brought on by the entropic contributions from the thermal fluctuations at higher temperatures. As would be expected of the crossover region, it shifts to higher temperatures with increasing external magnetic field strengths.

These results potentially carry broad implications for the path towards formulating a generalized order parameter alongside a notion of a crossover region with minimal a priori information through the use of ML methods, which would allow for the investigation of many interesting complex systems in condensed matter physics and materials science. The advantage of the present method is in its capability of capturing the crossovers. This opens a new avenue for the study of quantum critical points from the data obtained at low but finite temperatures that instead exhibits crossover regions. Examples of these include data from large scale numerical Quantum Monte Carlo simulations for heavy fermion materials and high temperature superconducting cuprates for which quantum critical points are believed to play crucial roles for their interesting properties^59,60,61.

There are many opportunities beyond investigating more complex systems by introducing improvements to this method beyond the scope of this work. For instance, finite-size scaling is an important approach towards addressing limitations presented by finite-sized systems for investigation critical phenomena⁶². Establishing correspondence between the VAE encodings of different system sizes is a challenging proposition, as different VAE structures will need to be trained for each system size, which in turn may require different hyperparameters and training iteration counts to provide similar results. Consequently, numerical difficulties can arise when performing finite-size scaling analysis, as the variation of predicted properties with respect to system size may be difficult to isolate from the systemic variation due to different neural networks being used to extract said properties. Nevertheless, this would be a significant step towards improving VAE characterization of critical phenomena.

References

Gubernatis, J. E., Jarrell, M., Silver, R. N. & Sivia, D. S. Quantum monte carlo simulations and maximum entropy: Dynamics from imaginary-time data. Phys. Rev. B 44, 6011–6029. https://doi.org/10.1103/PhysRevB.44.6011 (1991).
Article ADS CAS Google Scholar
Jarrell, M. & Gubernatis, J. Bayesian inference and the analytic continuation of imaginary-time quantum monte carlo data. Phys. Rep. 269, 133–195. https://doi.org/10.1016/0370-1573(95)00074-7 (1996).
Article ADS MathSciNet CAS MATH Google Scholar
Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434 (2017).
Article CAS Google Scholar
Huang, L. & Wang, L. Accelerated monte carlo simulations with restricted boltzmann machines. Phys. Rev. B 95, 035105. https://doi.org/10.1103/PhysRevB.95.035105 (2017).
Article ADS Google Scholar
Lindemann, F. The calculation of molecular vibration frequencies. Physik. Z. 11, 609–615 (1910).
CAS Google Scholar
Varma, C. M. & Zhu, L. Helicity order: Hidden order parameter in ${{{\rm uru}}_{2}{{\rm si}}}_{2}$. Phys. Rev. Lett. 96, 036405. https://doi.org/10.1103/PhysRevLett.96.036405 (2006).
Article ADS CAS PubMed Google Scholar
Chakravarty, S., Laughlin, R. B., Morr, D. K. & Nayak, C. Hidden order in the cuprates. Phys. Rev. B 63, 094503. https://doi.org/10.1103/PhysRevB.63.094503 (2001).
Article ADS CAS Google Scholar
Chandra, P., Coleman, P., Mydosh, J. A. & Tripathi, V. Nature 417, (2002).
Vojta, M. Quantum phase transitions. Rep. Prog. Phys. 66, 2069–2110. https://doi.org/10.1088/0034-4885/66/12/r01 (2003).
Article ADS MathSciNet CAS Google Scholar
Wang, L. Discovering phase transitions with unsupervised learning. Phys. Rev. B 94, 195105. https://doi.org/10.1103/PhysRevB.94.195105 (2016).
Article ADS Google Scholar
Pilania, G., Gubernatis, J. E. & Lookman, T. Structure classification and melting temperature prediction in octet ab solids via machine learning. Phys. Rev. B 91, 214302. https://doi.org/10.1103/PhysRevB.91.214302 (2015).
Article ADS CAS Google Scholar
Walker, N., Tam, K.-M., Novak, B. & Jarrell, M. Identifying structural changes with unsupervised machine learning methods. Phys. Rev. E 98, 053305. https://doi.org/10.1103/PhysRevE.98.053305 (2018).
Article ADS CAS Google Scholar
Hu, W., Singh, R. R. P. & Scalettar, R. T. Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical examination. Phys. Rev. E 95, 062122. https://doi.org/10.1103/PhysRevE.95.062122 (2017).
Article ADS PubMed Google Scholar
Wetzel, S. J. Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders. Phys. Rev. E 96, 022140. https://doi.org/10.1103/PhysRevE.96.022140 (2017).
Article ADS PubMed Google Scholar
Alexandrou, C., Athenodorou, A., Chrysostomou, C. & Paul, S. Unsupervised identification of the phase transition on the 2D-Ising model. arXiv e-prints arXiv:1903.03506 (2019).
Wetzel, S. J. & Scherzer, M. Machine learning of explicit order parameters: from the Ising model to SU(2) lattice gauge theory. Phys. Rev. 96, 184410. https://doi.org/10.1103/PhysRevB.96.184410 (2017).
Article ADS Google Scholar
Wang, L. Discovering phase transitions with unsupervised learning. Phys. Rev. 94, 195105. https://doi.org/10.1103/PhysRevB.94.195105 (2016).
Article ADS Google Scholar
Kim, D. & Kim, D.-H. Smallest neural network to learn the Ising criticality. Phys. Rev. E 98, 022138. https://doi.org/10.1103/PhysRevE.98.022138 (2018).
Article ADS CAS PubMed Google Scholar
Torlai, G. & Melko, R. G. Learning thermodynamics with boltzmann machines. Phys. Rev. B 94, 165134 (2016).
Article ADS Google Scholar
Morningstar, A. & Melko, R. G. Deep learning the ising model near criticality. J. Mach. Learn. Res. 18, 5975–5991 (2017).
MathSciNet MATH Google Scholar
Onsager, L. Crystal statistics. i. a two-dimensional model with an order-disorder transition. Phys. Rev. 65, 117–149. https://doi.org/10.1103/PhysRev.65.117 (1944).
Article ADS MathSciNet CAS MATH Google Scholar
Chaikin, P. M. & Lubensky, T. C. Principles of condensed matter physics (Cambridge University Press, Cambridge, 1995).
Book Google Scholar
Joy, P. A., Kumar, P. S. A. & Date, S. K. The relationship between field-cooled and zero-field-cooled susceptibilities of some ordered magnetic systems. J. Phys. Condens. Matter 10, 11049–11054. https://doi.org/10.1088/0953-8984/10/48/024 (1998).
Article ADS CAS Google Scholar
Nordblad, P., Lundgren, L. & Sandlund, L. A link between the relaxation of the zero field cooled and the thermoremanent magnetizations in spin glasses. J. Magn. Magn. Mater. 54–57, 185–186. https://doi.org/10.1016/0304-8853(86)90543-3 (1986).
Article ADS Google Scholar
Montroll, E. W., Potts, R. B. & Ward, J. C. Correlations and spontaneous magnetization of the two-dimensional ising model. J. Math. Phys. 4, 308–322. https://doi.org/10.1063/1.1703955 (1963).
Article ADS MathSciNet Google Scholar
Singh, S. P. Spinodal theory: A common rupturing mechanism in spinodal dewetting and surface directed phase separation (some technological aspects: spatial correlations and the significance of dipole-quadrupole interaction in spinodal dewetting). Adv. Condens. Matter. Phys. 2011. https://doi.org/10.1155/2011/526397 (2011).
Magnus, F. et al. Long-range magnetic interactions and proximity effects in an amorphous exchange-spring magnet. Nat. Commun. 7, 1–7 (2016).
Article Google Scholar
Singh, S. P. Revisiting 2d lattice based spin flip-flop ising model: magnetic properties of a thin film and its temperature dependence. Eur. J. Phys. Educ. 5, 8–19 (2017).
Google Scholar
Huang, R. & Gujrati, P. D. Phase transitions of antiferromagnetic ising spins on the zigzag surface of an asymmetrical husimi lattice. R. S. Open Sci. 6, 181500 (2019).
Article MathSciNet CAS Google Scholar
Hirschfelder, J. O., Curtiss, C. F. & Bird., R. B. Molecular theory of gases and liquids. J. Polym. Sci. 17, 116–116. https://doi.org/10.1002/pol.1955.120178311 (1955).
Article MATH Google Scholar
Titov, S. V. & Tovbin, Y. K. A molecular model of water based on the lattice gas model. Russ. J. Phys. Chem. A 85, 194–201. https://doi.org/10.1134/S0036024411020336 (2011).
Article CAS Google Scholar
Shi, Y. & Duke, T. Cooperative model of bacterial sensing. Phys. Rev. E 58, 6399–6406. https://doi.org/10.1103/PhysRevE.58.6399 (1998).
Article ADS CAS Google Scholar
Bai, F. et al. Conformational spread as a mechanism for cooperativity in the bacterial flagellar switch. Science 327, 685–689. https://doi.org/10.1126/science.1182105 (2010).
Article ADS CAS PubMed Google Scholar
Vtyurina, N. N. et al. Hysteresis in dna compaction by dps is described by an ising model. Proc. Nat. Acad. Sci. 113, 4982–4987. https://doi.org/10.1073/pnas.1521241113 (2016).
Article ADS CAS PubMed Google Scholar
Baxter, R. J. Exactly solved models in statistical mechanics (1982).
Kramers, H. A. & Wannier, G. H. Statistics of the two-dimensional ferromagnet. part ii. Phys. Rev. 60, 263–276. https://doi.org/10.1103/PhysRev.60.263 (1941).
Article ADS MathSciNet MATH Google Scholar
Wannier, G. H. The statistical problem in cooperative phenomena. Rev. Mod. Phys. 17, 50–60. https://doi.org/10.1103/RevModPhys.17.50 (1945).
Article ADS Google Scholar
Nozieres, P. & Schmitt-Rink, S. Bose condensation in an attractive fermion gas: from weak to strong coupling superconductivity. J. Low Temp. Phys. 59, 195–211 (1985).
Article ADS CAS Google Scholar
Kondo, J. Resistance minimum in dilute magnetic alloys. Prog. Theor. Phys. 32, 37–49. https://doi.org/10.1143/PTP.32.37 (1964).
Article ADS CAS Google Scholar
Fytas, N. G. et al. Universality from disorder in the random-bond blume-capel model. Phys. Rev. E 97, 040102. https://doi.org/10.1103/PhysRevE.97.040102 (2018).
Article ADS MathSciNet CAS PubMed Google Scholar
Fytas, N. G. & Martín-Mayor, V. Universality in the three-dimensional random-field ising model. Phys. Rev. Lett. 110, 227201. https://doi.org/10.1103/PhysRevLett.110.227201 (2013).
Article ADS CAS PubMed Google Scholar
Van Rossum, G. & Drake, F. L. Python 3 Reference Manual (CreateSpace, Scotts Valley, CA, 2009).
Google Scholar
van der Walt, S., Colbert, S. C. & Varoquaux, G. The numpy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13, 22–30. https://doi.org/10.1109/MCSE.2011.37 (2011).
Article Google Scholar
Dask Development Team. Dask: Library for dynamic task scheduling. https://dask.org (2016).
Lam, S. K., Pitrou, A. & Seibert, S. Numba: A llvm-based python jit compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, LLVM ’15, 7:1–7:6, https://doi.org/10.1145/2833157.2833162 (ACM, New York, NY, USA, 2015).
Swendsen, R. H. & Wang, J.-S. Replica monte carlo simulation of spin-glasses. Phys. Rev. Lett. 57, 2607–2609. https://doi.org/10.1103/PhysRevLett.57.2607 (1986).
Article ADS MathSciNet CAS PubMed Google Scholar
Hukushima, K. & Nemoto, K. Exchange monte carlo method and application to spin glass simulations. J. Phys. Soc. Jpn. 65, 1604–1608. https://doi.org/10.1143/JPSJ.65.1604 (1996).
Article ADS CAS Google Scholar
Marinari, E. & Parisi, G. Simulated tempering: a new monte carlo scheme. EPL 19, 451–458. https://doi.org/10.1209/0295-5075/19/6/002 (1992).
Article ADS CAS Google Scholar
Pearson, K. Liii. on lines and planes of closest fit to systems of points in space. Philos. Mag. 2, 559–572. https://doi.org/10.1080/14786440109462720 (1901).
Article MATH Google Scholar
van der Maaten, L. & Hinton, G. Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2011).
MATH Google Scholar
Kingma, D. P. & Welling, M. Auto-Encoding Variational Bayes. arXiv e-prints arXiv:1312.6114 (2013).
Zhang, W., Itoh, K., Tanida, J. & Ichioka, Y. Parallel distributed processing model with local space-invariant interconnections and its optical architecture. Appl. Opt. 29, 4790–4797. https://doi.org/10.1364/AO.29.004790 (1990).
Article ADS CAS PubMed Google Scholar
Klambauer, G., Unterthiner, T., Mayr, A. & Hochreiter, S. Self-normalizing neural networks. CoRR arXiv:abs/1706.02515 (2017).
Chen, T. Q., Li, X., Grosse, R. B. & Duvenaud, D. K. Isolating sources of disentanglement in variational autoencoders. CoRR arXiv:abs/1802.04942 (2018).
Ruder, S. An overview of gradient descent optimization algorithms. arXiv e-prints arXiv:1609.04747 (2016).
Chollet, F. et al. Keras. https://keras.io (2015).
Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available from tensorflow.org.
Hunter, J. D. Matplotlib: a 2d graphics environment. Comput. Sci. Eng. 9, 90–95. https://doi.org/10.1109/MCSE.2007.55 (2007).
Article Google Scholar
Coleman, P. & Schofield, A. J. Quantum criticality. Nature 433, 226 (2005).
Article ADS CAS Google Scholar
Varma, C., Nussinov, Z. & van Saarloos, W. Singular or non-fermi liquids. Phys. Rep. 361, 267–417. https://doi.org/10.1016/S0370-1573(01)00060-6 (2002).
Article ADS MathSciNet CAS MATH Google Scholar
Vidhyadhiraja, N. S., Macridin, A., Şen, C., Jarrell, M. & Ma, M. Quantum critical point at finite doping in the 2d hubbard model: a dynamical cluster quantum monte carlo study. Phys. Rev. Lett. 102, 206407. https://doi.org/10.1103/PhysRevLett.102.206407 (2009).
Article ADS CAS PubMed Google Scholar
Cardy, J. Finite-size Scaling. Current physics (North-Holland, 1988).

Download references

Acknowledgements

This work is funded by the NSF EPSCoR CIMM project under award OIA-1541079. Additional support (MJ) was provided by NSF Materials Theory grant DMR-1728457. An award of computer time was provided by the INCITE program. This research also used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.

Author information

Authors and Affiliations

Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA, 70803, USA
Nicholas Walker, Ka-Ming Tam & Mark Jarrell
Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
Ka-Ming Tam & Mark Jarrell

Authors

Nicholas Walker
View author publications
You can also search for this author in PubMed Google Scholar
Ka-Ming Tam
View author publications
You can also search for this author in PubMed Google Scholar
Mark Jarrell
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.W. collected the simulation data, analyzed the results, prepared the figures, and wrote the bulk of the manuscript. K.-M.T. wrote considerable parts of the introduction. M.J. provided valuable guidance and conversations regarding the physical significance of the work. All authors reviewed the manuscript.

Corresponding author

Correspondence to Nicholas Walker.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Walker, N., Tam, KM. & Jarrell, M. Deep learning on the 2-dimensional Ising model to extract the crossover region with a variational autoencoder. Sci Rep 10, 13047 (2020). https://doi.org/10.1038/s41598-020-69848-5

Download citation

Received: 27 December 2019
Accepted: 06 July 2020
Published: 03 August 2020
DOI: https://doi.org/10.1038/s41598-020-69848-5

This article is cited by

Neural network-based order parameter for phase transitions and its applications in high-entropy alloys
- Junqi Yin
- Zongrui Pei
- Michael C. Gao
Nature Computational Science (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.