## Abstract

The hydrogen phase diagram has several unusual features which are well reproduced by density functional calculations. Unfortunately, these calculations do not provide good physical insights into why those features occur. Here, we present a fast interatomic potential, which reproduces the molecular hydrogen phases: orientationally disordered Phase I; broken-symmetry Phase II and reentrant melt curve. The H_{2} vibrational frequency drops at high pressure because of increased coupling between neighbouring molecules, not bond weakening. Liquid H_{2} is denser than coexisting close-packed solid at high pressure because the favored molecular orientation switches from quadrupole-energy-minimizing to steric-repulsion-minimizing. The latter allows molecules to get closer together, without the atoms getting closer, but cannot be achieved within in a close-packed layer due to frustration. A similar effect causes negative thermal expansion. At high pressure, rotation is hindered in Phase I, such that it cannot be regarded as a molecular rotor phase.

## Introduction

Since the discovery of solid molecular hydrogen in 1899, the nature of this phase has remained controversial^{1}. It is now believed that the solid “Phase I” comprises rotating hydrogen molecules on a hexagonal close-packed lattice^{2}. With increasing pressure the rotation becomes hindered^{3} by intermolecular interactions, both steric and electrostatic, leading ultimately to phase transformations to a low temperature Phase II^{4}, in which quadrupole–quadrupole interactions (EQQ) arrest the rotation^{5}, and a high-pressure Phase III^{6,7}, in which steric interactions dominate.

Experimental study of these phases has proved challenging. X-ray study showed the hcp structure, but could not resolve molecular orientation at low temperature^{8}, and the first room temperature only completed in 2019^{9}. Raman spectroscopy shows peaks corresponding to quantum rotors at low pressure, which gradually broaden and shift with pressure, and a distinctive sharp phonon mode which rules out cubic close packing as a structure^{10,11,12,13,14,15}. The melt line has a strongly positive Clapeyron slope at low pressures, with a turnover around 100 GPa^{16,17,18,19}. The negative slope means that even though the solid is hexagonal “close-packed”, the liquid must be even denser. The turnover also means the liquid has higher compressibility, but how this comes about remains unexplained. X-ray studies at low temperature traversing Phase I–II–III do not show any convincing structural changes, in part because it has proven impossible to get sufficient resolution to determine the molecular orientation^{8,9}.

Spectroscopy gives vibrational data, which are still insufficient to determine the structures of phases II, III, and IV. There have been many and varied attempts to identify the structures via simulations^{20,21,22,23,24,25,26,27,28,29,30}. However, a consensus has not yet been reached. Based on fully ab initio calculations, including density functional theory (DFT) or quantum Monte Carlo (QMC)^{31}, a number of candidate structures have been proposed for Phase II and III. Besides differing molecular orientation, they are all similar, consisting of primitive cells with lattice sites close to hcp^{22}. Among the structures, the *P*2_{1}/*c*-24, *C*2/*c*-24, and *P**c*-48 structures provide low-energy candidate structures for phases II, III, and IV.

The modern theory of the structure of these phases is based around electronic structure calculations. The early work involved calculating the ground state, assuming classical nuclei, then adding quantum-nuclear effects via the quasiharmonic approximation. This methodology, whether based on DFT or QMC, predicts hcp-like ground states for Phases I–III in agreement with X-ray data. However the spectroscopic signature of the Phase II—the appearance of many sharp, low-frequency, and peaks^{11,32}—is not well reproduced by the quasiharmonic calculations. As explained in the previous paragraph, the likely cause is a failure of the harmonic mode assumption for excited states, rather than the DFT itself.

To understand the high-temperature phases, one needs to examine non-harmonic behaviour, including rotation, which means going beyond a single unit cell, e.g. using molecular dynamics. Molecular dynamics requires forces on each atom based on the positions of all the atoms in the system, which requires a force model which is fast enough to allow large simulations. Here we use a machine-learning approach to derive a transferable force model based on an interatomic potential. There are several approaches to machine-learning interatomic forces^{33,34,35}, which balance speed, transferability and accuracy. We adopt an approach focusing on transferability.

Any machine-learned potential should conserve energy, and therefore be based on a Hamiltonian (the potential). Forces are guaranteed to be conservative if they depend on translational and rotational invariant quantities: the “fingerprint” of each atom. We are interested in molecular phases here, so our potential specifies which atoms are “bonded” and allows stretching but not breaking of bonds.

In this paper we apply the machine-learning approach so create an interatomic potential for molecular hydrogen. We show that the potential describes the three molecular solid phases, with free-rotor Phase I, and broken-symmetry Phase II and a high-pressure Phase III. Furthermore, the melt line has a maximum, such that at high pressure the liquid is denser than the hcp solid, a feature we attribute to short-range directional order giving lower quadrupole–quadrupole interaction in the solid. The potential is trained on energies and classical Hellmann–Feynman forces derived from standard DFT in the Born–Oppenheimer approximation adopted by all standard DFT codes, so the interatomic potential is the same for deuterium, hydrogen-deuteride (HD) and hydrogen. The potential accounts for binding due to electronic structure: contributions from quantum-nuclear effects can be incorporated using lattice dynamics or path integral methods.

## Results

### Fitting forces and the phase diagram

A particular challenge for hydrogen comes from the hierarchy of energies. The covalent bond is much stronger than the van der Waals attraction between molecules, which is turn is much stronger than the EQQ interactions which determine molecular orientations. To address this our potential combines a hierarchical fitting strategy alongside machine-learning (HMLP) described below.

For transferability testing, we used the standard approach of fitting to a subset of the data and testing against a different subset. Furthermore, we used an iterative fitting process: a trial potential was fitted, and applied in both Phase II annealing and melting line MD simulations. If novel configurations were found, they were used to generate more reference states for the DFT database, and the fitting process was repeated. This iterative process ensures that spurious structures are suppressed and the ground state structure is the same as found in DFT. Technical details of the forcefield parameterisation are given in the “Methods” section.

### Phase diagram

Figure 1 shows the very good agreement between the classical HMLP and the DFT phase diagrams. By eliminating finite size effects, the HMLP can capture the full long-range correlations, however, this does not appear to have a significant effect on the phase boundaries.

In Phase I, the H_{2} molecules exhibit free rotation at pressures below 40 GPa and temperatures below 900 K (orange hexagon symbols); at higher pressures the rotation is inhibited but there is no long-ranged orientational order. At low temperatures, Phase II becomes stable (red triangles), and the stable temperature region increases gradually with pressures. At high temperatures, the hcp lattice collapses to a liquid state. The calculated melting curve has a strong positive slope (*d**T*/*d**P* > 0) at low pressures, reaches a maximum at around 900 K and 90 GPa, and then drops. The HMLP predicted phase diagram agrees reasonably with experimental observations, as well as DFT (Fig. 1).

The HMLP and DFT predictions are good for the melt curve, but both overstabilise the broken-symmetry Phase II. This is due to the lack of quantum-nuclear effects, notably the zero-point energy, and can be addressed by including quantum-nuclear effects in the simulation. The discrepancy in the Phase I–II line does not indicate any inaccuracy of the HMLP itself. Our predicted melting curve is consistent with experiments: the value for the melting curve maximum is located between 80 and 100 GPa and 900 K, similar to the HMLP potential values. It also agrees with two-phase ab initio simulations, which proposed a gradual softening of the intermolecular repulsive interactions as its cause^{16}. The close agreement of the HMLP transition pressure with experimental data enables us to accurately simulate behaviours of temperature- or pressure-driven phase transition between Phase I and II. A “Phase III” is observed at higher pressures, corresponding to a different symmetry-breaking. However, by design the present HMLP model should start to fail to capture the properties of H_{2} at still higher pressures, where molecule dissociation needs to be considered.

### Nature of Phase I

Phase I can be easily recognised in MD by ordering of the molecular centres on the hcp lattice, and disorder of the orientations. Although frequently referred to as a free-rotor phase, we find this to be true only at low pressures. As pressure is increased the angular momentum auto-correlation becomes shorter than a single rotation, and then acquires a negative component, indicating that the molecule is librating.

Another characteristic of Phase I is the molecular vibration or “vibron”: in Raman scattering this corresponds to the in-phase vibration of all molecules. The vibron frequency first increases, then decreases with pressure. Two plausible reasons are given for this reduction: either increased intermolecular coupling or weakening of the covalent bond. In our model, the covalent bond is always described by the same Morse parameters, so changes in the vibron frequency can arise only from resonant interactions between molecules, not weakening of the bond. Thus reproducing the reentrant vibron behaviour is a test of both the physical basis and the parameterisation of the model.

Since the molecules are rotating in Phase I, lattice dynamics cannot be used, so Raman phonon frequency is numerically characterised by the in-phase mode-projected velocity auto-correlation function (VAF)^{36}. Trajectories and velocities were produced from 150 K HMLP-MD simulations within the micro-canonical ensemble (NVT) initiated in the *P*2_{1}/*c-*24 structure. A very fine time step of 0.05 fs was used and the trajectory and velocities were saved every ten time steps. By calculating the bond stretching velocities and Fourier transformation of the VAFs we calculate both the total vibron density of states^{26,37} from

and the signal from the most strongly Raman-active mode,

where *i**k* runs over all molecules (comprising atoms *i* and *k*). A similar projection method is used for the E2g phonon^{38}.

Figure 2 plots the calculated total vibron spectra of solid and liquid hydrogen as a function of pressure from MD simulations. Both show a signature of vibron turnover above a critical pressure (about 54 GPa), consistent with the experimental observations^{11}. This proves that bond weakening is not required for the turnover, since our potential has a fixed bond strength. Notably, the mean bond length in Phase I decreases monotonically with pressure (Fig. 2c), again at odds with ideas of bond weakening. What appears to be happening is a competition between two effects: at higher pressures the compression of the bond causes an increase in the frequency due to anharmonicity in the potential, whereas above 54 GPa the frequency is lowered due to coupling between the molecules.

The hcp structure has a Raman-active mode (E2g symmetry) corresponding to shearing motion of the basal plane. The frequency of this mode is experimentally well-determined and extremely pressure-dependent, from 36 cm^{−1} at zero pressure to 1100 cm^{−1} at 250 GPa^{39,40,41,42}. Figure 2d shows a comparison of our potential with experimental pressure dependencies *ν*(*P*) of the E2g optical. The red symbols in are our HMLP-MD predictions, consistent with the DFT data of this mode extracted from our calculations. Comparing the calculations with experiment shows that the HMLP predicted frequency curves agree better with experiment than existing isotropic empirical potentials^{43,44} (olive curve).

### Denser than close-packed liquid

Figure 1 shows that the potential correctly reproduces the turnover and negative Clapeyron slope. We investigated the possible explanation for this denser than close-packed liquid. Figure 3a shows the equation of state for both solid and liquid phases, with the crossover indicating where the liquid is denser than the solid. The HMLP predicts a negative thermal expansion, which is consistent with DFT^{45}. The normalised radial distribution function (Fig. 3b) shows that the liquid structure is essentially unchanged with pressure up to the pressures where bond breaking becomes a factor. We therefore deduce that the denser liquid is not related to the molecular–atomic transition.

Figure 4 compares the solid and liquid at the melting point. They are remarkably similar: close to the phase boundary the liquid shows five discernible neighbour peaks indicating short-ranged structure to 10 Å. The molecular bond length is longer in the liquid than the solid (shown more clearly in Fig. 2), but the separation between molecules is noticeably smaller in the liquid as evidenced by the first peak in the molecule–molecule RDF. This means that the molecules get closer together in the liquid, despite being longer.

Intermolecular interactions are dominated by quadrupole–quadrupole interactions and steric repulsion. Table 1 shows the implied contribution from quadrupole–quadrupole interactions calculated by electrostatics from HMLP simulations. Although the ML potential has no explicit electrostatic terms, there is a strong orientation correlation, which lowers the quadrupolar energy, not only in Phase II, but also in Phase I and to a lesser extent in the liquid.

We hypothesised that molecules can get closer together if their constituent atoms are further apart (i.e. in an “X” shape viewed down the intermolecular vector). By contrast, the solid has orientations which offer higher cohesive energy.

Figure 5b investigates this further using DFT, showing that the T configuration which optimises the quadrupole interaction becomes unfavourable with respect to the X configuration at a separation of 2.25 Å. As we saw in Fig. 4, the nearest neighbours are already this close by 20 GPa.

To quantify this, we looked across the various fingerprints to see which most strongly differentiate liquid from solid configurations on the melt line. This turns out to be *f**p*_*E*_{QQ}, the fingerprint with the same functional form as the quadrupole–quadrupole interaction: the MLP has learned that this is important. Figure 5a shows that the contribution from *f**p*_*E*_{QQ} becomes significantly higher in the solid above 70 GPa. We note that *f**p*_*E*_{QQ} is dimensionless, and its fitted contribution to the potential is different from the estimated quadrupole–quadrupole energy compared in Table 1.

These findings explain why the liquid is denser than the solid. At pressures above the melting point maximum, orientation-dependent interactions are strong enough that rotation is inhibited^{46,47,48}. However, these low-energy arrangements are typically associated with larger intermolecular distances (e.g. the T configuration). By contrast, the liquid favours other arrangements which allow the molecules to come closer.

The X configuration maximises the atom–atom distance for a given intermolecular separation. This and similar arrangements compensates for the smaller molecule–molecule distance in the liquid to give the same peak position in the atom–atom RDF in both liquid and solid.

### Nature of Phase II

We performed extensive HMLP-MD simulations around the Phase I–II boundary, with different starting configurations, to determine candidate structures and phase stability of H_{2} Phase II. At 150 K and 20 GPa, the orientations of the H_{2} molecular axes are almost randomly distributed along different directions, indicating Phase I with freely rotating molecules. Upon compression to the high pressure of 80 GPa and cooling to a low temperature of 50 K, the material transforms to an orientationally ordered phase in which the molecular rotations are restricted (Fig. 1). By carefully comparing it with candidate structures proposed by ab initio calculations, we find that the lattice and molecular ordering is close to *P*2_{1}/*c-*24, which has been one of the most thoroughly studied and strongest candidate for Phase II^{5,22,23,49}.

The HMLP does not include quadrupole interactions explicitly—it has “learnt” them. Table 1 shows what the quadrupole interactions would be, using electrostatic calculation based on the HMLP configurations. The large negative values indicate the prevalence of quadrupole-type ordering: strongest in Phase II. The differences, tens of meV, is of similar magnitude to the phase transformation temperature. The HMLP has learnt that Phase II is stabilised by quadrupole interactions.

The corresponding RDF indicates that the centre of each molecule remains close to the hcp lattice sites. Consequently, we define an order parameter *O* relating the structures of Phase II in our HMLP-MD simulations with the static-lattice DFT predictions of *P*2_{1}/*c-*24. The average value \(\langle O\rangle\) exhibits a sharp change as the system transitions from the structured Phase II to the rotationally symmetric Phase I with increasing temperature. Transition temperatures were taken at discontinuous jumps in \(\langle O\rangle\). This produces the phase diagram shown in Fig. 6b. Note that the transition temperatures obtained from analysis of \(\langle O\rangle\) agree with those obtained from peaks in the heat capacity calculated as \({\left(\frac{\partial H}{\partial T}\right)}_{P}\) from finite differences (see Supplementary Fig. 5). The phase boundary agrees well with the experiment^{32}, particularly for the more classically behaved deuterium. Similarity to experiments on deuterium rather than hydrogen is perhaps unsurprising, since nuclear quantum effects such as zero-point motion are significant at the low temperatures investigated here.

### Transition to Phase III

Above 160 GPa we find a high-pressure transformation to a broken-symmetry structure different from Phase II dominated by efficient packing rather than EQQ, this is at approximately the same pressure as Phase III.

Experimentally, Phase III is associated with a sharp drop in the vibron frequency and the appearance of a strong IR signal. This implies a non-centrosymmetric structure and a weakening of the molecules. In studies of HD a process of bond dissociation and recombination (“*DISREC*”, 2HD → H_{2} + D_{2}) has been observed^{50}. This bond breaking is seen in DFT to also occur in pure H_{2}^{26,30}. Our potential does not allow for bond breaking, so we have not studied the dynamics of this “Phase III” in detail.

## Discussion

In summary, we have introduced a heirarchical, iterative machine-learning based interatomic potential for atomistic simulations of H_{2} molecules, by directly learning from reference ab initio molecular dynamics simulations. The resultant HMLP-MD approach predicts angular energy dependence in the range of tens of meV/atom and demonstrates good transferability to various structural environments. Several applications have been presented for which our potential is particularly well suited. The fast, transferrable potential is also suitable for a wide range of further applications and extensions, including compounds, bond breaking, and path integral calculations.

The simulations reproduce the equilibrium temperature–pressure phase diagram for molecular phases (I, II, III, and melt, *P* < 160 GPa). The maximum in the melt curve in hydrogen is highly counterintuitive—it requires that the liquid is denser than the hexagonal close-packed solid. By detailed simulation we resolve this by showing that certain molecular orientations (e.g. X) allow the molecules to approach more closely, while others (e.g. T) have lower quadrupole energy. By monitoring an order parameter corresponding to the quadrupolar interactions find that, at high pressures, Phase I develops intermolecular correlations which lower the energy, while the liquid has correlation which lower the volume.

The simulations of the Phase I–II boundary show a transition from an ordered Phase II to a orientation-disordered Phase I. The molecules in Phase I are not freely rotation, and we find that rotation about the *c*-axis persists to higher-pressure/lower-temperature than rotation out of plane.

Our HMLP potential also has shown the capability of predicting the pressure dependence of the Raman-active E2g mode, consistent with experiment and previous DFT calculations. We explain the maximum frequency of the vibron as due to competition between molecular compression and stronger intermolecular coupling. Weakening of the covalent bond is not required.

## Methods

### Machine-learned interatomic potential

#### Learning dataset

Structures for reference atomic environments and benchmarks were accumulated from DFT-based ab initio MD runs. The DFT calculations were performed using the CASTEP package^{51} within the Perdew–Burke–Ernzerhof generalised gradient approximation (PBE)^{52} for the exchange–correlation function. A cutoff energy of 1000 eV for the plane-wave basis set and a *k*-point mesh of 1 × 1 × 1 were selected. To ensure the transferability of the potential to a wide variety of atomistic situations, H_{2} in different geometric arrangements was considered, including modest-sized bulk samples in Phase I, II, and liquid, composed of 144 H_{2} molecules. Moreover, unusual configurations found in HMLP-MD with preliminary versions of the potential were added to the DFT training set to improve performance and transferability. The final accumulated dataset includes up to 41,468 configurations, which are provided as a separated file in the Supplementary Information.

The PBE functional was chosen because it has become the de facto standard in studies of molecular systems, and high-pressure hydrogen. PBE has been criticised for overstabilising the metallic phases: this is due to its behaviour at high electron density gradients^{53}, and not vdW corrections. In this work we only consider molecular phases, so this is not a concern.

We emphasise that determining the suitable dataset is not straightforward. Numerous iterations of the potential were required to obtain a good fit to the phase diagram. A good fit to DFT energies of known phases is not evidence that other phases are unstable: we test each iteration of the potential by running MD simulations in NPT ensemble, cycling the pressure and temperature between the regions expected for Phases I, II, and III to ensure that all phase transitions were between the fitted phases. When a crystal structure different from the fitted one was found, this new phase was calculated using DFT. If this showed that the ML potential described the phase poorly, it was added to the training set and the training redone. The success of such strategy is supported by an example shown in Supplementary Note 2.

#### Covalent bond

The covalent bonding contribution to the force is approximated as \(({F}_{1}^{{\mu }_{12}}\,-\,{F}_{2}^{{\mu }_{12}})/2,\) where \({F}_{1}^{{\mu }_{12}}\) is the component of atomic force projected down the molecular axis. We examined various options for fitting the covalent bond: harmonic, Lennard-Jones and Morse potentials. Although the Morse form provides the best fit across our dataset using simple regression, we adopt the harmonic form in our potential to prevent artificial bond-breaking events at high pressure (>120 GPa). Supplementary Table 1 shows the fitted parameters for the harmonic and Morse form, respectively.

#### Non-bonded interactions

We describe the short-ranged Coulomb and van der Waals potentials using pairwise functions to create the fingerprint. These are built using Gaussians with a smooth cutoff in the form

combined with damped sinusoidal function form

where **r**_{ij} is the distance between atom *i* and *j* with *η*_{k} the range of the *k*th fingerprint, and *f*_{cut}(*r*_{ij}) is a damping function for atoms within the cutoff distance.

These fingerprints are mapped onto the corresponding residual energies, defined by the DFT energies less the contribution from the covalent bonding contribution.

This mapping is achieved using the kernel ridge regression (KRR) method which is capable of handling complex nonlinear relationships^{54}. The details of parameterisation and mapping algorithms are shown in Supplementary Note 3.

#### Orientation-dependent interactions

The orientation-dependent contribution is the last considered. These are fitted to the residuals once covalent and pairwise interactions are subtracted from the DFT energies. The corresponding fingerprints for the orientation-dependent interactions are listed in Supplementary Note 3 and Supplementary Table 2.

The fidelity of our HMLP is evaluated by the comparison of our ML prediction and the DFT calculations in Supplementary Fig. 6. The mean absolute error is of the order of the expected numerical and theoretical accuracy of the reference quantum mechanics-based calculations, indicating good performance of the present ML model.

### Machine-learned molecular dynamics

The simulations were performed using periodic boundary conditions and a time step of 0.5 fs. The Nose–Hoover thermostat and the Parrinello–Rahman barostat^{55} were used for controlling temperature and pressure, respectively. All simulations were carried out using the LAMMPS package and the atomic configurations were visualised with the AtomEye package. Typical models of the H_{2} system was created with *P*2_{1}/*c-*24 structure containing up to 72,576 molecules. To reproduce the entire temperature–pressure phase diagram, the NPT simulations of 1152-atom supercells of *P*2_{1}/*c-*24 structure were carried out at selected temperatures and pressures, from which we can identify the corresponding stable phases and melting point via the Z-method^{56}. Furthermore, the phase-coexistence method^{57} with co-existing 27,648 molecules of H_{2} solid and liquid was adopted to determine the properties of solid and liquid phases at the melting curve. To probe the Phase I–II boundary, 2304 atom supercells of the Phase II *P*2_{1}/*c-*24 structure were allowed to equilibrate for 250 ps at a series of pressures and temperatures.

### Analysis of molecular dynamics

To distinguish between the broken-symmetry structure of Phase II and the free rotors of Phase I we introduce the orientational order parameter *O*.

Here the summation is over unit cells *i*, *j*, each containing a set of basis molecules *b*. Unit vectors \({\hat{{\bf{r}}}}_{i,b}\) and \({\hat{{\bf{r}}}}_{j,b}\) are oriented along the the H–H bond of the *b*th molecule in the *i*th and *j*th unit cell, respectively, and **R**_{ij,b} is the distance between the centre of mass of these two molecules. The angled brackets denote a time average. This parameter probes the long-range order in the system relative to the chosen basis, which in our case is the *P*2_{1}/*c-*24 unit cell^{23}. A value of 1 means that the system has the *P*2_{1}/*c-*24 structure, and a value of 0 suggests that the system is disordered. Note that this order parameter only detects similarity to the given basis and thus a phase change to a structure with a different unit cell will yield an erroneously low value. The trajectories were therefore visually inspected in addition to the order parameter analysis. A 4 × 6 × 4 supercell of *P*2_{1}/*c-*24 (2304 atoms) was used for Phase II, and the unit cells and basis for this system are illustrated in Supplementary Fig. 7. MD trajectories were calculated for temperatures ranging from 30 to 150 K and pressures from 20 to 160 GPa. After a 50 ps equilibration period, the order parameter *O* was averaged over the remaining 200 ps of the trajectory. The results are shown in Fig. 6a.

The Raman-active phonons were extracted from the MD using the projection method which automatically includes anharmonic effects^{36,38}.

We tried numerous approaches to measure the orientation relationship between adjacent molecules, *i* and *j* with interatomic vectors \({\overrightarrow{\sigma }}_{i}\) and \({\overrightarrow{\sigma }}_{j}\) separated by \(\hat{\overrightarrow{R}}\). For Table 1 we used the explicit equation for linear quadrupoles:

where *Q* = 0.26 DÅ is the quadrupole moment of the H_{2} molecule and the orientational factor \(\Gamma (\overrightarrow{{\sigma }_{i}},\overrightarrow{{\sigma }_{j}},\hat{\overrightarrow{R}})\) is defined as:

## Data availability

Datasets used in generation of the H_{2} potential are available at the Edinburgh DataShare https://doi.org/10.7488/ds/2874.

## Code availability

Plugin code with which the interatomic potential can be implemented through LAMMPS is available as supplementary online material in Supplementary Dataset 1.

## References

- 1.
Dewar, J. Sur la solidification de l’hydrogene.

*Ann. Chim. Phys.***18**, 145–150 (1899). - 2.
Van Kranendonk, J.

*Solid Hydrogen*. (Plenum press, New York and London, 1983). - 3.
Lorenzana, H. E., Silvera, I. F. & Goettel, K. A. Orientational phase transitions in hydrogen at megabar pressures.

*Phys. Rev. Lett.***64**, 1939 (1990). - 4.
Silvera, I. F. & Wijngaarden, R. J. New low-temperature phase of molecular deuterium at ultrahigh pressure.

*Phys. Rev. Lett.***47**, 39 (1981). - 5.
van de Bund, S. & Ackland, G. J. Quadrupole arrangements and the ground state of solid hydrogen.

*Phys. Rev. B***101**, 014103 (2020). - 6.
Hemley, R. J. & Mao, H. K. Phase transition in solid molecular hydrogen at ultrahigh pressures.

*Phys. Rev. Lett.***61**, 857–860 (1988). - 7.
Lorenzana, H. E., Silvera, I. F. & Goettel, K. A. Evidence for a structural phase transition in solid hydrogen at megabar pressures.

*Phys. Rev. Lett.***63**, 2080–2083 (1989). - 8.
Akahama, Y. et al. Evidence from x-ray diffraction of orientational ordering in phase III of solid hydrogen at pressures up to 183 GPa.

*Phys. Rev. B***82**, 060101 (2010). - 9.
Ji, C. et al. Ultrahigh-pressure isostructural electronic transitions in hydrogen.

*Nature***573**, 558–562 (2019). - 10.
Goncharov, A. F., Hemley, R. J., Mao, H.-K. & Shu, J. New high-pressure excitations in parahydrogen.

*Phys. Rev. Lett.***80**, 101–104 (1998). - 11.
Goncharov, A. F., Gregoryanz, E., Hemley, R. J. & Mao, H.-K. Spectroscopic studies of the vibrational and electronic properties of solid hydrogen to 285 GPa.

*Proc. Natl Acad. Sci. USA***98**, 14234–14237 (2001). - 12.
Gregoryanz, E., Goncharov, A. F., Matsuishi, K., Mao, H.-K. & Hemley, R. J. Raman spectroscopy of hot dense hydrogen.

*Phys. Rev. Lett.***90**, 175701 (2003). - 13.
Akahama, Y., Kawamura, H., Hirao, N., Ohishi, Y. & Takemura, K. Raman scattering and x-ray diffraction experiments for phase III of solid hydrogen.

*J. Phys. Conf. Ser*.**215**, 012056 (2010). - 14.
Zha, C.-S., Cohen, R. E., Mao, H.-K. & Hemley, R. J. Raman measurements of phase transitions in dense solid hydrogen and deuterium to 325 GPa.

*Proc. Natl Acad. Sci. USA***111**, 4792–4797 (2014). - 15.
Howie, R. T., Dalladay-Simpson, P. & Gregoryanz, E. Raman spectroscopy of hot hydrogen above 200 GPa.

*Nat. Mater.***14**, 495–499 (2015). - 16.
Bonev, S. A., Schwegler, E., Ogitsu, T. & Galli, G. A quantum fluid of metallic hydrogen suggested by first-principles calculations.

*Nature***431**, 669–672 (2004). - 17.
Deemyad, S. & Silvera, I. F. Melting line of hydrogen at high pressures.

*Phys. Rev. Lett.***100**, 155701 (2008). - 18.
Eremets, M. & Trojan, I. Evidence of maximum in the melting curve of hydrogen at megabar pressures.

*JETP Lett.***89**, 174–179 (2009). - 19.
Geng, H. Y., Hoffmann, R. & Wu, Q. Lattice stability and high-pressure melting mechanism of dense hydrogen up to 1.5 TPa.

*Phys. Rev. B***92**, 104103 (2015). - 20.
Kohanoff, J., Scandolo, S., Chiarotti, G. L. & Tosatti, E. Solid molecular hydrogen: the broken symmetry phase.

*Phys. Rev. Lett.***78**, 2783 (1997). - 21.
Kohanoff, J., Scandolo, S., de Gironcoli, S. & Tosatti, E. Dipole-quadrupole interactions and the nature of phase iii of compressed hydrogen.

*Phys. Rev. Lett.***83**, 4097 (1999). - 22.
Pickard, C. J. & Needs, R. J. Structure of phase III of solid hydrogen.

*Nat. Phys.***3**, 473–476 (2007). - 23.
Pickard, C. J. & Needs, R. J. Structures at high pressure from random searching.

*Phys. Status Solidi B***246**, 536–540 (2009). - 24.
Martinez-Canales, M. et al. Novel structures and superconductivity of silane under pressure.

*Phys. Rev. Lett.***102**, 087005 (2009). - 25.
Liu, H., Zhu, L., Cui, W. & Ma, Y. Room-temperature structures of solid hydrogen at high pressures.

*J. Chem. Phys.***137**, 074501 (2012). - 26.
Magdău, I. B. & Ackland, G. J. Identification of high-pressure phases III and IV in hydrogen: simulating raman spectra using molecular dynamics.

*Phys. Rev. B***87**, 174110 (2013). - 27.
Monserrat, B., Needs, R. J., Gregoryanz, E. & Pickard, C. J. Hexagonal structure of phase III of solid hydrogen.

*Phys. Rev. B***94**, 134101 (2016). - 28.
Ackland, G. J. & Magdău, I. B. Appraisal of the realistic accuracy of molecular dynamics of high-pressure hydrogen,

*Cogent Phys*.**2**, 1049477 https://doi.org/10.1080/23311940.2015.1049477 (2015). - 29.
Magdău, I. B., Marques, M., Borgulya, B. & Ackland, G. J. Simple thermodynamic model for the hydrogen phase diagram.

*Phys. Rev. B***95**, 094107 (2017). - 30.
Magdău, I. B. & Ackland, G. J. Infrared peak splitting from phonon localization in solid hydrogen.

*Phys. Rev. Lett.***118**, 145701 (2017). - 31.
Drummond, N. D. et al. Quantum monte carlo study of the phase diagram of solid molecular hydrogen at extreme pressures.

*Nat. Commun.***6**, 7794 (2015). - 32.
Liu, X.-D., Howie, R. T., Zhang, H.-C., Chen, X.-J. & Gregoryanz, E. High-pressure behavior of hydrogen and deuterium at low temperatures.

*Phys. Rev. Lett.***119**, 065301 (2017). - 33.
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces.

*Phys. Rev. Lett.***98**, 146401 (2007). - 34.
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons.

*Phys. Rev. Lett.***104**, 136403 (2010). - 35.
Botu, V. & Ramprasad, R. Adaptive machine learning framework to accelerate ab initio molecular dynamics.

*Int. J. Quantum Chem.***115**, 1074–1083 (2015). - 36.
Ackland, G. J. & Magdău, I. B. Efficacious calculation of raman spectra in high pressure hydrogen.

*High. Press. Res.***34**, 198–204 (2014). - 37.
Magdău, I. & Ackland, G. J. High temperature raman analysis of hydrogen phase iv from molecular dynamics.

*J. Phys. Conf. Ser.***500**, 032012 (2014). - 38.
Pinsook, U. & Ackland, G. J. Calculation of anomalous phonons and the hcp-bcc phase transition in zirconium.

*Phys. Rev. B***59**, 13642 (1999). - 39.
Hanfland, M., Hemley, R. J. & Mao, H.-K. Novel infrared vibron absorption in solid hydrogen at megabar pressures.

*Phys. Rev. Lett.***70**, 3760 (1993). - 40.
Hemley, R., Mao, H. & Shu, J. Low-frequency vibrational dynamics and structure of hydrogen at megabar pressures.

*Phys. Rev. Lett.***65**, 2670 (1990). - 41.
Mao, H.-K. & Hemley, R. J. Ultrahigh-pressure transitions in solid hydrogen.

*Rev. Mod. Phys.***66**, 671 (1994). - 42.
Loubeyre, P., Occelli, F. & LeToullec, R. Optical studies of solid hydrogen to 320 gpa and evidence for black hydrogen.

*Nature***416**, 613–617 (2002). - 43.
Silvera, I. F. & Goldman, V. V. The isotropic intermolecular potential for H2 and D2 in the solid and gas phases.

*J. Chem. Phys.***69**, 4209–4213 (1978). - 44.
Wijngaarden, R. J., Goldman, V. V. & Silvera, I. F. Pressure dependence of the optical phonon in solid hydrogen and deuterium up to 230 kbar.

*Phys. Rev. B***27**, 5084 (1983). - 45.
Geng, H. Y., Wu, Q., Marqués, M. & Ackland, G. J. Thermodynamic anomalies and three distinct liquid-liquid transitions in warm dense liquid hydrogen.

*Phys. Rev. B***100**, 134109 (2019). - 46.
Cooke, P. I. et al. The Raman signal of a hindered rotor.

*Phys. Rev. B***102**, 064102 (2020). - 47.
Ackland, G. J. & Loveday, J. S. Structures of solid hydrogen at 300 K.

*Phys. Rev. B***101**, 094104 (2020). - 48.
Afonina, V. S. Optical studies of diatomic molecules at extreme conditions. PhD thesis, University of Edinburgh, (2018).

- 49.
Geng, H. Y., Song, H. X., Li, J. & Wu, Q. High-pressure behavior of dense hydrogen up to 3.5 TPa from density functional theory calculations.

*J. Appl. Phys.***111**, 063510 https://doi.org/10.1063/1.3694793 (2012). - 50.
Dias, R. P., Noked, O. & Silvera, I. F. New phases and dissociation-recombination of hydrogen deuteride to 3.4 mbar.

*Phys. Rev. Lett.***116**, 145501 (2016). - 51.
Clark, S. J. et al. First principles methods using castep.

*Z. Kristallogr.***220**, 567–570 (2005). - 52.
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple.

*Phys. Rev. Lett.***77**, 3865 (1996). - 53.
Azadi, S. & Ackland, G. J. The role of van der waals and exchange interactions in high-pressure solid hydrogen.

*Phys. Chem. Chem. Phys.***19**, 21829–21839 (2017). - 54.
Kira, K. & Rendell, L. A. A practical approach to feature selection. In

*Machine Learning Proceedings 1992*, 249–256 (Elsevier, 1992). - 55.
Parrinello, M. & Rahman, A. Polymorphic transitions in single crystals: a new molecular dynamics method.

*J. Appl. Phys.***52**, 7182 (1981). - 56.
Belonoshko, A. B., Skorodumova, N., Rosengren, A. & Johansson, B. Melting and critical superheating.

*Phys. Rev. B***73**, 012201 (2006). - 57.
Mendelev, M. et al. Development of new interatomic potentials appropriate for crystalline and liquid iron.

*Philos. Mag.***83**, 3977–3994 (2003). - 58.
Pena-Alvarez, M. et al. Quantitative rotational to librational transition in dense H2 and D2.

*J. Phys. Chem. Lett.***11**, 6626 (2020).

## Acknowledgements

The authors acknowledge the ERC project HECATE for funding. We are grateful for computational support from the UK national high performance computing service, ARCHER, and from the UK Materials and Molecular Modelling Hub, which is partially funded by EPSRC (EP/P020194), for both of which access was obtained via the UKCP consortium and funded by EPSRC grant ref. EP/P022561/1. We thank Miriam Pena-Alvarez and Eugene Gregoryanz for the experimental phonon and vibron data. We thank Hua Geng for sharing complete DFT data from ref. ^{45}.

## Author information

### Affiliations

### Contributions

G.J.A. conceived the study, H.Z. coded and performed the machine-learning procedure, H.Z. and G.J.A. generated ab initio data, H.Z. and H.W. performed molecular dynamic simulations, all authors analysed the simulation results, and contributed equally to writing the manuscript.

### Corresponding authors

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests.

## Additional information

**Peer review information** *Nature Communications* thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Zong, H., Wiebe, H. & Ackland, G.J. Understanding high pressure molecular hydrogen with a hierarchical machine-learned potential.
*Nat Commun* **11, **5014 (2020). https://doi.org/10.1038/s41467-020-18788-9

Received:

Accepted:

Published:

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.