Introduction

The microscopic nature of an excess proton interpenetrated within a three-dimensional hydrogen bonded network of water remains a long-standing elusive aspect of aqueous acids, mainly due to the inherent spectral complexity of bulk water1,2,3,4,5,6,7. Gas-phase aqueous clusters with an excess proton of precisely controlled compositions, H+(H2O)n, thus offer useful bottom-up model systems that enable one to focus on the evolution of vibrational spectral features associated with the excess proton surrounded by a well-defined number of water molecules1,8,9,10,11,12,13,14,15. In this study we introduce a new protocol based on the combination of high level electronic structure theory and the inclusion of anharmonicity, and demonstrate that it is able to obtain the nearly complete assignment of the infrared (IR) spectrum of the magic number H+(H2O)21 cluster in excellent agreement with experiment.

Although the properties of isolated clusters are much simpler than those of the bulk liquid, their spectroscopic features are still complex, and the unequivocal interpretation of the spectroscopic signature of an excess proton in water clusters hinges on synergetic experimental and theoretical works16,17,18,19,20. Theoretical calculations of the IR spectra on candidate local-minimum structures aid the assignment of the vibrational features with respect to the experimental observations. The small-sized protonated water clusters, containing no more than 11 water molecules, have been extensively studied through experimental and theoretical works in the past decades16,17,19,20,21,22,23. The proton in water cluster has been conventionally considered to be in two accommodation motifs, the Eigen form21 (i.e., a hydrated hydronium cation, H3O+(H2O)3) and the Zundel form16 (i.e., a proton shared between two water molecules, H2O···H+···OH2), which would induce dramatically different vibrational features near 1000 and 2660 cm−1, respectively, in the IR spectrum12. Agmon and co-workers have studied the IR spectra of protonated water clusters of different sizes by using ab initio molecular dynamics simulations24,25,26,27, which contributed comprehensive understanding of the protonated water structures.

New spectral signatures emerge as the small size cluster grows into a complex three-dimensional network in cage morphology with the increasing number of water molecules13. A long-standing puzzle regarding the evolution of the vibrational features in the intermediate size regime is the emergence of a pronounced intensity anomaly at the “magic number” size, H+(H2O)211,15,28,29,30,31. Many previous studies have suggested that the H+(H2O)21 cluster takes shape of a configuration in which an Eigen-state cation H3O+, whose O-H bonds are hydrogen-bonded to three H2O molecules, tends to reside on the surface of a dodecahedral cage containing one interior H2O molecule29,30,32,33,34,35, as shown in Fig. 1. However, the link between the experimental IR spectrum and the theoretically predicted structure was missing since the computed IR spectra did not match the experiment. The harmonic normal-mode analysis based on density functional theory (DFT) predicted strong IR bands of the symmetric and asymmetric O-H stretching modes of the cage-surface-bound H3O+ near 2600 cm−1, which were not evident at all in the experimental spectrum1. Later, Torrent-Sucarrat and Anglada30 have shown that the anharmonic coupling plays a crucial role in the characterization of the IR spectrum of the H+(H2O)21 cluster. Their calculation by the second-order vibrational perturbation theory (VPT2) predicted a strong asymmetric O-H stretching band of the H3O+ around 2000 cm−1, outside the region measured in the experiment. Recently, Fournier et al.10 have successfully extended the range of the measurement down to 600 cm−1, and found that the agreement between the experiment and theory was only qualitative. One of the reasons for the discrepancy was the low theoretical level treatment of the anharmonicity in predicting the IR spectrum of the H+(H2O)21 cluster. Yu and Bowman36 have shown that the higher level, vibrational configuration interaction (VCI) method achieves a significant improvement over VPT2. Nevertheless, the calculation was based on a potential energy surface (PES) of H+(H2O)21 represented as a sum of many-body potential energy functions (PEFs) of small clusters, H+(H2O)n (n = 1–4)37,38,39 and (H2O)n (n = 1–3)40, derived from ab initio electronic structure calculations. VCI calculations were also carried out for fragments of the cluster, H3O+(H2O)3 (15 dimensions) and each H2O (3 dimensions). Theoretical calculations that account for both the electronic and vibrational structures of the full H+(H2O)21 cluster remain a challenge.

Fig. 1: Representation of the H+(H2O)21 cluster.
figure 1

A The optimized Eigen-state H+(H2O)21 structure using the fragment-based CCD/aug-cc-pVDZ level of theory. H3O+ is at the top of the structure. There are two types of DDA water molecules: three are hydrogen-bonded with H3O+ (blue, DDAh) and three are distant from H3O+ (turquoise, DDAd). Nine AAD-type water molecules are colored in pink. Four-coordinated AADD water molecules are colored in purple and yellow, respectively, corresponding to the one in the interior (AADDi) and four at the surface (AADDs). B, C represent two asymmetric O-H stretching modes of H3O+, denoted \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1}}\) and \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{2}}\).

Herein, we report a new protocol for computing the IR spectra of complex molecular clusters that has the potential of establishing a transformative opportunity in the field. We apply this protocol to the “guinea-pig” of the field, namely the IR vibrational spectrum of the H+(H2O)21 cluster with an Eigen structure. We rely on the state-of-the-art, fragment-based Coupled Cluster (CC) theory41,42 and the Second-order Vibrational Quasi-Degenerate Perturbation Theory (VQDPT2)43,44. Our previously developed electrostatically embedded generalized molecular fractionation (EE-GMF) method41, whose accuracy and efficiency have been rigorously evaluated in a series of studies, is utilized to reduce the computational scaling of the full system CC calculations. EE-GMF shows an acceleration by factors of >40 over the conventional full system calculations, while the deviations of EE-GMF calculated energies of systems containing over 100 water molecules at diverse ab initio levels are mostly within 0.01 a.u. as compared to the full system calculations41,45. VQDPT2 has been tested to be as accurate as VCI for small molecules43,44, but is scalable to many-mode systems. Recently, the method has been further improved by utilizing local coordinates and applied to strongly hydrogen bonded network in biomolecules46. In this work, VQDPT2 has been carried out in 89 dimensions using coordinates localized to each molecule of the H+(H2O)21 cluster (See the Methods and Supplementary Methods sections for details). The combination of high-level quantum electronic and vibrational calculation yields accurate spectral features compared to experiment, thus resolving the physical picture of an excess proton accommodation in this complex water network.

Results

The structure of H+(H2O)21

The optimized H+(H2O)21 structure at the fragment-based Coupled-Cluster Doubles level with the aug-cc-pVDZ basis set (CCD/aug-cc-pVDZ) is color-coded in Fig. 1A for a better clarification of the assignments of the vibrational bands. The Eigen-type hydronium ion (red sphere), integrating three hydrogen-bonded DDA water molecules (blue, denoted DDAh, where A and D stand for hydrogen-bond acceptor and donor, respectively), is located on the surface of the cage. Each of the DDAh water molecules donates a hydrogen bond to an AADD (yellow) and an AAD (pink) water molecule, respectively. The upper three AAD water molecules also connect to three lower DDA water molecules, which are all distant from the H3O+ cation (turquoise, denoted DDAd). There are five four-coordinated AADD water molecules, four of them on the surface (denoted AADDs) and the remaining one in the interior (denoted AADDi).

Theoretical and experimental IR spectra

The harmonic and VQDPT2 spectra computed at the fragment-based EE-GMF CCD/aug-cc-pVDZ level are shown in Fig. 2, along with the previous theoretical spectra obtained by the VPT2 method based on DFT at the B3LYP/6-31 + G(d) level30 and the VCI method based on the many-body PEFs36, and the experimentally measured one10 for comparison. We also compare with the harmonic spectrum computed at the full-system MP2/aug-cc-pVDZ level (shown in Supplementary Fig. 2). The difference between the CCD/ harmonic and CCD/VQDPT2 spectra computed in this study is quite pronounced (see Fig. 2). The harmonic spectrum shows distinct blue shifts in the range of 3000–3800 cm−1 compared to the VQDPT2 spectrum. In addition, the strong vibrational bands in a range of 1800–2200 cm−1 in the VQDPT2 spectrum are totally absent in the harmonic spectrum. The anharmonic correction leads to a drastic change in the spectral shape. Consequently, the VQDPT2 spectrum is much closer to the experimental result compared to the one predicted by the harmonic approximation, exemplifying the crucial role of anharmonicity in the characterization of the IR spectra of protonated water clusters.

Fig. 2: IR spectra of the H+(H2O)21 cluster.
figure 2

IR spectra of the H+(H2O)21 cluster obtained by the harmonic approximation and VQDPT2 method based on the fragment-based CCD/aug-cc-pVDZ approach, as compared to the experiment10 and the previous calculations by VPT2 based on B3LYP/6-31+G(d)30 and VCI based on the many-body PEFs36. The fundamental excitations of H3O+ are denoted \({\nu }_{{H}_{3}{O}^{+}}^{r}\) (frustrated rotation), \({\nu }_{{H}_{3}{O}^{+}}^{u}\) (umbrella vibration), \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1}}\) and \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{2}}\) (asymmetric O-H stretching), \({\nu }_{{H}_{3}{O}^{+}}^{s}\) (symmetric O-H stretching), and those of H2O are denoted \({\nu }_{{H}_{2}O}^{lib}\) (libration), \({\nu }_{{H}_{2}O}^{b}\) (bending), \({\nu }_{AAD}^{free}\) (dangling O-H stretching of AAD), and (ai) (O-H stretching). The resonance states between the fundamental excitation of asymmetric O-H stretching and the combination tones (r+u and r+b) are denoted \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1},{a}^{2},r+u}\), \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1},r+u}\), \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{2},r+u}\), \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1},r+b}\), and \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{2},r+b}\), where r, u, and b represent the frustrated rotation, the umbrella vibration, and the HOH bending of H3O+, respectively. The harmonic and VQDPT2 spectra are broadened using Lorentz functions with the full-width at half-maximum (FWHM) of 5 cm−1. The raw stick spectrum is shown in Supplementary Fig. 1.

When comparing with the previous VPT2 spectrum at the B3LYP/6-31 + G(d) level, the present result shows significant improvement in predicting the experimental IR bands, confirming the importance of using the high-level theoretical treatment for both the electronic (CCD/aug-cc-pVDZ) and vibration (VQDPT2) parts. The VPT2 spectrum results in significant deviations from the experiment, showing blue-shifted bands in a low frequency range of 1200–2600 cm−1 and distinctly red-shifted bands in a high frequency range of 2800–3700 cm−1. These discrepancies hamper a definitive assignment of the observed bands. In stark contrast, the VQDPT2 spectrum based on fragment-based CCD/aug-cc-pVDZ agrees very well with experiment in both low and high frequency regions. Note that the correction of higher-level electronic correlation effects by employing CCSD/aug-cc-pVTZ supports our present predictions (see Supplementary Fig. 3 and Supplementary Note 3), which substantiates the effectiveness of the CCD/aug-cc-pVDZ level in computing the IR spectral features for quantitative assignments. We further emphasize that the O-H bond-stretching PESs of the H3O+ cation and the H2O molecule, as well as their intermolecular interaction PESs, calculated by B3LYP/6-31 + G(d) deviate significantly from benchmark results obtained using a high-level wavefunction theory (CCSD(T)/aug-cc-pVQZ), while the PESs calculated by CCD/aug-cc-pVDZ are in good agreement with the benchmarks (shown in Supplementary Fig. 4). This further justifies the thesis that the vibrational band assignments benefit from high-level electronic and vibrational structure theories.

The previous VCI spectrum matches well with the present result. Some notable differences are: (1) VCI gives no signal in a range of 600–900 cm−1 because the librational modes of H2O were excluded from the calculation. (2) The IR band shape in a range of 1700–2800 cm−1 appears different, where VCI exhibits a strong, broad band around 1950 cm−1 and diminishes beyond 2200 cm−1, whereas VQDPT2 yields sharp peaks up to ~2700 cm−1. Nevertheless, the overall agreement of the IR spectra obtained by two different theoretical approaches indicates the robustness of the calculated results.

Band assignment

The assignment of the spectral features associated with the proton defect is of fundamental importance. The proton-induced absorptions and the associated motions of the surface-bound H3O+ (see Supplementary Fig. 5) are characterized in detail below. One of the three frustrated rotations gives a strong peak at 943 cm−1 (\({\nu }_{{H}_{3}{O}^{+}}^{r}\)). The distinct and isolated peak calculated at 1267 cm−1 corresponds to the umbrella vibration (\({\nu }_{{H}_{3}{O}^{+}}^{u}\)), which agrees with the experimental band at 1220 cm−1. The calculated continuous absorption occurring over the range of 1720–2100 cm−1, corresponding to a broad band in the same region in the experiment, is due to the asymmetric O-H stretching modes of the surface-bound H3O+, which supports the assignment in the previous works10,36. Two asymmetric O-H stretching modes of H3O+ (\({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1}}\) and \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{2}}\)), illustrated in Figs. 1B and 1C, are predicted at 1791 and 2035 cm−1, respectively. Moreover, the peak at 1949 cm−1 is attributed to a resonance state of the asymmetric O-H stretching modes of H3O+, combination tones of the frustrated rotation and umbrella vibration of H3O+, and the libration of DDAh water (\({\nu }_{{H}_{3}{O}^{+}}^{{{{{{{\rm{a}}}}}}}^{1}{,{{{{{\rm{a}}}}}}}^{2},{{{{{\rm{r}}}}}}+{{{{{\rm{u}}}}}}}\)). Note that the H-O-H bending of H3O+ is calculated at ~1730 cm−1, but its intensity is too weak to be noticed. The origin of the strong band near 2220 cm−1 and the shoulder near 2400 cm−1 has been a matter of discussion. It was speculated that the band at 2220 cm−1 stemmed from a combination of H3O+ bend and the frustrated rotation by comparing with a similar band in an Eigen cluster, H+(H2O)410,47. The present calculation predicts two absorption bands at 2261 and 2409 cm−1, which correspond to the experimental signatures of ca. 2220 cm−1 and 2400 cm−1, respectively. The former is assigned to a resonance state of the asymmetric O-H stretching of H3O+ and the combination tones of frustrated rotation and umbrella vibration of H3O+ (\({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1},r+u}\), \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{2},r+u}\)), while the latter is attributed to the symmetric O-H stretch of H3O+ (\({\nu }_{{H}_{3}{O}^{+}}^{s}\)). The broad absorption around 2720 cm−1 was rarely explored due to the weak intensity in the experimental spectrum. The present calculation offers a clear-cut assignment of the 2720 cm−1 band to a resonance state of the asymmetric O-H stretching and a combination tone of the frustrated rotation and H3O+ bend (\({\nu }_{{H}_{3}{O}^{+}}^{{a}^{1},r+b}\), \({\nu }_{{H}_{3}{O}^{+}}^{{a}^{2},r+b}\)). Therefore, our work reveals the origin of bands associated with the motion of the surface-bound H3O+. The assignments are summarized in Supplementary Table 1.

Let us now focus on IR bands of neutral water molecules. The experimental spectrum gives broad features in a low-frequency range from 600 to 1000 cm−1, which was assigned to the librational motion of neutral water molecules10. The present calculation supports the assignment, yielding IR bands of the librational motion in the same range. The three predicted absorption peaks (696, 720, and 746 cm−1), occurring below 750 cm−1, arise from the libration of three neighboring DDAh-type neutral water molecules around the surface-bound H3O+ ion in the single hydrogen-bond acceptor configurations. The remaining bands in this region mainly correspond to the libration of AAD-type water molecules far from H3O+. The strongest peak calculated at 855 cm−1 agrees well with the experimental peak around 840 cm−1. The detailed assignments of the librational bands are summarized in Supplementary Table 2, and the vibrational motion is illustrated in Supplementary Fig. 6.

The bending vibrations of neutral water molecules are predicted to be around 1600 cm−1, in good agreement with a sharp band observed at 1620 cm−1 in the experiment. The broad envelope from 3000 to 3600 cm−1, with sequential peaks denoted by letters (ai), is associated with the O-H stretching motions of the neutral water molecules. These features have attracted increased attention, because they serve as a useful marker to characterize the hydrogen bond network in various systems34. Yu and Bomwan36 assigned some of these spectral signatures, tracing them to specific types of water molecules through the one-to-one correspondence with the VCI spectrum. We have performed a more comprehensive analysis on the VQDPT2 spectrum and reconfirmed most of their assignments. The assignments of neutral water O-H stretching bands are shown in Fig. 3 and the details are given in the Supplementary Note 1 and Supplementary Table 3.

Fig. 3: Assignments of O-H stretching features of neutral water molecules.
figure 3

(Upper panel) Enlarged view of the VQDPT2 spectrum in the range of O-H stretching of neutral water molecules. The assignments are indicated in colors and labels according to the type of corresponding water molecules. Notations: Fermi resonance (Fermi), combination band (Comb), mixed vibration with different type of water (+). (Lower panel) The O-H stretching motions of the neutral water molecules. (A) Free O-H stretch of the AAD-type water molecule. (B), (C) Two DDA-type water molecules in feature (a). (D), (E) The AADDs-type water molecules in feature (b). (F), (G), (H) The AAD, AADDs, and DDAh-type water molecules in feature (c). (I), (J) The DDAh and AAD-type water molecules in feature (d). (K) The AAD-type water molecule in feature (e). (L) The AADDi-type water molecule in feature (f).

Two types of AADD water

It is noticeable that the O-H stretching frequency of the AADDs-type water molecules is significantly higher than that of the AADDi-type water molecule (features b and f, respectively, in the upper panel of Fig. 3), which suggests two distinct structures of four-coordinated water molecules. The structural comparison of the AADDi and AADDs water molecules in the H+(H2O)21 cluster, with reference to the ice Ih48 and liquid water42, is summarized in Table 1. The major difference between the interior and surface four-coordinated water of H+(H2O)21 cluster lies in the angle of the hydrogen-bonds. The angles of the hydrogen-bonds formed between the AADDi-type water molecule and its neighbors are all near 177°, as shown in the inset of the upper panel of Fig. 3, which is almost the same as the average hydrogen-bond angle (178°) in ice Ih. These strong hydrogen bonds of the AADDi-type water molecule in the ice-like structure accounts for the relatively low O-H stretching frequency. In contrast, the average hydrogen bonds formed between the AADDs-type water molecules and its partners are distorted (see the inset of Fig. 3), exhibiting the structural property similar to liquid water. The average hydrogen-bond angles of liquid-like AADDs water molecules is 162° in the H+(H2O)21 cluster, which is very close to the average angle of 158o in liquid water, indicating that the AADDs water molecules are less confined, and its hydrogen-bond strength with the partners is weaker than that of the AADDi water molecule. These imperfect, and therefore weaker hydrogen bonds, are responsible for the high O-H vibrational frequency of more flexible AADDs water molecules under less confined environment, as compared to the AADDi water molecule. The difference of the two types of AADD water molecule in the H+(H2O)21 cluster is also addressed in comparison with the experimentally observed bulk spectra of ice Ih49 and liquid water50, as shown in Supplementary Fig. 7. The O-H stretching frequency of liquid water shows a clear blue shift by ~180 cm−1 with reference to ice Ih, which is in good agreement with the present calculation (shifted by ~165–185 cm−1) and further proves the existence of two types of AADD water molecule in the H+(H2O)21 cluster. Our study demonstrates that there are two kinds of four-coordinated water molecules in the H+(H2O)21 cluster based on direct comparison between the experimental IR spectrum and high-level wavefunction theory calculations. The internal and surface four-coordinated AADD water molecules correspond to the ice-like and liquid-like water, respectively, as indicated by their differences in local tetrahedral structure, hydrogen-bond strengths, and vibrational spectral signatures. This can provide a bottom-up framework for understanding the structural differences at the molecular level between fully coordinated, bulk-like water and interfacial water at the water/solid or water/vapor interfaces51.

Table 1 The structural comparison of the simulated liquid water, ice Ih, and the H+(H2O)21 cluster (d in angstrom, and in degree). The results for the H+(H2O)21 cluster are average values over 4 cluster structures shown in Fig. 5.

Intermolecular couplings of water molecules

Given an increasing interest in the relaxation of O-H stretching vibration in caged water clusters52, let us address the effect of the water–water couplings on the calculated spectrum. The intermolecular coupling between the water molecules were taken into account in VQDPT2 calculations by including bi-linear coupling terms in the PES. Note that the water–water coupling was excluded in the previous work36. The VQDPT2 spectrum excluding the water–water couplings (except for DDAh-DDAh so as to retain the inter-molecular couplings of the H3O+(H2O)3 moiety) is shown in Fig. 4A for the O-H stretching region, and in Supplementary Fig. 8 for the lower frequency region. The two spectra with and without the water–water couplings give major peaks in similar positions, and thus the overall appearance looks similar. Nonetheless, the presence of the water–water coupling generally makes the spectrum more broadened and widespread. For example, the O-H stretching band in a range of 3100–3400 cm−1 exhibits noticeable differences. In the fully coupled model, the peaks calculated at 3284 and 3166 cm−1 (denoted as g and i, respectively), manifest intra-molecular Fermi resonance of an AAD-type water molecule between an overtone of the bending mode (No. 100) and a fundamental of the O-H stretching mode (No. 101). Interestingly, the lower frequency component is further resonant with an overtone of the bending mode (No. 127) of a neighboring AAD-type water molecule. The vibrational modes, the resonance diagram, and the component of vibrational wavefunctions are shown in Fig. 4B. It is notable that the coupling constant between modes 100 and 127 (calculated as 12 cm−1) is not particularly large compared to others; for example, the coupling constants of mode 100 with bending modes of DDAd- and AADDs-type water molecules in the nearest neighbors are obtained as 14 and 16 cm−1, respectively. Instead, the state mixing is induced by the match of frequencies, where the fundamental excitations of the two AAD-type water molecules are obtained as 1605 and 1609 cm−1, whereas those of DDAd- and AADDs-type water molecules are higher in frequency at 1640 and 1651 cm−1. The result implies a novel relaxation pathway of the O-H stretching excitation energy through bending overtone states mediated by the vibrational resonance53. These peaks are observed around 3230 and 3110 cm−1 in the experiment.

Fig. 4: On the effect of water–water (W–W) interactions.
figure 4

A Comparison of the IR spectra obtained by VQDPT2 with and without the harmonic coupling between water molecules in the PES. Note that the H3O+(H2O)3 moiety has all inter-molecular couplings included in both cases. B The vibrational modes, the resonance diagram, and the component of vibrational wavefunctions for the intra- and inter-molecular Fermi resonance of the AAD-type water molecules.

Theoretical IR spectra of other structures

As all the key bands in the H+(H2O)21 IR spectrum have been assigned to the particular network sites based on the calculation of the structure in its most stable form, it is necessary to compare the computed spectrum to those of other minimum energy structures. To this end, three reported alternative stable structures of H+(H2O)21 cluster33 were also computed in the present study. The Cartesian coordinates of these structures are provided in the Supplementary Note 4. For ease of comparison, the four minimum energy structures considered in this study are displayed in Fig. 5 (note that a1 is the structure discussed above) along with the corresponding VQDPT2/CCD predictions. The numbers of the water molecules in the same types from the four structures are equal, and their relative positions are almost identical. The H3O+ and three DDAh-type water molecules have the same hydrogen-bond network in the four configurations (see Fig. 5A). Therefore, the key bands associated with the proton defect ranging from 1200 to 2800 cm−1 in the calculated IR spectra show a very similar pattern for the four structures (see Fig. 5B). The major difference among the four structures lies in the different orientations of individual water molecules in AAD, DDAd, AADDs, and AADDi types (see Fig. 5A), leading to different hydrogen-bond partners for water molecules at the same positions among the four structures. The perturbation of hydrogen-bond network of individual water molecules results in a slight rearrangement of the spectrum in the O-H stretching region from 3100 to 3600 cm−1 (see Fig. 5C), which is sensitive to the hydrogen-bond structure. Although the shapes of the O-H stretch-induced absorptions are slightly different for the four structures, all of them have the characteristic features as those in the structure a1, and the same features derived from the specific type of water appear at very close positions. A few individual features unique in the structures a2, a3 and a4 are described in the Supplementary Note 2. In essence, this comparative study emphasizes that different minimum energy structures are considered to sufficiently warrant the correct assignments of the distinct bands.

Fig. 5: Different minimum energy structures and corresponding IR spectra.
figure 5

A Overview of the four minimum energy structures of H+(H2O)21 reported in Hodges and Wales’s study33, and (B), (C) the corresponding VQDPT2 IR spectra computed at the fragment-based CCD/aug-cc-pVDZ level. The enlarged view of the O-H stretching region of the spectra in (C) is color-coded according to the type of the corresponding water molecules.

Discussion

We have developed a new protocol to compute the IR spectrum of molecular clusters based on the combination of EE-GMF and VQDPT2 for treating the electronic and vibrational problems, respectively. In the EE-GMF method, all monomers and dimers are calculated by the ab initio electronic structure calculations with the electrostatic embedding scheme. This scheme, in which the environmental effects are incorporated by surrounding atomic point charges, accounts for the electronic polarization and hydrogen-bond cooperativity effects, thereby making the truncation after the dimer terms far more accurate than a simple summation of bare monomer/dimer energies. VQDPT2 treats the strong interaction among quasi-degenerate states by VCI, and the weak interaction with many, non-degenerate states by the second-order perturbation theory. Unlike the regular perturbation theory, VQDPT2 is capable of describing resonance states without divergence while keeping the cost-efficiency and the scalability to many-mode systems. Furthermore, we employ vibrational coordinates localized to each molecule and represent the PES in terms of “intra”-molecular anharmonicity and “inter”-molecular harmonic coupling. The PES generated by EE-GMF is used for VQDPT2 calculations. The method is an ideal combination to compute the vibrational spectrum of molecular clusters, exploiting the locality of electronic and vibrational motions.

These methodological advances signify the nearly complete assignment of the IR spectral features of the H+(H2O)21 cluster, 17 years after it has been experimentally measured. The calculated spectrum not only reproduces the well-defined structures for the bands previously assigned, but also provides definitive structural proof for the clarification of the previously controversial and unclear band assignments of proton motions. We emphasize that the revelation of the IR band assignments has a profound impact on the understanding of molecular structures in various systems. The precise assignments of the proton defect band in a hydrogen bonded network carve a path for addressing several open questions related to the nature of proton speciation in water. The site-specific analysis of the water O-H stretching region reveals distinct structures for the internal ice-like and surface liquid-like four-coordinated water molecules that are the cornerstone of understanding the local structure of water in diverse environments; for example, the water/air or solid interface, water clusters and droplets in amorphous polymer, and so on.

The present calculation is complementary with the previous VCI calculation based on the many-body PEFs by Yu and Bowman. On one hand, the PEFs of H+(H2O)n (n = 1–4) and (H2O)n (n = 1–3) were derived at the mixed high electronic structure levels of CC and MP2, but they were simply summed to construct the PES of H+(H2O)21, whereas the PES in our calculation is computed for H+(H2O)21 by the fragment-based CC in an electrostatically embedding scheme at the level of CCD/aug-cc-pVDZ. On the other hand, VCI was carried out for a H3O+(H2O)3 moiety in 15 dimensions incorporating ~140,000 of VCI states and other water molecules in 3 dimensions, whereas VQDPT2 was performed for H+(H2O)21 in 89 dimensions incorporating ~1000 of quasi-degenerate states by VCI and hundreds of millions of non-degenerate states by perturbation. There are multiple measures on the level accuracy, and the two approaches are complementary with each other. Nonetheless, the resulting IR spectra exhibit an overall agreement, which substantiates the robustness of the theory even for such a complex system as H+(H2O)21.

Although the present calculation predicted the IR peak position in good match with the experiment, the agreement of the intensity and line-shape is less sufficient, in particular, in the range of 1700–2700 cm−1. This is primarily because the calculated spectrum was constructed by simply augmenting the peak position and intensity using Lorentz functions of constant FWHM (5 cm−1). The procedure is valid when the excited state has long lifetime. However, the broad, line-shape observed in the experiment indicates fast dynamics of the proton defect and the vibrational mode mixing after the excitation of O-H stretching vibration of H3O+. We also found an indication of the inter-molecular energy relaxation pathway of the O-H stretching excitation of neutral water molecules via H-O-H bending overtones. With the advent of the experimental techniques (IR–IR hole burning, 2D-IR, etc.), revealing the fast dynamics of proton defect and water molecules is intriguing. Further theoretical improvement is needed to extend the framework to time-dependent quantum theory as well as to generate a more accurate PES for quantum dynamics, which will be the scope of future works.

Methods

The electrostatically embedded generalized molecular fractionation (EE-GMF) method

Fragment-based quantum chemical methods54,55, in which a large system is decomposed into small, tractable pieces more affordable to electronic structure calculations, have been proposed as an effective way to sidestep the non-linear scaling of standard quantum mechanical (QM) computational cost with respect to the system size. The electrostatically embedded generalized molecular fractionation (EE-GMF) method was developed in our group to treat large-sized molecular clusters41,42,45,56,57,58,59. As a fragment-based quantum chemical method, the EE-GMF approach has been elaborated in a series of our recent publications41,42,56,57,59, and thus we only give a brief description here. The EE-GMF approach was developed for specifically dealing with molecular clusters, in which each molecule could be assigned as a single fragment without cutting the chemical bonds. Then each fragment, with the remaining system represented by the background charges, could be feasibly treated at diverse ab initio levels. The interactions between two fragments that are spatially in close contact have important contributions to the energetic properties of the system, and hence are also calculated by QM, while the long-range electrostatic interactions are approximated using the classical Coulomb interactions for efficiency. Therefore, according to the EE-GMF scheme, the total energy (\(E\)) of the molecular cluster can be expressed by,

$${E}_{molecular\,cluster}^{{{{{{\rm{EE}}}}}}-{{{{{\rm{GMF}}}}}}}=\mathop{\sum }\limits_{i=1}^{N}{\tilde{E}}_{i}+\mathop{\sum }\limits_{i=1}^{N-1}\mathop{\sum }\limits_{{j=i+1}\atop {|{{{{{{\bf{R}}}}}}}_{ij}|\le \lambda }}^{N}({\tilde{E}}_{ij}-{\tilde{E}}_{i}-{\tilde{E}}_{j})-\mathop{\sum }\limits_{i=1}^{N-1}\mathop{\sum }\limits_{{j=i+1}\atop {|{{{{{{\bf{R}}}}}}}_{ij}| > \lambda }}^{N}\mathop{\sum}\limits _{m\in i}\mathop{\sum}\limits _{n\in j}\frac{{q}_{m(i)}{q}_{n(j)}}{{{R}}_{m(i)n(j)}}$$
(1)

where N is the number of the fragments in the molecular cluster, \({\tilde{E}}_{i}\) denotes the self-energy of the fragment i along with the interaction energy between the fragment and background charges of the rest of the atoms in the system, the second term in Eq. (1) denotes the two-body QM interaction energies between dimer ij (which is composed of fragments i and j), when the distance Rij between fragments i and j is less than or equal to a predefined distance threshold \(\lambda\), and \({q}_{n(j)}\) represents the atomic charge of the nth atom in the jth fragment. The last term in Eq. (1) deducts the doubly counted electrostatic interactions between distant fragment pairs (outside the distance threshold \(\lambda\)), because those interactions are already taken into account in each fragment QM calculations with the electrostatic embedding scheme. Here in this study, each water molecule and hydronium ion H3O+ were assigned as individual fragments, and all the two-body interactions between any two fragments were calculated by QM (i.e., \(\lambda\) is chosen to be sufficiently large to cover all the two-body QM interactions). The higher-order many-body interactions are implicitly incorporated in the electrostatic embedding scheme. In this case, Eq. (1) is simplified to \({E}_{molecular\,cluster}^{{{{{{\rm{EE}}}}}}-{{{{{\rm{GMF}}}}}}}=\mathop{\sum }\nolimits_{i=1}^{N}{\tilde{E}}_{i}+\mathop{\sum }\nolimits_{i=1}^{N-1}\mathop{\sum }\nolimits_{j=i+1}^{N}({\tilde{E}}_{ij}-{\tilde{E}}_{i}-{\tilde{E}}_{j})\), which becomes similar to the electrostatically embedded many-body expansion (EE-MB) method proposed by Dahlke and Truhlar60,61.

With the EE-GMF fragmentation method, all of the monomers and dimers were explicitly treated through the standard ab initio calculations with the electrostatically embedding scheme to account for the environmental effect, which ensures the electronic polarization and hydrogen-bond cooperativity to be properly taken into consideration. Through the monomer and dimer calculations, one- and two-body electronic Coulomb, exchange, and correlation interactions are treated nearly exactly at the CCD level. By means of the electrostatic embedding approach, three-body and all higher-order many-body Coulomb interactions are also included implicitly. It is the electrostatic embedding scheme that renders the many-body expansion quickly convergent, and the truncation after the dimer QM interactions sufficiently accurate. The atomic charges utilized for the embedding field were obtained from the SPCFW62 water model and the electrostatic potential fitting at the HF/aug-cc-pVDZ level for the protonated water H3O+. The first and second derivatives of the total energy with respect to the nuclear coordinates, i.e., the atomic forces and Hessian matrix, can be calculated analytically42,57,59,63, which were utilized for geometry optimization and normal mode analysis. The quasi-Newton algorithm was adopted for H+(H2O)21 geometry optimization from a given initial structure, and the BFGS procedure was used to update the Hessian matrix during the optimization procedure. The convergence criterion of the maximum atomic force was set to 0.001 Hartree/Bohr.

The dipole moment of the molecular cluster (\(\mu\)) can also be obtained based on the EE-GMF scheme as,

$$\mu_{molecular\,cluster}^{{{{{{\rm{EE}}}}}}-{{{{{\rm{GMF}}}}}}}=\mathop{\sum }\limits_{i=1}^{N}{\mu }_{i}+\mathop{\sum }\limits_{i=1}^{N-1}\mathop{\sum }\limits_{{{j=i+1}\atop {|{{{\bf{R}}}}_{ij}|\le \lambda }}}^{N}({\mu }_{ij}-{\mu }_{i}-{\mu }_{j})$$
(2)

where \({\mu }_{i}\) is the dipole moment of fragment i. In this work, all the two-body corrections on the dipole moment of the entire molecular cluster between any two fragments are calculated by QM (i.e., \(\lambda \to \infty\), and \({\mu }_{molecular\,cluster}^{{{{{{\rm{EE}}}}}}-{{{{{\rm{GMF}}}}}}}= \mathop{\sum}\limits_{i=1}^{N}{\mu }_{i}+\mathop{\sum}\limits_{i=1}^{N-1}\mathop{\sum}\limits_{j=i+1}^{N}({\mu }_{ij}-{\mu }_{i}-{\mu }_{j})\)). The derivative of the dipole moment with respect to the normal coordinates can also be derived to compute the IR intensity59,63.

The second-order vibrational quasi-degenerate perturbation theory (VQDPT2)

VQDPT243,44 is an efficient method to solve the vibrational Schrödinger equation (VSE). The vibrational Hamiltonian reads in terms of mass-weighted, rectilinear vibrational coordinates, \(\{{Q}_{i}\}\), as

$${\hat{H}}_{\upsilon }=-\frac{1}{2}\mathop{\sum }\limits_{i=1}^{f}\frac{{\partial }^{2}}{\partial {{Q}_{i}}^{2}}+V({{{{{\boldsymbol{Q}}}}}})$$
(3)

where f is the number of vibrational degrees of freedom and V is the potential energy surface (PES) of a system. The vibrational self-consistent field (VSCF) wavefunction is the starting point of the calculation,

$$|{\Phi }_{{{{{{\bf{n}}}}}}}^{VSCF}\rangle =\mathop{\prod }\limits_{i=1}^{f}|{\phi }_{{{n}}_{i}}^{(i)}({Q}_{i})\rangle$$
(4)

where n denotes the quantum number of a target vibrational state. The one-mode functions are obtained by solving the VSCF equation,

$$\left[-\frac{1}{2}\frac{{\partial }^{2}}{\partial {{Q}_{i}}^{2}}+\left\langle \mathop{\prod}\limits_{i^{\prime} \ne i}{\phi }_{{{n}}_{i{\prime} }}^{(i{\prime} )}|V|\mathop{\prod}\limits_{i^{\prime} \ne i}{\phi }_{{{n}}_{i{\prime} }}^{(i{\prime} )}\right\rangle \right]{\phi }_{{{n}}_{i}}^{(i)}={\varepsilon }_{{{n}}_{i}}{\phi }_{{{n}}_{i}}^{(i)}$$
(5)

VQDPT2 improves the VSCF solution using the second-order quasi-degenerate perturbation theory. We divide the Hilbert space into a P space spanned by VSCF configuration functions, \(\{{\Phi }_{{{{{{\bf{p}}}}}}}^{VSCF}\}\), in which the components are energetically quasi-degenerate to target states, and a complimentary Q space, \(\{{\Phi }_{{{{{{\bf{q}}}}}}}^{VSCF}\}\). The effective Hamiltonian is written up to the second order as,

$${\left({H}_{eff}^{(0+1)}\right)}_{{{{{{\bf{p}}}}}}{{{{{\bf{p}}}}}}^\prime}=\left\langle {\Phi }_{{{{{{\bf{p}}}}}}}^{VSCF}|{\hat{H}}_{\upsilon }|{\Phi }_{{{{{{\bf{p}}}}}}^{{{\prime} }}}^{VSCF}\right\rangle$$
(6)
$${({H}_{eff}^{(2)})}_{{{{{{\bf{p}}}}}}{{{{{\bf{p}}}}}}^\prime}=\mathop{\sum}\limits_{{{{{{\bf{q}}}}}}\ne {{{{{\bf{p}}}}}}}\frac{\left\langle {\Phi }_{{{{{{\bf{p}}}}}}}^{VSCF}|{\hat{H}}_{\upsilon }|{\Phi }_{{{{{{\bf{q}}}}}}}^{VSCF}\right\rangle \left\langle {\Phi }_{{{{{{\bf{q}}}}}}}^{VSCF}|{\hat{H}}_{\upsilon }|{\Phi }_{{{{{{\bf{p}}}}}}^\prime}^{VSCF}\right\rangle }{2}\left(\frac{1}{{E}_{{{{{{\bf{p}}}}}}}^{(0)}-{E}_{{{{{{\bf{q}}}}}}}^{(0)}}+\frac{1}{{E}_{{{{{{\bf{p}}}}}}^\prime}^{(0)}-{E}_{{{{{{\bf{q}}}}}}}^{(0)}}\right)$$
(7)

where \({E}_{{{{{{\bf{p}}}}}}}^{(0)}\) is the zero-th order energy defined as,

$${E}_{{{{{{\bf{p}}}}}}}^{(0)}=\mathop{\sum}\limits_{i}{\varepsilon }_{{p}_{i}}$$
(8)

The diagonalization of the effective Hamiltonian yields the VQDPT2 energy and wavefunctions. The P and Q space is constructed using two control parameters, Ngen and λmax, for a target vibrational state of interest, n. VSCF configurations that are quasi-degenerate to n are searched in a configuration space {s} defined by λmax as,

$${\lambda }_{{{{{{\bf{sn}}}}}}}=\mathop{\sum }\limits_{i=1}^{f}\left|{s}_{i}-{n}_{i}\right|\le \,{\lambda }_{{\max }}$$
(9)

The quasi-degenerate configurations found in the search are denoted \(\left\{{{{{{{\bf{p}}}}}}}^{(1)}\right\}\). Then, the same search is carried out for each configuration of \(\left\{{{{{{{\bf{p}}}}}}}^{(1)}\right\}\) to find the second generation of quasi-degenerate configurations, \(\left\{{{{{{{\bf{p}}}}}}}^{(2)}\right\}\). The process is repeated Ngen times to obtain the P space configurations,

$$P=\left\{{{{{{\bf{n}}}}}}\right\}+\left\{{{{{{{\bf{p}}}}}}}^{(1)}\right\}+\left\{{{{{{{\bf{p}}}}}}}^{(2)}\right\}+\cdots +\left\{{{{{{{\bf{p}}}}}}}^{({N}_{{{{{{\rm{gen}}}}}}})}\right\}=\left\{{{{{{{\bf{p}}}}}}}^{m}\right|m=1,2,\cdots ,{N}_{P}{{{{{\rm{\}}}}}}}$$
(10)

The Q space is constructed by selecting configurations, q, that satisfies the following condition,

$${\lambda }_{{{{{{\bf{q}}}}}}{{{{{{\bf{p}}}}}}}^{m}}=\mathop{\sum }\limits_{i=1}^{f}\left|{q}_{i}-{p}_{i}^{m}\right|\le {\lambda }_{{\max }}$$
(11)

Note that the Q space configurations are non-degenerate with any of the P space configurations. The energy differences in the denominator of Eq. (7) are finite, and thus VQDPT2 is free of a divergence problem. The relation of VQDPT2 with other vibrational methods and the vibrational calculations based on local coordinates are described in the Supplementary Methods.

Computational details

The protonated H+(H2O)21 cluster with an Eigen state hydrated hydronium cation sitting on the surface of the water cage was utilized in this study. The four most stable conformers of H+(H2O)21 were obtained from Hodges and Wales’s study33, and used as the initial structures for geometry optimization. The Coupled-Cluster Doubles (CCD) theory with the aug-cc-pVDZ basis set was applied for geometry optimization and harmonic vibrational calculation by using the EE-GMF method. All QM calculations were carried out with the Gaussian16 program64.

The anharmonic vibrational calculations were carried out in terms of coordinates localized to each molecule (H3O+ and H2O). We employed the nine- and four-highest frequency modes of H3O+ and H2O, respectively. Thus, 89 out of 192 coordinates were set to be active. The harmonic frequencies of the local coordinates (and their counterparts in normal coordinates) are listed in Supplementary Table 4. In Eq. (8) of the Supplementary Methods, the intra-molecular PES was generated at the anharmonic level up to the three-mode representation (3MR) by the multi-resolution method65 combining the QFF66 and the grid PES67. The one-mode representation (1MR)-PES was a grid PES with 11 grid points for all terms, while the two-mode representation (2MR) and 3MR coupling terms with mode coupling strength (MCS)68 larger than 1.0 and 10.0 cm−1 were obtained by a grid PES with 7 and 5 grid points, respectively. Other weaker terms were represented by QFF. The generation of QFF and grid PES required 981 points of gradient and 11,263 points of energy, respectively, which were computed by the EE-GMF method based on the level of CCD/aug-cc-pVDZ. In addition, the 1MR grid PES was also calculated at the CCSD/aug-cc-pVTZ level for more accurate description of the electronic correlation effect. The inter-molecular, harmonic coupling was obtained from the Hessian matrix. Finally, the VQDPT2 calculations43,44 were performed with Ngen = 3 and λmax = 4. The target vibrational states were set to fundamental excitations. The largest P and Q spaces incorporated 920 and 333 million VSCF configurations, respectively. The IR intensities were computed using the dipole moment surfaces obtained from the same grid points as the grid PES. All anharmonic vibrational calculations were carried out using the SINDO program69.