Sulphur Kβ emission spectra reveal protonation states of aqueous sulfuric acid

In this paper we report an X-ray emission study of bulk aqueous sulfuric acid. Throughout the range of molarities from 1 M to 18 M the sulfur Kβ emission spectra from H2SO4 (aq) depend on the molar fractions and related deprotonation of H2SO4. We compare the experimental results with results from emission spectrum calculations based on atomic structures of single molecules and structures from ab initio molecular dynamics simulations. We show that the S Kβ emission spectrum is a sensitive probe of the protonation state of the acid molecules. Using non-negative matrix factorization we are able to extract the fractions of different protonation states in the spectra, and the results are in good agreement with the simulation for the higher part of the concentration range.


Results and Discussion
presents the experimental emission spectra of aqueous sulfuric acid. Alonso-Mori and coworkers studied the S Kβ emission spectrum of sulphate minerals and assigned the features in the spectrum 45 based on orbital structure calculations. We follow the usual nomenclature in our discussion and assign lines in the spectra of Fig. 1 as Kβ' and Kβ 1,3 . The latter line has shoulder-like structures Kβ x and Kβ". Moreover, as seen clearly in the 18.0 M system, the Kβ x and Kβ" features both have two components. With increasing concentration, the highest Kβ 1,3 line loses intensity, and the Kβ x and Kβ" lines gain spectral weight. We assign this behavior to two phenomena: the protonation state of the acid and the effect of solvation.
To gain a deeper understanding of the effects of the protonation state on the shape of the spectrum, we performed spectrum calculations for single molecules in vacuum, presented in Fig. 2(a), and calculations based on ab initio molecular dynamics (AIMD) simulations, presented in Fig. 2(b). For the in-vacuum single-molecule calculations the used structures are presented in Fig. 2(c-e). The calculation was performed for a geometry optimized H 2 SO 4 molecule and systems obtained by removal of one or two protons from the molecule and reoptimizing the geometry. Moreover we performed calculations removing the proton(s) but without re-optimization of geometry for − SO 4 2 and − HSO 4 . The results of the latter calculations are shown in light color in Fig. 2(a). The num-  ber of electrons and occupied orbitals remained constant in all three systems. The orbitals are sensitive to protonation, which is manifested in the individual transitions in the emission spectrum. For H 2 SO 4 , the Kβ' line consists of 3 transitions, which are strongly split in the deprotonated but non-reoptimized systems, and which are regrouped after geometry re-optimization. The Kβ x − Kβ 1,3 − Kβ" group has 5 transitions, and the spectral shape of this feature is not largely affected by the geometry re-optimization. Moreover, the splitting of the latter lines occurs in H 2 SO 4 , both the Kβ x and the Kβ" features being reproduced by the calculations.
The in-vacuum calculation reveals two-fold effects, when protons are removed from the acid. First, orbital eigenenergies and even their order in energy may be modified, and second, intensity changes due to symmetry and due to reshaping of the orbitals can occur. The latter is manifested in Fig. 2(a) for the lowest part of the Kβ x line. In the − SO 4 2 ion (Fig. 2(c)), the orbital from which the decay takes place is spherical around the S atom, making the transition to S1s dipole forbidden. As one or two protons are added ( Fig. 2(d,e)), breaking of the symmetry causes the orbital to change shape, which makes it allowed for decay by dipole transition. This analysis applies strictly only to the high-symmetry geometry of − SO 4 2 , and, due to ensemble averaging, the transition becomes also allowed in real systems, yet probably very weakly.
Secondly, as a more precise but also more challenging approach, we simulated ensemble-averaged S Kβ emission spectra by sampling the trajectories from previously published AIMD simulations 30 . The protonation distribution along the AIMD runs is presented in Table 1. The obtained AIMD-based vertical emission spectra are depicted in Fig. 2(b). The simulation shows again a single peak for the Kβ' line. Moreover, the simulation for the 1.5 M system reproduces the Kβ" satellite. For the 18.8 M system, the Kβ x − Kβ 1,3 − Kβ" line group resembles the in-vacuum H 2 SO 4 case owing to the fact that the system consists mostly of the fully protonated H 2 SO 4 . The 18.8 M case produces the further splittings of lines seen in Fig. 2(b). The origin of this is apparent from the spectrum calculation of the isolated H 2 SO 4 molecule in vacuum as shown in Fig. 2(a). The simulation of Fig. 2(b) shows qualitative agreement with the experiment, with some discrepancy in reproduction of the Kβ x and Kβ", lines. The origin of the mismatch is most likely due to spectrum simulation. For the 1.5 M case, in turn, the structural simulation gives erroneous results, as discussed later in our NNMF analysis.
To study the statistical process of formation of the ensemble-averaged spectrum from the spectra of individual configurations, we analyzed the intensities in regions I-III of Fig. 2(b) for each snapshot in the liquid simulation. By integrating the intensity in each of the regions, that were chosen from an averaged spectrum, we transform the complex problem of relating spectral features and changes thereof to structural properties into a more feasible one. This allows studying spectral structure -property relationships, naturally depending on how much correlation between the two exists. The data points of all the concentrations are depicted maximum-normalized in Fig. 3(a), and they lie close to a plane in I,II,III intensity-space. The grouping of the points follows the protonation state of the acid molecule, which is known from the simulation for each data point.
The distribution of intensity values in the different regions I-III for the three protonation forms is presented in Fig. 3 4 2 intensity is close to zero for most snapshots and in both region I and III the intensity grows with the number of protons in the acid molecule. For region II, that includes the Kβ 1,3 peak, the intensity decreases with increasing number of protons.
The number of protons in the emitting acid molecule is known from the simulation (see the Methods section). We calculated the averaged spectra for all three forms, from all snapshots of all concentrations, presented in Fig. 3(e). The analysis reveals that in the bulk case the Kβ x and Kβ" features appear in both − HSO 4 and H 2 SO 4 , but as shown in the histograms of Fig. 3(b-d) the intensities vary along the number of protons. This is in agreement to the in-vacuum case, with the notion that the Kβ" feature appears as a separated peak in H 2 SO 4 and as a tail of the main line in − HSO 4 . We extracted the component spectra of − SO 4 2 , − HSO 4 , and H 2 SO 4 and their weights (and therefore molecular percentages) from the experimental spectra using constrained NNMF, see ref. 46 and Methods section for details. The number of components is less than the number of raw spectra, which enables an approximative low-rank matrix factorization of the data set. As constraints we used the assumptions that 18.0 M aqueous sulfuric acid is 100 mol-% H 2 39 . The 2 unconstrained spectra are allowed to take any functional form in the procedure, and are "discovered" by the NNMF algorithm, guided only by best describing the data set of 6 spectra using only 3 (one of which is fixed) with varying weights. The NNMF procedure does not provide Apart from the fixed coefficients, the initial guesses of the coefficients were taken from the simulations. The shapes of the resulting component spectra however suggest that the result is reasonable. The NNMF-obtained component spectra in Fig. 4(a) manifest similar behaviour in terms of the ratios Kβ x /Kβ 1, 3 and Kβ x /Kβ″ as the simulation, which supports the interpretation. Figure 4(b) presents the coefficients of the different spectra with the values from AIMD simulations, and for the higher concentrations the agreement is found to be good. The most likely reason for the mismatch at lower concentration is the quality of the structural AIMD simulation, as we observe large discrepancies between different computational works 14,30 , both of which disagree with the experimental value 39 . We conclude that the lowest concentrations are difficult to simulate due to most likely long reaction times for the span of AIMD. The NNMF coefficients are not in complete agreement with previous experiments by Margarella and coworkers 39 . A possible reason for this may be the used NNMF procedure, or its inherent assumption that there are three component spectra.  In conclusion, the S Kβ emission spectra of aqueous sulfuric acid changes with the protonation state of the acid molecule. This is due to the changes in the valence orbitals, that also chemically bind the protons. The intensities of the Kβ x , Kβ 1,3 , and Kβ″ are a probe of the protonation state of the acid, and the lowest Kβ x emission line originates from acid molecules with one or two protons. The obtained molecular fractions from the experiment are in agreement with those from AIMD simulation, apart from the lowest studied concentrations. The overall trends in the spectral changes from protonation state to other are explained by the spectrum calculations.

Methods
The experiment was performed at the ID26 beamline of the European Synchrotron Radiation Facility (ESRF). The incident photon energy was tuned by means of a cryogenically cooled double Si (111) crystal monochromator with the resolution of 0.36 eV. Higher harmonics were suppressed by two Si mirrors operating in total reflection. We utilized a closed acid-resistant sample cell to confine liquid sulfuric acid inside an emission spectrometer operating in vacuum conditions. The photon beam with a 50 × 250 μm 2 cross section was directed on a closed acid-resistant PTFE sample cell with 1.0 μm thick 1 × 1 mm 2 Si 3 N 4 front window, which allowed for conditions very close to room/ambient conditions inside the cell. X-ray fluorescence was collected along the polarization direction of the incident photon beam and analyzed with a Johansson type in-vacuum x-ray emission spectrometer 48 . The spectrometer operated in the dispersive geometry that combines target positioning within the Rowland circle with a position sensitive detection of diffracted x-rays. In our experiment, the target cell was placed at a distance of 42 cm in front of the diffraction crystal. The first order reflection of the Si (111) crystal was used, and the diffracted photons were detected by a thermoelectrically cooled (− 40 °C) CCD camera consisting 770 × 1152 pixels with 22.5 × 22.5 μm 2 pixel size. The experimental resolution at the energy of the S Kβ emission line was 0.45 eV and the bandwidth collected at the fixed detector position was ≈ 40 eV which enabled us to collect whole Kβ emission spectrum simultaneously.
Sulphuric acid with 18.0 M was provided by Carlo Erba Reagents and solutions with lower concentrations were purchased from Carl Roth. We used commercially provided solutions without further processing, except for the 14.8 M (79-m%) solution, which we prepared by using 3 g of 11.6 M (62-m%) and 3 g of 18.0 M (96-m%) solutions. The obtained spectra were brought onto a releative energy scale centered at the Kβ 1,3 peak.
To gain physical insight to the molecular structures, we evaluated X-ray emission spectra for the studied systems. The emission energies and intensities were obtained by two separate calculations: (i) the single molecules 2 two calculations were made. The first one models the effect of removing proton(s) from H 2 SO 4 without additional geometry optimization, and the second one contains also affects from geometric relaxation. For generation of orbital plots, the code package ERKALE 51,52 was used. The orbital was taken from ground-state calculation and the orbital ordering number of this orbital was assumed to remain the same in XES, which is supported by an orbital calculation using the Z + 1 approximation. The ground-state eigenenergies for orbitals in CP2K 53 and Erkale match well. The orbital plots were drawn using the VMD package 54 All spectral calculations, except calculations for the orbital plots, were done using the CP2K code package 53  2 if its distance to any of the oxygens was less than 1.3 Å, a criterion also used by Choe et al. 14 . An oxygen atom in turn, was determined to belong to the − SO 4 2 if its distance to the sulphur atom was less than 2.0 Å. The delta-peak spectra from the calculation were finally convoluted with a Gaussian line shape with a full width at half maximum of 1.5 eV. For the set (i), this FWHM was used to account for statistical broadening expected in the system, incoming photon bandwidth, spectrometer resolution, and life-time broadening. For the set (ii) the statistical broadening is covered by the phase space sampling and we used a FWHM of 1.0 eV instead. For all bulk spectrum simulations, we used the aug-cc-pVTZ basis set for the excited S atom and the TZVP-MOLOPT-GTH basis set 56 with GTH pseudopotentials 57 for the other atoms. The Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional 58 was used in all calculations.
We performed constrained non-negative matrix factorization (CNNMF) of the experimental spectra to extract percentages of the different molecular species. Non-negative matrix factorization (NNMF) 46,47 is a method that can be used to represent a series of m spectra of n energy points with k < m non-negative components and their non-negative weights. Formally one obtains a matrix factorization where A n×m , F n×k , and C k×m are the non-negative matrices containing the raw data, the component spectra, and their weights, respectively. The factorization is not exact and all spectra are integral-normalized to unity. To achieve optimization with constraints, we made our own implementation of the steepest descent (SD) algorithm. We minimize the quadratic cost function Here the cost function was optimized using SD algorithm where γ = 0.01 scaling parameter was used. Finally in the end of the iteration cycle, all negative elements were set to 0, a procedure listed as one alternative by Berry and co-workers 46 . Constrained optimization was achieved by giving the values of constrainted elements as initial guesses and never updating these elements in the SD iterations. Thus the partial derivatives (3) 39 . The statistical errors for the coefficients C were obtained using the bootstrap algorithm with 100 resamplings. In the procedure random noise, the magnitude of which is extracted from the experimental spectra (as mean difference of intensity in subsequent energy channels), is added to spectra of A and CNNMF is run to obtain new F and C. The errorbars represent the mean of absolute deviation of the resampled set from the original coefficients C.