Background & Summary

Raman scattering is the inelastic scattering of light by either absorbing or creating lattice vibrations (phonons)1. Raman spectroscopy provides local structure fingerprints of materials by detecting changes in the frequency of incident monochromatic radiation due to the scattering2,3. Materials identification with Raman spectroscopy holds advantages including high chemical sensitivity, tolerance to long-range disorder, and high spatial resolution4,5. The non-invasive and non-destructive Raman measurement sees applications in in-situ and operando characterization modes6. Compared to other phonon measurement techniques, such as neutron scattering, Raman spectroscopy offers better frequency resolution, easy sample preparation, relatively inexpensive equipment and modularity of lasers, and its fast operation3,4,5. Thanks to these advantages, Raman spectroscopy is widely used in the characterization of biological and organic molecules, as well as inorganic materials in various fields including energy storage and conversion6,7,8,9,10,11, catalysis12,13,14, low-dimensional materials15,16, biomedical applications17,18,19, and others. However, the interpretation of experimental Raman spectra may be complicated requiring significant time and effort. This urges the need for accurate computational references to improve the speed and accuracy of Raman spectra’s interpretation.

First-principles calculations of lattice dynamics and phonons have gained considerable attention and are pushing forward the sub-field of computational Raman spectroscopy20,21,22. In 2015, Togo and Tanaka built the well-known first-principles phonon database that contains the full phonon dispersion of a large number of inorganic and ordered materials23. This database showcases the power of phonopy -a software package developed to predict the phonon properties of ordered crystals in combination with the accuracy enabled by first-principles calculations24. More recently, Liang and co-workers reported a database of computed Raman spectra of approximately 55 compounds derived from density-functional perturbation theory (DFPT) calculations and finite difference derivative of the dielectric tensors25. Taghizadeh et al. applied time-dependent third-order perturbation theory to calculate Raman spectra of 733 two-dimensional (2D) monolayer materials and used the calculated spectra to identify materials from experimental Raman spectra26. Popov and co-workers proposed a new average method for computing Raman spectra of polar polycrystalline materials from DFPT27. Most recently, utilizing force constant matrices from the Togo phonon database, Bagheri et al. built a Raman database by only calculating Raman tensors by finite differences28.

This important progress represents the “bleeding edge” of computational Raman spectroscopy for the purpose of spectra interpretation and further data-driven development. However, one common shortcoming of all these databases is that the predictions are limited to rather low levels of theory, namely the generalized gradient approximation (GGA) exchange-correlation functionals29,30, while more accurate hybrid functionals (beyond “plain vanilla” GGA) have not been applied in calculating Raman databases. In particular, the 2D Raman library by Taghizadeh et al. is based on the Perdew–Burke–Ernzerhof (PBE) functional26, Liang and co-workers’ database was calculated using PBE + U25,31,32, and the database by Bagheri et al. are based on the revised PBE for solids (PBEsol)28,33. All these predictions are plagued to a different degree by the pernicious self-interaction error of the semi-local GGA functionals. However, the use of such low-level theories is due to the intrinsic complexity and high computational cost of phonon calculations with higher-level theories, and in particular, their implementation in plane-wave and pseudo-potential codes. Therefore, an efficient and accurate hybrid-functional computational Raman database is in high demand.

In the present data records, we compute Raman spectra together with phonon properties at a consistently high level of theory, namely the hybrid functional PBE034, in a linear combination of atomic orbitals, expanded in terms of triple-ζ valence with polarization (pob-TZVP-rev2) Gaussian basis sets35,36. These approximations are realized using the CRYSTAL code at a relatively low computation cost thanks to its exploitation of lattice symmetry during frequency calculations37,38,39. Here, the computation of Raman spectra is automated such that experimentally measured structures are “digested” into the CRYSTAL code, followed by the calculation of Raman spectra and other phonon properties, which are subsequently stored into a MongoDB database. The database is interfaced with an interactive user-friendly web application.

Methods

Theory

A small fraction of an incident light beam is scattered when it passes through a substance, or more precisely a material. The inelastic part of the scattering is known as Raman scattering, which changes its frequency by creating or absorbing phonons. From group-theory analysis of crystal symmetry, only particular phonon modes are allowed to take part in Raman scattering. The intensity of the scattering is dependent on the polarizability of materials because the scattering originates from the polarization fluctuations induced by the electric field of the incident electromagnetic radiation. Frequencies of the Raman-active phonon modes and the scattering intensities are calculated to get Raman spectra for inorganic compounds.

Our calculations of Raman spectra are performed using the ab initio computational program CRYSTAL37,38,39. The calculations with CRYSTAL have several unique features that facilitate efficient and accurate computation of lattice dynamics: (i) In contrast to most commonly used plane–wave basis sets, the crystalline orbitals are expanded as linear combination of atom centered atomic–orbitals, (ii) the hybrid functional PBE0, with 25% Hartree–Fock exchange34, is used rather than the semi-local GGA functionals, such as PBE30, and (iii) point group symmetry of lattice is utilized at multiple levels (i.e., atomic positions, wave functions, etc.) during the calculations of vibration frequencies, which means the calculations are only performed on symmetrically irreducible atoms.

Vibration frequencies are determined in CRYSTAL by calculating and diagonalizing mass-weighted dynamical matrices (bottom box in Fig. 1)40,41. The dynamical matrix is a 3n × 3n (n = number of atoms, 3n is the number of vibrational normal modes) matrix with components being the second partial derivatives of the DFT total energies versus positional displacements, which are calculated by numerical differentiation of the analytical gradient of the DFT total energy with respect to the atomic positions. Diagonalization of an appropriately symmetrized mass-weighted dynamical matrix then gives the normal modes (eigenvectors) and vibration frequencies (eigenvalues).

Fig. 1
figure 1

Computational workflow to build the Raman spectra database: structures of inorganic compounds are from the ICSD experimental database; the structures are automatically converted to CRYSTAL input files; CRYSTAL calculations include structure optimization, coupled-perturbed Kohn-Sham for Raman tensors, and numerical differentiation of analytical first derivative of energy versus displacement of each atom to get dynamical matrices; the resulting vibration frequencies and Raman intensities yield Raman spectra; the calculations can also produce phonon dispersion and density of states, as well as Infrared (IR) spectra; after CRYSTAL calculations, the processed outputs are stored in the MongoDB database and interfaced through the web application (https://raman-db.streamlit.app/).

In CRYSTAL, Raman intensities are calculated directly through the Raman polarizability tensors obtained via perturbation theory, i.e. the coupled-perturbed Kohn-Sham (CPKS) approach (middle box in Fig. 1)42,43,44,45,46. In this approach, a static electric field is applied along different directions to calculate the first-, the second-, and the third-order electric susceptibility tensors, which can be converted into polarizability and dielectric tensors, the first hyperpolarizability tensor, and second hyperpolarizability tensor, respectively. From the electric susceptibility tensors, a 3 × 3 Raman tensors are calculated for each of the 3n normal modes. The Raman intensities can then be calculated using the following equations:

$${I}_{i,j}(n)=VX{(n,i,j)}^{2},$$
(1)

where Iij(n) is the directional Raman intensity for single crystals, X(n, i, j) is the i, j-th component of the Raman tensor expressed in the basis of normal mode n, and V is cell volume. Placzek rotation invariants5 are used for averaging the mode Raman tensors of single crystals with Eqs. 24:

$${G}_{0}(n)=\frac{1}{3}{\left[\sum _{i}X(n,i,i)\right]}^{2},$$
(2)
$${G}_{1}(n)=\frac{1}{2}\sum _{i,j}{\left[X(n,i,j)+X(n,j,i)\right]}^{2},$$
(3)
$${G}_{2}(n)=\frac{1}{3}\sum _{i,j}{\left[X(n,i,i)-X(n,j,j)\right]}^{2}.$$
(4)

The parallel, perpendicular, and total Raman intensities of powders of polycrystalline materials are computed using Eqs. 57:

$${I}_{{\rm{par}}}(n)=V\left[15{G}_{0}(n)+6({G}_{1}(n)+{G}_{2}(n))\right],$$
(5)
$${I}_{{\rm{perp}}}(n)=\frac{9}{2}V\left[{G}_{1}(n)+{G}_{2}(n)\right],$$
(6)
$${I}_{{\rm{t}}{\rm{o}}{\rm{t}}}(n)={I}_{{\rm{p}}{\rm{a}}{\rm{r}}}(n)+{I}_{{\rm{p}}{\rm{e}}{\rm{r}}{\rm{p}}}(n).$$
(7)

Note that intensities can be conveniently recalculated from available Raman tensors with different measurement conditions including temperature and the incident laser frequency of the Raman instrument. Intensities for infrared (IR) absorption are also calculated in the CPKS process and conveniently stored in our database. Symmetry and group theory are applied in the calculations to differentiate Raman- and IR-active vibrational modes.

Workflow

The development of the present database consisted of three major phases: I. Collecting crystal structures and automatically generating input files for CRYSTAL, II. performing first-principles CRYSTAL calculations, troubleshooting their successful completion followed by data collection, and III. curation of the data with a MongoDB database interfaced by a user-friendly website (Fig. 1).

In Phase I, we first select high-quality experimental structures characterized at low temperature (or room temperature) and ambient pressure from the Inorganic Crystal Structure Database (ICSD)47. Then, the experimental structures (i.e., lattice constants, atomic positions, and space groups) are automatically converted into CRYSTAL inputs by symmetry analysis. Specifically, space group and irreducible-atom information needed for CRYSTAL is extracted using the spglib interfaced with the pymatgen library48,49. Exceptions are the monoclinic and orthorhombic structures, which may present alternative definitions of unique axes or origins, and they are processed separately. Rhombohedral structures from ICSD usually take their hexagonal conventional cells, and they are all converted to standard rhombohedral primitive cells for consistent input50.

In Phase II of our workflow, CRYSTAL calculations are performed for Raman and IR spectra and other phonon-related properties (e.g., phonon dispersion and density of states, etc.). The calculation includes three parts, namely a full structure optimization (i.e., atomic position, lattice parameter, and volume), calculation of vibration frequencies (IR and Raman), and CPKS calculation of Raman intensities. A quasi-Newton algorithm is used for the optimization of atomic positions, cell parameters, and volumes. Consistent and high-accuracy settings are used for all calculations. The DFT total energy was converged to 10−11 Hartree/cell (~2.7 × 10−10 eV/cell) for the self-consistent field (SCF) procedure. The DFT total energy was integrated with the Pack-Monkhorst sampling scheme over large and symmetrized 8 × 8 × 8 k-points grids (SHRINKING = 8). Tolerances for Coulomb and exchange integral series were set to 10−7 Hartree for both Coulomb overlap and penetration, 10−7 Hartree for exchange overlap, and 10−9 and 10−30 Hartree for exchange penetration. Crystalline orbitals were expanded as a linear combination of atomic orbitals, which are described by Gaussian valence triple-ζ with polarization (pob-TZVP-rev2) basis sets35,36. In DFT the unknown exchange and correlation functional was approximated with the PBE0 hybrid functional, which provides excellent predictions of the experimental Raman spectra of Na3PS4 and quartz-SiO2 (Tables 1, 2)34.

Table 1 Computational Raman frequencies (in cm−1) for Na3PS4 (\(P\overline{4}{2}_{1}c\), ICSD No. 121566) calculated with different levels of theory compared to experimental Raman frequencies measured at 100 K.
Table 2 Computational Raman frequencies (in cm−1) for α-quartz SiO2 (P3221, ICSD No. 156197) calculated with different levels of theory and compared to experimental Raman frequencies.

At the Γ-point, 3n + 1 (with n = number of atoms) total energy and gradient calculations are required for computing the IR and Raman frequencies. The first derivatives of the DFT total energies versus atomic displacements, i.e., the total energy gradients to displacements, are computed with a single ionic displacement (0.003 Å) for each coordinate with respect to their equilibrium position. Born charges are also obtained during the frequency calculations. Based on the input generation in Phase I, all calculations are performed in an automatic and high-throughput manner.

In Phase III, this workflow deals with the post-processing of CRYSTAL outputs and subsequent data curation. Output files are parsed to extract relevant data, such as IR and Raman frequencies, and their intensities. The post-processing also includes a convolution routine to generate Raman spectra based on the Voigt model with adjustable percentages of the Gaussian and the Lorentzian shapes. The computed data can be found in Table 3. All data is stored and organized using a MongoDB database currently hosted in the MongoDB Atlas cloud service. An user-friendly and publicly available web app is built for users to search for compounds and plot their Raman spectra interactively (https://raman-db.streamlit.app/).

Table 3 Description of the name key, data type, and size for the computational properties as stored in the present database using a nested JSON structure.

Data Records

The computational data is organized in JSON format and stored in a MongoDB database, which is available in our public GitHub repository (https://github.com/caneparesearch/project Raman) and on Zenodo51. The computed IR and Raman frequencies, Raman intensities, and other phonon-related properties for any calculated compounds can be accessed directly from the repository. The names and quantities for the computed properties are stored as key-value pairs using a nested JSON structure, and they are elaborated in Table 3. For each computed compound, the database includes the structure from ICSD and the DFT-optimized structures, the DFT total energies, the vibrational entropy, the heat capacity, the electric susceptibilities up to the third order, the Raman and the IR tensors, the Raman intensities, the Born charges, the dynamical matrices, the vibration frequencies, and their modes’ symmetries. The convoluted Raman and IR spectra obtained from the computations, as well as the simulated measurement temperature and incident laser wavelength, are also included. While the current database includes only Γ-point phonons, the database is designed to be compatible with calculations at different Q points other than the Γ-point, phonon dispersions, phonon density of states (DOS), and inelastic neutron scattering (INS) spectra.

The database currently contains 161 calculated inorganic compounds. Figure 2 shows the statistics of the database: There are 43 sulfides, 12 selenides and telluriums, 42 oxides, and peroxides, 16 halides, 10 silicates, 9 carbonates, 4 sulfates, 10 phosphides, 8 phosphates, and thiophosphates, and the remailing 7 compounds are nitrides, nitrates, borates, and carbides, respectively. Most compounds belong to the monoclinic (41) and orthorhombic (50) lattice systems, and there are also 4 triclinic, 12 tetragonal, 13 rhombohedral, 25 hexagonal, and 16 cubic systems, respectively. The 161 compounds computed are listed in Tables 4, 5 with their ICSD codes, space group symbols, and a link to their detailed Raman information (computed outputs), including the calculated Raman- (and IR-) active vibrational modes and intensities.

Fig. 2
figure 2

Database statistics. (a) Number of compounds for different chemistries; the selenide bar also includes telluride-based compounds, and the phosphate bar also includes thiophosphate-based compounds; “others” include nitride, nitrate, borate, and carbide. (b) Number of compounds for the seven lattice systems: triclinic (TRI), monoclinic (MCL), orthorhombic (ORC), tetragonal (TET), rhombohedral (RHL), hexagonal (HEX), and cubic (CUB).

Table 4 List of computed compounds (No. 1–90) in the database and their Raman information, such as Raman-active frequencies and Raman intensities.
Table 5 List of computed compounds (No. 91–161) in the database and their Raman information, such as Raman-active frequencies and Raman intensities.

Technical Validation

To validate our simulated Raman spectra obtained with the hybrid functional and local basis sets, we compare these predictions with experimental spectra directly available in the literature or in the RRUFF database -an integrated database of Raman spectra data for minerals52.

Computational Raman spectra of two compounds (Na3PS4 and α-quartz SiO2) were benchmarked against experimental Raman spectra from the literature to demonstrate the high accuracy of the hybrid functional PBE0 approximation compared to other functionals. Table 1 shows the computed frequencies of Raman-active vibrational modes of Na3PS4 (\(P\overline{4}{2}_{1}c\), ICSD No. 121566) calculated by different approximations of the unknown exchange and correlation functional, and compared with experimental frequencies10. The experimentally observed Raman-active modes are matched to computational ones according to their mode symmetries. Errors of predicted frequencies are calculated with respect to the matched experimental frequencies, and they are listed in the last three rows of Table 1. Among all the exchange and correlation functionals tested, the hybrid functional PBE0 used for this dataset shows the smallest mean absolute error (MAE), maximum absolute error (MaxAE), and mean absolute percentage error (MAPE), of only 3 cm−1, 12 cm−1, and 2.9%, respectively. In contrast, the range–separated hybrid functional by Heyd, Scuseria and Ernzerhof HSE0653,54 shows comparable accuracy as PBE0 in determining Raman frequencies, but calculations with HSE06 in CRYSTAL are almost twice more expensive than those with PBE0 (e.g., computation time in seconds is 31,492 vs. 12,059 on 128 cores –2 × AMD EPYC 7742– for Na3PS4). Notably, PBE0 and HSE06 perform especially well for the prediction of “high”-frequency modes (>100 cm−1). The GGA functionals PBE and PBEsol show the worst performance with MAE, MaxAE, and MAPE of up to 17 cm−1, 40 cm−1, and 6%, respectively. These inaccuracies are not acceptable if computation serves to fingerprint Raman spectra of ordered materials. Meta-GGA functionals in the flavor of R2SCAN show relatively small absolute errors compared to GGA, but the inaccuracies compared to experimental data are still approximately twice (2×) that of PBE0. With the van der Waals correction (Grimmes’s D3)55, PBE0-D3 shows worse accuracy in frequencies than PBE0 alone, with an 11 cm−1 MAE. The worse accuracy when including D3 is partially attributed to the ionic character of the Na+–PS43- bonds in Na3PS4. As shown by the error values in square brackets of Table 1, PBE0 and HSE06 show decreased MAE for the high-frequency modes, while the high-frequency MAE of all the other functionals increases. A decrease in MaxAE from 12 to 9 cm−1 from all frequencies to high frequencies for PBE0 also indicates that the max error ≥10 cm−1 only appears at low frequencies (<100 cm−1). This further encourages Raman calculations using PBE0 because high-frequency vibrational modes appear less noisy than soft modes in experimental measurements, and accurate computational reference of the high-frequency modes are more important in facilitating the interpretation of experimental spectra.

The α-quartz SiO2 (P3221, ICSD No. 156197) is another widely studied compound with an experimental Raman spectrum56. Table 2 shows the calculated frequencies with different DFT exchange and correlation functionals and their errors with respect to the experimental reference. Interestingly, three of the reported experimental assignments do not show observable peaks in the Raman spectra from the same study56, and two of them are also absent in our calculation results. The three non-observable frequencies are marked with * in the last column of Table 2, and the frequency errors were calculated only for the modes identified in our calculations.

As in the case of Na3PS4, frequencies calculated with PBE0 appear in good agreement with experimental observations, and thus very accurate. The PBE0 frequencies show the smallest MAE, MaxAE, and MAPE of 11 cm−1, 26 cm−1, and 3%. Except for the computationally more expensive HSE06, all the other exchange and correlation functionals do not show comparable accuracy to PBE0. Vibrational modes are assigned according to the selection rule as IR-active (change in dipole moment) and Raman-active (change in polarizability) modes.

Due to the lack of high-quality Raman data from consistent measurement in literature, we decided to further validate our computational approach by a systematic comparison of calculated spectra with the experimental Raman database RRUFF52. We applied our approach to calculating Raman spectra for 78 inorganic compounds available in the RRUFF database, and these spectra have also been added to our database. To generate input following our workflow, the RRUFF compounds were matched to ICSD structures according to their chemical formula and mineral names. The matching was subsequently verified using space group information of Materials Project57 entries that were assigned to RRUFF compounds in Ref. 28.

Figure 3 shows the comparison of the calculated (purple) and RRUFF experimental (green) Raman spectra for six chosen inorganic compounds of the 78 inorganic compounds considered here: Li3PO4 (ICSD No. 77095), SiO2 (ICSD No. 90145), MgCO3 (ICSD No. 80870), Na2Si2O5 (ICSD No. 34688), Sb2PbO6 (ICSD No. 81387), and TiO2 (ICSD No. 9852). The maximum Raman intensities are normalized to 1,000 arbitrary units (a.u.) and plotted versus Raman frequencies in cm−1. The calculated spectra are then convoluted with a Voigt line shape consisting of 50% Lorentzian and 50% Gaussian. Purple ticks at the bottom of each spectrum mark the calculated Raman frequencies. Insets show optimized crystal structures from our calculations, which are visualized with JS-ICE58 together with their computed space groups. The spectra comparison for all the 78 calculated RRUFF compounds can be found in Supplementary Figures S2S15 of Supporting Information (SI).

Fig. 3
figure 3

Raman spectra calculated using our workflow with the hybrid functional PBE0 (purple) in comparison with RRUFF experimental spectra (green) for six inorganic compounds: (a) Li3PO4 with identified ICSD No. 77095 and space group Pnma, (b) SiO2 (ICSD No. 90145 and P3121), (c) MgCO3 (ICSD No. 80870 and R−3c), (d) Na2Si2O5 (ICSD No. 34688 and P121/a1), (e) Sb2PbO6 (ICSD No. 81387 and P−31 m), and (f) TiO2 (ICSD No. 9852 and I41/amd). Raman intensities are in arbitrary units (a.u.) and the maximum peak in each spectrum is normalized to 1000. Raman-active vibration modes are marked as purple ticks at the bottom of each spectrum. All calculated spectra are plotted using a Voigt line shape with 50% Lorentzian and 50% Gaussian. Insets show optimized crystal structures from our calculations. A complete spectra comparison of all the 78 calculated RRUFF compounds can be found in Section S3 of Supporting Information.

The calculated and experimental Raman spectra of the six compounds in Fig. 3 show very good agreement. As shown in the plots, the experimental peaks are all correctly predicted with very small deviations in the values of frequencies. Specifically, 87 out of the total 93 peaks show wavenumber deviations within ±10 cm−1 and 91 peaks within ±15 cm−1. The relative deviations are within ±3% for 88 out of the total 93 peaks, and within ±5% for 92 peaks. Predictions are especially accurate for the most intense peak in each spectrum, with a maximum deviation of only ~–9.3 cm−1 in the case of MgCO3 (Fig. 3c). Note that large deviations in intensities are only observed for relatively experimental weak peaks (Fig. 3f). Such accurate prediction of the most intense peak for a compound is very useful when the prediction is used to fingerprint phases from experimental Raman spectra. Notably, the six spectra (Fig. 3) do not represent the best computation–experiment matches among all the 78 spectra (Supplementary Figures S2S15 of SI).

Besides the individual spectra comparison for each compound, a statistical comparison was also made between calculated and experimental Raman-active frequencies, to provide a more systematic evaluation of the overall computational accuracy. Among all the 78 RRUFF compounds we calculated, the spectra of 15 compounds show experimental artifacts that make it impossible to compare their frequencies with our calculated Raman-active frequencies (Supplementary Figures S13S15 of SI), and the spectra of the rest 63 compounds (Supplementary Figures S2S12 of SI) were used for the statistical comparison. Given a pair of computational and RRUFF experimental spectra for one of the 63 compounds, the common frequency range of the spectra was first identified. Then, within the common range, all local maxima of the Raman intensities and their frequencies were located. Finally, each computational intensity maximum (i.e., a peak in the computational Raman spectrum) is uniquely matched to an experimental intensity maximum (i.e., a peak in the experimental Raman spectrum), and the computational frequency deviation from the experiments is calculated for each pair of the matched Raman peaks.

Figure 4 shows the computational frequency deviations from the matched RRUFF Raman-active frequencies, plotted versus Raman shift. The wavenumber deviation is plotted using purple points, and the relative deviation is plotted using orange triangles. The grey shaded area indicates a wavenumber deviation range of ±10 cm−1 and a relative deviation range of ±5%. There are a total number of 804 pairs of matched peaks from the 63 spectra used for comparison. A percentage of 94.7% (88.9%) wavenumber deviations are within the range of ±10 cm−1 (±8 cm−1), and a percentage of 97.6% (92.9%) relative deviations are within the range of ±5% (±3%). The maximum absolute values of wavenumber and relative deviations are 22.4 cm−1 and 10.9%, respectively. More data points are found at smaller Raman shift wavenumbers because the Raman-active modes are overall more frequently distributed at lower frequencies for the 63 compounds (Supplementary Figures S2S12 of SI). Another trend shown in Fig. 4 is that the deviations are generally larger at lower frequencies. A plausible reason for this trend is that experimental measurements are usually more noisy for softer vibrational modes.

Fig. 4
figure 4

Wavenumber deviation (purple) and relative deviation (orange) of calculated Raman frequencies compared to the RRUFF experimental Raman frequencies. The grey shaded area indicates a wavenumber-deviation range of ±10 cm−1 and a relative-deviation range of ±5%. The frequency comparison is only conducted within the common frequency range between computational and experimental spectra. A total number of 63 PBE0-RRUFF spectra are used to extract the deviations (Supplementary Figures S2S12 of Supporting Information). There are 15 RRUFF spectra found to have experimental artifacts and are not included in this comparison to the computed spectra (Supplementary Figures S13S15 of Supporting Information).

Usage Notes

The present database is accessible through our web application (https://raman-db.streamlit.app/). In this web application (Fig. 5), users can search for chemical formulae, select the desired compound according to its ICSD identification number, view the crystal structure, and interactively plot the Raman and IR spectra with different convolution schemes (i.e., Gaussian shapes, Lorentzian shapes, and Voight shapes). The spectra plots can be downloaded as a PNG file, and the spectra data (frequencies and intensities for the Raman- and IR-active vibrational modes, together with their irreducible representations) can be downloaded as practical CSV files. Relevant quantities, such as Raman tensors and Born charges are also available in the application. A complete list of calculated compounds in the present database is constantly updated and shown on the web page.

Fig. 5
figure 5

A demo of the Raman-database web application (https://raman-db.streamlit.app/). The database of computed Raman properties is interfaced by this web application in the ways of searching for compounds, viewing crystal structures, interactively plotting Raman and IR spectra, and query for phonon properties (left and bottom right). A complete list of all available inorganic compounds in the database and their ICSD numbers (right). The list currently contains 161 compounds and is growing.