A library of ab initio Raman spectra for automated identification of 2D materials

Raman spectroscopy is frequently used to identify composition, structure and layer thickness of 2D materials. Here, we describe an efficient first-principles workflow for calculating resonant first-order Raman spectra of solids within third-order perturbation theory employing a localized atomic orbital basis set. The method is used to obtain the Raman spectra of 733 different monolayers selected from the Computational 2D Materials Database (C2DB). We benchmark the computational scheme against available experimental data for 15 known monolayers. Furthermore, we propose an automatic procedure for identifying a material based on an input experimental Raman spectrum and apply it to the cases of MoS2 (H-phase) and WTe2 (T′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${}^{\prime}$$\end{document}-phase). The Raman spectra of all materials at different excitation frequencies and polarization configurations are freely available from the C2DB. Our comprehensive and easily accessible library of ab initio Raman spectra should be valuable for both theoreticians and experimentalists in the field of 2D materials. The authors develop a first-principles workflow for calculating Raman spectra of 733 monolayers from the computational 2D materials database. After benchmarking results against experimental data for 15 monolayers, an automatic procedure for identifying a material from its Raman spectrum is proposed.

F ollowing the discovery of graphene in 2004 1 , the field of two-dimensional (2D) materials has grown tremendously during the last decade. Today, more than 50 different monolayer compounds including metals 2,3 , semiconductors 4-6 , insulators 7 , ferromagnets 8 , and superconductors 9,10 , have been chemically grown or mechanically exfoliated from layered bulk crystals 11 . The enormous interest in 2D materials has mainly been driven by their unique and easily tunable properties (as compared to 3D bulk crystals), which make them attractive for both fundamental research and technological applications in areas such as energy conversion/storage, (opto)-electronics, and photonics 6,12,13 . Among the various experimental techniques used for characterizing 2D materials, Raman spectroscopy plays a pivotal role 14 thanks to its simplicity, non-destructive nature, and high sensitivity towards key materials properties such as chemical composition, layer thickness (number of layers), inter-layer coupling, strain, crystal symmetries and sample quality [15][16][17] .
Raman spectroscopy is a versatile technique for probing the vibrational modes of molecules and crystals from inelastically scattered light, and is widely used for identifying materials through their unique vibrational fingerprints 18 . There are various types of Raman spectroscopies that differ in the number of photons or phonons involved in the scattering process 18 . Here we focus on the first-order Raman processes in which only a single phonon is involved. Typically, this is the dominant scattering process in defect-free samples (which are considered here). Note that Raman processes involving defect states or several phonons may also play important roles in some 2D crystals such as graphene 19 . As shown schematically in Fig. 1(a), the light scattered from a crystal appears in three distinct frequency bands: A strong resonance at the incident frequency ω in due to Rayleigh (elastic) scattering, and weaker resonances due to Raman (inelastic) scattering at ω in − ω ν and ω in + ω ν forming Stokes and anti-Stokes bands, respectively. Here, ω ν is the frequency of a (Raman active) vibrational mode of the crystal, i.e. a phonon. Depending on the symmetry of the phonon modes and polarization of the electromagnetic fields, a phonon mode may be active or inactive in the Raman spectrum.
While semi-classical theories of Raman spectroscopy can provide some qualitative insight 18 , a full quantum mechanical treatment is necessary for a quantitatively accurate description. In particular, ab initio techniques have been employed successfully to calculate Raman spectra of both molecules 18,20 and solids 21,22 typically showing good agreement with experimental spectra. The parameter-free nature of such computational schemes endow them with a high degree of predictive power, although their computational cost can be significant, thus, in practice limiting them to relatively simple, i.e. crystalline, materials. In the realm of 2D materials, ab initio Raman studies have been limited to a handful of the most popular 2D crystals including graphene 19 , hBN 23 , WTe 2 24 , SnS, and SnSe 25 , as well as MoS 2 and WS 2 26 . In view of the significant experimental efforts currently being devoted to the synthesis and application of future 2D materials and the important role of Raman spectroscopy as a main characterization tool, it is clear that the compilation of a comprehensive library of Raman spectra of 2D materials across different crystal structures and chemical compositions is a critical and timely endeavor.
Recently, we have introduced the open Computational 2D Materials Database (C2DB) 11 , which contains various calculated properties for several thousands 2D crystals using state of the art ab initio methods. The properties currently provided in the C2DB include the relaxed crystal structures, thermodynamic phase diagrams (convex hull), electronic band structures and related quantities (effective masses, deformation potentials, etc.), elastic properties (stiffness tensors, phonon frequencies), and optical conductivity/absorbance spectra. We stress that the materials in the C2DB comprise both experimentally known as well as hypothetical materials, i.e. materials that may or may not be possible to synthesize in reality.
In this paper, we present an ab initio high-throughput computation of the resonant first-order Raman spectra of more than 700 monolayers selected as the most stable 2D crystals from the C2DB. The calculations are based on an efficient density functional theory (DFT) implementation of the first-order Raman process employing a localized atomic orbital (LCAO) basis set 27 . We describe the implementation and the automated workflow for computing the Raman spectra at three different excitation frequencies and nine polarization setups. All calculated Raman spectra are provided in Supplementary Figs. 2-734, and can be found at the C2DB website (http://c2db.fysik.dtu.dk). In addition, the applied computational routines are freely available online through the website. Our numerical results are benchmarked against available experimental data for selected 2D crystals (15 different monolayers) such as MoS 2 , MoSSe, and MoSe 2 . The calculated spectra show excellent agreement with experiments for the Raman peak positions and acceptable agreement for the relative peak intensities. Finally, we analyze the inverse problem of identifying a material based on an input (experimental) Raman spectrum as shown schematically in Fig. 1 procedure can be used to differentiate clearly the distinct structural phases of MoS 2 and WTe 2 . Incidentally, the library of calculated Raman spectra provides a useful dataset for training machine learning algorithms 28,29 . As such, our work is not only a valuable reference for experimentalists and theoreticians working in the field of 2D materials, but also represents a step in the direction of autonomous (in situ) characterization of materials.

Results
Theory of Raman Scattering. We first briefly review the theory of Raman scattering in the context of third-order perturbation theory. As discussed above, accurate modeling of Raman processes requires a quantum mechanical treatment to obtain the electronic properties. Regarding the electromagnetic field, it can be shown that a classical description of the field 30,31 yields the same results as the full quantum mechanical theory that quantizes the photon field 18,32 . The most common approach to Raman calculations is the Kramers-Heisenberg-Dirac approach 31 , in which the Raman tensor is obtained as a derivative of the electric polarizability with respect to the vibrational normal modes 21,26,30,31 . Nonetheless, here we employ a more direct and much less-explored approach based on time-dependent thirdorder perturbation theory to obtain the rate for coherent electronic processes involving creation/annihilation of two photons and one phonon. While the two approaches can be shown to be equivalent 18 , at least when local field effects can be ignored as is the case for 2D materials, the third-order perturbative approach can be readily extended to higher order Raman processes (e.g. scattering on multiple phonons), and provides a more transparent physical picture of the Raman processes in terms of individual scattering events 33 . Hence, our computational framework is prepared for future extensions to multi-phonon processes. Note that in terms of computational effort, the perturbative approach is comparable to the polarizability derivative method for typical crystals, for which the matrix element calculation dominates the computation time. In this case, both approaches scale as N ν N 2 b , where N ν and N b denote the number of phonon modes and electronic bands, respectively.
To derive an expression for the Raman intensity, both electron-light and electron-phonon Hamiltonians are treated as perturbations (the exact forms of these Hamiltonians are given in the method section). A general time-dependent perturbation can be written asĤ 0 ðtÞ P ω 1Ĥ 0 ðω 1 Þ expðÀiω 1 tÞ (ω 1 runs over positive and negative frequencies and can also be zero). Note that, in our study, there are three distinct frequency components in H 0 ðtÞ: input and output frequencies (ω in and ω out ) due to the electron-light interaction and zero frequency (i.e. time-independent) for electron-phonon coupling. Within third-order perturbation theory, the transition rate P ð3Þ i!f from an initial state Ψ i j i to a final state jΨ f i due to the perturbative HamiltonianĤ 0 ðtÞ, is given by 34 Here, a, b summations are performed over all eigenstates of the unperturbed system (here a set of electrons and phonons) and the sums over ω n with n = 1, 2, 3 are over all three involved frequencies in the perturbative HamiltonianĤ 0 ðtÞ. The notation (ω 1 ω 2 ω 3 ) indicates that, in performing the summation over ω n , the sum ω 1 + ω 2 + ω 3 = ω is to be held fixed. In addition, E α with α 2 i; f ; a; b f gdenote the energies associated with Ψ α j i and the Dirac delta ensures energy conservation. The light field is written as F ðtÞ ¼ F in u in exp ðÀiω in tÞ þ F out u out exp ðÀiω out tÞ þ complex conjugate, where F in=out and ω in/out are the amplitudes and frequencies of the input/output electromagnetic fields, respectively, see Fig. 1(a). In addition, u in=out ¼ P α u α in=out e α denote the corresponding polarization vectors, where e α is the unit vector along the α-direction with α ∈ {x, y, z}.
We now specialize to the case where the initial and final states of the system are given by Ψ i j i ¼ 0 j i n ν j i and jΨ f i ¼ j0i jn ν þ 1i, respectively 32 so that E f − E i = ℏω ν . Here, 0 j i denotes the ground state of the electronic system and n ν j i is a state with n ν phonons at frequency of ω ν . In this case, the intensity of the Stokes Raman process for a phonon mode is proportional to P ð3Þ i!f , in which the transition rate involves a photon absorption, followed by an emission of a single phonon and photon. For this type of processes, (ω 1 , ω 2 , ω 3 ) are any permutation of (ω in , −ω out , 0), e.g. ω 1 = ω in , ω 2 = −ω out , ω 3 = 0 and five similar terms (all six terms contribute to the response at frequency of ω = ω in − ω out ). The total Raman intensity I(ω) is then obtained by summing over all possible final states, i.e. phonon modes ν. Inserting the perturbative Hamiltonians [c.f. Eqs. (6)-(8) in method section] in Eq. (1), the expression for the Stokes Raman intensity involving scattering events by only one phonon can be written Here, I 0 is an unimportant constant (since Raman spectra are always reported normalized) that is proportional to the input intensity and depends on the input frequency, and n ν is given by the Bose-Einstein distribution, i.e. n ν ðexp½_ω ν =k B T À 1Þ À1 at temperature T. Due to momentum conservation, only phonons at the center of the Brillouin zone contribute to the one-phonon Raman processes 19 . Furthermore, R ν αβ denotes the Raman tensor for phonon mode ν, see method section. Eq. (2) is used for computing the Raman spectra in this work for a given excitation frequency and polarization setup. It may be noted that one can derive a similar expression for the anti-Stokes Raman intensity by replacing n ν + 1 by n ν in Eq. (2) and ω ν by −ω ν in Eq. (10) in method section. Note, also, that the Raman shift ℏω is expressed in cm −1 with 1 meV equivalent to 8.0655 cm −1 .
Computational workflow. An overview of the automated workflow for computing the Raman tensor of the materials in the C2DB is shown in Fig. 2. First, the relaxed structures are extracted from the database. In this work, we consider only compounds that are dynamically stable. Next, the electronic band energies and wavefunctions are obtained from a DFT calculation. In parallel, a zone-center phonon calculation is performed to obtain the optical vibrational modes. From the obtained electronic states and phonon modes, the momentum and electron-phonon matrix elements are evaluated and stored. In the final step, for a given excitation frequency and input/output polarization vectors, the Raman spectrum is calculated using Eq. (2). The key feature of the approach outlined here is that the calculation process can be automatized, allowing one to perform thousands of calculations in parallel without human intervention.
For simplicity, we have restricted the study to non-magnetic materials, but our routines can be readily extended to include magnetic materials. The Raman spectra presented in this paper are computed for in-plane polarization, where the incoming and outgoing photons are polarized along the x-or y-directions, i.e. u in/out are either [1, 0, 0] or [0, 1, 0]. The four possible combinations are referred to as xx, xy, yx, and yy polarization setups.
Raman spectra and comparison with experiments. Figure 3 compares the calculated Raman spectrum of three different monolayer transition metal dichalcogenides (TMDs), namely MoSe 2 , MoSSe and MoS 2 , with the experimental data extracted from ref. 35 . For all three monolayers, a good agreement is observed both for the peak positions and relative amplitudes of the main peaks. Additional peaks in the experimental spectra presumably originate from the substrate or defects in the samples. The differences between the Raman spectra of the three materials provide valuable information about the crystal structure. Symmetry and Raman activity of phonon modes are determined by the irreducible point group representations. MoS 2 and MoSe 2 are members of point group D 3h , whereas MoSSe lacking a horizontal mirror plane σ h belongs to the point group C 3v . In Mulliken notation, the irreducible representation of MoS 2 and MoSe 2 is 2A 00 whereas for MoSSe the lowered symmetry leads to 3A 1 + 3E. For MoS 2 and MoSe 2 , one member of both A 00 2 and E 0 is an acoustic mode, and the other A 00 2 mode is Raman inactive. For MoSSe, A 1 and E each contain an acoustic mode, and all other modes are Raman active. The relevant modes are shown schematically in Fig. 3(b). In general, a Raman active mode will only appear in certain polarization configurations. The tensorial Raman selection rules follow from the irreducible point group representations 36,37 as shown for point groups D 3h and C 3v in Supplementary Note 1.
Next, we focus on the case of MoS 2 , and investigate the dependency of the Raman spectrum on the excitation frequency and polarization, see Figs. 4(a) and 4(b), respectively. In Fig. 4(a), the Raman spectra are computed for three commonly used wavelengths of blue, green and red laser sources. In this case both in-and outgoing polarization vectors are along the y-direction (or x-direction). While the relative strength of the first Raman active peak in the spectrum is enhanced slightly for shorter wavelengths, the shape of the spectrum does not change significantly. Note that, in reality, the relative amplitudes of the E 0 and A 0 1 modes may change considerably if the excitation frequency coincides with an exciton resonance 38,39 . This is because excitons can selectively enhance specific Raman modes due to their symmetry 40,41 . Although this effect is not captured properly in our independent-electron model, it is in principle straightforward to include by using the many-body eigenstates obtained by diagonalizing the Bethe-Salpeter equation (BSE) when evaluating the matrix elements in Eq. (1) [41][42][43][44][45] . Moreover, the absolute magnitudes of the Raman peaks can vary substantially by changing the excitation wavelength due to the possible resonance with electronic states (resonance Raman spectroscopy) 40 . Nonetheless, the overall magnitude of the Raman spectra is usually of little practical importance compared to the spectral positions and spectra are typically normalized as done here. Changing the polarization of electromagnetic fields not only influences the relative amplitudes of Raman peaks, but may switch certain modes on and off as shown in Fig. 4(b). For instance, the MoS 2 E 0 mode becomes completely inactive for the perpendicular polarization setup (zz) due to symmetry 26 . This is easily confirmed using Supplementary Eq. (1) of Supplementary Note 1 predicting an inactive E 0 mode for zz-polarization. Note that, although the E″ mode is Raman active for xz-, yz-, zx-, and zypolarizations, the intensity is too small to be observed in Fig. 4(b).
We have assessed the quality of the Raman library for a wide range of material compositions and crystal structures. Fig. 5 compares experimental and calculated Raman spectra for 12 monolayers including graphene, hBN, several conventional TMDs in the H-or T 0 -phase as well as anisotropic crystals such as phosphorene and Pd 2 Se 4 . In general, the number of Raman active modes increases with the number of atoms in the unit cell, as expected. For instance, there are more than eight peaks in the Raman spectrum of Pd 2 Se 4 . Furthermore, as a rule of thumb, Raman modes of materials containing heavier atoms are at lower frequencies and vice versa, e.g. the Raman peaks for graphene and hBN appear at frequencies above 1000 cm −1 . The experimental data are obtained under various experimental conditions such as different excitation wavelengths and polarizations or diverse sample substrates. Note that if polarized Raman spectra were not available (or in the case of unspecified polarization), an average of all four in-plane polarization settings, i.e. xx, xy, yx, and yy, has been used for generating the theoretical spectra. In general, there is quite good agreement between our calculations and experimental results in all cases, particularly, for the peak positions. The deviations can be attributed to various factors such as substrate and excitonic effects, which are not captured in our calculations, as well as the quality of the experimental samples and other experimental uncertainties, all of which can influence the spectra considerably.
Identifying materials from their Raman spectra. At this point, we turn to a critical test of the ab initio Raman library: given an experimental Raman spectrum, is it possible to identify the underlying material by comparing the experimental spectrum to a library of calculated spectra? The answer to this question will depend on several factors including: (1) the quality of the experimental spectrum. (2) The quality of the calculated spectra, i.e. the ability of theory to reproduce a (high quality) experimental spectrum for a given material. (3) The size/density of the calculated Raman spectrum database. Obviously, a more densely populated database increases the chances that the experimental sample is, in fact, contained in the database. But, at the same time, this increases the risk of obtaining a false positive, i.e. matching the experimental spectrum by a calculated spectrum of a different material.
Putting the above idea into practice requires a quantitative measure for comparing Raman spectra. In the present work, we use the two lowest moments to fingerprint the Raman spectrum. In general, the Nth Raman moment of the spectrum is given by where I ν denotes the amplitude of mode ν, i.e.
Note that, for these calculations, we normalize the Raman spectrum such that its zeroth moment becomes one, i.e.
Therefore, the first Raman moment corresponds to the mean value of the spectrum. Rather than using the second moment, we use the standard deviation of the spectrum as the selected measure, given by δω ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi hω 2 i À hωi 2 q : ð4Þ Figure 6 shows a scatter plot of 〈ω〉 and δω/〈ω〉 for the 733 monolayers at an excitation wavelength of 532 nm and xxpolarization setup obtained at the room temperature. In this plot, crystals composed of lighter elements appear further to the right because their optical phonons generally have higher energies. Furthermore, crystals with fewer atoms in the unit cell and/or higher degree of symmetry, appear in the bottom of the plot because they have fewer (non-degenerate) phonons and thus fewer peaks in their Raman spectrum resulting in a reduced To test the feasibility of inverse Raman mapping, we evaluate the lowest Raman moment fingerprint for five experimental Raman spectra of MoS 2 (H-phase) and three spectra of WTe 2 (T 0phase) obtained from independent studies, see stars in Fig. 6. Similar analyses have been performed for the eleven additional crystals found in Fig. 5, and is provided in Supplementary Note 2.
Firstly, note that the fingerprint of MoS 2 in the T 0 -phase (WTe 2 in H-phase) is located relatively far from the H-phase (T 0 -phase) fingerprint in the plot, which suggests that the lowest Raman moments are indeed able to distinguish different structural phases of the same material. The insets highlight the regions surrounding the experimental data. The variation in the experimental fingerprints is due to small differences in the Raman spectra, originating from the variations in sample quality, substrate effects, measurement techniques/conditions, etc. Consequently, the precise peak positions and, in particular, their amplitudes can vary from one experiment to another. Clearly, the fingerprints of the calculated spectra for both MoS 2 and WTe 2 lie close to the experimental data. In a few cases, such as Pd 2 Se 4 , the experimental fingerprints lie further from the theoretical predictions, as illustrated in Supplementary Fig. 1. This may partly be due to insufficient sample quality for these less-explored 2D crystals. In fact, the deviation between theory and experiments is comparable to the variation between the different experiments. Importantly, only a few other materials show a similar agreement with the experimental data. This suggests that fingerprints including higher order moments could single out the correct material with even higher precision. For instance, the skewness (based on the third Raman moment) can be used to distinguish MoS 2 from CrS 2 . By manual inspection of the Raman spectra, one readily confirms that the calculated spectra of MoS 2 and WTe 2 are in fact the best match to the experimental spectra, e.g. other candidates have Raman peaks that are not observed in the experimental spectra or the relative amplitudes of the peaks are completely different from the experimental data. Nonetheless, the procedure of manual inspection can be replaced by a more rigorous and unbiased approach as discussed below.
To compare the experimental and calculated Raman spectra quantitatively, we focus on the experimental data of Tongay et al. 46 and Cao et al. 47 for MoS 2 and WTe 2 , respectively. The experimental spectra for MoS 2 are obtained without any polarizer at 77 K at an excitation wavelength of 488 nm. For WTe 2 in Cao et al. 47 , the experiment is performed at room temperature using a 532 nm laser linearly polarized in-plane. To account for the unspecified polarization, we take the average of Raman spectra for the xx and xy polarization setups in the case of WTe 2 , while for MoS 2 the average of all Raman spectra for transverse components (xx, xy, yx, and yy) is used as the theoretical spectrum. For quantitative comparison with the experimental data, one can use Euclidean distances between the experimental and theoretical spectra as a measure. For two Raman spectra I 1 (ω) and I 2 (ω), the Euclidean distance (or L 2 -norm) ||I 1 − I 2 || is defined as Note that the spectra are normalized such that the total area is unity. Figure 7 shows the computed Euclidean distances from the calculated Raman spectra to the experimental data for both MoS 2 and WTe 2 . We highlight the points corresponding to the materials in the insets of Fig. 6. In both cases, identifying the smallest Euclidean distance confirms that the Raman spectra closest to the experimental data are indeed the calculated spectra of MoS 2 and WTe 2 . This shows that the quality and accuracy of, respectively, the experimental and computed 2D materials Raman spectra, is sufficient for automatic structure identification.

Discussion
We have introduced a comprehensive library of ab initio computed Raman spectra for more than 700 2D materials spanning a variety of chemical compositions and crystal structures. The 2D materials comprise both experimentally known and hypothetical compounds, all dynamically stable and with low formation energies. Using an efficient first-principles implementation of third-order perturbation theory, the full resonant first-order Raman tensor was calculated including all nine possible combinations for polarization vectors of the input/output photons and three commonly used excitation wavelengths. All spectra are freely available as part of the C2DB and should comprise a valuable reference for both theoreticians and experimentalists in the field. The reliability of the computational approach was demonstrated by comparison with experimental spectra for 15 monolayers such as graphene, hBN, phosphorene and several TMDs in the H-, T-, and T 0 -phases. We carefully tested the feasibility of inverse Raman mapping, i.e. to what extent the library of computed Raman spectra can be used to identify the composition and crystal structure of an unknown material from its Raman spectrum. For the specific cases of MoS 2 in H-phase and WTe 2 in T 0 -phase, we showed that a simple fingerprint based on the lowest moments of the Raman spectrum is sufficient to identify the materials from their experimental Raman spectrum. This represents a significant step in the direction of autonomous identification/characterization of materials. In addition, apart from being a useful reference for 2D materials research, the Raman library can be used to train   Fig. 7 Euclidean distances between Raman spectra. Distances (see main text for details) are calculated between theoretical Raman spectra and the experimental data of Tongay et al. 46 and Cao et al. 47 for MoS 2 (top) and WTe 2 (bottom). For comparison purposes, we highlight the points corresponding to materials in the insets of Fig. 6 by yellow. NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16529-6 ARTICLE NATURE COMMUNICATIONS | (2020) 11:3011 | https://doi.org/10.1038/s41467-020-16529-6 | www.nature.com/naturecommunications machine learning algorithms to predict Raman spectra directly from the atomic structure similarly to recent work on prediction of linear optical spectra for molecules 48 . This is of particular importance in the currently attractive trend of employing machine learning algorithms in materials science 28,29 .
In the present work, we have focused on Raman processes involving only a single phonon, i.e. first-order Raman processes, since these are typically the dominant contributions to the Raman spectrum. Nonetheless, the presented methodology can be readily extended to include two-phonon scattering processes, although the computational cost will be significantly increased. Excitonic effects in the Raman spectrum have been neglected since most experimental Raman spectra are recorded off-resonance where excitons play a minor role. The inclusion of excitonic effects can be achieved within the presented methodology by employing the many-body eigenstates obtained from the BSE 42,49,50 instead of Slater determinantal electron-hole excitations. However, this will mainly affect the amplitude of the Raman peaks which is of secondary importance in practice. We only compute the Raman spectra of monolayers in the present work, but the library can be extended to multi-layer structures. For some 2D materials such as graphene or MoS 2 this can be done by employing existing exchange-correlation functionals capable of accurate modeling of van der Waals forces. But for other 2D systems such as phosphorene, further development of exchange-correlation functionals is required to describe the complex inter-layer couplings, particularly for low-frequency Raman modes 51 . The symmetry of phonons modes have previously been investigated for graphene 52 , the TMD family 37 and phosphorene 53 using group theory analysis. Based on the Raman library, such analysis could be performed for a much wider range of materials in future work. Finally, the current work has been restricted to non-magnetic materials, and the ab initio Raman response of magnetic materials is an interesting future research field.

Methods
Theory. In the independent-particle approximation, the Hamiltonian of a system of electrons interacting with phonons and electromagnetic fields takes the form H ¼Ĥ 0 þĤ eγ þĤ eν , whereĤ 0 is the unperturbed Hamiltonian of the electrons (e) and phonons (ν),Ĥ eγ describes the electron-light interaction (here written in the velocity or minimal coupling gauge 54,55 ), andĤ eν describes the electron-phonon coupling. In second quantization, they are given by 56 Here,ĉ y =ĉ andâ y =â are the creation/annihilation operators of electrons and phonons, respectively, A denotes the vector potential (F ¼ À∂A=∂t), ε nk is the energy of the single-particle electronic state nk j i, and ℏω νq denotes the phonon energy of normal mode ν and wavevector q. Furthermore, p nmk ¼ hnkjpjmki and g νq nmk ¼ hnk þ qj∂ νq V KS jmki are the momentum and electron-phonon matrix elements (to the first order in the atomic displacements 56 ), respectively, with the Kohn-Sham potential V KS . The summation over k implies an integral over the first Brillouin zone, i.e. (2π) D ∑ k → V D ∫ BZ d D k where V is the D-dimensional volume (D = 2 for 2D systems). Note that the A 2 term does not contribute to the linear Raman response and, hence, is absent here. Moreover, we neglect the Coulomb interaction between electrons and holes, i.e. excitonic effects. If the Raman spectroscopy is performed with an excitation frequency that matches the exciton energy 38 , the electron-hole interactions should be included, ideally within the GW and BSE framework [41][42][43] .
We now insert the Hamiltonians given in Eqs. (6)- (8) in the third-order perturbation rate, Eq. (1). As mentioned in the main text, for the Stokes processes involving one phonon, six permutations of (ω in , − ω out , 0) are used for (ω 1 , ω 2 , ω 3 ). Furthermore, the eigenstates of the unperturbed Hamiltonian Ψ a j i (or Ψ b j i) can be written as ψ e n ν 0 j i, where ψ e and n ν 0 j i are the many-body electronic and phononic states, respectively (the index e runs only over many-body electronic states). SinceĤ eν andĤ eγ are, respectively, linear in and independent of the phononic operator, the phonon state n ν 0 j icontributes to the Stokes response only if n ν 0 j i is either n ν j i or n ν ± 1 j i . Consequently, hΨ a jĤ eγ jΨ i i ¼ hψ e jĤ eγ j0iδ ν 0 ν δ n ν 0 n ν and hΨ a jĤ eν jΨ i i ¼ hψ e jĤ eν j0iδ ν 0 ν δ n ν 0 ðn ν ± 1Þ (δ ij denotes the Kronecker delta). The total Raman intensity I is obtained by summing over all final states, i.e. phonon modes, and given by I(ω) = I 0 ∑ ν (n ν + 1)|P ν | 2 δ(ω − ω ν )/ω ν , where P ν is defined as Here, E e=d denote the electronic energies (with respect to the electronic ground state), the summations over e and d include only electronic states, andP andĜ ν are many-body electronic operators given byP P nmk p nmkĉ y nkĉmk and G ν P nmk g ν0 nmkĉ y nkĉmk . Note that the momentum conservation implies that only phonons at q = 0 contribute to the response here 19 , i.e. ω ν ≡ ω ν0 . Since bothP and G ν are bi-linear in the electronic operator, for a non-vanishing matrix elements, jψ e=d i must include singly-excited states, i.e. terms in the formĉ y ckĉvk 0 j i (indices c and v imply conduction and valence bands, respectively) 50 . Excitonic effects can readily be introduced at this stage by incorporating the BSE solution 45,50 . However, we neglect the excitonic effects in the present work, and hence, each singly-excited state contributes individually to the response, i.e. jψ e=d i ¼ĉ y ckĉvk j0i with an energy of E e=d ¼ ε ck À ε vk . At finite temperature, the expression for jψ e=d i should be taken as jψ e=d i ¼ f i ð1 À f j Þĉ y jkĉik j0i, where f i ð1 þ exp½ðε ik À μÞ=k B TÞ À1 is the Fermi-Dirac distribution with chemical potential μ.
Rewriting P ν in terms of the single-particle variables and polarization vectors leads to Eq. (2) for the Raman intensity, where the Raman tensor component, R ν αβ , reads Here, ε ij ≡ ε ik − ε jk , p α ij hikjp α jjki, g ν ij hikj∂ ν0 V KS jjki, and (i, j, m, n)/ν are the electron/phonon band index. The line-shape broadening is accounted for by adding a small phenomenological imaginary part, iη, to the photon frequencies ω in/out → ω in/out + iη. We set the frequency broadening to η = 200 meV in our calculations.
First-principles calculations. All DFT calculations are performed with the projectoraugmented wave code, GPAW 57,58 , in combination with the atomic simulation environment (ASE) 59 . The Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional is used 60 and the Kohn-Sham orbitals are expanded using the double zeta polarized (dzp) basis set 27 . Despite its fairly small size, the dzp basis set provides sufficiently accurate phonon modes. This has been tested by benchmarking the phonon frequencies obtained from this basis set against the results using the commonly-employed plane waves for 700+ monolayers (more than 7000 phonon modes). We confirm that for approximately 80% of all phonons, the discrepancy between the two approaches is less than 5%. Also, the choice of exchange-correlation functional may slightly influence the Raman spectra. For instance, it is known that the PBE functional tends to overestimate the lattice parameters and underestimate the phonon frequencies in crystals 61 , whereas the opposite occurs for the local-density approximation (LDA) functionals. Nonetheless, this choice only slightly influences our calculated Raman spectra, and PBE usually provides sufficiently accurate phonon frequencies in the range of theoretical and experimental uncertainties 62 . The monolayers are placed between two vacuum regions with thicknesses of 15 Å. A convergence test of Raman spectra with respect to the wavevector density is performed for several materials, and a mesh with the density of 25 Å −1 for ground state calculations was chosen. The phonon modes are obtained using the standard approach based on calculating the dynamical matrices in the harmonic approximation 63 . The dynamical matrix is evaluated using the small-displacement method 64 , where the change of forces on a specific atom caused by varying the position of neighboring atoms is computed. Since only the zone-centered (Γ-point) phonons are required, the phonon modes can be computed based on the crystal unit cell. A k-mesh with a density of 12 Å −1 is used for phonon calculations, and the forces are converged within 10 −6 eVÅ −1 . Since the wavefunctions and Kohn-Sham potentials in GPAW are evaluated on a real-space grid 57 , a convergence test with respect to this grid spacing is performed and a real-space grid of 0.2 Å is chosen for calculations. The electron-phonon matrix elements are then obtained within the adiabatic approximation using a finite difference technique for evaluating the derivative of the Kohn-Sham potential 65 . Similarly, the momentum matrix elements are calculated using the finite difference technique and the correction terms due to projector-augmented waves 66 are added. The width of the Fermi-Dirac occupations is set to k B T = 50 meV for faster convergence of the DFT results. For generating the Raman spectra, a Gaussian [GðωÞ ¼ ðσ ffiffiffiffiffi 2π p Þ À1 expðÀω 2 =2σ 2 Þ] with a variance σ = 3 cm −1 is used to replace the Dirac delta function, which accounts for the inhomogeneous broadening of phonon modes. The temperature of the Bose-Einstein distributions is set to 300 K for all calculations except for the results in top panel of Fig. 7, where a temperature of 77 K is used. The calculations are submitted, managed, and received using the simple MyQueue workflow tool 67 , which is a Python front-end to job scheduler.
Experimental Raman spectra. The experimental Raman spectra are extracted from the figures in the corresponding references using a common plot digitizer. To remove the noise in the experimental data, they are filtered using a Savitzky-Golay filter 68 of order three with a filter window length of eleven. For a fair comparison with our theoretical spectra in Fig. 5, we have convolved the experimental spectra with a Gaussian function with variance of 10 cm −1 to reduce the effect of possible but unimportant small frequency shifts between the experimental and theoretical spectra. Furthermore, the Raman moments have been calculated over a frequency range where the main Raman peaks appear, from 350 to 450 cm −1 for MoS 2 and from 75 to 260 cm −1 for WTe 2 . For calculating the Euclidean distance, both the experimental and theoretical spectra are convolved with a Gaussian function with variance of 6 cm −1 .

Data availability
All calculated Raman spectra are freely available online through https://cmrdb.fysik.dtu. dk/c2db/. Other data is available from the corresponding author upon reasonable request.

Code availability
GPAW is an open-source DFT Python code based on the projector-augmented wave method and the ASE, which is available at https://wiki.fysik.dtu.dk/gpaw/. The Raman code used for generating Raman spectra in this work will be available in future releases of the code.