Ab initio determination of crystal stability of di-p-tolyl disulfide

With the rapid growth of energy demand and the depletion of existing energy resources, the new materials with superior performances, low costs and environmental friendliness for energy production and storage are explored. Di-p-tolyl disulfide (p-Tol2S2) is a typical lubricating material, which has been applied in the field of energy storage. The conformational properties and phase transformations of p-Tol2S2 have been studied by pioneers, but their polymorphs and the polymorphism induced crystal structure changes require further analysis. In this study, we perform the crystal structural screening, prediction and optimization of p-Tol2S2 crystal with quantum mechanical calculations, i.e., density functional theory (DFT) and second-order Møller–Plesset perturbation (MP2) methods. A series of crystal structures with different molecular arrangements are generated based on the crystal structure screening. As compared to long-established lattice energy calculation, we take an advantage of using more accurate technique, which is Gibbs free energy calculation. It considers the effects of entropy and temperature to predict the crystal structures and energy landscape. By comparing the Gibbs free energies between predicted and experimental structures, we found that phase α is the most stable structure for p-Tol2S2 crystal at ambient temperature and standard atmospheric pressure. Furthermore, we provide an efficient method to discriminate different polymorphs that are otherwise difficult to be identified based on the Raman/IR spectra. The proposed work enable us to evaluate the quality of various crystal polymorphs rapidly.


Methods
Crystal structure predictions. The credibility of computing methods for crystal structure prediction has been improved over the past years. Here we use the MOLPAK and USPEX tools to determine the crystal forms from the molecular structure. MOLPAK can create unit cells that have the smallest volumes per molecule by allowing surrounding molecules to collapse along the three axes of a Cartesian coordinate system until a predetermined repulsion threshold with the central molecule is reached 32,33 . While the USPEX program, adapting the evolutionary algorithm that can find the stable structures for systems with various rigid molecular shapes 26,34 . The initial structures are generally generated by selected space groups with the initial known structures under atmospheric pressure. Structure relaxation is implemented from low to high precision using the VASP program. All the structures are examined and their initial sorting was performed based on lattice energy or enthalpy after full relaxation. However selected potential structures were resorted according to root-mean-square deviations (RMSDs), given in Tables S1 and S2.

Structure optimization and energy calculation.
After generation of the initial crystal structures using MOLPAK and USPEX, for further calculation, 20 structures from MOLPAK and 18 structures from USPEX were selected based on lower lattice energies and enthalpies, respectively. To minimize the enthalpy, we adopt quasi-Newton algorithm 35 for the crystal structure optimization. Because of the large molecular size and the quantity of molecules in the supercell, the embedded-fragment method 30,31,36 with ωB97XD/6-31G* was used for the calculation of enthalpy and Gibbs free energy of the p-Tol 2 S 2 crystal. The Hessian matrix approximation was updated using the BFGS procedure 37 and the convergence criterion for the maximum gradient was set to 0.001 Hartree/Bohr. The embedded fragment method, as the name implies, is a process that breaks a large system into small fragments. The embedded fragment method divides the total energy per unit cell of the crystal into an appropriate combination of the energies of monomers and dimers, and therefore can treat the macromolecules effectively. The individual molecule is designated as a segment, and the interaction energies between two segments, in close contact with each other, are calculated by quantum mechanics (QM). The interaction energy between the two distal segments is calculated according to Coulomb interactions. The DFT calculations, accounting for the environmental influences, are embedded in an electrostatic field represented by the point charges of the remaining system.
The internal energy of a unit cell for the molecular crystal system is calculated by: www.nature.com/scientificreports/ where the three-integer index of unit cell is represented by a variable 'n'. The quantum mechanical (QM) energy of the ith molecules in the nth unit cell is represented by E i(n) . The QM energy of dimers is represented by E i(0)j(n) , where ith and jth represent number of molecules with respect to 0th central unit cell and nth unit cells 29,38-41 . The crystal system is represented by a 3 × 3 × 3 supercell. The 1st part of Eq. (1) calculates all the single molecular energies in the 0th unit cell, which is in the center. The 2nd part represents the two-body QM interaction that has a shorter distance than (where is a given cutoff distance which is set to 4 Å in this work). The 3rd part of Eq. (1) provides the interactions between one molecule in the central unit cell and the other in nth unit cell whose distance is shorter than . The QM is used to calculate short range interactions (the first three parts in Eq. (1)). It is calculated in electrostatic field of the rest, where ωB97XD/6-31G* level is used to fit electrostatic potential variations. The background charges are represented by the 11 × 11 × 11 supercell. The Coulomb's charge-charge interaction is employed to approximately treat long-range interactions between two molecules of dimers whose distance is larger than . The long-range electrostatic interactions are represented by E LR in a 41 × 41 × 41 supercell.
The enthalpy H e for each unit cell can be calculated by considering the effect of external pressure as follow: where the external pressure is represented by P and the unit cell volume is represented by V.
The harmonic approximation is used to calculate zero-point vibrational energy U v and the entropy S v , which are shown in Eqs. (3) and (4), respectively, where the frequency of phonon with lattice vector k is represented by ω nk . β = 1/k 0 T and k 0 is the Boltzmann constant. The capital K is the product of all k, which are evenly spaced grid points in the reciprocal unit cell. The k-grid of 21 × 21 × 21 has been used in this study, where K = 9261. The calculation of Gibbs free energy with effects of temperature and pressure in a unit cell is given by Eq. (5). As MP2/6-31G* is more accurate than DFT/ωB97XD/6-31G* but its computational time and cost are very high. Therefore, to be more efficient and effective, we optimized crystal structure as well as calculated enthalpy (H e ) over single point energy, zero-point vibrational energy ( U v ), and entropy ( S v ) at DFT/ωB97XD/6-31G* level. Moreover, to obtain better accuracy, MP2/6-31G* level was used to recalculate enthalpy (H e ) only, over single point energy based on crystal structure calculated by DFT/ ωB97XD/6-31G*.
Raman spectra simulation. For a periodic molecular system, the dynamical force constant matrix is calculated as follow: where the masses of atoms p and q are represented by m p and m q , respectively.
The k is a Brillouin zone point which is known. The 2nd order derivative of the total energy in a unit cell which correspond to atom p and q in the 0th and the nth cell, is represented by H(r p,0 , r q,n ) in Eq. (6) at geometry equilibrium. The number of nearby unit cells for QM treatment was truncated at R = 1, and the number of k-points was set to 21 in each of the three dimensions. The vibrational frequencies and the corresponding normal modes of the periodic molecular system can be obtained by solving the dynamical matrix D r p , r q , k . The zone-center (k = 0) vibrations presenting nonzero intensities results in the activation of IR-and Raman spectra. The forceconstant matrix D(0) was used to calculate the vibrational frequency. Therefore, we can obtain the IR intensity (I n0 ) and Raman intensity (R n0 ) for the fundamental transition of the mode in the nth phonon branch with wave vector k = 0, demonstrated by the following derivatives: www.nature.com/scientificreports/ The corresponding normal mode is represented by Q n0 . The dipole moment derivative ∂µ i /∂Q n0 and the polarizability derivative ∂α ij /∂Q n0 in the central unit cell can be derived with the embedded-fragment quantum mechanical method.
Crystal structure predictions. The crystal structure prediction was performed by the MOLPAK and USPEX programs. From MOLPAK, we obtained 31 standard clusters of crystal structures with common space groups of (P2 1 2 1 2 1 , P 1 , P2 1 /c, Cc, C2/c, P2 1, Pna2 1 and Pbca). Each cluster contains 300+ crystal structures. Therefore, MOLPAK generated total crystal structures more than 9000. From these 9000+ crystal structures, only 20 crystal structures were selected according to lower lattice energies, and these selected structures were further optimized. All the selected 20 structures from MOLPAK have same lattice energy of − 16.91 kcal/mol, given in Table S1. We obtained same lattice energy for 20 selected structures because of two main reasons: first, these structures are similar to each other with minor difference in their lattice parameters, and second, the MOL-PAK software does not calculate lattice energy very accurately. Therefore, to obtain accurate energy landscape, we optimized selected structures and calculated their Gibbs free energy at the QM level.
On the other hand, USPEX generated 20 generations with each generation generated about 20 crystal structures. Therefore, the total crystal structures generated by USPEX were about 400. From USPEX predictions, only 18 crystal structures were selected based on enthalpy sorting. After selection of potential candidates we sorted them again, according to RMSDs, given in Table S2. The lattice parameters of phase α, phase β and the 38 predicted new crystal structures by MOLPAK and USPEX are given in Tables S1 and S2 as Supporting Information.
The Gibbs free energy. The stability of 40 crystal structures are compared in Fig. 2, including phase α, phase β and 38 predicted new structures by MOLPAK and USPEX. Figure 2 shows the difference of Gibbs free energy per unit cell among 38 predicted structures, phase α and phase β of the p-Tol 2 S 2 as a function of unit cell RMSD. In Fig. 2, the purple and green circles represent the 38 predicted structures by MOLPAK and USPEX, whereas the green square and red diamond denote the experimental Gibbs free energies of phases α and β,  The values of RMSD vary from 1 to 3.5 Å, which suggest that the crystal structure prediction method is reliable. Moreover, the lowest value of RMSD is smaller than 2 Å demonstrating the accuracy of prediction. The correlation between RMSD and Gibbs free energy (as shown in Fig. 2) reveals that the free energy of phase α, locating at the bottom of the lattice energy landscape, is the most stable structure among all predicted and experimental structures. Furthermore, the selected 20 structures from MOLPAK were possessing same lattice energies, given in Table S1. However, in Fig. 2 Gibbs free energy varies from about 1.5 to 8 kcal/mol, which is evidence of accuracy of Gibbs free energy calculation.
To verify the rationality of predicted structures, we selected eight candidates from 38 predicted structures by USPEX and MOLPAK programs to visualize overlay only. Figures S1 and S2 in the Supporting Information present the similarity of crystal structures and the molecular conformations via the calculations of the structural RMSDs, phase α and the predicted structures based on the RDKIT tool 44 . We selected a subset of 8 structures from 38 structures based on difference in RMSDs. The green molecules represent the structures of p-Tol 2 S 2 phase α determined by X-ray diffraction and the red molecules denote the predicted structures based on USPEX and MOLPAK programs. The molecular overlay shows the reliability of the predicted structures.
Vibrational spectra. The crystal structure of a molecule can be identified by vibrational spectroscopy 45,46 .
The calculated full range of Raman spectra are provided in Figs. S3 and S4 of the Supporting Information. For the purpose of highlighting the differences of phase α and phase β, we select parts of the calculated Raman and IR spectra of p-Tol 2 S 2 crystals. The Raman and IR spectra are shown in Figs. 3 and 4. In Fig. 3, the Raman spectra are calculated from 700 to 1200 cm −1 , where phase α (green curve) and phase β (red curve) are predicted to display seven and eight discernible intense Raman bands, respectively. The band 3*, locating at ~ 874 cm -1 , in the red curve is the characteristic peak for phase β. Similarly, there are five and four IR bands for phase α (green curve) and phase β (red curve), respectively. The band 2*, locating at ~ 3220 cm −1 , in the green curve is the characteristic peak for phase α. Therefore, the band 3* in the Raman spectra and the band 2* in the IR spectra, are the characteristic bands to distinguish phases α and β of p-Tol 2 S 2 . Figures 3 and 4 show that although the structures of phases α and β are very similar (as shown in Fig. 1), they can be identified by the characteristic bands in the vibrational spectra. The vibrational spectra of different pressure about phase α and phase β are demonstrated in Figs. S5-S8 in the Supporting Information. We can observe the vibrational spectra as a function of different pressures.

Conclusions
It is highly desirable to control polymorph for all the manufacturing industries which manufacture and/or utilize crystalline products. This will enable us to enhance functionality of the products to next higher level. A high precision method to screen different polymorphs for designing of crystals and to find out the most stable structure that confers an optimal product would, therefore, be extremely valuable. In this paper, we carried out a theoretical approach to screen different polymorphs of p-Tol 2 S 2 , a lubricating and energy storage material, based on the DFT and MP2 methods with the embedded fragment approach. With the theoretical calculation, we pick out the most stable structure phase α from the predicted and experimental structures. Considering the small RMSDs, there will be hundreds of predicted structures between them and the conformation of existing phases www.nature.com/scientificreports/ α and β, the proposed method has successfully performed the predictions for sufficiently similar conformations along with the identification of the most stable structure. The predicted vibrational spectra provide an effective way to identify polymorphs with the characteristic peaks of IR and Raman spectra.