Abstract
Realtime timedependent densityfunctional theory (RTTDDFT) and linear response timedependent densityfunctional theory (LRTDDFT) are two important approaches to simulate electronic spectra. However, the basis sets used in such calculations are usually the ones designed mainly for electronic ground state calculations. In this work, we propose a systematic and robust scheme to truncate the atomic orbital (AO) basis set employed in TDDFT and TD Hartree–Fock (TDHF) calculations. The truncated bases are tested for both LR and RTTDDFT as well as RTTDHF approaches, and provide an acceleration up to an order of magnitude while the shifts of excitation energies of interest are generally within 0.2 eV. The procedure only requires one extra RT calculation with 1% of the total propagation time and a simple modification on basis set file, which allows an instant application in any quantum chemistry package supporting RT/LRTDDFT calculations. Aside from the reduced computational effort, this approach also offers valuable insight into the effect of different basis functions on computed electronic excitations and further ideas on the design of basis sets for special purposes.
Similar content being viewed by others
Introduction
Electronically excited states and their properties are among the central topics of quantum chemistry research. The utilized theoretical methods for excited state calculations typically require equivalent or higher computational resources compared to analogous ground state calculations. Highly accurate multiconfigurational methods are computationally demanding and thus can only be applied to small systems.
Timedependent densityfunctional theory (TDDFT), due to its good compromise between accuracy and efficiency, has been employed in a wide range of applications, especially for spectroscopy^{1,2,3}. Realtime propagation (RTP) has become an appealing technique for the solution of the timedependent Kohn–Sham calculations, namely, realtime timedependent densityfunctional theory (RTTDDFT), or general approximations to the timedependent Schrödinger equation^{4,5,6,7,8,9}. It is based on the evolution of molecular orbitals under the influence of an external field, often with only a δpulse (field) applied at the beginning. In the weak field limit within the adiabatic approximation, the spectroscopy simulations using RTTDDFT and linear response timedependent densityfunctional theory (LRTDDFT) should provide comparable results^{4}. During each time step, one needs to construct the Hamiltonian given by the new molecular orbital (MO) coefficients, which is the most timeconsuming part of the RTP. Depending on the quantum chemistry method employed in RTP, the construction of the Hamiltonian may scale to \({{{{{{{\mathcal{O}}}}}}}}({N}^{4})\), e.g., for HF Coulomb and exchange matrices calculated with 2electron integrals, where N refers to the number of AO basis functions. Therefore, reducing the number of basis functions or finding a proper smaller basis set can potentially save a large amount of computational time and memory.
Previous studies on the topic of basis set truncation/reduction have mainly followed three strategies: (1) decreasing the size of the virtual space for frozen natural orbital approximations used in perturbation based methods^{10,11} (e.g., Møller–Plesset perturbation theory, coupled cluster singledouble and perturbative triple, complete active space perturbation theory), (2) reducing the number of functions in correlation consistent basis sets^{12,13}, (3) reducing the number of basis functions of subsystems (which apply expensive wavefunctionbased methods) for embedding calculations^{14,15}. However, these works focus on the electronic ground state. Multiple embedding techniques have been applied to accelerate RTTDDFT calculations by treating subsystems with different level of theories^{16,17,18,19,20}. The idea of a decomposition of the electric dipole moment into molecular orbital pairs was also proposed in recent works for the acceleration or analyses of spectra^{21,22,23}. This work explores the contribution of a fundamental ingredient—basis functions—to the electronic spectra. The truncation of basis functions proposed in this work is designed to check every single component in the basis set (basis function). One can also apply a shell level truncation for general applications. Basis set files can be easily modified for an accelerated simulation of the spectrum and to obtain a better chemical insight into the electric dipole moments contribution. Moreover, a routine to construct complete basis set (CBS) for TDDFT calculations is proposed. The calculations of electronic absorption and ECD spectra in this work take place in a linear response framework within the electric dipole approximation, assuming that the excited states of the system can be well described within the occupiedvirtual space spanned by the ground state solution of the system. As standardly done, we assume the adiabatic approximation, discarding the dependence of the exchangecorrelation functional on the history of the propagation. The decomposition of electric dipole moments into the contribution of individual AO basis functions and checking the variation of molecular orbitals (in component of basis functions) during the RTP provides a quantitative evaluation of each AO basis function based on its importance for the electronic spectra under study. This further paves the way for a truncation process on the basis set for the computational speedup and a way to generate complete basis set for TDDFT calculations.
In this work, we propose a basis set truncation scheme for TDDFT calculations. The method is tested for small molecules up to a highly conjugated system and a metal cluster, and achieves an acceleration up to an order of magnitude in RTTDDFT or LRTDDFT calculations with negligible change in the region of interests (e.g., valenceshell transitions) of the computed spectra.
Results
Electric dipole moment
In the context of this work, the electronic part of the electric dipole moment \(\overrightarrow{d}\) is defined as the trace of the product of the density matrix and integrals of the electric dipole moment operator \(e\overrightarrow{r}\) (\(\overrightarrow{r}=(x,\, y,\, z)\), e is elementary charge) in the AO basis with basis functions \(\{{\chi }_{\mu }\}\). For calculating the timedependent electric dipole moment \(\overrightarrow{d}(t)\), we use the AO basis representation for both density matrix P^{AO}(t) and the electric dipole moment integrals \(\overrightarrow{{{{{{{{\boldsymbol{D}}}}}}}}}\) as shown in Eq. (1). In this way, only the density matrix P^{AO}(t) is timedependent and \(\overrightarrow{{{{{{{{\boldsymbol{D}}}}}}}}}\) remains the same during the RTP for fixed nuclei. P^{AO}(t) can be further expressed in molecular orbital (MO) basis as P^{MO} (MO density matrix after SCF, see Eq. (3) where f_{i} is the occupation number of the ith MO) and the timedependent part is only carried by the MO coefficients C(t) and its complex conjugate C^{†}(t) (see Eq. (2)). In this work, the AO basis functions are all Gaussiantype orbitals.
Realtime propagation
In our implementation, the MO coefficients C(t) are propagated for a small timestep Δt (see Eq. (4)) using the “enforced timereversal symmetry” (ETRS)^{24} scheme. U(t + Δt) represents the propagator at time t + Δt and is calculated with Eq. (5), where S is the overlap matrix in AO basis and F(t) is the Fock matrix or Kohn–Sham (KS) matrix in AO basis at time t. C(t + Δt), U(t + Δt), and F(t + Δt) are computed selfconsistently^{24}.
F(t) needs to be constructed for each time step and usually contributes most to the computational time in RTP. For example, the elements of the HF exchange matrix K_{μν}(t) are given in Eq. (6) (\(\left\langle \mu \lambda\sigma \nu \right\rangle\) are the twoelectron repulsion integral (ERIs) in AO basis expressed in Eq. (7)), and the computation of the exchange matrix K(t), which is required for the construction of F(t), scales as \({{{{{{{\mathcal{O}}}}}}}}({{N}_{{{{{{{{\rm{AO}}}}}}}}}}^{4})\) (N_{AO} is the number of AO basis functions).
AO basis truncation
In order to decrease N_{AO}, we first analyse Eq. (1) for the electric dipole contribution from each AO basis function. For the sake of simplicity, \({\overrightarrow{O}}_{\mu }(t)\) is used to represent the μth diagonal element of \({{{{{{{{\boldsymbol{P}}}}}}}}}^{{{{{{{{\rm{AO}}}}}}}}}(t)\overrightarrow{{{{{{{{\boldsymbol{D}}}}}}}}}\), and thus \(\overrightarrow{d}(t)\) can then be rewritten as in Eq. (8). Taking a detailed look at the construction of \({\overrightarrow{O}}_{\mu }(t)\) in Eq. (9), one can find that it provides a decomposed form of electric dipole moments of each basis function. Therefore, we use \({\overrightarrow{O}}_{\mu }(t)\) to represent the electric dipole contribution from the μth basis function.
However, \({\overrightarrow{O}}_{\mu }(t)\) is not translational invariant because the value of \({\overrightarrow{D}}_{\mu \nu }\) (element in \(\overrightarrow{{{{{{{{\boldsymbol{D}}}}}}}}}\)) depends on the choice of reference points \(\overrightarrow{R}\) (see Eq. (10)). Note that \(\overrightarrow{r}\) and \(\overrightarrow{R}\) are referenced to the origin of coordinate system. Though \(\overrightarrow{R}\) does not affect the full spectrum after Fourier transform (because \(\overrightarrow{d}\) is translational invariant for neutral systems as \(\overrightarrow{R}{S}_{\mu \nu }\) cancels with the nuclear electric dipole contribution), it can change the relative contribution of electric dipole moments from each AO basis function (\({\overrightarrow{O}}_{\mu }(t)\)). We can further split \({\overrightarrow{D}}_{\mu \nu }\) into a reference point (\(\overrightarrow{R}\))independent term \(\langle {\chi }_{\mu }\overrightarrow{r}{\chi }_{\nu }\rangle\) and a reference pointdependent term \(\overrightarrow{R}{S}_{\mu \nu }\), where S_{μν} is the element of the overlap matrix in AO basis.
In \(\langle {\chi }_{\mu }\overrightarrow{r}{\chi }_{\nu }\rangle\), the relative position of atoms can cause different values of elements in the matrix, which we would like to avoid. To explain the reason, we can think about a toy system consisting of only two hydrogen atoms with Cartesian coordinates H1 \({\overrightarrow{r}}_{1}\) = (0, 0, a) and H2 \({\overrightarrow{r}}_{2}\) = (0, 0, a) where a ≠ 0. It is obvious that e.g., diagonal matrix elements \(\langle {\chi }_{{{{{{{{{\rm{H1}}}}}}}}}_{s}}\overrightarrow{r}{\chi }_{{{{{{{{{\rm{H1}}}}}}}}}_{s}}\rangle \,\ne \,\langle {\chi }_{{{{{{{{{\rm{H2}}}}}}}}}_{s}}\overrightarrow{r}{\chi }_{{{{{{{{{\rm{H2}}}}}}}}}_{s}}\rangle\) (see Eq. (11)) even though, by symmetry, we expect the same “contribution” of electric dipole from the two atoms. In Eq. (11), each H atom has a Slatertype 1s orbital Ae^{−ζr} where A is normalization constant and r is the distance from the center of the atom, and we change the integration variable from \(\overrightarrow{r}\) to \(\overrightarrow{s}\) using \(\overrightarrow{s}=\overrightarrow{r}{\overrightarrow{r}}_{1}\) and \(\overrightarrow{s}=\overrightarrow{r}{\overrightarrow{r}}_{2}\).
One way of minimizing the effect of \(\langle {\chi }_{\mu }\overrightarrow{r}{\chi }_{\nu }\rangle\) is to shift the molecular system far from (0, 0, 0), which is equivalent to set a large \(\overrightarrow{R}\). It is worth noting that we do not need to formally “move” the molecule, and this is just an assumption made in the derivation from Eq. (9) to Eq. (12). In this method, we only care about the relative value of \({\overrightarrow{O}}_{\mu }(t)\) when determining the basis function(s) to be truncated, and \(\overrightarrow{R}\) provides the same factor to \({\overrightarrow{D}}_{\mu \nu }\) and later \({\overrightarrow{O}}_{\mu }(t)\). Therefore, it is safe to substitute \({\overrightarrow{D}}_{\mu \nu }\) with S_{μν} in the expression of \({\overrightarrow{O}}_{\mu }(t)\), and thus we have a scalar O_{μ}(t) as shown in Eq. (12). It is worth noting that O_{μ}(t) does not explicitly contain any electric dipole information, which makes sense because the electric dipole cannot be formally defined on a single atomic centered orbital. However, AOs do contribute to the electric dipole by forming MOs across different atomic centers, and the information of such contribution, which we call density matrix contribution, is contained in P^{AO}(t). The expression in Eq. (12) is also known in Mulliken population analysis, as the number of electrons associated with \({\chi }_{\mu }\). The following computations are all based on the scalar form O_{μ}(t).
To quantify the contribution, we use the formula in Eq. (13). \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) is an indicator measuring the variation of the Density matrix Contribution (DC) of the μth AO basis function. S_{t}[O_{μ}(t)] computes the standard deviation of O_{μ}(t) for the total simulation time for each μ. The numerator of Eq. (13) indicates variation (along the RTP) of the electric dipole moment contribution from each AO basis function. The dimensionless quantity \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) is then constructed by dividing the numerator with its mean value of all AO basis functions. A small \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) value means the change of electric dipole moments contributed from the μth basis function is comparatively small among all AO basis functions, and removing this basis function should not change the spectrum (a constant value vanishes after Fourier transform for RTTDHF/TDDFT) significantly. One might point out that \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) cannot distinguish the pulse from different directions because O_{μ}(t) in Eq. (12) is no longer direction dependent like in Eq. (8). However, it is found that P^{AO}(t) still varies according to the direction of the pulse because its action is coded in the MO coefficients.
It is worth noting that we have considered applying basis transformation regarding to \({{{{{{{{\boldsymbol{P}}}}}}}}}^{{{{{{{{\rm{AO}}}}}}}}}(t)\overrightarrow{{{{{{{{\boldsymbol{D}}}}}}}}}\), namely using the eigenvectors of P^{AO}(t) (transform to natural orbital basis) or \(\overrightarrow{{{{{{{{\boldsymbol{D}}}}}}}}}\). However, the former one is timedependent so it is hard to choose a transformation matrix for all time steps, and the latter one is reference pointdependent (see Eq. (10)) and one cannot obtain the consistent truncation choice under translation (note that nuclei do not move in this study as opposed to, e.g., Ehrenfest dynamics). Also, an AO basis is a common choice in most molecular simulations and some solid state simulations (e.g., Gaussian and Plane Waves^{25} method in CP2K package (CP2K version 7.0 (Development Version), the CP2K developers group. CP2K is freely available from https://www.cp2k.org/.)). Therefore, the truncation on the AO basis has broad application prospects and can be easily applied by a simple modification of basis set file.
During practical tests of basis truncation, we observed that using only \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) as indicator is not enough for obtaining an accurate spectrum. Another indicator \({x}_{\mu }^{{{{{{{{\rm{IP}}}}}}}}}\) is introduced which measures the Importance of Propagation stability (IP) of the μth AO basis function (see Eq. (14)). C_{μj}(t) denotes an element in transformation matrix (from AO to MO basis), S_{t} computes the standard deviation along the time for each μ and j, and \(\mathop{\sum }\nolimits_{j}^{{N}_{{{{{{{{\rm{MO}}}}}}}}}}\) sums over all standard deviations in MOs originating from the μth AO basis function. The numerator of Eq. (14) indicates the variation (along RTP) of the contribution from each AO basis function to all MOs in transformation matrix C(t). As for \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\), the dimensionless quantity \({x}_{\mu }^{{{{{{{{\rm{IP}}}}}}}}}\) is also constructed by dividing the numerator with its mean value of all AO basis functions. Small \({x}_{\mu }^{{{{{{{{\rm{IP}}}}}}}}}\) value means that the contributions to MOs from the μth basis function do not change much compared to the contribution of all AO basis functions, and removing this basis function should not affect the propagation of the density matrix (remaining part) significantly. Note that both O_{μ}(t) and C_{μj}(t) are usually complex numbers for RTP, and the standard deviation of a set of complex numbers is calculated as in Supplementary Eq. (1).
In practice, an empirical parameter x^{thr} is chosen as threshold for both \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) and \({x}_{\mu }^{{{{{{{{\rm{IP}}}}}}}}}\), where the AO basis functions with both indicators smaller than x^{thr} can be removed. The remaining basis set \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) (truncated AO basis set) is then defined as in Eq. (15), given the original AO basis set {χ_{μ}} (of which the cardinality \({\{{\chi }_{\mu }\}}\) is N_{AO}). Sometimes \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) includes only part of the given shell, e.g., for a pshell, only \({\chi }^{{p}_{x}}\) and \({\chi }^{{p}_{y}}\) are in \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) and \({\chi }^{{p}_{z}}\) is truncated. Such symmetry breaking is mainly due to the utilization of polarized field (δpulse) in the RTTDDFT calculations, and the rotational invariance requires a shell level truncation. Considering the truncated basis set used in any computational chemistry package, we also recommend shell level truncation for the general application of the basis set file. In most cases, the majority rule can be applied for a truncation at the shell level, namely, the shells containing more than half of their original basis functions remain in \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) while others are fully discarded. This has to be checked with the \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) indicators of the basis functions in the same shell to ensure that there are no strong contribution to electric dipole transitions arising from some basis functions. In this study, the basis sets of (S)methyloxirane, ()αpinene, ZnPc, and Ag_{20} systems are truncated at the shell level.
The schematic view of the truncation process is shown in Fig. 1.
Using \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\), namely, reducing number of basis functions from N_{AO} to \({N}_{{{{{{{{\rm{trunc}}}}}}}}}={\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\), can ideally decrease the total computational time to \({({N}_{{{{{{{{\rm{trunc}}}}}}}}}/{N}_{{{{{{{{\rm{AO}}}}}}}}})}^{4}\) for a RTTDHF calculation or a RTTDDFT calculation with hybrid exchangecorrelation functional. Also, this truncated basis set can be transferred to LRTDHF/TDDFT calculations.
The procedure for carrying out RTTDHF/TDDFT calculations with truncated AO basis set for the examples studied in this work is described as follows:

Run 100 (400) steps (1% of the total simulation time) of RTTDHF/RTTDDFT simulation with the timestep 0.2 (0.05) atomic units using a preliminarily chosen basis set \({\{{\chi }_{\mu }\}}\), and collect the information regarding S, P^{AO}(t), C(t) of every step.

Calculate \({x}_{\mu }^{{{{{{{{\rm{DC}}}}}}}}}\) and \({x}_{\mu }^{{{{{{{{\rm{IP}}}}}}}}}\) via Eqs. (13) and (14), respectively, and select truncated AO basis set \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) based on the criteria in Eq. (15).

Run 10’000 (40’000) steps of RTTDHF/TDDFT (full) simulation with the same timestep using \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\). Note that a ground state SCF calculation should be carried out with the truncated basis set before the RTTDHF/RTTDDFT simulation in order to apply the perturbation to a converged ground state.
It is worth noting that the truncation procedure by construction eliminates the transitions that are not (or very weakly) electric dipole allowed, and thus this approach focuses more on the overall spectrum rather than the types of transitions.
Complete basis set limit
In addition to the analyses of truncated AO basis functions, we introduce an algorithm to construct basis sets towards the CBS limit for RT(LR)TDHF/TDDFT calculations.
The idea of CBS limit employed here is to add diffuse functions (see examples (S)methyloxirane and ()αpinene for the reason) to all types of AO basis functions (s, p, d, f, g, ...) representing different orbital angular momenta l. These functions are added in an eventempered manner^{26,27} by a geometric progression of the orbital exponents in the original basis set: \({\alpha }_{l,k}={\alpha }_{l}{\beta }_{l}^{k},\,\forall k\in {\mathbb{N}}\). α_{l,k} is an exponent of the lshell with kth power, and α_{l} and \({\beta }_{l}^{k}\) are two parameters to be optimized for the basis set. Since most basis sets available (Pople^{28}, Dunning^{29}, Jensen^{30}, Ahlrichs^{31,32}, etc.) provide more than one exponent for each type of shell, we can directly extrapolate from these values to get the additional exponent \({\alpha }_{l,k+1}={\alpha }_{l,k}^{2}/{\alpha }_{l,k1}\). One can increase the k value until significant linear dependencies are found in the basis set (sometimes also referred to as basis set overcompleteness^{33}).
This CBS scheme usually requires a quite large basis set for the calculation, and it is usually unclear which basis function(s) should be removed once overcompleteness is reached. Therefore, we combine it with AO truncation and propose an “AddWhileTruncate” algorithm (see Algo. 1) to construct the CBS specifically designed for RT(LR)TDHF/TDDFT calculations. Firstly, a preliminarily chosen basis set \({\{{\chi }_{\mu }\}}\) is used for a short period of RTTDHF/TDDFT calculation and \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) is selected. An additional basis set containing diffuse functions (\({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{diffuse}}}}}}}}}\)) is constructed in an eventempered manner. \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{diffuse}}}}}}}}}\) may contain some basis functions truncated in previous steps (combined as \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{deleted}}}}}}}}}\)), which should be removed. Then \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{diffuse}}}}}}}}}\) is combined with \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) to form a new basis set \({\{{\chi }_{\mu }\}}\). In order to check the overcompleteness of the newly created AO basis set, we calculate the overlap matrix \({{{{{{{{\boldsymbol{S}}}}}}}}}_{\{{\chi }_{\mu }\}}\) and solve for its eigenvalues λ. If the minimal absolute eigenvalue ∣λ∣_{min} is smaller than a userdefined small value ϵ or \({\{{\chi }_{\mu }\}}\) remains the same as in the last cycle (namely, basis functions are neither truncated nor added), \({\{{\chi }_{\mu }\}}\) is regarded as the CBS under such ϵcondition (\({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{CBS}}}}}}}}\epsilon }\)), otherwise the new \({\{{\chi }_{\mu }\}}\) is used to repeat the previous steps until the final condition is fulfilled. In practice, one can also manually remove some newly added diffuse functions within \({\{{\chi }_{\mu }\}}\) in the iteration to satisfy the given ϵcondition. In this case, in order to minimize the total number of basis functions, we first remove the diffuse functions corresponding to higher orbital angular momentum, which is the same idea as the one applied in calendar basis sets^{34}. For the sake of simplicity, we use the term “basis functions” for “AO basis functions" in the remaining part of this manuscript.
Algorithm 1
AddWhileTruncate CBS Algorithm
1: repeat
2: \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{old}}}}}}}}}\leftarrow \{{\chi }_{\mu }\}\)
3: Run a RTTDHF/TDDFT simulation with \({\{{\chi }_{\mu }\}}\) for 100 (400) steps.
4: Construct \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\) by Eqs. (13)–(15)
5: Construct additional eventempered basis set \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{diffuse}}}}}}}}}\) for \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\)
6: \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{deleted}}}}}}}}}\leftarrow {\{{\chi }_{\mu }\}}_{{{{{{{{\rm{deleted}}}}}}}}}\cup (\{{\chi }_{\mu }\}\setminus {\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}})\)
7: \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{diffuse}}}}}}}}}\leftarrow {\{{\chi }_{\mu }\}}_{{{{{{{{\rm{diffuse}}}}}}}}}\setminus {\{{\chi }_{\mu }\}}_{{{{{{{{\rm{deleted}}}}}}}}}\)
8: \(\{{\chi }_{\mu }\}\leftarrow {\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\cup {\{{\chi }_{\mu }\}}_{{{{{{{{\rm{diffuse}}}}}}}}}\)
9: Solve for eigenvalues λ of overlap matrix \({{{{{{{{\boldsymbol{S}}}}}}}}}_{\{{\chi }_{\mu }\}}\)
10: until ∣λ∣_{min} < ϵ or \(\{{\chi }_{\mu }\}={\{{\chi }_{\mu }\}}_{{{{{{{{\rm{old}}}}}}}}}\)
11: \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{CBS}}}}}}}}\epsilon }\leftarrow {\{{\chi }_{\mu }\}}_{{{{{{{{\rm{trunc}}}}}}}}}\)
Example: H_{2} dimer
The H_{2} dimer is used as the first test system, with the δpulse applied along z direction (see the geometry in Fig. 2d, z axis is parallel to the H–H bond, and y axis is perpendicular to the plane formed by the four atoms). Four different basis sets, 631G, 631G**, 631++G, and 631++G**^{28,35,36}, are utilized for RTTDHF calculations.
For each H atom, 631G contains two stype basis functions (noted as 2s for convenience), 631G** contains 2s1p (1p as extra polarization function, we use italic form to represent specific basis function(s), e.g., 1p means the first ptype basis function), 631++G contains 3s (3s as an extra diffuse function), and 631++G** contains 3s1p. Note that the abbreviations we use here refer to basis functions but not specific electron shells. The same convention of basis/orbital notations is utilized for all examples in this study. E.g. for a truncation from 5s4p to 5s3p ( −3p), −3p means the 3^{rd} ptype basis function is removed but the 1^{st}, 2^{nd}, and 4^{th} ptype basis functions and all stype basis functions remain.
After 100 steps of RTTDHF calculations, x^{DC} and x^{IP} of each basis function are computed and can be visualized in Fig. 2a–d as x^{DC}x^{IP} map. Basis functions are represented by colored squares with their value x^{DC} ⋅ x^{IP}. The electric dipole contribution and importance of propagation stability of basis functions are sorted in two axes. The red dashed line and the gray dashed line represent x^{thr} = 0.1 and x^{thr} = 0.2, respectively. x^{thr} splits the x^{DC}x^{IP} map into four quadrants: important for both electric dipole contribution and propagation stability (top right), important for only electric dipole contribution (top left), important for only propagation stability (bottom right), and important for neither one (bottom left). The basis functions locate inside the left below region (red borders for x^{thr} = 0.1) are the ones recommended to be deleted from the basis set.
One can find that no basis function is to be deleted in the case of 631G (Fig. 2a) and 631++G (Fig. 2c) basis sets, and there are 8 basis functions to be deleted in the case of 631G** (Fig. 2b) and 631++G** (Fig. 2d) basis sets with x^{thr} = 0.1. These 8 basis functions are the same for 631G** and 631++G**: 2p_{x} and 2p_{y} of each H atom, which belong to the polarization functions. If x^{thr} is set to 0.2 (gray dashed lines in Fig. 2b, d), extra four basis functions are to be deleted in both cases: 2p_{z} of each H atom. Therefore, a setting of x^{thr} = 0.2 essentially truncates 631G** to 631G and 631++G** to 631++G for the H_{2} dimer. Note that the pulse causes differences between atoms with different nuclear Cartesian coordinates, leading to different x^{DC}x^{IP} values of the same basis functions in different H atoms. In Supplementary Fig. 1, we further provide a more intuitive view of O_{μ} from each basis function in 631++G and 631++G** basis sets. One can easily distinguish small contribution components from the large contribution ones.
Let us take a closer look at the x^{DC}x^{IP} map in the case of 631++G** basis set (see basis functions shown in Fig. 2d, we focus only on one H atom here). Actually, we can explain qualitatively that 2p_{x} and 2p_{y} are the least important basis functions for the simulation of the electronic absorption spectrum. The electric dipole transitions from σ_{1s−1s} to \({\pi }_{2{p}_{x}2{p}_{x}}\), \({\pi }_{2{p}_{x}2{p}_{x}}^{*}\), \({\pi }_{2{p}_{y}2{p}_{y}}\), and \({\pi }_{2{p}_{y}2{p}_{y}}^{*}\) are almost (considering the effect from the other H_{2} molecule closeby) forbidden due to symmetry reasons. The x^{DC}x^{IP} map shows that 2p_{x} is slightly more important than 2p_{y}, which may be explained by a stronger interaction on the x direction between H_{2} molecules. The electric dipole transition \({\pi }_{2{p}_{z}2{p}_{z}}^{*}\leftarrow {\sigma }_{1s1s}\) is allowed, and thus the 2p_{z} basis function is considered to be more important than 2p_{x} and 2p_{y} for the electronic absorption spectrum. The electric dipole transition \({\sigma }_{2s2s}^{*}\leftarrow {\sigma }_{1s1s}\) is also allowed and \({\sigma }_{2s2s}^{*}\) has an lower energy than \({\pi }_{2{p}_{z}2{p}_{z}}^{*}\), which leads to a higher occupation probability. Therefore, 2s in 631++G** basis set is one of the “dominant” basis functions in the RTP for H_{2} with the computational settings used.
In more complex systems, such an energetic analysis in terms of “static” wavefunctions (e.g., wavefunctions after SCF) is not enough to give a reasonable truncated basis set since MO coefficients C(t) are timedependent, which explains our choice of using first 100 (400) steps for the analysis.
Besides, a Jaccard index^{37} ( J(x^{thr})) is applied to analyze the similarity of deleted basis functions suggested by our two criteria x^{DC} and x^{IP} (see Eq. (16)). A high Jaccard index indicates more basis functions in common between two sets and vice versa. This information provides an intuitive view of the truncation along x^{thr} for a given basis set.
For H_{2} dimer system, it is found that J(x^{thr}) of 631G** and 631++G** remains at a value of 1.0 from x^{thr} = 0.01 to x^{thr} = 1.0 (see Fig. 2e). This shows that RTP has clear “preference” for some basis functions within the given basis sets. In the case of 631G and 631++G basis sets, on the other hand, J(x^{thr}) remains at a value of 0.0 up to x^{thr} = 0.9, which means that no redundant basis functions are found for such basis sets.
The spectra using different basis sets and truncated 631++G** basis sets are in Fig. 2f. The spectra using 631G and 631G** basis sets look very similar, which matches the truncation suggestion given in Fig. 2b. The same situation is also found in the spectra using 631++G and 631++G** basis sets. Truncated 631++G** basis set with x^{thr} = 0.2 (noted as 631++G** trunc 12, 12 basis functions left) leads to the same basis set as 631++G, with an error of ~ 0.1 eV of corresponding excitation energies. With a tighter threshold x^{thr} = 0.1, truncated 631++G** basis set (noted as 631++G** trunc 16, 16 basis functions left) achieves more accurate spectra compared to 631++G basis set, with an error at the level of ~0.01 eV. These results are in accordance with the x^{DC}x^{IP} map introduced before. For the sake of completeness, the cases of δpulse from x or y direction are included in Supplementary Fig. 2.
Example: H_{2}O dimer
Four different basis sets, def2TZVP, def2TZVPP, def2TZVPD, and def2TZVPPD^{32,38} are utilized for RTTDHF calculations of H_{2}O dimer system (see Fig. 3g for the nuclear structure). Again, these four basis sets are chosen based on the addition of polarization functions and/or diffuse functions. The x^{DC}x^{IP} map of this system is shown in Fig. 3a–d after 100 steps of RTTDHF calculations. Compared to H_{2} dimer, H_{2}O dimer is a more complicated system and thus the x^{DC}x^{IP} map is more involved. Nevertheless, it is clear that the distribution of basis functions in all plots shows a “dumbbell” shape (from left below to right top), namely, more dispersed in low and high x^{DC}/x^{IP} region compared to the middle range. This provides a rough idea of the range of truncation, and the cross point of x^{thr} = 0.1 generally locates at the neck of the “dumbbell”. The Jaccard indices for the four basis sets are shown in Fig. 3e. In Supplementary Figure 3, some visualizations of orbitals are shown in the x^{DC}x^{IP} map of the def2TZVP basis set.
A natural question of basis set truncation is whether one needs to do it recursively until the basis set does not change anymore (which we refer to as “recursive truncation”). Therefore, we test the recursive truncation of def2TZVPD and def2TZVPPD basis sets, and provide the number of basis functions together with Jaccard indices (see Fig. 3f). Onetime truncation decreases the number of basis functions from 148 to 103 and from 116 to 92 for the two original basis sets, while recursive truncation only decreases the number further from 103 to 98 and from 92 to 91, respectively. It is worth mentioning that the two basis sets are very similar as one can see from the number of basis functions after the recursive truncation. The Jaccard indices give visual evidence that “def2TZVPD trunc 91” and “def2TZVPPD trunc 98” do not change after another truncation process since J(x) = 0 for x ∈ [0, 0.1]. It also shows that onetime truncation is good enough to significantly decrease J(x^{thr}) value (see two dashed lines). Considering the time and computational resources spent on recursive truncation process (usually needs several rounds of RTTDHF/TDDFT calculation), we only focus on onetime truncation in the following.
The spectra of H_{2}O dimer using def2TZVP, def2TZVPP, def2TZVPD, and def2TZVPPD basis sets give a similar conclusion as in H_{2} dimer case, namely, that extra diffuse functions have large impact on the absorption spectra while extra polarization functions have limited impact on the absorption spectra (see Supplementary Figure 4). In addition, we provide the spectra after a onetime truncation and recursive truncation processes (see Fig. 3g). All spectra in this figure are very close to each other up to an excitation energy of 20 eV, indicating def2TZVPPD includes many redundant basis functions for RTTDHF calculations in this case. For the usage of computational resources, in the case of \({{{{{{{\mathcal{O}}}}}}}}({N}^{4})\) scaling (HF Coulomb and exchange matrices calculated with 2electron integrals, without realspace griding or densityfitting), a RTTDHF run with “def2TZVPD trunc 91” basis set only consumes (91/148)^{4} = 14 % of the time compared to the original def2TZVPPD basis set. For this system, a special interest is the effect of hydrogen bonds on the truncation process. However, we do not find any dependence of deleted basis functions on the distance (up to 10 Å) between two water molecules, and the suggested truncated basis sets are very similar. This may indicate that the basis functions needed for the description of hydrogen bonds are also important for the electronic absorption spectrum of the water monomers themselves.
Moreover, the CBS scheme is tested for the H_{2}O dimer system. Two basis sets, def2TZVPPD and def2QZVPPD are used as the starting point for the CBS scheme, with ϵ = 10^{−6} (CBS10^{−6}). We directly modify the basis set file every time when truncating or adding basis functions. The detailed steps of the CBS scheme for def2TZVPPD basis set are shown in Table 1. The original def2TZVPPD basis set of the H atom and the O atom is 3s3p1d and 6s4p3d1f, respectively, with 148 basis functions for the H_{2}O dimer system. After the first RTTDHF run, the first dsubshell (1d) of H and the first dsubshell and fsubshell of O (1d1f) are truncated (shown in the bracket). The diffuse functions are then added to the remaining subshells (shown in the bracket), resulting in 4s4p for H and 7s5p3d for O. This is followed by a second RTTDHF run with truncating and adding basis function. With a basis set of 5s4p for H and 8s6p4d for O, we find ∣λ∣_{min} is smaller than the threshold we set (10^{−6}), thus the newly added subshell with the highest angular moment is removed, say, 4p in H and 4d in O. Finally, we obtain a basis set of 5s3p for H and 8s6p3d for O, with 138 basis functions and ∣λ∣_{min} = 3.7 × 10^{−6}. In Supplementary Table 1, we do it analogously for def2QZVPPD basis set.
RTTDHF calculations are carried out with these two CBSs, and the resulting spectra are shown in Fig. 3g. Choosing either the def2TZVPPD or the def2QZVPPD basis set (blue and red solid lines) can result in differences of absorption peaks with an excitation energy larger than 12 eV, while their corresponding CBSs10^{−6} (blue and red dashed lines) match until 15 eV. Also, CBSs10^{−6} leads to a red shift of 0.2 eV compared to the two original basis sets, which usually indicates the behavior of a larger basis set according to the observations in this work, while in our case this is achieved by less basis functions. In addition, we utilize combined basis sets of original basis set and its CBS10^{−6} (noted as CBSall) for the same calculation, and we can see that each spectrum (dotted lines) also agrees with the corresponding CBS10^{−6} one.
Examples: (S)methyloxirane and ()αpinene
As other examples, (S)methyloxirane and ()αpinene molecules are tested with truncated basis sets. The def2TZVPP^{32,38} basis set is adopted as a reference basis set and the B3LYP functional is selected as the exchangecorrelation functional for these two systems. x^{thr} = 0.1 and x^{thr} = 0.2 are used as the truncation threshold. Supplementary Tables 2 and 3 give information about the original and truncated basis sets for (S)methyloxirane and ()αpinene, respectively. The corresponding x^{DC}x^{IP} maps are in Supplementary Figs. 5 and 6.
The resulting absorption spectra are shown in Fig. 4a and Supplementary Fig. 8a. The truncated basis sets, both x^{thr} = 0.1 and x^{thr} = 0.2, provide a good approximation to the absorption spectra compared to the original def2TZVPP basis set, while using as few as half of the basis functions (in the case of x^{thr} = 0.2). Apart from the usage in RTTDDFT, the truncation process is found to be robust in LRTDDFT as well. In LRTDDFT calculations, 500 and 2000 roots are solved for (S)methyloxirane and ()αpinene systems, respectively. From these results and the results in H_{2} dimer and H_{2}O dimer systems, one may find that most basis functions truncated are polarization functions, e.g., p/dsubshell for H and d/fsubshell for C/O, while diffuse functions are usually not removed. This explains why the CBS scheme we propose only considers additional diffuse functions.
Furthermore, we use the same basis sets for ECD spectra calculations, considering that the two quantities x^{DC} and x^{IP} do not explicitly depend on the electric dipole operator. The ECD spectra of (S)methyloxirane and ()αpinene are shown in Fig. 4b and Supplementary Fig. 8b, respectively.
Table 2 gives the benchmark of basis sets used for (S)methyloxirane and ()αpinene. RTTDDFT calculations of these two systems are carried out using CP2K. Because Coulomb and exchange and correlation (XC) terms are evaluated on grids, we do not observe a significant timesaving using the proposed truncated basis sets. Nevertheless, computational resources can be reduced as much as one order of magnitude in LRTDDFT calculations (Gaussian09^{39}) using truncated basis sets. The corresponding memory usage of ()αpinene system is also shown in the table. The memory cost of Coulomb and exchange matrices scale as \({{{{{{{\mathcal{O}}}}}}}}({N}^{4})\) (2electron integrals, without realspace griding or densityfitting), and one may easily encounter a memory bottleneck with large basis sets (e.g., using def2TZVPP for ()αpinene, maximal memory set to 200 GB), which, however, can be alleviated with truncated basis sets. In Supplementary Tables 4 and 5, we show the scaling information of ()αpinene using HF/def2TZVPP and its truncated basis set, and computational time to calculate Coulomb and exchange matrices, respectively. To assess the contribution from HF exchange term, we further show the difference between B3LYP/def2TZVPP and BLYP/def2TZVPP in the calculation of RTTDDFT spectrum in Supplementary Fig. 7.
Example: ZnPc
ZnPc is a popular example for excitedstate calculations^{40,41,42,43,44,45}. This example is mainly utilized to demonstrate a stepbystep truncation from 631G(d,p) to 631G. Here we use the nuclear geometry of ZnPc from a previous study^{45} with the B3LYP functional and 631G(d,p)^{28,35,36} as the reference basis set. Figure 4c provides the information of deleted basis functions and Jaccard indices. The numbers close to the dashed lines are x^{thr} values, and the notations on the right side are details of deleted basis functions in italic form. Note that the number of deleted basis functions (\({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{deleted}}}}}}}}}\)) can be higher than the number calculated from the subshell notations on the right. This is because extra basis functions might also be deleted but not the corresponding full subshells, e.g., Zn 1f x^{thr}=0.04 corresponds to a \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{deleted}}}}}}}}}\) value larger than 7 because some other basis functions like H 1p_{z} (but not the full 1p) are deleted. In addition, this planar system, which we place in x − y plane in the simulation, shows some preferences for 1d_{xy} and \(1{d}_{{x}^{2}{y}^{2}}\) of C/N elements, and thus 1d_{yz}, 1d_{xz}, and \(1{d}_{{z}^{2}}\) basis functions are the first to be deleted in the range of x^{thr} = 0.09 ~ 0.18, indicating that the truncation scheme can provide the information of preference on the orientation of basis functions (or AOs with different magnetic quantum number). It is worth mentioning that x^{thr} = 0.09 and x^{thr} = 0.18 truncation leads basically to the 631G(d) and 631G basis sets, except for the additional truncation on the Zn atom. Also, it is found that \({\{{\chi }_{\mu }\}}_{{{{{{{{\rm{deleted}}}}}}}}}\) and J(x^{thr}) show a very similar trend. After x^{thr} = 0.18, both lines reach a plateau where seldom further basis functions can be removed, indicating 631G as a good truncated basis set. Actually, we can see this from x^{DC}x^{IP} map of the same system (see Supplementary Fig. 9) in which the truncated and remaining basis functions almost form two blocks with x^{thr} = 0.18 (dashed blue line).
The corresponding RTTDDFT and LRTDDFT (1000 roots) spectra are given in Fig. 4d. It is clear that 631G(d,p), 631G(d), and 631G basis sets all provide similar results, which match our truncation suggestions. This shows a practical usage of our truncation scheme on the selection of basis set. In addition, the CBS10^{−6} with 631G(d,p) reference is constructed (see Supplementary Table 6 for the CBS process). Nevertheless, it does not change much in the RTTDDFT/LRTDDFT spectra compared to the original 631G(d,p) basis set.
Example: Ag_{20}
Ag_{20} is a metal cluster with tetrahedral structure (T_{d} symmetry), which has been investigated with TDDFT calculations^{46,47,48}. Here we use the nuclear geometry of Ag_{20} from a previous study^{48} together with PBE0^{49} functional and GTH^{50,51} Gaussiantype pseudopotential basis sets^{52} GTHDZVP, GTHTZVP, and GTHTZV2P. GTHTZV2P is used as the reference basis set for the truncation process. The Ag atoms in Ag_{20} cluster are categorized into 3 groups: vertex (v), edge (e), and face (f) (see Fig. 4e). The atoms in the same group are equivalent in space and should have the same contribution to the electronic absorption spectrum. Table 3 shows the truncated basis functions versus increasing x^{thr} values. As one can see from the table, atoms in different groups generally have different suggested basis set truncations. More basis functions are truncated for atoms at vertex position, and less for atoms at face position. This is reasonable because vertex Ag atoms have a limited space angle “bonded” with other atoms, while face Ag atoms have half of their surrounding space occupied with 9 nearest neighbors, and complex surroundings often require more basis functions to describe the interactions. For comparison, basis set information of GTHTZVP and GTHDZVP is also listed. The truncation scheme provides quite different basis sets from GTHTZVP and GTHDZVP basis set, e.g., GTHTZVP can be regarded as −2f truncated basis set of GTHTZV2P, but 2f basis functions are the last choice of truncation from our scheme (up to x^{thr} = 0.6). This means that the corresponding standardly available smaller basis sets, e.g., GTHDZVP or GTHTZVP, do not always contain the most important basis functions (for TDDFT calculations) from the larger ones, e.g., GTHTZV2P, which is different from what we have found for the ZnPc example system. We select x^{thr} = 0.3 and x^{thr} = 0.5 truncated basis sets for LRTDDFT calculations, under the consideration that the numbers of basis functions are close to GTHTZVP and GTHDZVP basis sets, respectively.
The LRTDDFT spectra (2000 roots) of the Ag_{20} cluster using 5 different basis sets are shown in Fig. 4f. In general, “GTHTZV2P trunc 684” (x^{thr} = 0.3) gives better agreement with the reference GTHTZV2P basis set than the one with the GTHTZVP basis set. This can be seen as follows: 1. for the first several peaks (at ~ 2.6 eV, 2.9 eV, and 3.4 eV), red dashed peaks all locate closer to black peaks than yellow dashed peaks; 2. for the peaks up to 7 eV, the red dashed line follows closer to the black line than the yellow dashed line. However, the difference between the spectra calculated using “GTHTZV2P trunc 512” (x^{thr} = 0.5) and GTHDZVP basis sets is limited, which can be explained by their similar composition in terms of basis functions shown in Table 3. The Ag_{20} example demonstrates that the proposed truncation scheme has the ability of assigning different basis sets to the atoms, according to their “interaction” with the full system.
Additionally, we also provide some testing calculations to demonstrate: 1. 1% of the total propagation time is sufficient to show the contribution from each basis function, and the same truncation suggestion has been obtained using 1%, 10%, and 100% of the RTP steps (see Supplementary Fig. 10) 2. Indicator x^{IP} is necessary in the truncation scheme, and the truncation using only indicator x^{DC} can lead to a different spectrum (see Supplementary Fig. 11) 3. ERIs Schwarz screening does not affect the truncation scheme, and they can be used together for the acceleration in RTTDHF/TDDFT calculations (see Supplementary Figs. 1213 and Supplementary Tables 79).
Discussion
We have introduced an AO basis set truncation scheme for TDDFT calculations, based on the analysis of a short period of real time propagation of MO coefficients. Two quantities – density matrix contribution and importance of propagation stability – are constructed as indicators for the truncation process. The truncated basis sets are found to reproduce the electronic absorption spectra obtained with the original basis sets well. In some cases, truncated basis sets can serve as intermediate basis sets between two levels of available basis sets, or are found to be very close to lower level basis sets available, in which the truncation process works as a means to help in basis set selection. Two intuitive graphs, x^{DC}x^{IP} map and Jaccard index, are introduced for the analysis of basis functions. These graphs also provide a guide for the choice of the truncation threshold x^{thr} (e.g., see diagrams in Fig. 5).
As opposed to basis sets constructed mainly for the purpose of energy minimization and geometry optimization, the truncation scheme proposed provides a task, system, and chemical environmentspecific basis set. It has reduced number of basis functions and accelerates the calculations involving the construction of Coulomb and/or exchange matrices iteratively in every propagation step, potentially with the scaling of \({{{{{{{\mathcal{O}}}}}}}}({N}^{4})\). ERIs usually benefit from evaluating all components from the same shell. However, they are only computed once before the propagation. Because the truncation process is carried out on the original AO basis set without any rotation or reconstruction of basis functions, the truncated basis sets can be easily employed in any quantum chemistry package using Gaussiantype (or Slatertype) basis, with a simple modification of the basis set file. Additionally, we have tested recursive truncation to show that the process is robust and will result in a “truncation consistent” basis set given a certain x^{thr}. Though the truncation is based on the analysis of realtime propagation, the basis sets produced can also been used for LRTDDFT calculations and provide equally good spectra. Nevertheless, the acceleration of LRTDDFT calculations depends on the the systems and purposes of the research, e.g., limited number of excitations in LRTDDFT for a small system may not be worth an additional RTTDDFT calculation to determine a truncated basis set, while a highly conjugated or a large system with excitations of higher energies should benefit from the truncation scheme. How the truncation scheme and truncated basis set might be transferable to more accurate yet expensive methods like GW/BetheSalpeter equation^{53,54,55}, timedependent coupled cluster/configurational interaction^{56,57,58}, or other type of any excitedstate calculations might be explored in the future.
Furthermore, an “AddWhileTruncate” algorithm has been proposed to construct basis sets towards the complete basis set limit. The additional basis functions are added as diffuse functions in an eventempered manner, and no extra polarization functions are added. The neglect of polarization functions is primarily based on truncation experiences we have got from this study (e.g., in H_{2}O dimer, (S)methyloxirane, and ()αpinene systems). There are some discussions about the use of polarization and diffuse functions used for electric dipole moment, polarizability, and TDDFT calculations in previous works^{59,60,61,62,63}. Nevertheless, as shown in test examples, the truncation process provides the possibility to select polarization and diffuse functions quantitatively. The proposed CBS scheme can construct basis sets to arbitrary accuracy, depending on a predefined parameter limited by the linear dependency between basis functions.
The truncation scheme might reveal some intrinsic knowledge for the better description of electronic excitations between ground state and excited states, and offer a thought for the design of basis sets in TDDFT calculations. Future work can be on both basis set constructions and migration to other excitedstate calculations or properties. In this work, all original basis sets employed are ground state energyoptimized, however, there is another group of completenessoptimized basis sets^{64,65,66}, with which one may also test the efficiency and validate the accuracy towards CBSlimit^{65}. Auxiliary density matrix methods^{67} provide an alternative way to accelerate HF exchange calculation via auxiliary basis set, and have been found to yield highly accurate results in energies and response properties^{63}. Considering the computational demanding HF exchange calculation employed in hybrid functionals, it is possible to further assess the truncation scheme for auxiliary basis sets. In addition, the idea of decomposing the electric dipole contributions into the contribution from individual basis functions can be migrated to other properties and produce different task specific basis sets. One may be interested in a truncated basis for dynamic calculations, which, however, might require further investigations on the consistency in the truncation for each nuclear configuration. Apart from basis set truncation, a direct basis set optimization algorithm (e.g., on exponents of Gaussiantype basis functions) is also possible given a proper loss function based on x^{DC} and x^{IP} parameters. While we have only tested the truncation process on neutral molecules in this study, charged systems could also be investigated. This would be an interesting topic since it may demonstrate the dependence of necessary basis functions on different charges for excited state calculations (e.g., effect of diffuse function on anions which is known for ground state cases).
In summary, our basis set truncation scheme provides a robust process for decreasing the number of basis functions and speeding up TDDFT calculations, while preserving the high accuracy of the spectra. The quantitative basis set analysis allows a profound understanding of the basis functions employed and opens up a broad area for potential research in excited state calculations.
Methods
The systems H_{2} dimer, H_{2}O dimer, (S)methyloxirane, ()αpinene, zinc phthalocyanine (ZnPc), and Ag_{20} have been investigated. Information about the applied computational methods, basis sets, and codes are listed in Table 4. For the H_{2} dimer and the H_{2}O dimer, we utilize an inhouse version of the PySCF^{68,69} RTTDHF module^{70} to test truncation and CBS scheme. Calculations are carried out with 631G series^{28,35,36} w/o additional polarization/diffuse functions, and def2TZVP series^{32,38} w/o additional polarization/diffuse functions. No Schwarz screening is used for the RTTDHF calculations of H_{2} dimer and H_{2}O dimer systems. For (S)methyloxirane, ()αpinene, and ZnPc, the CP2K (CP2K version 7.0 (Development Version), the CP2K developers group. CP2K is freely available from https://www.cp2k.org/.) package and the Gaussian09^{39} package is used for RTTDDFT and LRTDDFT (B3LYP^{71}) calculations, respectively. For Ag_{20}, GoedeckerTeterHutter (GTH) pseudopotential^{50,51} with the corresponding Gaussiantype pseudopotential basis sets^{52} GTHDZVP, GTHTZVP, and GTHTZV2P, and PBE0^{49} hybrid functional are employed for timedependent density functional perturbation theory (TDDFPT, up to the first order of the perturbation we use the term LRTDDFT in this work) calculations using CP2K (CP2K version 7.0 (Development Version), the CP2K developers group. CP2K is freely available from https://www.cp2k.org/.) package. Schwarz screening threshold 10^{−10} (default in CP2K) is used in the RTTDDFT calculations of (S)methyloxirane, ()αpinene, ZnPc, and Ag_{20} systems. All basis set files used in this work are from Basis Set Exchange^{72}, visualization of molecular structures and orbitals uses Avogadro^{73} software, and graphs are generated with Matplotlib^{74}.
A δpulse is chosen as the electric field perturbation to excite the molecules in RTTDHF/TDDFT calculations. The application of the δpulse can be thought of as being applied instantly to the converged ground states MOs \({\phi }^{0}\rangle\) between a time t = 0^{−} and t = 0^{+}. It corresponds to an impulse^{75}
where ℏ is reduced Planck constant. The vector \(\overrightarrow{\kappa }\) indicates the direction and amplitude of the perturbation. The propagation is then started from the perturbed MOs \({\psi }_{i}^{\delta }\rangle\).
Data availability
The data generated in this study have been deposited at https://gitlab.uzh.ch/lubergroup/aotruncation. Source data are provided in this paper.
Code availability
Python codes for carrying out basis set truncation and RTTDSCF calculations are available at https://gitlab.uzh.ch/lubergroup/aotruncation.
References
Casida, M. & HuixRotllant, M. Progress in timedependent densityfunctional theory. Annu. Rev. Phys. Chem. 63, 287–323 (2012).
Laurent, A. D. & Jacquemin, D. TDDFT benchmarks: a review. Int. J. Quant. Chem. 113, 2019–2039 (2013).
Adamo, C. & Jacquemin, D. The calculations of excitedstate properties with timedependent density functional theory. Chem. Soc. Rev. 42, 845–856 (2013).
Provorse, M. R. & Isborn, C. M. Electron dynamics with realtime timedependent density functional theory. Int. J. Quant. Chem. 116, 739–749 (2016).
Goings, J. J., Lestrange, P. J. & Li, X. Realtime timedependent electronic structure theory. WIREs Comput Mol Sci. 8 (2017). https://doi.org/10.1002/wcms.1341.
Li, X., Govind, N., Isborn, C., DePrince, A. E. & Lopata, K. Realtime timedependent electronic structure theory. Chem. Rev. 120, 9951–9993 (2020).
Mattiat, J. & Luber, S. Efficient calculation of (resonance) Raman spectra and excitation profiles with realtime propagation. J. Chem. Phys. 149, 174108 (2018).
Mattiat, J. & Luber, S. Vibrational (resonance) Raman optical activity with real time time dependent density functional theory. J. Chem. Phys. 151, 234110 (2019).
Mattiat, J. & Luber, S. Time domain simulation of (resonance) Raman spectra of liquids in the short time approximation. J. Chem. Theory Comput. 17, 344–356 (2020).
Aquilante, F., Todorova, T. K., Gagliardi, L., Pedersen, T. B. & Roos, B. O. Systematic truncation of the virtual space in multiconfigurational perturbation theory. J. Chem. Phys. 131, 034113 (2009).
Nagy, P. R., GyeviNagy, L. & Kállay, M. Basis set truncation corrections for improved frozen natural orbital CCSD(t) energies. Mol. Phys. 119 (2021). https://doi.org/10.1080/00268976.2021.1963495.
Mintz, B. & Wilson, A. K. Truncation of the correlation consistent basis sets: extension to thirdrow (ga–kr) molecules. J. Chem. Phys. 122, 134106 (2005).
Feller, D. & Dixon, D. A. Density functional theory and the basis set truncation problem with correlation consistent basis sets: elephant in the room or mouse in the closet? J. Phys. Chem. A 122, 2598–2603 (2018).
Barnes, T. A., Goodpaster, J. D., Manby, F. R. & Miller, T. F. Accurate basis set truncation for wavefunction embedding. J. Chem. Phys. 139, 024103 (2013).
Claudino, D. & Mayhall, N. J. Simple and efficient truncation of virtual spaces in embedded wave functions via concentric localization. J. Chem. Theory Comput. 15, 6085–6096 (2019).
Ding, F., Manby, F. R. & Miller, T. F. Embedded meanfield theory with blockorthogonalized partitioning. J. Chem. Theory Comput. 13, 1605–1615 (2017).
Koh, K. J., NguyenBeck, T. S. & Parkhill, J. Accelerating realtime TDDFT with blockorthogonalized manby–miller embedding theory. J. Chem. Theory Comput. 13, 4173–4178 (2017).
Krishtal, A., Ceresoli, D. & Pavanello, M. Subsystem realtime time dependent density functional theory. J. Chem. Phys. 142, 154116 (2015).
Santis, M. D. et al. Environmental effects with frozendensity embedding in realtime timedependent density functional theory using localized basis functions. J. Chem. Theory Comput. 16, 5695–5711 (2020).
Sharma, M. & Sierka, M. Efficient implementation of density functional theory based embedding for molecular and periodic systems using gaussian basis functions. J. Chem. Theory Comput. 18, 6892–6904 (2022).
Repisky, M. et al. Excitation energies from realtime propagation of the fourcomponent dirac–kohn–sham equation. J. Chem. Theory Comput. 11, 980–991 (2015).
Bruner, A., LaMaster, D. & Lopata, K. Accelerated broadband spectra using transition dipole decomposition and padé approximants. J. Chem. Theory Comput. 12, 3741–3750 (2016).
Wibowo, M., Irons, T. J. P. & Teale, A. M. Modeling ultrafast electron dynamics in strong magnetic fields using realtime timedependent electronic structure methods. J. Chem. Theory Comput. 17, 2137–2165 (2021).
Castro, A., Marques, M. A. L. & Rubio, A. Propagators for the timedependent kohn–sham equations. J. Chem. Phys. 121, 3425–3433 (2004).
Lippert, G., Hutter, J. & Parrinello, M. A hybrid gaussian and plane wave density functional scheme. Mol. Phys. 92, 477–487 (1997).
Bardo, R. D. & Ruedenberg, K. Eventempered atomic orbitals. VI. optimal orbital exponents and optimal contractions of gaussian primitives for hydrogen, carbon, and oxygen in molecules. J. Chem. Phys. 60, 918–931 (1974).
Cherkes, I., Klaiman, S. & Moiseyev, N. Spanning the hilbert space with an even tempered gaussian basis set. Int. J. Quant. Chem. 109, 2996–3002 (2009).
Ditchfield, R., Hehre, W. J. & Pople, J. A. Selfconsistent molecularorbital methods. IX. an extended gaussiantype basis for molecularorbital studies of organic molecules. J. Chem. Phys. 54, 724–728 (1971).
Dunning, T. H. Gaussian basis sets for use in correlated molecular calculations. i. the atoms boron through neon and hydrogen. J. Chem. Phys. 90, 1007–1023 (1989).
Jensen, F. Polarization consistent basis sets: principles. J. Chem. Phys. 115, 9113–9125 (2001).
Weigend, F., Furche, F. & Ahlrichs, R. Gaussian basis sets of quadruple zeta valence quality for atoms h–kr. J. Chem. Phys. 119, 12753–12762 (2003).
Weigend, F. & Ahlrichs, R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for h to rn: design and assessment of accuracy. Phys. Chem. Chem. Phys. 7, 3297 (2005).
Lehtola, S. Curing basis set overcompleteness with pivoted cholesky decompositions. J. Chem. Phys. 151, 241102 (2019).
Papajak, E., Zheng, J., Xu, X., Leverentz, H. R. & Truhlar, D. G. Perspectives on basis sets beautiful: seasonal plantings of diffuse basis functions. J. Chem. Theory Comput. 7, 3027–3034 (2011).
Hariharan, P. C. & Pople, J. A. The influence of polarization functions on molecular orbital hydrogenation energies. Theor. Chem. Acc. 28, 213–222 (1973).
Clark, T., Chandrasekhar, J., Spitznagel, G. W. & Schleyer, P. V. R. Efficient diffuse functionaugmented basis sets for anion calculations. III. the 321+g basis set for firstrow elements, lif. J. Comput. Chem. 4, 294–301 (1983).
Jaccard, P. THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1. New Phytol. 11, 37–50 (1912).
Rappoport, D. & Furche, F. Propertyoptimized gaussian basis sets for molecular response calculations. J. Chem. Phys. 133, 134105 (2010).
Frisch, M. J. et al. Gaussian 09 Revision D.01. Gaussian Inc. Wallingford CT (2009).
Theisen, R. F., Huang, L., Fleetham, T., Adams, J. B. & Li, J. Ground and excited states of zinc phthalocyanine, zinc tetrabenzoporphyrin, and azaporphyrin analogs using DFT and TDDFT with franckcondon analysis. J. Chem. Phys. 142, 094310 (2015).
Wang, C., Shao, J., Chen, F. & Sheng, X. Excitedstate absorption for zinc phthalocyanine from linearresponse timedependent density functional theory. RSC Adv. 10, 28066–28074 (2020).
Martynov, A. G. et al. Methodological survey of simplified TDDFT methods for fast and accurate interpretation of UV–vis–NIR spectra of phthalocyanines. ACS Omega 4, 7265–7284 (2019).
Zhang, L., Qi, D., Zhao, L., Bian, Y. & Li, W. Substituent effects on the structure–property relationship of unsymmetrical methyloxy and methoxycarbonyl phthalocyanines: DFT and TDDFT theoretical studies. J. Mol. Graph. Model. 35, 57–65 (2012).
Wallace, A. J., Williamson, B. E. & Crittenden, D. L. Coupled cluster calculations provide a onetoone mapping between calculated and observed transition energies in the electronic absorption spectrum of zinc phthalocyanine. Int. J. Quant. Chem. 117, e25350 (2017).
Tussupbayev, S., Govind, N., Lopata, K. & Cramer, C. J. Comparison of realtime and linearresponse timedependent density functional theories for molecular chromophores ranging from sparse to high densities of states. J. Chem. Theory Comput. 11, 1102–1109 (2015).
SánchezGonzález, Á., MuñozLosa, A., Vukovic, S., Corni, S. & Mennucci, B. Quantum mechanical approach to solvent effects on the optical properties of metal nanoparticles and their efficiency as excitation energy transfer acceptors. J. Phys. Chem. C 114, 1553–1561 (2010).
KudaSingappulige, G. U. & Aikens, C. M. Excitedstate absorption in silver nanoclusters. J. Phys. Chem. C 125, 24996–25006 (2021).
Chen, M., Dyer, J. E., Li, K. & Dixon, D. A. Prediction of structures and atomization energies of small silver clusters, (ag)n, n < 100. J. Phys. Chem. A 117, 8298–8313 (2013).
Adamo, C. & Barone, V. Toward reliable density functional methods without adjustable parameters: the PBE0 model. J. Chem. Phys. 110, 6158–6170 (1999).
Goedecker, S., Teter, M. & Hutter, J. Separable dualspace gaussian pseudopotentials. Phys. Rev. B 54, 1703–1710 (1996).
Krack, M. Pseudopotentials for h to kr optimized for gradientcorrected exchangecorrelation functionals. Theor. Chem. Acc. 114, 145–152 (2005).
VandeVondele, J. & Hutter, J. Gaussian basis sets for accurate calculations on molecular systems in gas and condensed phases. J. Chem. Phys. 127, 114105 (2007).
Rohlfing, M. & Louie, S. G. Electronhole excitations and optical spectra from first principles. Phys. Rev. B 62, 4927–4944 (2000).
Deslippe, J. et al. BerkeleyGW: A massively parallel computer package for the calculation of the quasiparticle and optical properties of materials and nanostructures. Comput. Phys. Commun. 183, 1269–1289 (2012).
Bruneval, F. et al. molgw 1: Manybody perturbation theory software for atoms, molecules, and clusters. Comput. Phys. Commun. 208, 149–161 (2016).
Pedersen, T. B. & Kvaal, S. Symplectic integration and physical interpretation of timedependent coupledcluster theory. J. Chem. Phys. 150, 144106 (2019).
Koulias, L. N., WilliamsYoung, D. B., Nascimento, D. R., DePrince, A. E. & Li, X. Relativistic realtime timedependent equationofmotion coupledcluster. J. Chem. Theory Comput. 15, 6617–6624 (2019).
Sonk, J. A., Caricato, M. & Schlegel, H. B. TDCI simulation of the electronic optical response of molecules in intense fields: comparison of RPA, CIS, CIS(d), and EOMCCSD. J. Phys. Chem. A 115, 4678–4690 (2011).
Darling, C. L. & Schlegel, H. B. Dipole moments, polarizabilities, and infrared intensities calculated with electric field dependent functions. J. Phys. Chem. 98, 5855–5861 (1994).
Elliott, P., Furche, F. & Burke, K. Excited states from timedependent density functional theory. In Reviews in Computational Chemistry, 91165 (John Wiley & Sons, Inc., 2009). https://doi.org/10.1002/9780470399545.ch3.
Pescitelli, G. & Bruhn, T. Good computational practice in the assignment of absolute configurations by TDDFT calculations of ECD spectra. Chirality 28, 466–474 (2016).
Barboza, C. A., Vazquez, P. A. M., Carey, D. M.L. & ArratiaPerez, R. A TDDFT basis set and density functional assessment for the calculation of electronic excitation energies of fluorene. Int. J. Quant. Chem. 112, 3434–3438 (2012).
Kumar, C. et al. Accelerating kohnsham response theory using density fitting and the auxiliarydensitymatrix method. Int. J. Quant. Chem. 118, e25639 (2018).
Chong, D. P. Completeness profiles of oneelectron basis sets. Can. J. Chem. 73, 79–83 (1995).
Manninen, P. & Vaara, J. Systematic gaussian basisset limit using completenessoptimized primitive sets. a case for magnetic properties. J. Comput. Chem. 27, 434–445 (2006).
Lehtola, S. Automatic algorithms for completenessoptimization of gaussian basis sets. J. Comput. Chem. 36, 335–347 (2014).
Guidon, M., Hutter, J. & VandeVondele, J. Auxiliary density matrix methods for hartreefock exchange calculations. J. Chem. Theory Comput. 6, 2348–2364 (2010).
Sun, Q. et al. PySCF: the pythonbased simulations of chemistry framework. Wiley Interdiscip. Rev. Comput. Mol. Sci. 8 (2017). https://doi.org/10.1002/wcms.1340.
Sun, Q. et al. Recent developments in the PySCF program package. J. Chem. Phys. 153, 024109 (2020).
Nguyen, T. S. & Parkhill, J. Nonadiabatic dynamics for electrons at secondorder: realtime TDDFT and OSCF2. J. Chem. Theory Comput. 11, 2918–2924 (2015).
Stephens, P. J., Devlin, F. J., Chabalowski, C. F. & Frisch, M. J. Ab initio calculation of vibrational absorption and circular dichroism spectra using density functional force fields. J. Chem. Phys. 98, 11623–11627 (1994).
Pritchard, B. P., Altarawy, D., Didier, B., Gibson, T. D. & Windus, T. L. New basis set exchange: an open, uptodate resource for the molecular sciences community. J. Chem. Inf. Model. 59, 4814–4820 (2019).
Hanwell, M. D. et al. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. J. Cheminform. 4 (2012). https://doi.org/10.1186/17582946417.
Hunter, J. D. Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
Yabana, K., Nakatsukasa, T., Iwata, J.I. & Bertsch, G. F. Realtime, realspace implementation of the linear response timedependent densityfunctional theory. Phys. Status Solidi B 243, 1121–1138 (2006).
Acknowledgements
Funding by the University of Zurich and the Swiss National Science Foundation (grant no: PP00P2_170667) is gratefully acknowledged. We thank the Swiss National Supercomputing Center for computing resources (project ID: pr119, s1001, and s1036).
Author information
Authors and Affiliations
Contributions
R.H. conceived the method. R.H., J.M., and S.L. designed the method. R.H. and J.M. performed the research. S.L. provided the resources and supervised the research. R.H., J.M., and S.L. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Han, R., Mattiat, J. & Luber, S. Automatic purposedriven basis set truncation for timedependent Hartree–Fock and densityfunctional theory. Nat Commun 14, 106 (2023). https://doi.org/10.1038/s41467022356944
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467022356944
This article is cited by

Liposozyme for wound healing and inflammation resolution
Nature Nanotechnology (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.