## Introduction

In designing new crystalline materials, crystal structure prediction (CSP) for a given chemical composition is the most fundamental task. A large number of physical properties can be predicted using first-principles calculations with a given atomic configuration. However, CSP is quite a difficult problem due to the exponential increase of the number of potential energy minima with respect to the system size.1 A great deal of effort has been devoted to overcoming this problem. To date, several searching algorithms in CSP have been successfully developed such as random search,2,3,4 simulated annealing,5,6 basin hopping,7 minima hopping,8,9 evolutionary algorithm (EA),10,11,12 particle-swarm optimization (PSO),13,14 and Bayesian optimization (BO).15 Ab initio random structure searching by Pickard and Needs is quite simple but still one of the most efficient approaches even today. EA and PSO are also popular and efficient algorithms as implemented in USPEX10,11,12 and CALYPSO.13,14

Even now, CSP is quite a time-consuming problem: to find the most stable structure, in existing methods, local optimizations of all the structures are performed to evaluate their energies. First-principles density-functional-theory (DFT) codes such as VASP16 and QUANTUM ESPRESSO17 calculate the force of each atom for a given structure. Relaxation is performed with the forces by using gradient methods such as the quasi Newton or conjugate gradient methods. Figure 1a shows how to obtain optimized energies by relaxing all the generated structures until convergence to local minima in structure space. However, if the final relaxed energy of a structure can be predicted in the middle of its calculation, we can skip unpromising calculations by stopping their optimizations and reduce the total CSP calculation by giving preference to promising structures as shown in Fig. 1b.

In this paper, we propose a method to very roughly estimate the final energy during local optimization, and we show that the total computational cost can be reduced by controlling the local optimization step based on that estimated energy. Generally, it is a hard to predict the energy after relaxation during local optimization. We obtain a rough estimate of the final energy with a quadratic approximation from current energy and force, focusing on the fact that, not only the energy but also the force on each atom, can be calculated by DFT calculations. In the proposed method, called Look Ahead based on Quadratic Approximation (LAQA), we first generate a large number of candidate structures, then select promising structures based on the energy estimation method, and proceed to calculate them preferentially. When performing CSP, if the calculation is judged insufficient, we may not find a stable structure, and need to perform additional calculation for newly generated candidate structures. To deal with such a procedure, we also propose a method to increase the number of candidate structures gradually, called sequential LAQA (sLAQA).

To show the effectiveness of the proposed methods, we conducted CSP simulations, using 7 typical systems: Si (8 and 16 atoms in unit cell), NaCl (16 and 32 atoms), ferromagnetic Y2Co17 (19 atoms), Al2O3 (10 atoms), and GaAs (16 atoms). We randomly generated hundreds of candidate structures in these systems and calculated all the local optimization steps. We then investigated how much of a computational cost reduction can be obtained with our proposed methods, compared with the widely used random searching approach, which repeatedly generate random structures and performs full local optimizations. The total local optimization steps using LAQA are reduced by a factor ranging from 2.02 up to 21.4, compared to random searching.

## Results

### Local optimization controlled by Greedy, LAQA, and sLAQA

In CSP, we compare the total energies evaluated by first-principles DFT calculations among a number of generated structures, and finally predict the minimum energy structure as the global minimum structure of the system. We propose approaches to efficiently perform CSP by preferentially relaxing structures that are predicted to have a low-final energy such as stable structures based on the basic idea shown in Fig. 1.

One of the simplest approaches of local optimization control is to progress relaxation in accordance with the order of energies. In many cases, if a structure is close to its stable state, its energy is considered to be relatively low. Therefore, if local optimization is done from a structure with low energy, it is expected that a stable structure will be obtained at a relatively early stage. We refer to this simplest idea based CSP method as greedy method (Greedy), as such an approach is called greedy algorithm in the field of reinforcement learning.18 In this method, we first generate a sufficient number of structures, N, that are expected to contain at least one structure that will relax to the most stable structure and then calculate the first local optimization step for them. We search a stable structure by performing local optimization in accordance with the order of their calculated energies.

However, the Greedy method does not always work well because an initial energy of a structure that will relax to a stable structure is not necessarily low as the blue local optimization step in Fig. 1a. On the other hand, during such a local optimization step from initial high to final low energy, it is expected that a relatively large force may be applied to each atom. This is because it is thought that each atom moves greatly until it is fully relaxed. It is expected that we can preferentially select structures that have final low energies, not only using just energy as Greedy, but also other information such as the above-mentioned force and stress applied to a cell.

In this paper, we propose a method (LAQA) to control local optimization based on the following score as the simplest and the most versatile method using energy and force. Figure 2a shows the flowchart of LAQA. First, we generate a sufficient number of structures, N, that are expected to contain at least one structure that will relax to the most stable structure. We calculate the first local optimization step for all the structures (Initialization). Next, we calculate rough estimates of the final energies to select a structure on which to proceed with local optimization (Scoring). For each structure i and local optimization step t, we denote by Ei,t the total energy of the structure divided by formula unit (f.u.), and by Fi,t the sum of forces on atoms divided by formula unit. Fi,t is calculated by averaging the norms of the force applied to each atom. We calculate the score Li,t of each structure i after T steps as follows:

$$L_{i,T} = \left( {\begin{array}{cc} {\mathop{\rm{min}}\limits_{1 \le t \le T}} E_{i,t} - \frac{{F_{i,T}^{2}}}{{2\Delta F_{i,T}}} & \, \, ({\rm{not}} \, {\rm{optimized}}) \cr \infty & \quad ({\rm{fully}} \, {\rm{optimized}}) \end{array}} \right.$$
(1)

where ∆Fi,T = |Fi,T − Fi,T−1|. We fix ∆Fi,T = 1 for T = 1 and ∆Fi,T = 10−6 if the values of Fi,T and ∆Fi,T−1 are the same. In our approach, we preferentially proceed local optimization of structures with low values of the score. As can be seen from the first term in Eq. (1), we give priority to structures with low energy. The reason why it is taking minimum value of E is that a spike-like exceptionally large value is often obtained in the calculation process. According to the second term in Eq. (1), the total score decreases as the force applied to each atom increases. Additionally, if the force does not change significantly even if step changes, that is, if it can be thought that large structural change is continuing, the score decreases. For a structure that has been fully optimized, infinity is assigned as its score to avoid additional calculation for it. This score can be regarded as a rough estimation of the fully optimized energy of the system with quadratic approximation. According to this score, the structure which has the minimum score is selected (Selection) and calculation of one local optimization step is performed (Calculation). Note that fully optimized structures, having a score of +∞, are not selected.

Since holding large amount of candidate structures simultaneously when performing LAQA may be costly, we also propose an algorithm to gradually increase the number of optimization structures. We call this approach sequential LAQA (sLAQA). Figure 2b shows the flowchart of sLAQA. We first fix the pooling number Np and generate Np structures randomly. The structure to optimize locally is chosen from the pooled structures. The initialization, scoring, selection, and calculation steps are the same as LAQA. After the calculation step, if the calculated structure is fully optimized, we exclude it. Without elimination of unpromising structures, the proportion they occupy in the pool would keep increasing, we therefore simultaneously exclude the structure with the highest score, i.e., the most unpromising structure. Then, we generate two new structures randomly to keep the number of pooled structures Np, and calculate the first local optimization step for the new structures as initialization. Finally, we add them in the structure pool. In LAQA, if the set of initial structures is the same, the selected structure to calculate based on the L score is the same. Therefore, the number of steps required to obtain a stable structure does not change for the same set. On the other hand, in sLAQA, even if the set of initial structures is the same, since a structure to be added to the pool is newly chosen randomly, the number of steps required to obtain the stable structure changes.

For actual CSP of unknown systems, we need to stop calculation at a certain stage. It is difficult to identify the most stable structure, but when multiple relaxed structures with the lowest energy are obtained, the structure is considered to be the most stable structure with a high probability. Therefore, in LAQA, it is considered to be practical to stop a search when structures above mentioned are obtained. If such structures are not obtained and final energies of fully relaxed structures become high, it is considered effective to generate a new dataset of initial structures and repeat the calculation using LAQA until such structures are found. In sLAQA, it is considered practical to stop calculation if such structures are obtained.

### CSP using RS and BO

In order to examine the effectiveness of LAQA and sLAQA, we introduce other approaches of CSP: random search (RS)2 and Bayesian optimization based structure selection.15

In the RS approach, initial structures are randomly generated and fully relaxed. Here, we first generate a number of initial structure candidates, and perform RS-based CSP by randomly selecting a structure to relax them. While RS is a simple and widely used method, it may take a long time to find a stable structure.

BO is widely used as one of the global optimization methods,19 and in recent years its usefulness is also shown in the field of materials science.20 In the CSP using BO,15 a stable structure is searched by repeating a selection of an initial structure among candidate initial ones and relaxation of it. Unlike LAQA, sLAQA, and Greedy, a selected structure is fully optimized. In structure selection, a structure that is expected to have lower final energy is chosen by using the framework of BO based on structure data and their relaxed energies previously calculated. To perform the framework of BO, we adopted the fingerprint of Oganov and Valle21 as the descriptor of structures in the previous work.15 A fingerprint is a vector representation of a structure. The vector is calculated by a fingerprint function that is invariant with respect to shifts in the coordinate system, rotations, and reflections. The fingerprint of Oganov and Valle21 was designed to map similar structures to similar vectors. Compared with RS, it is expected to speed up the calculation by efficiently searching for low energy structures by BO-based search. In this paper, we repeated this CSP trials 400 times for each system, and calculated the performance of BO by calculating the average of structures to find a stable structure.

### Initial structure generation and local optimization

We randomly generated initial structures with specific space groups, using CrySPY,22 as described by Yamashita et al.15 Once the space group is specified, some lattice parameters are fixed by the symmetry. The remaining unfixed lattice parameters are taken at random. A combination of the Wyckoff positions corresponding to the space group is randomly selected. The atoms are arranged according to the selected Wyckoff positions under the constraint of the minimum interatomic distance.

Total energy calculations and structure optimizations were carried out using DFT with the projector augmented wave method23 as implemented in VASP code.16 The internal atomic coordinates as well as the cell parameters were fully optimized (see Methods section for the details of DFT calculation).

### Tested systems

We performed test simulations of CSP for five typical systems: Si (8 and 16 atoms in unit cell), NaCl (16 and 32 atoms), Y2Co17 (19 atoms), Al2O3 (10 atoms), and GaAs (16 atoms). We denote these systems by Si8, Si16, Na8Cl8, Na16Cl16, Y2Co17, Al4O6, and Ga8As8. The most stable structures of Si, NaCl, Y2Co17, Al2O3, and GaAs are the rocksalt, diamond, Th2Zn17-type, corundum (α-Al2O3), and zinc blend structure, respectively. Y2Co17 is a ferromagnetic intermetallic compound and its space group is $$R\bar 3m$$.24 For GaAs, the wurtzite structure whose energy is quite close to the one of the zinc blend structure is also taken into consideration as a stable structure.

We randomly prepared 500, 700, 500, 500, 700, 1000, and 1000 candidate structures for Si8, Si16, Na8Cl8, Na16Cl16, Y2Co17, Al4O6, and Ga8As8 as listed in Table 1 and evaluated their total energy. When generating the candidate structures, we imposed constraints on interatomic distance. We used a 1.8 Å distance for Si16, Na8Cl8, Na16Cl16, Y2Co17, Al4O6, and Ga8As8 and 2.0 Å for Si8. We compared generated initial structures by calculating the fingerprint.21 For each system, almost all of them are different each other. Although only two structures in Na8Cl8 are the same, we performed local optimization for them as different structures. Figure 3 shows the result of the locally optimized energies for all the generated initial structures of the Si8 (a), Si16 (b), Na8Cl8 (c), Na16Cl16 (d), Y2Co16 (e), Al4O6 (f), and Ga8As8 (g) systems. In each system, the differences of the optimized energies (eV/f.u.) from the energy of the most stable structure are plotted. For Al4O6, we show the enlarged result (eV/atom) in Fig. 3h. This result corresponds to the one of the black square region in Fig. 3f. The basic information for these systems is listed in Table 1. Eventually, we optimized 7, 3, 18, 17, 1, and 2 structures to stable structure, for Si8, Si16, Na8Cl8, Na16Cl16, Y2Co17, and Al4O6, respectively. For Ga8As8, two stable (zinc blend) structures and two wurtzite structures were obtained. We also collected the total energy and the force on atoms in each local optimization step. The average steps of local optimization for all structures are also listed in Table 1.

### Simulation results

To show the effectiveness of the proposed method, we performed CSP simulations on the calculated dataset of the Si8, Si16, Na8Cl8, Na16Cl16, Y2Co17, Al4O6, and Ga8As8 systems. In the CSP trials, local optimization is controlled by different algorithms: random sampling (baseline method), LAQA, sLAQA, Greedy, and BO. Instead of actually generating the structure randomly in each CSP trial, we evaluated their performances by randomly choosing from prepared structures. Figure 4 illustrates the control and reduction of local optimization steps by using LAQA for the Y2Co17 system. Each line shows energy as a function of local optimization step from an initial structure. Bold black lines show the local optimization steps needed to reach the stable structure. Figure 4a shows all the local optimization steps of the Y2Co17 system. The number of total steps is 91,423. LAQA controls local optimization steps as shown in Fig. 4b for the same system. Stable structure is already optimized after only 3300 steps (~3.6%) and most unpromising structures are not optimized. Figure 4c, d are enlarged results of Fig. 4a, b.

Figure 5a shows the average number of total local-optimization steps required to find the stable structure with random search (blue), LAQA (green), sLAQA (orange, red, purple), Greedy (light blue), and BO (yellow). For Ga8As8, we calculated performances by using these algorithms to find the stable structure (Ga8As8 (stable)), a wurtizite structure (Ga8As8 (wurtzite)) and at least one of both stable structures and wurtzite structures (Ga8As8 (both)). The values of total steps for LAQA and Greedy are calculated using all the prepared structures as generated ones. We calculated the average numbers for sLAQA by changing the number of pooling Np. We estimated them by repeating CPS trials 2000 times and averaging their total local-optimization steps. To compare the performance of RS and BO with the others, we converted averaged number of structures to find a stable structure and a wurtzite structure into local optimization steps based on the average of local optimization steps for each system in Table 1. We also show the frequency distribution of number of trials required to find the most stable structure using random search, LAQA, and sLAQA in Fig. 5b. The result of random search obtained from the expected values in frequency theory is shown as blue circles. The total number of frequencies is normalized to 100. Green bars and orange histograms show the result of LAQA and sLAQA. The number of steps to find a stable structure in sLAQA is not constant. This is because the set of initial structures with a certain pool number changing in each trial and a structure that newly enters the pool is randomly chosen. The distributions of the result using sLAQA are biased to smaller numbers of trials and the number of local optimization steps can be reduced.

The total steps required to find the stable structure using LAQA were reduced by a factor of 4.23, 10.1, 3.79, 4.78, 14.0, 2.02, and 6.85 compared to the result of random search for Si8, Si16, Na8Cl8, Na16Cl16, Y2Co17, Al4O6, and Ga8As8, respectively. For Ga8As8 (wurtzite) and Ga8As8 (both), the total steps were reduced by a factor of 21.4 and 9.60. Since the proposed methods basically calculates preferentially structures expected to have lower final energies, not only stable structures but also wurtzite structures for Ga8As8 were also obtained at an early stage. The results using Greedy were almost the same as using LAQA for Si16 and Na8Cl8, but were worse for Si8, Na16Cl16, Y2Co17, Al4O6, and Ga8As8. These differences result from the acceleration effect of local optimization for stable structures by using LAQA. Figure 6 shows the changes of fully optimized order of structures when using Greedy or LAQA. Each dot shows the orders of full optimization for an initial structure, i.e., how many structures were fully optimized before it, using these algorithms. Dark blue and green dots are stable and wurtzite structures. If a structure is plotted below the diagonal, it shows that optimization order is accelerated by using LAQA compared to Greedy. The blue dots in the green squares indicate stable structures whose local optimizations were preferentially made, using LAQA. The structure in the orange circle of Si16 was not preferentially relaxed, compared to Greedy. While the initial energy of this structure was relatively high, the force exerted on each atom was small, and as a result it was fully relaxed gradually with relatively long steps (166 steps). This is longer than the average one of Si16 (92.81 steps) and that of the structure in the black circle (44 steps). Therefore, the score of LAQA became relatively high for the structure of the orange circle, so its relaxation did not proceed preferentially. From these results, the total steps to find the stable structure and wurtzite structures for Si8, Na16Cl16, Y2Co17, Al4O6, and Ga8As8 can be reduced by LAQA compared to Greedy. For Si16 and Na8Cl8 in Fig. 6, there are structures that are optimized immediately by Greedy and LAQA (structures in black circles). Since LAQA requires some exploratory calculations to find promising structures using the LAQA score, the results of Greedy were better than those of LAQA. However, even for these systems, the total steps using LAQA were reduced by a factor of 10.1 and 3.79 compared to those of random search. These results suggest the efficiency of the local optimization control and the rough estimation of fully optimized energy in LAQA.

## Discussion

For most of the systems, the average steps using sLAQA are decreased as the pooling number Np is increased as shown in Fig. 5. This is because the larger Np, the higher probability of including stable structures from the beginning of optimization. Note that even if the pooling number is increased, the performance of sLAQA is different from that of LAQA because sLAQA adds new structures. The performances of sLAQA for Si Np are relatively poor. This result is considered to be derived from the facts that there were only two stable structures and the stable one in the orange circle in Fig. 6 was not optimized by using LAQA score preferentially. Although performances depend on the systems, from these results, it was suggested that by using sufficient Np in sLAQA, we can achieve performance far better than random search.

The results of LAQA and sLAQA showed better performance than those of BO in most cases, while the reduction effects of calculation using BO were not very large compared to RS. It is thought that the relatively low performance of BO is due to the problem that an initial structure and the structure after relaxation are different. In general, in BO based search, it is assumed that a point (structure) in a search space and the value corresponding to that point are in one-to-one correspondence. In CPS, however, there are two structures initial and relaxed ones. Although a calculated energy corresponds to a relaxed structure, we need to choose a structure to calculate next from among candidates of initial structures. This structural differences may adversely affect the performance of BO. On the other hand, our approach in this paper is completely different from BO, and the above-mentioned problem does not become a problem.

In the present study, we have proposed LAQA, and its variants: a novel approach for CSP that controls local optimization steps. Results on seven crystalline systems have demonstrated that LAQA and sLAQA can significantly reduce the total local optimization steps, compared to random search. In future work, we plan to combine our approach with effective structure generation or selection methods such as EA, PSO, or BO. We would like to apply this approach to other tasks such as identification of low-energy conformers of molecules.

## Methods

### Details of DFT calculation

We employed the generalized gradient approximation by Perdew, Burke, and Ernzerhof25 for exchange-correlation functional. The internal atomic coordinates as well as the cell parameters were fully optimized until forces acting on every atom became less than 0.01 eV/Å. The k-point meshes were automatically generated using pymatgen.26 For Si, a cutoff energy of 307 eV for the plane-wave expansion of the wave function and k-point mesh density of 80 Å−3 for reciprocal cells were used. For NaCl, a cutoff energy of 328 eV and k-point mesh density of 80 Å−3 were used. Y2Co17 was treated as a ferromagnet, and 335 eV and 100 Å−3 were employed for the cutoff energy and k-point mesh density, respectively. For Al2O3, a cutoff energy of 500 eV and k-point mesh density of 100 Å−3 were used. For GaAs, a cutoff energy of 261 eV and k-point mesh density of 100 Å−3 were used.

### Data availability

Initial structures and calculated data are available at http://www.tsudalab.org/files/csp_dataset.zip. Our implementation is available on Github at http://github.com/Tomoki-YAMASHITA/CrySPY.