The chemical and structural properties of atomically precise nanoclusters are of great interest in numerous applications, but predicting the stable structures of clusters can be computationally expensive. In this work, we present a procedure for rapidly predicting low-energy structures of nanoclusters by combining a genetic algorithm with interatomic potentials actively learned on-the-fly. Applying this approach to aluminum clusters with 21 to 55 atoms, we have identified structures with lower energy than any reported in the literature for 25 out of the 35 sizes. Our benchmarks indicate that the active learning procedure accelerated the average search speed by about an order of magnitude relative to genetic algorithm searches using only density functional calculations. This work demonstrates a feasible way to systematically discover stable structures for large nanoclusters and provides insights into the transferability of machine-learned interatomic potentials for nanoclusters.
Nanoclusters have drawn much attention due to their special physical and chemical properties1,2 which are distinct from molecules or bulk crystal materials. These properties make them useful in diverse research fields including catalysis3,4,5,6, chemical sensing7, fluorescence8,9, and medicine10. The unique properties of nanoclusters are largely the consequence of distinct size-dependent atomic structures, quantum finite-size effects, and very large surface-to-volume ratios11,12. These properties are generally not smooth functions of cluster sizes and can fluctuate with the addition or removal of a single atom13.
Computational screening is a promising way to identify nanoclusters with desirable properties, but to predict the properties of a nanocluster from first principles it is necessary to first identify the low-energy atomic structures of the cluster. Many optimization methods have been proposed to perform global structure searches for nanoclusters, including the basin hopping method14,15, unbiased random sampling16, particle swarm optimization17,18, simulated annealing19,20 and genetic algorithms (GA)21,22,23,24,25. Each of these methods involves the evaluation of the energies of a large number of candidate structures, which makes it critically important to evaluate structure energies with a method that is both fast and sufficiently accurate to distinguish between competing structures. Density functional theory (DFT) provides a high level of accuracy, but its speed and scalability typically limit the search to nanoclusters of sizes up to only several dozen atoms26,27,28,29. Classical interatomic potentials, which typically have simple functional forms derived from fundamental physics, are several orders of magnitude faster than DFT and have been used to search for ground state structures with up to a few hundred atoms14,30,31. However, classical interatomic potentials often lack the accuracy required to resolve the energy differences between competing candidate structures, especially for the low-lying local minima on the potential energy surface (PES) that are often only tens of meVs apart21,32.
In recent years, an alternative type of interatomic potential has emerged in the form of machine-learned interatomic potentials (MLIPs)33,34,35,36,37, which are parameterized by fitting to a set of training data. Examples of MLIPs are neural network potentials38,39,40, Gaussian approximation potentials (GAP)41,42,43, spectral neighbor analysis potentials44,45, moment tensor potentials (MTP)46,47,48, the atomic cluster expansion (ACE)49, and potentials found through symbolic regression50. Although they may be slower than classical interatomic potentials by an order of magnitude or more, MLIPs are generally more accurate and are still orders of magnitude faster than ab initio calculations51.
One of the challenges in using MLIPs to search for ground state structures is that because the ground state is unknown, it is difficult to ensure that the potential is constructed in a way that will yield accurate ground state energies. To address this challenge we use active learning52,53,54, in which the potential is trained adaptively with new data generated during the search. Recently, similar strategies have also been successfully applied using MLIPs for global optimization of bulk crystalline materials55,56 and nanoclusters57,58. Kolsbjerg et al. used actively-learned neural network potentials to identify the structures for small (up to 13-atom) Pt-based clusters on an MgO (100) support57. Tong et al. identified low-energy structures for B36, B40, and B84 clusters using a GAP potential trained on the fly58. Here we demonstrate that MTP, which have been shown to have a good balance between accuracy and speed for bulk materials51, can be used to accurately and efficiently identify low-energy structures for metal clusters from 21 to 55 atoms. As the properties of small metal clusters can change in a non-smooth way with cluster size, we further investigate the transferability of these potentials to clusters of varying size.
We demonstrate our approach by searching for low-energy structures of aluminum clusters. Aluminum nanoclusters are actively studied for applications like catalysts for nitrogen dissociation59, and optoelectronics60. They are also important model systems in theoretical chemistry for metallic aromaticity61, magnetism of nanoclusters62 and superatoms63. We have discovered new aluminum cluster structures for 25 out of the 35 sizes that are at least 1 meV/atom lower in DFT-calculated energy than the lowest-energy structures we have found in the literature29,30,64,65,66. New lowest-energy structures for an additional two sizes were discovered by DFT-only GA used for benchmarking. Our approach, described in detail below, provides a template that can be used to significantly accelerate the computational design of atomic clusters, and paves the way for determining atomic structures of large nanoclusters.
Hyperparameter selection for moment tensor potentials
Existing benchmarks of MTP on bulk crystalline structures51,67 give generally good sets of parameters for training reliable MTP, but little information is available on good parameters for training on clusters. To identify a good set of parameters for our calculations we used Al clusters with 24 atoms as a model system and tested various combinations of hyperparameters, including potential complexity (defined by the parameter levmax)48, the amount of training data, and the weight for force components (the “force weight”) relative to the weight for energies. One quarter of the structures were randomly selected for validation, with the rest used to train the potential. Additional details about the construction of this dataset are provide in the “Methods” section and Supplementary Note 1.
For a fixed training set, both energy and force errors decrease steadily as increasingly complex potentials are used (Fig. 1b). However, such a gain is at the expense of an exponential growth in training costs (Fig. 1c). Additional analysis of other combinations of hyperparameters (Supplementary Figs. 4–7) shows similar trends as in Fig. 1. To balance accuracy and training costs we used levmax = 14 and a force weight that was 1/1000 that of the energy weight for all subsequent active learning genetic algorithm (GA_AL) runs. Using these parameters, we found that force and energy errors plateaued after the training set exceeded about 1000 structures (Fig. 1d).
Prediction of structures for Al clusters with 21–40 atoms
We evaluated our approach by predicting structures of aluminum nanoclusters with 21–40 atoms (Fig. 2). The performance of the GA_AL algorithm was evaluated by comparing it with a GA that used only DFT to calculate energies (GA_DFT), where both algorithms were run for the same amount of computing time. For our initial evaluation, GA_AL was initialized with untrained potentials and a new potential was trained at each cluster size. Details of how we performed the comparison are provided in the “Methods” section.
For eight cluster sizes (21, 23, 24, 26, 28, 33, 35 and 36), mostly among the smaller clusters, GA_AL and GA_DFT found essentially the same lowest-energy clusters (Fig. 2a), with similarity scores, a measure of geometrical differences, below 0.3 (see the “Methods” section). For 10 out of the 20 sizes (25, 27, 30–32, 34, 37–40), GA_AL found clusters that were lower in energy than those found by GA_DFT by at least 1 meV/atom, with an average energy difference of −5.06 meV/atom (or −169.64 meV/cluster). For clusters of 22 atoms, GA_AL identified a distinct cluster with a calculated energy within 0.1 meV/atom of the lowest-energy cluster identified by GA_DFT. The lowest-energy 33-atom cluster found by GA_AL is 1.47 meV/atom lower in energy than the one found by GA_DFT, but it is structurally similar based on both the similarity score and visual inspection (see Supplementary Fig. 16). Therefore it is not counted as a new lowest-energy cluster. There was only one cluster size (29 atoms), for which GA_DFT found a distinct cluster with lower energy than that found by GA_AL. For this size the cluster found by GA_DFT was lower in energy by 3.61 meV/atom (104.69 meV/cluster). On average, the energies of structures found by GA_AL are lower by 2.43 meV/atom (82.27 meV/cluster).
To quantify how much more quickly the GA_AL approach finds low-energy structures, we define the “acceleration ratio” as the ratio of the time it took GA_DFT to find its lowest-energy structure to the time it took GA_AL to find a structure with at least as low of the energy. The time spent for GA_AL includes the time spent in the GA search, time required to generate training data and evaluate low-energy clusters using DFT, and time spent retraining the interatomic potential. In the case of size 29, GA_AL failed to discover better or equivalent configurations, so the ratio is set to 0. Among the remaining sizes, the acceleration ratio ranged from 0.19 to 9.12. The average acceleration ratio across all 20 sizes is 2.29 with a median of 1.80 (Fig. 2b). GA_DFT often did not find a cluster with energy as low as that found by GA_AL (Fig. 2b), suggesting that if the acceleration ratios were based on the time required to find the lowest-energy structure from both algorithms they would be larger. Energy evolution plots illustrating the acceleration of GA_AL relative to GA_DFT are provided in Supplementary Fig. 12.
Size-transferable interatomic potentials for nanoclusters
The results presented in the previous section were obtained by training a new potential at every cluster size, as there is a risk that a potential trained on one cluster size might not work well for clusters of another size due to the fact that the properties of atomic clusters can change discontinuously with the number of atoms in the cluster. However using a potential trained at one size to find structures of a different size could significantly speed up the structure search by reducing the total amount of training data that must be generated. In particular, using potentials trained with smaller clusters to predict the structures of larger clusters can have significant performance advantages, as the cost of generating training data using DFT typically scales as approximately the cube of the number of valence electrons in the cluster68.
We examined how accurately potentials trained on clusters of a range of small sizes are able to predict the energies of clusters with larger sizes. Training data were separated into a group of 3000 clusters with even numbers of atoms (22, 26, 30, 34, and 38) and another group of 3000 clusters with odd numbers of atoms (21, 25, 29, 33, and 37), as DFT calculations indicate that even-sized clusters and odd-sized clusters have distinct ground state magnetic moments (see Supplementary Note 3). A third training set of also 3000 clusters was selected from a set containing all sizes listed above, odd and even. The validation sets were composed of about 3000 clusters for each cluster size between 50–55 atoms. Details of the construction of the training and validation sets can be found in the “Methods” section.
All three mixed-size potentials predicted energies of the large clusters with validation errors (~10 meV/atom) comparable to training errors (Fig. 3a, b). The errors in the predicted forces (~165 meV/Å) were slightly worse than the fitting errors. These errors are similar to the training and validation errors achieved when all of the training and validation data consisted of clusters of 24 atoms (Fig. 1). The validation errors are below those achieved on silicon42 and boron58 clusters with GAP potentials, but some of this difference is likely due to the natures of the different elements. Mixing training data with different magnetic moments did not have a significant adverse effect on model predictions (Fig. 3). The potential trained on clusters with odd numbers of atoms has slightly smaller prediction errors for both force and energy than the potential trained on clusters with even numbers of atoms regardless of whether the validation set contains even-sized or odd-sized clusters. The potential trained with both even and odd clusters has larger energy training errors than both of the even and odd potentials, but energy validation errors between the validation errors for even and odd potentials. For forces, the potential trained with both even and odd clusters has the lowest training and validation errors in all cases.
For comparison, we evaluated the ability of potentials trained on clusters of a single size. For each size, the training data consisted of 3000 dissimilar clusters, with exception of clusters with 21 atoms (the smallest size) for which our training set only had 2136 clusters after removing structurally similar clusters (see “Methods”). For potentials trained on clusters of a single size, training and validation errors were similar for forces, and potentials trained on a single size may predict forces with significantly lower errors than the potentials trained on a mixed set of sizes. However validation errors for energies are notably worse than the training errors, especially for potentials trained on small clusters (Fig. 3a, b). The accuracy strongly depends on the size of the clusters in the training set, with larger sizes having the lowest errors. This suggests that quantum finite-size effects may be particularly pronounced for clusters with fewer than about 30 atoms, limiting the extent to which potential models trained at these sizes can be transferred to larger sizes. The training algorithm may also have a difficult time determining how the undercoordination of surface atoms in a cluster affects its energy when all clusters in the training set have approximately the same surface area. In contrast, training sets with a mixture of cluster sizes provide more information on how the energy is affected by the cluster surface area, which may improve the prediction accuracy for clusters of varying sizes. Parity plots of energies and force components of both single-size and mixed-size potentials can be found in Supplementary Figs. 8 and 9.
Because of the particular importance of identifying the structures of low-energy clusters, we evaluated the potentials on the lowest-energy structures we found or collected from the literature with 50 to 55 atoms (see Supplementary Fig. 10). The diversity of the training sets with mixed cluster sizes proved beneficial for identifying low-energy clusters, as the MTP extrapolation grades (see Methods) for the lowest-energy clusters in the validation set are less than one, suggesting interpolation, with respect to the training sets of odd-sized and even-sized clusters. On the other hand, the low-energy structures had extrapolation grades above 1, suggesting extrapolation, with respect to the training sets of single-sized clusters. Accordingly, the mixed-size potentials had much lower energy errors than the single-size ones.
Prediction of structures for Al clusters with 41–55 atoms
To identify low-energy structures with 41–55 atoms, we used the mixed-size potentials trained on clusters with odd or even numbers of atoms to initialize GA_AL searches for clusters with an odd or even number of atoms respectively. Initializing the GA_AL algorithm with a pre-trained potential demonstrated significant performance advantages (Fig. 4). On average GA_AL explored 144 times as many clusters as GA_DFT during the calculation time of the GA_AL run, with an average acceleration ratio of 12.06 (and a median of 8.86) compared to GA_DFT (Table 1). To calculate the acceleration ratios, the time spent pre-training the potential for GA_AL was also included in the total GA_AL time. However the one-time cost of generating the training data used for pre-training was not, as there is no incremental cost of generating this data for each new size. For sizes 45 and 54, the acceleration ratio was set to 0 since they failed to discover better or equivalent configurations as GA_DFT. The acceleration ratios for the remaining sizes ranged from 1.24 to 71.21, which is consistent with the spread of nearly two orders of magnitude we observed for the runs initialized on untrained potentials. The full set of times to solutions for both GA_DFT and GA_AL are plotted on a log scale in Supplementary Fig. 15.
The sizable increase of acceleration ratios for GA_AL with pre-trained potentials can be credited to a significant reduction in the number of times DFT is called for learning on the fly in the early stages of the search. For GA_AL runs initialized with untrained potentials, clusters in the early stages of the runs tend to have high energies (Fig. 4a), so training steps in the early stages are sampling a relatively high-energy region of configuration space. For GA_AL runs initialized with well-trained potentials, computational resources are more efficiently spent exploring the low-energy configurations.
We examined the quality of the ground state configurations discovered using GA_AL for clusters of 21–55 atoms by comparing them with the lowest-energy structures that have been previously reported for aluminum clusters. Here we only consider studies for which we were able to find the atomic coordinates of the discovered structures29,30,64,65,66. All structures collected from the literature were reoptimized by DFT using the same settings as those used in GA_AL. For 25 of the 35 sizes, GA_AL found structures at least 1 meV/atom lower in energy than the lowest-energy structure in the literature, and for another 7 sizes it identified the same lowest-energy structures as were available in the literature. For clusters of 22 atoms, GA_DFT found a structure that is structurally distinct from the lowest-energy structure reported in literature29 (and was rediscovered by GA_AL), but has only slightly lower energy (by 0.15 meV/atom). The GA_DFT algorithm discovered structures lower in energy than those discovered by GA_AL and the literature for 2 sizes (29 and 54). For clusters of 45 atoms, GA_DFT rediscovered the best-known structure from the literature but GA_AL did not. Detailed results are provided in Fig. 5 and Supplementary Table 6. A complete panel of lowest-energy clusters with 21 to 55 atoms can be found in Supplementary Fig. 17 and coordinates of these clusters have been published on the Novel Materials Discovery (NOMAD) repository69,70 and listed in Supplementary Dataset 1.
Structures found using GA_AL have lower energies than the lowest-energy literature structures by an average of 16.81 meV/atom, with a maximum of 51.81 meV/atom at size 36 (1.87 eV/cluster). For the five sizes for which GA_AL did not discover the best structures (Fig. 5a), the energies are no more than 4.0 meV/atom above those of the structures with the lowest known energies.
Cluster structure analysis
Low-energy aluminum clusters already start to show morphological regularity in the size range studied in this work (Supplementary Fig. 17). Favorable structures of low-lying clusters include layered close-packed structures (sizes 40–43) and tetrahedra with closed-packed surfaces (sizes 35–37, 54–55). Tetrahedra are favored when they can form mostly closed shells of atoms. These structure types reflect the facts that aluminum has an FCC crystal structure in its bulk phase and (111) surfaces have the lowest free surface energies among low-index facets71. Analysis of cohesive energies shows a peak at size 36 (Supplementary Fig. 18), indicating the cluster with 36 atoms is highly stable relative to the clusters of neighboring sizes. It has a perfect tetrahedral shape and the highest degree of symmetry (D2d) among the lowest-energy clusters. Experimental studies showed a surprisingly high melting temperature close to the bulk melting temperature of Al cation clusters with around 37 atoms72, echoing with the cohesive energy peak we observed. More details can be found in Supplementary Note 6.
The GA_AL approach presented here has clear advantages, including about an order of magnitude acceleration compared to GA_DFT on average. The advantages of the active learning approach are apparent by a comparison with the work of Tuo et al.66, who used a neural-network potential trained on DFT calculations in a GA to search for low-energy aluminum clusters. However the potential they used was not retrained on the fly, and the DFT-calculated energies of the structures they discovered are significantly higher than the ones discovered by our approach (Fig. 5) and higher than those discovered by a DFT search by Aguado and López29. The difference is likely due to the inability of the machine-learned potential to extrapolate accurately to structural motifs that were not present in the original training data.
Due to the stochastic nature of GA, the performance advantage varied by nearly two orders of magnitude across systems, and in a small number of cases GA_DFT outperformed GA_AL. The performance of the active learning approach depends on the particular implementation of the algorithm, and there are several potential areas for improvement for the GA_AL approach used here. One is the relatively high-energy prediction errors of MTP for nanoclusters compared with bulk systems. Benchmarks by Zuo et al.51 showed that MTP has energy errors generally less than 5 meV/atom and sometimes even lower than 1 meV/atom for bulk elemental systems. However, for nanoclusters, as exhibited in Fig. 1, validation energy errors are at the order of 10 meV/atom for small clusters with 24 atoms. They can be lowered by using smaller force weights, but the improvement comes at the expense of driving up force errors, which increases the possibility of creating artificial local minima. The relatively large energy errors increase the chance that the energy of the lowest-energy cluster is overestimated and never enters the pool. To mitigate this risk we used a relatively large pool size (25 clusters) to expand the energy window of pool clusters and raise the chance of the lowest-energy cluster being captured in the pool. A large pool has also been shown to increase the success rate of identifying the lowest-energy isomer due to the structural diversity of the pool22, although at the expense of slowing convergence speed21,22. An alternative approach would be to use a machine learning framework that results in a more accurate interatomic potential. Very recently, Lysogorskiy et al. have demonstrated an efficient implementation of the ACE, which was shown to be faster and more accurate than MTP on bulk copper and silicon systems73. The ACE approach could also be promising for nanoclusters and is worth evaluating in future work.
We found that the energy window of pool clusters became narrower as the structure search continued, implying a high density of metastable states with energies close to the global minimum, especially for large clusters. This is not unexpected since the dimension of configuration space dramatically increases as system size grows. The relatively narrow energy window increases the chance of the pool missing the lowest-energy isomer, as the window size may be comparable to the error in MTP energy predictions. A possible workaround is to run GA_AL and then use the discovered low-energy clusters to seed a GA_DFT search. This would consume additional computational resources but decrease the uncertainty in the proposed lowest-energy structures.
Another area for improvement is the relationship between the extrapolation grades (used to identify structures that trigger retraining) and prediction errors. A high extrapolation grade normally implies an energy evaluation with high uncertainty, but a low grade does not necessarily guarantee an accurate prediction (see Supplementary Fig. 19). In practice, we addressed this challenge by starting DFT re-optimization and retraining whenever the majority of clusters in the pool had MTP-calculated energies but not DFT-calculated energies. An alternative approach would be to implement similarity-based measurements of uncertainty, which might more accurately identify structures for which the prediction errors are likely to be large.
The calculation of extrapolation grades was a significant portion of the overall computational cost of the GA_AL algorithm, taking on average about 10% of total wall-time (Supplementary Fig. 20). The routines for calculating the extrapolation grade in the MLIP package were not parallelized, so this portion of the algorithm would run on a single processor while the other processors reserved for the workflow sat idle. Having a parallelized and internal implementation of the grade calculation could considerably reduce the cost of this step and substantially increase the acceleration ratios of GA_AL against GA_DFT.
The choice of exchange-correlation functional (or other sources of inaccuracy in DFT calculations) could also affect the energy ranking of low-lying clusters. Galvão and Viegas showed the lowest-energy cluster from more accurate functionals are generally already included in the set of 5–10 lowest-energy clusters found by less accurate functionals32. For this reason, we list the ten lowest-energy clusters of each size from 21 to 55 atoms in the Supplementary Dataset 1 and also published them on the NOMAD repository69,70.
Although we have demonstrated that potentials trained on small clusters can be used to predict the structures of clusters about twice as large, it is not clear how well these potentials will work on significantly larger particles. If transferability can be retained up to larger clusters, the methods we have presented could be used to efficiently create a comprehensive datasets of cluster structures for small particles with structures that cannot be simply described as that of a truncated crystal.
We quantify geometric similarity between two cluster structures of same size by a similarity score calculated using an approach based on the spectral decomposition of extended distance matrices74. The score is non-negative and a smaller value implies higher similarity. Identical clusters have a score of 0, and visually distinguishable clusters typically have a score above about 0.3. The similarity measure is used to prevent geometrically similar clusters from being simultaneously included in the pool, which can improve the efficiency of GA22, and to select diverse training data for the MTP56.
A GA is a global optimization method inspired by the principles of natural selection75. We developed our own code based off the pool-based Birmingham Parallel Genetic Algorithm24 with some variations. A pool of low-energy clusters of fixed size is maintained during the search. Initial clusters are generated by randomly distributing atoms in space. Once the pool is filled, genetic operations, namely, crossover and mutation, are applied to parent clusters selected from the pool to generate child clusters. Child clusters that are dissimilar to all pool clusters and have a lower energy than the pool cluster with the highest energy will replace the highest-energy pool cluster. Additional details of the GA can be found in the Supplementary Method.
Genetic algorithm with actively learned interatomic potentials
To accelerate the GA search for new stable nanoclusters, we use MLIPs (to improve speed) trained on-the-fly using active learning (to maintain accuracy). We refer to this combination of GA and active learning as “GA_AL”. The active-learning query strategy uses the generalized D-Optimality criterion implemented in the MLIP package47,48, which assigns unlabeled data an “extrapolation grade” based on a measure of the extent to which the unlabeled data is outside of the space spanned by the training data. An extrapolation grade above 1 implies extrapolation relative to the current training set and large errors should be expected, while a value below 1 indicates interpolation48.
The GA_AL runs batch retraining cycles and maintains a waitlist of structures to be included in the next cycle (Fig. 6). Two extrapolation grade thresholds are used when determining whether a newly-generated cluster should be added to the waitlist. The first threshold, γbreak, is used to screen clusters before relaxation using MTP. The trained potential may struggle to relax clusters with extrapolation grades above this threshold, so they are automatically added to the waitlist. Structures with extrapolation grades below γbreak are relaxed. If the extrapolation grade of the relaxed structure is greater than the second threshold, γselect, then it too is added to the waitlist. In this work γbreak was set to 10 for GA_AL initialized with untrained potential, as the default value recommend by MTP code48. A looser value of 1000 was used for the pre-trained potential, as it is not as important to add training data to a potential that has already been trained. As-generated clusters with extrapolation grades about 1000 typically cannot be evaluated accurately by MTP potentials, but we found they could still be relaxed by MTP to reasonable configurations. Starting DFT relaxations from configurations pre-relaxed using MTP was used to reduce computational costs. The parameter γselect was set to 1.01 for all searches. When the waitlist reaches a user-defined size (here set to 5), the GA is paused and a retraining cycle begins. All new clusters in the pool as well as clusters on the waitlist are relaxed using DFT and added to the training set. Before retraining the potential, a similarity screening is applied to select the most geometrically diverse set of configurations from all relaxation steps (discussed below), which maximizes structural diversity and reduces training cost.
Because of the uncertainty in MTP-predicted energies, there is a risk that the pool over time becomes polluted with structures with erroneously low MTP-predicted energies. To mitigate this risk, a retraining cycle is also started whenever a majority of the clusters in the pool (>50%) have energies that were calculated using MTP and not DFT. GA_AL is considered to be converged when no cluster with an energy lower than the lowest-energy pool cluster has been found for 4000 new clusters.
When initializing GA_AL with pre-trained potentials, it is beneficial to switch off retraining at the beginning of the search. In this approach, extrapolating clusters are discarded and new ones are regenerated until they are interpolating, allowing the GA to more fully explore the PES of the pre-trained potential. We did this for the first 5000 clusters in GA_AL runs initialized with mixed-size potentials when generating clusters with 41–55 atoms. Additional discussion and justification for this approach are provided in the Supplementary Method.
Moment tensor potentials
We used the MLIP package48 to train MTP. The hyperparameters for training include potential complexity, energy weight, force weight and stress weight. Potential complexity is characterized by the maximum level of moments, levmax, of basis functions46,48. The energy weight was always set to 1, so the force weight can be seen as the weight of force components relative to the weight of the energy. The stress weight was set to 0 since it is irrelevant in the case of clusters due to the lack of lattice. We generated potentials with levmax = 14 and a force weight of 1/1000 relative to the energy weight, to balance between accuracy and training cost, as shown in the “Results” section and Supplementary Note 1. The inner and outer cutoff radii defining the local atomic neighborhood were set to the default values of 2 and 5 Å, and eight radial basis functions were used48. A maximum number of 5000 training iterations were allowed for potential fitting. This limit was never reached, as the maximum number of training iterations in any GA_AL run was 1399.
Training data selection for pre-trained potentials
To select the training data for the pre-trained mixed-size potentials, we used a diversity-based strategy and an energy-based strategy to select structurally diverse structures from DFT relaxations and to improve accuracy in the low-lying regions of the PES. All DFT calculations were collected from GA_DFT and GA_AL runs on clusters with 21–40 atoms. For each of the constituent cluster sizes, relaxation trajectories were only kept if the corresponding local ground states have similarity scores larger than 0.3 with the local ground states of all other trajectories already included in the training set. Within each trajectory, only dissimilar ionic steps were selected as well. We accomplished this by including the fully relaxed structure and iterating backwards through the relaxation until encountering a structure with a similarity score, relative to the most recently-added structure, that was at least 0.3. That structure was then added to the training set, and we repeated this procedure until all relaxation steps were exhausted. This diversity-based strategy was applied throughout this work, in data preparation processes for both training and validation sets.
Following the similarity screening, we performed an energy-based selection strategy. First, all structures that passed through the diversity screening were grouped into sets based on the number of atoms in the cluster. In total, 50% of the training data selected from each set consisted of the structures with lowest energy, 10% consisted of the structures with the highest energy, and the remaining 40% were randomly picked from the remaining ionic steps. The inclusion of high-energy training data ensures that relaxation by MTP does not lead to physically unrealistic high-energy configurations.
A total of 3000 structures were selected for each of the training sets. For potentials trained with clusters of multiple sizes (the ones labeled by “odd”, “even” and “all” in Fig. 3), equal numbers of structures were chosen from each constituent size. For potentials trained with clusters of a single size, all 3000 structures were chosen from clusters of that size. The training set for the potential trained with clusters of 21 atoms only contains 2136 structures after the diversity filtering of DFT calculations, and they were all included in the training set.
Validation data selection for pre-trained potentials
We collected validation data for mixed-size potentials from GA_DFT runs on clusters with 50–55 atoms. The diversity-based strategy discussed above was used to select a structurally diverse set of structures. The validation sets contain around 3000 structures for each size. The validation data cover a wide range of energies and forces to ensure a thorough validation of potentials in a variety of atomic environments. The average force components are about 0.08 eV/Å with maxima of about 5 eV/Å across validation cluster sizes. Details of the validation dataset can be found in Supplementary Tables 2 and 3.
Training data selection for on-the-fly retraining in GA_AL
We also use similarity filtering, as described above, to select structures for on-the-fly retraining. Similarity filtering is used to select distinct clusters from each relaxation trajectory during active learning. We used a tight similarity threshold of 0.3 on small to medium clusters (21–40) and a looser threshold of 0.15 on large clusters (41–55). The looser threshold is meant to increase the fraction of available data being added to the training set at each retraining cycle, as relaxing large clusters is computationally more expensive. We do not check similarity between new training data and all existing data on-the-fly, as this is computationally costly.
Comparison of GA_AL against GA_DFT
Both GA_AL and GA_DFT were run using the same set of GA parameters (see Supplementary Method) on all 24 cores of Intel E5-2680 V3 processors. The GA_AL runs were performed until 4000 consecutive new clusters had been generated without identifying a new lowest-energy cluster. GA_DFT runs were performed for at least the same amount of time as GA_AL runs for fair comparison. For large clusters with 50–55 atoms, GA_DFT searches were executed for a much longer time of 21 days, to reach comparable energy levels as GA_AL runs (see also Supplementary Method). The only difference between the GA_AL and GA_DFT algorithms was that GA_AL used MTP retrained with active learning and DFT for relaxation, whereas GA_DFT used only DFT for relaxation. Low-energy structures identified in GA_AL were re-optimized using DFT at the retraining stage and only DFT-evaluated energies were reported at the end, to ensure an ab initio level of accuracy.
To determine whether a new low-energy structure had been found, we considered both the DFT-calculated energy and similarity scores. A cluster that is lower in energy by at least 1 meV/atom and at the same time has a similarity score, compared to the existing lowest-energy structure, that is greater than 0.3 is identified as a new lowest-energy cluster. Clusters that have a total energy within 1 meV/atom to the existing lowest-energy cluster but are structurally dissimilar are considered as energetically similar clusters and are not counted as new lowest-energy clusters. Borderline cases were inspected manually. More details of how we determined whether the algorithm had found a new lowest-energy structure can be found in Supplementary Note 5.
All DFT calculations were carried out using the Vienna ab initio simulation package (VASP)68,76,77,78 with the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional79,80,81. The projector augmented wave (PAW) dataset shipped with VASP with the title “PAW_PBE Al 04Jan2001” was used82,83. Reciprocal space was sampled by a single k-point at the Г point and the kinetic energy cutoff for the plane-wave basis was set to 240 eV. The electronic self-consistency loop was considered to reach convergence when subsequent steps had an energy difference below 10−5 eV and the convergence criterion for ionic relaxation was set to a force difference below 0.01 eV/Å. Our dataset84 shows that ground state Al nanoclusters with even sizes above 18 and odd sizes above 7 have net spins of 0 μB and 1 μB, respectively (see Supplementary Note 3). Therefore, all ab initio calculations fix the magnetic moment to 0 μB for even-sized clusters and to 1 μB for odd-sized clusters using the parameter NUPDOWN in VASP. Periodic images of clusters are separated by a vacuum of size at least 10 Å to avoid any spurious interactions (see convergence tests in Supplementary Note 9).
Structures of the ten lowest-energy clusters at each size from 21 to 55 atoms have been published in the NOMAD repository at https://doi.org/10.17172/NOMAD/2022.06.27-169,70. They are also listed in Supplementary Dataset 1 for the reader’s convenience. The mixed-size MTP potentials used in GA_AL calculations with pre-trained potentials are provided in Supplementary Dataset 2.
Our implementation of the GA_AL procedure is open-sourced under the Apache License 2.0 at https://gitlab.com/muellergroup/cluster-ga. The code also supports genetic algorithm searches using only DFT or interatomic potentials through interfaces with VASP and LAMMPS. Input templates and documentation of input parameters can be found in the “example” folder of the repository.
Kang, X., Li, Y., Zhu, M. & Jin, R. Atomically precise alloy nanoclusters: syntheses, structures, and properties. Chem. Soc. Rev. 49, 6443–6514 (2020).
Jena, P. & Castleman, A. W. Clusters: a bridge across the disciplines of physics and chemistry. Proc. Natl Acad. Sci. USA 103, 10560 (2006).
Gawande, M. B. et al. Cu and Cu-based nanoparticles: synthesis and applications in catalysis. Chem. Rev. 116, 3722–3811 (2016).
Liu, L. & Corma, A. Metal catalysts for heterogeneous catalysis: from single atoms to nanoclusters and nanoparticles. Chem. Rev. 118, 4981–5079 (2018).
Jin, R., Zeng, C., Zhou, M. & Chen, Y. Atomically precise colloidal metal nanoclusters and nanoparticles: fundamentals and opportunities. Chem. Rev. 116, 10346–10413 (2016).
Li, G. & Jin, R. Atomically precise gold nanoclusters as new model catalysts. Acc. Chem. Res. 46, 1749–1758 (2013).
Saha, K., Agasti, S. S., Kim, C., Li, X. & Rotello, V. M. Gold nanoparticles in chemical and biological sensing. Chem. Rev. 112, 2739–2779 (2012).
Kang, X. & Zhu, M. Tailoring the photoluminescence of atomically precise nanoclusters. Chem. Soc. Rev. 48, 2422–2457 (2019).
Jin, R. Atomically precise metal nanoclusters: stable sizes and optical properties. Nanoscale 7, 1549–1565 (2015).
White, R. J., Luque, R., Budarin, V. L., Clark, J. H. & Macquarrie, D. J. Supported metal nanoparticles on porous materials. Methods Appl. Chem. Soc. Rev. 38, 481–494 (2009).
Martin, T. P. Shells of atoms. Phys. Rep. 273, 199–241 (1996).
Ferrando, R., Jellinek, J. & Johnston, R. L. Nanoalloys: from theory to applications of alloy clusters and nanoparticles. Chem. Rev. 108, 845–910 (2008).
Aguado, A. & Jarrold, M. F. Melting and freezing of metal clusters. Annu. Rev. Phys. Chem. 62, 151–172 (2011).
Wales, D. J. & Doye, J. P. K. Global optimization by basin-hopping and the lowest energy structures of Lennard-Jones clusters containing up to 110 atoms. J. Phys. Chem. A 101, 5111–5116 (1997).
Wales, D. J. & Scheraga, H. A. Global optimization of clusters, crystals, and biomolecules. Science 285, 1368 (1999).
Pickard, C. J. & Needs, R. J. Ab initio random structure searching. J. Phys. Condens. Matter 23, 053201 (2011).
Call, S. T., Zubarev, D. Y. & Boldyrev, A. I. Global minimum structure searches via particle swarm optimization. J. Comput. Chem. 28, 1177–1186 (2007).
Wang, Y., Lv, J., Zhu, L. & Ma, Y. Crystal structure prediction via particle-swarm optimization. Phys. Rev. B 82, 094116 (2010).
Weigend, F. & Ahlrichs, R. Quantum chemical treatments of metal clusters. Philos. Trans. R. Soc. A 368, 1245–1263 (2010).
Ahlrichs, R. & Elliott, S. D. Clusters of aluminium, a density functional study. Phys. Chem. Chem. Phys. 1, 13–21 (1999).
Johnston, R. L. Evolving better nanoparticles: genetic algorithms for optimising cluster geometries. Dalton Trans. 4193–4207 (2003).
Vilhelmsen, L. B. & Hammer, B. A genetic algorithm for first principles global structure optimization of supported nano structures. J. Chem. Phys. 141, 044711 (2014).
Tipton, W. W. & Hennig, R. G. A grand canonical genetic algorithm for the prediction of multi-component phase diagrams and testing of empirical potentials. J. Phys. Condens. Matter 25, 495401 (2013).
Shayeghi, A., Götz, D., Davis, J. B. A., Schäfer, R. & Johnston, R. L. Pool-BCGA: a parallelised generation-free genetic algorithm for the ab initio global optimisation of nanoalloy clusters. Phys. Chem. Chem. Phys. 17, 2104–2112 (2015).
Deaven, D. M. & Ho, K. M. Molecular-geometry optimization with a genetic algorithm. Phys. Rev. Lett. 75, 288–291 (1995).
Vargas, J. A., Buendía, F. & Beltrán, M. R. New AuN (N = 27–30) lowest energy clusters obtained by means of an improved DFT–genetic algorithm methodology. J. Phys. Chem. C 121, 10982–10991 (2017).
Davis, J. B. A., Shayeghi, A., Horswell, S. L. & Johnston, R. L. The Birmingham parallel genetic algorithm and its application to the direct DFT global optimisation of IrN (N = 10–20) clusters. Nanoscale 7, 14032–14038 (2015).
Drebov, N. & Ahlrichs, R. Structures of Aln, its anions and cations up to n = 34: a theoretical investigation. J. Chem. Phys. 132, 164703 (2010).
Aguado, A. & López, J. M. Structures and stabilities of Aln+, Aln, and Aln− (n = 13–34) clusters. J. Chem. Phys. 130, 064704 (2009).
Doye, J. P. K. A model metal potential exhibiting polytetrahedral clusters. J. Chem. Phys. 119, 1136–1147 (2003).
Xiang, Y., Jiang, H., Cai, W. & Shao, X. An efficient method based on lattice construction and the genetic algorithm for optimization of large Lennard-Jones clusters. J. Phys. Chem. A 108, 3586–3592 (2004).
Galvão, B. R. L. & Viegas, L. P. What electronic structure method can be used in the global optimization of nanoclusters? J. Phys. Chem. A 123, 10454–10462 (2019).
Behler, J. Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys. 145, 170901 (2016).
Mueller, T., Hernandez, A. & Wang, C. Machine learning for interatomic potential models. J. Chem. Phys. 152, 050902 (2020).
Schmidt, J., Marques, M. R. G., Botti, S. & Marques, M. A. L. Recent advances and applications of machine learning in solid-state materials science. npj Comput. Mater. 5, 83 (2019).
Deringer, V. L., Caro, M. A. & Csányi, G. Machine learning interatomic potentials as emerging tools for materials science. Adv. Mater. 31, 1902765 (2019).
Ramprasad, R., Batra, R., Pilania, G., Mannodi-Kanakkithodi, A. & Kim, C. Machine learning in materials informatics: recent applications and prospects. npj Comput. Mater. 3, 54 (2017).
Quaranta, V., Behler, J. & Hellström, M. Structure and dynamics of the liquid–water/zinc-oxide interface from machine learning potential simulations. J. Phys. Chem. C. 123, 1293–1304 (2019).
Artrith, N. & Urban, A. An implementation of artificial neural-network potentials for atomistic materials simulations: performance for TiO2. Comput. Mater. Sci. 114, 135–150 (2016).
Chiriki, S. & Bulusu, S. S. Modeling of DFT quality neural network potential for sodium clusters: Application to melting of sodium clusters (Na20 to Na40). Chem. Phys. Lett. 652, 130–135 (2016).
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
Bartók, A. P., Kondor, R. & Csányi, G. On representing chemical environments. Phys. Rev. B 87, 184115 (2013).
Szlachta, W. J., Bartók, A. P. & Csányi, G. Accuracy and transferability of Gaussian approximation potential models for tungsten. Phys. Rev. B 90, 104108 (2014).
Thompson, A. P., Swiler, L. P., Trott, C. R., Foiles, S. M. & Tucker, G. J. Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys. 285, 316–330 (2015).
Wood, M. A. & Thompson, A. P. Extending the accuracy of the SNAP interatomic potential form. J. Chem. Phys. 148, 241721 (2018).
Shapeev, A. V. Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Model. Simul. 14, 1153–1173 (2016).
Podryabinkin, E. V. & Shapeev, A. V. Active learning of linearly parametrized interatomic potentials. Comput. Mater. Sci. 140, 171–180 (2017).
Novikov, I. S., Gubaev, K., Podryabinkin, E. V. & Shapeev, A. V. The MLIP package: moment tensor potentials with MPI and active learning. Mach. Learn. Sci. Technol. 2, 025002 (2021).
Drautz, R. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B 99, 014104 (2019).
Hernandez, A., Balasubramanian, A., Yuan, F., Mason, S. A. M. & Mueller, T. Fast, accurate, and transferable many-body interatomic potentials by symbolic regression. npj Comput. Mater. 5, 112 (2019).
Zuo, Y. X. et al. Performance and cost assessment of machine learning interatomic potentials. J. Phys. Chem. A 124, 731–745 (2020).
Settles, B. Active Learning Literature Survey (University of Wisconsin-Madison, 2009).
Artrith, N. & Behler, J. High-dimensional neural network potentials for metal surfaces: a prototype study for copper. Phys. Rev. B 85, 045439 (2012).
Jinnouchi, R., Lahnsteiner, J., Karsai, F., Kresse, G. & Bokdam, M. Phase transitions of hybrid perovskites simulated by machine-learning force fields trained on the fly with Bayesian inference. Phys. Rev. Lett. 122, 225701 (2019).
Deringer, V. L., Proserpio, D. M., Csányi, G. & Pickard, C. J. Data-driven learning and prediction of inorganic crystal structures. Faraday Discuss. 211, 45–59 (2018).
Bernstein, N., Csányi, G. & Deringer, V. L. De novo exploration and self-guided learning of potential-energy surfaces. npj Comput. Mater. 5, 99 (2019).
Kolsbjerg, E. L., Peterson, A. A. & Hammer, B. Neural-network-enhanced evolutionary algorithm applied to supported metal nanoparticles. Phys. Rev. B 97, 195424 (2018).
Tong, Q., Xue, L., Lv, J., Wang, Y. & Ma, Y. Accelerating CALYPSO structure prediction by data-driven learning of a potential energy surface. Faraday Discuss. 211, 31–43 (2018).
Cao, B. et al. Activation of dinitrogen by solid and liquid aluminum nanoclusters: a combined experimental and theoretical study. JACS 132, 12906–12918 (2010).
Chen, R. et al. Sub-3 nm aluminum nanocrystals exhibiting cluster-like optical properties. Small 17, 2002524 (2021).
Boldyrev, A. I. & Wang, L.-S. All-metal aromaticity and antiaromaticity. Chem. Rev. 105, 3716–3757 (2005).
Cox, D. M., Trevor, D. J., Whetten, R. L., Rohlfing, E. A. & Kaldor, A. Aluminum clusters: magnetic properties. J. Chem. Phys. 84, 4651–4656 (1986).
Jena, P. & Sun, Q. Super atomic clusters: design rules and potential for building blocks of materials. Chem. Rev. 118, 5755–5870 (2018).
Doye, J. P. K., Wales, D. J. & Berry, R. S. The effect of the range of the potential on the structures of clusters. J. Chem. Phys. 103, 4234–4249 (1995).
Piotrowski, M. J. et al. Theoretical study of the structural, energetic, and electronic properties of 55-atom metal nanoclusters: a DFT investigation within van der Waals corrections, spin–orbit coupling, and PBE+U of 42 metal systems. J. Phys. Chem. C. 120, 28844–28856 (2016).
Tuo, P., Ye, X. B. & Pan, B. C. A machine learning based deep potential for seeking the low-lying candidates of Al clusters. J. Chem. Phys. 152, 114105 (2020).
Nyshadham, C. et al. Machine-learned multi-system surrogate models for materials prediction. npj Comput. Mater. 5, 51 (2019).
Kresse, G. & Furthmuller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996).
Ghiringhelli, L. M. et al. Towards efficient data exchange and sharing for big-data driven materials science: metadata and data formats. npj Comput. Mater. 3, 46 (2017).
Wang, Y. et al. Data for “Accelerated prediction of atomically precise cluster structures using on-the-fly machine learning”. NOMAD. https://doi.org/10.17172/NOMAD/2022.06.27-1 (2022).
Patra, A., Bates, J. E., Sun, J. & Perdew, J. P. Properties of real metallic surfaces: effects of density functional semilocality and van der Waals nonlocality. Proc. Natl Acad. Sci. USA 114, E9188 (2017).
Neal, C. M., Starace, A. K. & Jarrold, M. F. Melting transitions in aluminum clusters: the role of partially melted intermediates. Phys. Rev. B 76, 054113 (2007).
Lysogorskiy, Y. et al. Performant implementation of the atomic cluster expansion (PACE) and application to copper and silicon. npj Comput. Mater. 7, 97 (2021).
Li, X.-T., Yang, X.-B. & Zhao, Y.-J. Geometrical eigen-subspace framework based molecular conformation representation for efficient structure recognition and comparison. J. Chem. Phys. 146, 154108 (2017).
Holland, J. H. Adaptation in Natural and Artificial Systems (University of Michigan Press, 1975).
Kresse, G. & Hafner, J. Abinitio molecular-dynamics for liquid-metals. Phys. Rev. B 47, 558–561 (1993).
Kresse, G. & Hafner, J. Ab-initio molecular-dynamics simulation of the liquid-metal amorphous-semiconductor transition in germanium. Phys. Rev. B 49, 14251–14269 (1994).
Kresse, G. & Furthmuller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 6, 15–50 (1996).
Perdew, J. P., Burke, K. & Wang, Y. Generalized gradient approximation for the exchange-correlation hole of a many-electron system. Phys. Rev. B 54, 16533–16539 (1996).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Perdew, J. P. & Yue, W. Accurate and simple density functional for the electronic exchange energy: generalized gradient approximation. Phys. Rev. B 33, 8800–8802 (1986).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994).
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758–1775 (1999).
Manna, S. et al. A database of low-energy atomically precise nanoclusters. Preprint at https://doi.org/10.26434/chemrxiv-2021-0fq3q (2021).
Momma, K. & Izumi, F. VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J. Appl. Crystallogr. 44, 1272–1276 (2011).
We thank Prof. Alexander V. Shapeev for providing helpful guidance in using the MLIP package. The work was supported by the Office of Naval Research under the grant No. ONR MURI N00014-15-1-2681. Calculations were performed using computational resources from the Maryland Advanced Research Computing Cluster (MARCC), the Stampede2 supercomputer at the Texas Advanced Computer Center (TACC) and the Gordon supercomputer in Department of Defense High Performance Computing Modernization Program. TACC resources were provided through the XSEDE program with NSF award DMR-140068. Images of the atomic structures of clusters were generated using VESTA85.
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Wang, Y., Liu, S., Lile, P. et al. Accelerated prediction of atomically precise cluster structures using on-the-fly machine learning. npj Comput Mater 8, 173 (2022). https://doi.org/10.1038/s41524-022-00856-x