Introduction

Finding materials with optimal properties (the highest hardness, the lowest dielectric permittivity, etc.) or a combination of properties (e.g., the highest hardness and fracture toughness) is the central problem of materials science. Until recently, only experimental materials discovery was possible, with all limitations and expense of the trial-and-error approach, but the ongoing revolution in theoretical/computational materials science (see1,2) begins to change the situation. Using quantum-mechanical calculations, it is now routine to predict many properties when the crystal structure is known. In 2003, Curtarolo demonstrated the data mining method for materials discovery3 by screening crystal structure databases (which can include known or hypothetical structures) via ab initio calculations. At the same time, major progress in fully nonempirical crystal structure prediction took place. Metadynamics4 and evolutionary algorithms5,6,7 have convinced the community that crystal structures are predictable. Despite the success of these and other methods, a major problem remains unsolved: the prediction of a material with optimal properties among all possible compounds. Totally, 4950 binary systems, 161,700 ternary systems, 3,921,225 quaternary systems, and an exponentially growing number of higher-complexity systems can be created from 100 best-studied elements in the Periodic Table. In each system, a very large number of compounds and, technically, an infinite number of crystal structures can be constructed computationally, and an exhaustive screening of such a search space is impractical. Only about 72% of binary, 16% of ternary, 0.6% of quaternary, and less than 0.05% of more complex systems have ever been studied experimentally8, and even in those systems that have been studied, new compounds are being discovered continually9,10,11. Studying all these systems, one by one, using global optimization methods is unrealistic. Data mining is a more practical approach, but the statistics shows that the existing databases are significantly incomplete even for binary systems, and much more so for ternary and more complex systems. Besides, data mining cannot find fundamentally new crystal structures. When searching for materials optimal in more than one property, these limitations of both approaches become even greater. We present a new method implemented in our code, MendS (Mendelevian Search), and show its application to the discovery of (super)hard and magnetic materials.

Results

Mendelevian space

Global optimization methods are effective only when applied to property landscapes that have an overall organization, e.g., a landscape with a small number of funnels, where all or most of the good solutions (e.g., low-energy structures) are clustered. Discovering materials with optimal properties, i.e., performing a complex global optimization in the chemical and structural space, requires a rational organization of the chemical space that puts compounds with similar properties close to each other. If this space is created by ordering the elements by their atomic numbers, we observe a periodic patchy pattern (Fig. 1a), unsuitable for global optimization.

Fig. 1: Pettifor maps showing the distribution of hardness in binary systems, using different sequence of the elements.
figure 1

a Atomic numbers, b Villars’ Periodic number, c Pettifor’s MN, and d MN obtained in this work. Noble gases were excluded because of their almost complete inability to form stable compounds at normal conditions. Rare earths and elements heavier than Pu were excluded because of the problems of the DFT calculations. In total, we consider 74 elements that can be combined into 2775 possible binary systems. Each pixel is a binary system, the color encodes the highest hardness in each system.

In 1984, Pettifor suggested a new quantity, the so-called “chemical scale,” that arranges the elements in a sequence such that similar elements are placed near each other, and compounds of these elements also display similar properties12. This way, structure maps13 with well-defined regions of similar crystal structures or properties can be drawn. In a thus ordered chemical space, evolutionary algorithms should be extremely effective: they can zoom in on the promising regions at the expense of unpromising ones.

What is the nature of the chemical scale or the Mendeleev number (MN), which is an integer showing the position of an element in the sequence on the chemical scale? Pettifor derived these quantities empirically, while we redefined them using a more universal nonempirical way that clarifies their physical meaning (the method for computing MN is explained in the Supplementary Information). Goldschmidt’s law of crystal chemistry states that the crystal structure is determined by stoichiometry, atomic size, polarizability, and electronegativity of atoms/ions14,15, while polarizability and electronegativity are strongly correlated16. Villars et al.17 introduced another enumeration of the elements, emphasizing the role of valence electrons, which he called “Periodic number” (PN). He also showed that atomic size and electronegativity can be derived from AN and PN17. In redefining the chemical scale and MN, we used the most important chemical properties of the atom—size R and electronegativity χ (Pauling electronegativity)—the combination of which can be used as a single parameter succinctly characterizing the chemistry of the element. However, we need to emphasize that the chemical scale and MN are only used in this method for visualizing the results (the choice of MN for plotting such a Pettifor map is up to the user), while in our global coevolutionary algorithm, each atom is represented by both its size R and electronegativity χ to increase the accuracy. In this work, the atomic radius R is defined as half the shortest interatomic distance in the relaxed (for most elements hypothetical) simple cubic structure of an element—see the Supplementary Table 1.

Figure 2 shows the overall linear correlation between the MNs redefined in this work and those proposed by Pettifor. Carefully chosen MNs should lead to strong clustering in the chemical space, where neighboring systems have similar properties. The results of our searches for hard binary compounds using the PN, the MNs suggested by Pettifor and our redefined MNs are shown on Pettifor maps (Fig. 1b–d). Satisfyingly, our redefined MNs result in a better-organized chemical space with a clearer separation of regions containing binary systems with similar hardness. In fact, if our MNs (which are the sequences of projected elements on their regression line in the space of crudely correlated atomic radius and electronegativity) generate a good 2D map, with clear grouping of similar chemical systems (e.g., Na–Cl, K–Cl, Ca–Cl, Na–Br systems are located nearby), then a much better grouping is expected in the space of the initial two parameters R and χ, and it is in this space where variation operators of our method are defined (Fig. 3a, b). Also it worth mentioning that sizes and electronegativities of the atoms change under pressure—and using standard definitions of the MN (such as AN, PN, or Pettifor’s MN) will not work well. Our recipe, however, is universal and only requires atomic sizes and electronegativities at the pressure of interest. In this paper, we illustrate our method by binary systems, although more complex, at least ternary, systems are also tractable. In a nutshell (but see Methods section for details), our method performs evolution of a population of variable-composition chemical systems (each of which is tackled by an evolutionary optimization) - i.e. is an evolution over evolutions. Individual chemical systems are allowed to evolve and improve, then are compared and ranked, and the fittest ones get a chance to produce new chemical systems (which will partially inherit structural and chemical information from their parents). Evolving the population of such chemical systems, one efficiently finds the globally optimal solution and numerous high-quality suboptimal solutions as well.

Fig. 2: Correlation between the Mendeleev numbers defined in this work and those proposed by Pettifor.
figure 2

It is clear that these MNs have overall correlation, but for some elements (i.e. noble gases) there are big differences.

Fig. 3: MendS algorithm.
figure 3

a Scheme showing how the chemical heredity and b chemical mutation create new compositions. The probability, displayed in shades of gray, is given to each possible daughter system according to its distance from the fitter parent (dark green point). c Flowchart of the coevolutionary algorithm used in MendS (EA—evolutionary algorithm, MO—multi-objective).

Search for hard and superhard binary systems

Pareto optimization18 of hardness and stability was performed over all possible structures (with up to 12 atoms in the primitive cell) and compositions limited to the binary compounds of 74 elements (i.e., all elements excluding the noble gases, rare earth elements, and elements heavier than Pu). In this work, 600 systems have been computed in 20 MendS generations from a total of 2775 unary and binary systems that can be made of 74 elements, i.e., only about one fifth of all possible systems were sampled.

Figure 4 shows the efficiency of this method in finding optimal materials. In this fast calculation, numerous stable and metastable hard and superhard materials were detected in a single run. Carbon (diamond and other allotropes) and boron, known to be the only superhard elements, were both found. In addition, both new and numerous known hard and superhard binary systems, as well as potentially hard systems, were found in the same calculation, among them BxCy19, CxNy20,21, BxNy22,23, BxOy19,24,25, RexBy26,27, WxBy28, SixCy29,30,31,32, WxCy30,31,32, AlxOy30,31,32, TixCy32, SixNy32, TixNy32, BexOy32, RuxOy33,34, OsxOy35, RhxBy36, IrxBy36, OsxBy37,38,39, and RuxBy37,38,39. We reported some of the results of our search in a separate paper on the Cr–B, Cr–C, and Cr–N systems40, and our study of the W–B system41 was inspired by the present finding of promising properties in the Mo–B system (also published in42). The list of all systems studied during the calculation is available in Supplementary Information.

Fig. 4: Results of the simultaneous optimization of the hardness and stability in the space of all unary and binary compounds.
figure 4

a 1st MendS generation, b 10th MendS generation, c 20th MendS generation. The first five Pareto fronts are shown, green points representing all sampled structures. The instability of each compound is defined using Maxwell’s convex hull construction. Diamond, the hardest material, is indicated by a star.

Because of the huge compositional space (2775 systems, each with 102 possible compositions, each of which having a very large number of possible structures), it was necessary to shorten the time of calculations by reducing the number of generations and/or population size. Therefore, the structures and compositions found may be approximate and may need to be refined for the most interesting systems by a precise evolutionary calculation for each system. The results are shown in Table 1. Of these, some transition metal borides are predicted to be hard, some already reported as hard materials (e.g., MoxBy43,44 and MnxBy45) or discussed as potentially hard (e.g., TcxBy46, FexBy47, and VxBy48). Interestingly, a number of previously unknown hard structures more stable than those reported so far were predicted in these systems. Our calculations also revealed completely new hard systems, SxBy and BxPy, and, quite unexpectedly, the MnxHy system was discovered to contain very hard phases (Table 1).

Table 1 The predicted Vickers hardness (Hv), fracture toughness (K1C) and enthalpy above the convex hull of selected materials found using MendS.

For the MoxBy system, several simultaneously hard and low-energy structures were detected in our calculations. Of these, only the stable R\(\bar 3\)m structure of MoB2 was studied before, and the reported hardness for this structure (experimentally obtained 24.2 GPa49 and theoretically found 33.1 GPa44) is in close agreement with the value calculated in this work (28.5 GPa). MoB3 and MoB4 were studied widely before43,44, and a few low-energy and in some cases hard structures were reported for these systems (i.e., R\(\bar 3\)m-MoB3, 31.8 GPa;43 P63/mmc-MoB3, 37.3 GPa44; and much softer P63/mmc-MoB4, 8.2 GPa44). In this work, new low-energy structures with high hardness were discovered in these systems (Table 1).

For the MnxBy system, we propose several new compounds which are simultaneously hard and have low energy (Table 1). In the previous study50, P21/c-MnB4 was discussed to be stable and have a very high hardness (computed to be 40.1 GPa50, experimentally obtained 34.6–37.4 GPa51), while C2/m-MnB4 was claimed to be the second lowest-energy structure with the energy difference of 18 meV/atom. Our study confirms the stability of P21/c-MnB4. However, we discovered another MnB4 structure, with the Pnnm space group, the energy of which is lies between the energies of two aforementioned structures of MnB4 (Table 1). In this work, we found that the ferromagnetic phase of Pnnm-MnB4 is more stable than the nonmagnetic one, and the hardness of 40.7 GPa was computed for this magnetic structure.

Because of the radioactivity of technetium, the TcxBy system has not been studied experimentally, while computational studies of this system started recently46,52,53,54. In 2015, P\(\bar 3\)m1-TcB was predicted to be energetically more favorable than the previously discussed Cmcm and WC-type structures55. The reported hardness for this structure, 30.3 GPa55, is very close to the value predicted in this study (31 GPa). Because of the prediction of other stable compounds (e.g., Tc3B5) in our work, this structure became metastable (by 13 meV/atom). In this work, P\(\bar 6\)m2-TcB3 with the computed hardness of 27.2 GPa was predicted as a stable structure at zero pressure. Other works53,54, conducted in parallel to ours, also detected this structure and claimed that it is synthesizable at pressures above 4 GPa56. Another low-energy (3 meV above the convex hull) hard structure (33.1 GPa) with the P\(\bar 3\)m1 space group for TcB3 was also predicted in our study. P\(\bar 6\)m2-Tc3B5, a compound having a hardness of 30.6 GPa and stable at zero pressure, is predicted in our work for the first time. Several other simultaneously hard (in the range of 30 to 36 GPa) and low-energy metastable phases of TcxBy predicted in this work are shown in Table 1.

In recent years, many efforts were focused on searching for low-energy phases of VxBy and studying their electrical and mechanical properties. As a result, several low-energy hard and superhard phases were predicted48,55. Nevertheless, the experimental data exist only for the well-known hexagonal VB2 (AlB2-type) with the P6/mmm space group57. In addition to some previously studied structures58 (e.g., Cmcm-VB, Immm-V3B4, and P6/mmm-VB2), which were also found in our calculations, a few boron-rich phases possessing simultaneously low energy and very high hardness were discovered (Table 1). The calculated hardness for these boron-rich phases is very close to or above 40 GPa (VB7: 39.7 GPa, VB5: 40 GPa, and VB12: 44.5 GPa). A new extremely hard P\(\bar 4\)m2-V3B4 phase is predicted here, with the energy 6 meV lower than the previously proposed Immm structure.

Most of the studies of the FexBy system were dedicated to the FeB2 and FeB4 phases47,59,60. Several works studying different FexBy compounds61,62 reported Fe2B, FeB, and FeB2 as stable phases. In this work, we detected another stable phase, FeB3, with the P21/m space group and the hardness of 30.7 GPa. To the best of our knowledge, FeB3 was never reported, neither theoretically nor experimentally. The orthorhombic Pnnm-FeB4, with the energy of 2 meV above the convex hull (Table 1), was synthesized at pressures above 8 GPa, and its hardness was reported to be 62(5) GPa59, which encouraged many computational studies of this structure. However, none of them confirmed such a high value of hardness, while the Vickers hardness reported in several independent works varies in the range of 24–29 GPa47,60,62,63. We calculated its hardness to be 28.6 GPa.

In the BxPy system, the cubic boron phosphide BP with the zincblende structure is a well-known compound with the hardness reported to be roughly the same as that of SiC64. In our calculations, the hardness of SiC and BP was found to be 33 GPa and 37 GPa, respectively. Moreover, B6P was discovered as another stable compound in this system and predicted to be superhard, with the computed hardness exceeding 41 GPa. In the SiC system, in addition to the known diamond-type β-SiC, another similar structure (actually, a polytype of β-SiC) with the R3m space group and nearly the same hardness was found. The energy of this structure is just 1 meV/atom higher than that of β-SiC.

The MnxHy system is unexpected in the list of hard systems, but several very hard phases were indeed found in it (Table 1). All of these phases are nonmagnetic, highly symmetric, and energetically favorable (lying either on the convex hull or close to it), with the hardness of up to 30 GPa. In this system, two thermodynamically stable compounds (Mn2H and MnH) were predicted, with the space groups P\(\bar 3\)m1 and P63/mmc, and computed hardness of 21.5 and 29.5 GPa, respectively (in Table 1, only structures with the hardness above 26 GPa are shown for this system).

Generally, BxSy system is not hard, but metastable boron sulfides turn out to be potentially hard. We found a low-energy metastable phase of this system, Cmcm-B4S3, with the hardness unexpectedly exceeding 30 GPa. This can stimulate future studies of this system.

For a better insight, some of the prominent structures seen in our simulations are shown in Fig. 5a. More details on all phases presented in Table 1 are given in Supplementary Information.

Fig. 5: Results of our Mendelevian search for hard and superhard materials.
figure 5

a (1) F\(\bar 4\)3m-BN, (2) R\(\bar 3\)m-MoB2, (3) P\(\bar 3\)m1-MoB3, (4) P\(\bar 6\)m2-MoB5, (5) Cmcm-VB, (6) P6/mmm-VB2, (7) Immm-V3B4, (8) P\(\bar 4\)m2-V3B4, (9) P\(\bar 6\)m2-VB5, (10) I4/mmm-VB12, (11) Pnnm-MnB4, (12) Pm-MnB13, (13) Cmcm-B4S3, (14) P63/mmc-MnH, (15) P21/m-FeB3, and (16) R\(\bar 3\)m-B6P. b “Ashby plot” of the Vickers hardness vs. fracture toughness. Stable hard compounds from the previous works40,74 are shown as suns; stable and metastable compounds found in this work are represented by circles and triangles, respectively.

In our calculations, some boron hydrides were predicted to be superhard, but they had high energy and were not included in Table 1. However, it may be possible to stabilize these hard phases under pressure, or by chemical modification.

Figure 5b shows the studied materials in the space “hardness—fracture toughness.” Diamond, lonsdaleite and cubic BN possess the best properties, but are metastable at normal conditions. Among the stable phases, borides of transition metals (especially from groups VB, VIB, VIIB) stand out: we note VB2, V3B4, MoB2, CrB4, WB5, and MnB4 in particular. These and related materials (see65) present a high technological interest.

The fact that all known binary superhard systems were found in a short coevolutionary run demonstrates the power of the method, which is ready to be applied to the other types of materials.

Search for magnetic binary systems

In addition to the Mendelevian search for stable/metastable hard and superhard materials, we performed another Mendelevian search for materials with maximum magnetization and stability to examine the power and efficiency of the method in fast and accurate determination of materials with target properties. We performed this calculation over all possible structures (with up to 12 atoms in the primitive cell) and compositions limited to the binary compounds of 74 elements (i.e., all elements excluding the noble gases, rare earth elements, and elements heavier than Pu). In this calculation, well-known ferromagnets iron, cobalt, nickel, and several magnetic materials made from the combination of these elements with other elements were detected before the sixth generation. Here, for each structure we performed spin-polarized calculations using the GGA-PBE functional66 as implemented in the VASP code67,68. More details on structure relaxation and input parameters can be found in Supplementary Information. The chemical landscape of magnetization and evolution of its sampling in the Mendelevian search for magnetic materials are shown in Fig. 6d–f; this was formed after calculating 450 binary systems over 15 generations. In this plot, materials with high magnetization are clearly clustered together. Figure 6d, f shows how the (co)evolutionary optimization discovered all the promising regions at the expense of the unpromising ones. This calculation has found that among all substances, bcc-Fe has the highest magnetization at zero Kelvin.

Fig. 6: Sampling of the chemical space.
figure 6

Systems produced (a, d) randomly in the 1st generation, and using all variation operators in the (b, e) 5th and (c, f) 10th generations in searching for hard (ac) and magnetic (df) materials. Randomly generated systems are shown as violet circles.

Discussion

We have developed a method for predicting materials having one or more optimal target properties. The method, called Mendelevian search (MendS), based on the suitably defined chemical space, powerful coevolutionary algorithm, and multi-objective Pareto optimization technique, was applied to searching for low-energy hard and superhard materials. Note that due to the property of evolutionary and coevolutionary algorithms to enhance sampling of the most promising regions of the search space (where the optimal, as well as all or most of the high-quality suboptimal solutions are clustered together), each MendS search discovers a large number of materials with excellent properties at a low computational cost. Well-known superhard systems—diamond, boron allotropes, and the B–N system—were found in a single calculation together with other notable hard systems (Si–C, B–C, Cr–N, W–C, metal borides, etc.). The Mn–H system was discovered to be unexpectedly hard, and several new hard and superhard phases were revealed in the previously studied systems (V–B, Tc–B, Mn–B, etc.). The method successfully found almost all known hard systems in a single run, and a comprehensive chemical map of hard materials was produced. A similar chemical map was plotted for magnetic materials; well-known magnetic systems such as Ni, Co, Fe were found within just a few generations. These examples show the power and efficiency of our method, which can be used to search for optimal materials with any combination of properties at arbitrary conditions. As the first step in prediction of novel materials possessing desired properties, the method to a large extent solves, in a fully nonempirical way, the central problem of computational materials science.

Methods

The whole process can be described as a joint evolution (or coevolution) of evolutionary runs, each of which deals with an individual variable-composition system. Having defined the chemical space, we initialize the calculation by randomly selecting a small number of systems from the entire chemical space for the first MendS generation. These systems are then optimized by the evolutionary algorithm USPEX5,6,7 in its variable-composition mode69, searching for compounds and structures with optimal properties (e.g., here we simultaneously maximized hardness and stability), after which MendS jointly analyses the results from all these systems. Removing identical structures using the fingerprint method70, jointly evaluating all systems, refining and preparing the dataset, and discarding the structures that are unstable by more than 1.0 eV/atom, MendS ranks all systems of the current generation and selects the fittest (in present calculations, fittest 60% were selected) variable-composition systems as potential parents for new systems. Applying chemical variation operators, such as mutation and heredity, to these parent systems yields offspring systems for the next coevolutionary generation. In addition, some systems are generated randomly to preserve the chemical diversity of the population. This process is continued until the number of coevolutionary generations reaches the maximum predefined by the user (Fig. 3c). The underlying ab initio structure relaxations and energy calculations were performed using density functional theory with the projector augmented wave method (PAW) as implemented in the VASP code67,68. Further details on the input parameters of MendS, USPEX, and VASP are given in Supplementary Information.

Defining fitness: multi-objective (Pareto) optimization

Many scientific and engineering problems involve optimization of multiple conflicting objectives, for example, predicting novel materials that improve upon all critical properties of the known ones. The multi-objective evolutionary algorithm (MOEA) enables searching simultaneously for materials with multiple optimal properties, such as the enthalpy, hardness, density, dielectric permittivity, magnetization, etc. Here we performed searches optimizing simultaneously (1) stability, measured as the distance above the convex hull (chances of a compound to be synthesizable are higher if the compound is stable or low-energy metastable, i.e., is on the convex hull or close to it), and (2) hardness, computed using the Lyakhov–Oganov model71.

Hardness is a complicated property of materials which cannot be evaluated directly and rigorously from the crystal structure because it usually includes many nonlinear and mesoscopic effects. However, there are number of empirical models making it possible to estimate hardness from atomic-scale properties. The Chen–Niu empirical model72 is based on the relation between the elastic moduli and hardness. Although this model is reliable, calculating the elastic constants of materials on a large scale is computationally expensive. A similar model based on the elastic moduli was recently proposed by Mazhnik and Oganov73 and unlike Chen’s model, does not overestimate the hardness value of materials with low or negative Poisson’s ratio while for other materials gives similar results. The Lyakhov–Oganov model71, which computes the hardness from bond hardnesses, is more convenient for high-throughput searches: it is numerically stable, usually reliable, and can be used in calculations without significant cost, taking the crystal structure and chemical composition as input. For better understanding of the reliability of the mentioned models, a comparison of the computed values and experimental results for hardness of various materials is presented in ref. 65.

The result of the multi-objective optimization is, in general, not a single material, but a set of materials with a trade-off between their properties, and these optimal materials form the so-called first Pareto front. Similarly, 2nd, 3rd, … nth Pareto fronts can be defined (Fig. 4). In our method, the Pareto rank18 is used as a fitness.

Variation operators in the chemical space are of central importance for an efficient sampling of the chemical space using the previously sampled compositions and structures. These operators ensure that different populations not only compete, but also exchange information, i.e., learn from each other. An efficient algorithm could be constructed where the chemical space is defined by just one number for each element—the MN (or chemical scale); we use this for plotting the Pettifor maps, but within the algorithm itself, we resort to an even better option where each element is described by two numbers—electronegativity χ and atomic radius R, rescaled to be between 0 and 1—and it is this space where the variation operators act. There are three variation operators defined in the chemical space: chemical heredity, reactive heredity, and chemical mutation.

Chemical heredity replaces elements in parent systems with new elements such that their electronegativities and atomic radii lie in-between those of their parents (Fig. 3a). In doing so, we explore the regions of the chemical space between the parents

$$AB + CD \to XY,$$
(1)

where A, B, C, D, X, and Y are different elements, X is between A and C or A and D which is chosen randomly, and Y is between the other two elements (B and C or B and D).

Reactive heredity creates offspring by taking combinations of the elements from parents. For example, if the parents are A–B and C–D, their child is one of the A–C, A–D, B–C, and B–D systems.

Chemical mutation randomly chooses one of the elements of a parent and substitutes it with an element in its vicinity in the space of χ and R (Fig. 3b).

In both chemical mutation and chemical heredity, all elements are assigned the probability

$$P_i = \frac{{e^{ - \alpha x_i}}}{{{\sum} {e^{ - \alpha x_i}} }},i = 1,2,\,...,$$
(2)

to be selected, where x is the distance of element i from the parent element (in the case of chemical heredity, this formula is used to give a higher weight to the fitter parent, shown by a dark green point in Fig. 3a), and α is a constant (α = 1.5 is used here). The result of applying these chemical variation operators is shown in Fig. 6: the promising regions of the chemical space are sampled more thoroughly at the expense of the unpromising regions. When a new system is produced from parent system(s), it inherits from them a set of optimal crystal structures which are added to the first generation, greatly enhancing the learning power of the method.

After finishing the coevolutionary simulation, we took the most promising systems identified in it and performed longer evolutionary runs for each of them, calculating the final hardness using the Chen–Niu model72, and fracture toughness—using the Niu–Niu–Oganov model74.