COPEX: co-evolutionary crystal structure prediction algorithm for complex systems

Crystal structure prediction has been widely used to accelerate the discovery of new materials in recent years. Up to this day, it remains a challenge to predict the stable stoichiometries and structures of ternary or more complex systems due to the explosive increase of the size of the chemical and configurational space. Numerous novel materials with a series of unique characteristics are expected to be found in this virgin territory while new algorithms to predict crystal structures in complex systems are urgently called for. Inspired by co-evolution in biology, here we propose a co-evolutionary algorithm, which we name COPEX, and which is based on the well-known evolutionary algorithm USPEX. Within this proposed algorithm, a few USPEX calculations for ternary systems and multiple for energetically-favored pseudobinary or fixed-composition systems are carried out in parallel, and co-evolution is achieved by sharing structural information on the fittest individuals among different USPEX sub-processes during the joint evolution. We have applied the algorithm to W–Cr–B, Mg–Si–O, and Hf–Ta–C, three very different systems, and many ternary compounds have been identified. Our results clearly demonstrate that the COPEX algorithm combines efficiency and reliability even for complex systems.


INTRODUCTION
Exploring and designing novel materials is a major research avenue in materials science. High-throughput computation, where structure prediction methods act as an essential component, has proven its great utility in the area of materials discovery 1 . In the past decade, many important discoveries, e.g., transparent insulating phase of sodium 2 , superhard M-carbon 3,4 , partially ionic high-pressure form of elemental boron 5 , and superconducting sulfur hydride H 3 S with the record high T c 203 K 6,7 , etc., were triggered by crystal structure prediction methods.
Recently, 244 stable ternary nitrides have been discovered using a data mining structure prediction algorithm, doubling the list of 213 previously known stable ternary nitrides 8 . This discovery further demonstrated the remarkable power of computational materials discovery, indicating that our knowledge of ternary compounds is poor. One might have the impression that a major part of all possible binary, ternary, and even quaternary compounds have already been experimentally discovered. However, as shown in Fig. 1a, only 16% of ternary, and just 0.6% of quaternary systems have been experimentally studied at normal conditions, as demonstrated by P. Villars in 2013 9 . Hence, experimental exploration of multicomponent systems is still very far from complete. With reliable and efficient crystal structure prediction methods, one can imagine that the full exploration of possible ternary or more complex compounds is not beyond our reach anymore. In addition, much attention has been attracted to materials with ternary or more complex compositions due to many important discoveries in the past few years, such as the successful experimental synthesis of possibly the first roomtemperature superconductor which is a S-C-H compound formed at high pressure 10 .
Without having the pretense of being rigorous or exhaustive, crystal structure prediction algorithms can be classified into two categories, one is based on existing knowledge in the form of crystal structure databases, and the other is based on reliable and efficient exploratory algorithms which are capable of making predictions with little or no pre-existing knowledge. When it comes to the former category, data mining [11][12][13][14][15][16] acts as a representative approach, which predicts the structures of the target compounds when the structure prototypes are already present in the database. In the latter category, evolutionary algorithms [17][18][19][20] , basin hopping 21 , minima hopping 22 , metadynamics 23,24 , simulated annealing 25 , and random searching 26 are widely used. However, a big challenge remains when it comes to the exploration of ternary and more complex systems. As shown in Fig. 1b, the number of possible compositions in a ternary system increases dramatically with increasing number of atoms in the unit cell. For instance, with a given unit cell of 50 atoms, 19,600 different compositions are possible in a ternary system. According to our experience, one needs to sample at least hundreds or even thousands of different structures using evolutionary algorithm to locate the structure with global energy minimum of a given composition. Clearly, exploring the enormous chemical space of ternary and more complex systems by doing separate crystal structure prediction calculations for each composition is impossible.
The situation is greatly improved by variable-composition evolutionary searches, which take advantage of the exchange of information between different compositions in the same population, and are capable of predicting all stable compounds in a single calculation 27 . However, while this works well for binary systems, for ternaries it can miss a significant percentage of stable compounds. The cases where USPEX was used for predicting ternary 28 and even quaternary 29 compounds involved very expensive calculations, with careful cross-checks and adding 1 results from binary and pseudobinary calculations to the population, and an increased human factor.
Clearly, the key issue of crystal structure prediction for complex systems is the sampling of the high-dimensional chemical space in an effective way. In biology, co-evolution is defined as the evolution of two or more species which mutually affect each other, and it is one of the major forces that produce biodiversity 30 . Recently, co-evolution was used as the basis for an algorithm to predict the optimal material among all possible compounds of all elements 31 . Inspired by the concept of co-evolution, we present a co-evolutionary algorithm to address the problem of compound and crystal structure prediction for complex systems. Here we use USPEX (Universal Structure Predictor: Evolutionary Xtallgraphy) 17,18,32 for every single evolutionary calculation. Therefore we named our co-evolutionary algorithm as COPEX (Co-USPEXs). In our algorithm, each process of USPEX crystal structure prediction can be viewed as evolution of one biological species, and the co-evolution between different USPEX processes (species) is the key feature of our algorithm. We have applied this approach to three representative systems, Mg-Si-O, W-Cr-B, and Hf-Ta-C, where many stable ternary compounds have been identified. Our results show that this approach accelerates the exploration of the chemical space of complex systems.

Stability principles of ternary compounds
Stability of a ternary compound can be evaluated by the highest energy of formation in all the combinations of compounds and their own elemental forms. The energy of formation of E AxByC1ÀxÀy can be expressed as where E A , E B , and E C refer to the ground-state energy of elemental forms of A, B, and C. E f is a function of the composition, and its calculation requires the knowledge of E A , E B , E C , and E AxByC1ÀxÀy . Stable compounds shall have negative energy of formation E f , however, this is necessary but not sufficient. The sufficient condition is that the compound be thermodynamically more stable than all of the possible products of decomposition. As a result, all the thermodynamically stable compounds form a convex hull. As discussed in the introduction section and also demonstrated by Villars and Iwata 9 , there are so many possible stoichiometries that computationally exploring the whole chemical space is impossible. Thus, before discussing our coevolutionary algorithm, we would like to explain the strategies that we use to navigate the chemical space.
Traditionally, according to the chemical bonding features, molecular, ionic, ionic-covalent, and intermetallic ternary compounds can be divided into four groups, (I) ternary molecular (all non-metals), (II) acids, (III) ternary ionic or ionic-covalent (including a oxysalts), and (IV) intermetallics. Among the 45,000 ternary structures from the Crystallography Open Database (COD) 33 , we found (analyzing interatomic distances) that 45% are molecular compounds, 49% are ionic or ionic-covalent, and the remaining 6% are intermetallic compounds. Since our work is focused on crystal structure prediction of inorganic compounds, therefore, the 45% molecular compounds are disregarded in our analysis. In nature, the octet rule governs the formation of ionic or covalent compounds. For instance, by analyzing all the ionic, ionic-covalent compounds and intermetallics, one finds that about 46% of them have both the compositions (AB k ) x (AC l ) y (where AB k , and AC l are stable binary compounds in the corresponding systems) and A x (BC m ) y (in which A is a pure element, and BC m is a stable binary compound in the B-C system). Also, about 35% and 11% are found exclusively in (AB k ) x (AC l ) y or A x (BC m ) y category, respectively. It is reasonable that atoms always seek to adopt the most stable electronic configuration. Following this philosophy, one could take advantage of the above principle and explore the chemical space along the pseudobinary joints.
The situation is more complex for predicting ternary intermetallics. Since metallic bonding comes from free electrons, the individual properties of the constituent atoms have less influence on the formation of structures than in the case of ionic and covalent compounds. In general, intermetallics have a higher propensity to form solid solutions 34 , host significant concentrations of defects 35 , and form unusual stoichiometries. Many ternary stable compounds adopt the same structure prototypes as the binary compounds 9 . This situation holds in many cases, but should not be accepted as a universal axiom.
In conclusion, ternary compounds are usually formed by combining or atom-substituting stable binary compounds. Hence, we can summarize that the promising and efficient way to find the thermodynamically stable ternary compounds is focusing on the compositions along the pseudobinary joints or some fixedcomposition obtained by atomic substitution in stable binary compounds. However, one cannot rule out the possibility to find stable ternary compounds which do not follow the above rules. In such a case, a fully unconstrained variable-composition ternary structure search is still needed and different exploration strategies are encouraged.

Co-evolutionary algorithm
The evolutionary structure prediction algorithm mimics Darwinian evolution and employs a natural selection of the fittest in combination with variation operators, which has been proven to be very successful. In multicomponent systems, due to their vast compositional space, many structures/compositions have to be sampled (rendering the searches quite expensive), and still it is possible to skip stable compositions. Thus, in order to explore the configurational space effectively, one could focus on the compositions of the pseudobinary systems or some fixed compositions. Thus sampling task can be divided into several sub-processes, and each of them tackles one part of the compositional space. In biology, co-evolution proved a major force to create biodiversity. Here we take advantage of this idea to propose the COPEX algorithm to manipulate the multiple crystal structure prediction processes. The working procedure of COPEX is shown in Fig. 2 and the main steps are: 1. Gathering the structural information of all the known elemental, binary (including all the stable and low-energy metastable phases), and experimentally known ternary phases in this or similar systems. 2. Taking the above gathered structures as seeds and performing the COPEX calculation which will initialize two independent ternary variable-composition structure prediction processes at the very beginning. 3. Evaluating the results every ten generations and extracting low-energy structures to form a structure pool. 4. Sharing good structures between different USPEX runs, and applying variation operators to generate new compositions and structures. 5. Constructing the phase diagram of the system, and manipulating USPEX runs, i.e., launching or terminating pseudobinary or fixed-composition USPEX sub-processes. 6. Repeat steps 3-5 until pre-specified halting criteria are achieved.
USPEX has proven to be efficient and reliable for predicting stable structures in fixed-composition and in binary systems 27,[36][37][38] . Thus, the USPEX processes in COPEX can accelerate the process to locate the stable stoichiometry or structure, which in turn will benefit the evolutionary process and generate high-quality structures for the next generation. The phase diagram of the ternary system can be generated using the energies of formation of all structures. Since the phase diagram will change dynamically during the evolutionary process, we set to analyze the phase diagram after every 10 generations and manipulate the compositions and USPEX processes automatically as illustrated in step 4. The success of the evolutionary algorithm for crystal structure prediction relies largely on the various variation operators, i.e., heredity, softmutation, lattice mutation, permutation, and transmutation 37,39 . Among them, the chemical transmutation operator is designed for variable-composition structure search, which means swapping atoms at randomly selected positions while keeping positions of most atoms unchanged. After every generation of COPEX, we reassign the atomic positions of different elements with closer atomic radii, which turned out to be very helpful to the co-evolutionary process. In the following, we have applied the COPEX algorithm to W-Cr-B, Mg-Si-O, and Hf-Ta-C, three chemically very different systems, and found evidence the high efficiency and reliability of the method.

Application to the W-Cr-B system
The W-Cr-B system has attracted extensive attention due to the remarkable mechanical properties and chemical inertness of transition metal borides, such as CrB 4 40 and WB 5−x 41 . Besides, tungsten chromium borides, such as M 3 B 2 and M 5 B 3 (M refers to metal atoms) commonly exist in the Ni-superalloy single crystal turbine blades as the predominant precipitate phases. For a long time, the transition metal elements were believed to be distributed randomly over the sublattice M in M 3 B 2 and M 5 B 3 42-44 . However, in our previous work, we have found both from theory and experiment that the transition metal atoms are distributed in an ordered fashion in the M 3 B 2 and M 5 B 3 borides, and are stoichiometric, in fact, W 2 CrB 2 and W 4 CrB 3 , respectively 44,45 . Naturally, one may wonder if any other stable ternary compounds exist in the W-Cr-B system. Thus, we chose COPEX to investigate this system.
Before running COPEX, we need to thoroughly gather the structural information of all the stable phases in the corresponding unary and binary systems. If not, COPEX will need to spend unnecessary efforts to explore these systems. The W-B and Cr-B systems were widely investigated in literature 40,41,46,47 . In the Cr-B system, at ambient pressure five ground-state compounds (Cr 2 B, Cr 5 B 3 , CrB, CrB 2 , and CrB 4 40 ) and two metastable compounds (Cr 2 B 3 and Cr 3 B 4 ) are found 46 . In the W-B system, five ground-state compounds (W 2 B, W 8 B 7 , WB, WB 2 , and WB 3 ) are reported at ambient pressure 48 . In addition, WB 5−x is also taken into account in our calculation; this phase is stabilized by zero-point vibrational energy 41 . Furthermore, no stable binary compounds exist in the W-Cr system. Taking this prior information, we have performed two independent COPEX runs of the W-Cr-B system at ambient pressure. Besides W 2 CrB 2 and W 4 CrB 3 , we have also predicted three stable ternary compounds, namely, WCrB, WCrB 2 , and WCr 2 B. The crystal structure parameters of the predicted three ternary compounds are given in Supplementary Table 1. Their elastic constants are given in Supplementary Table 2, and we can see that all of them satisfy Born criteria of mechanical stability 49 . Phonon calculations show no imaginary frequencies in these three compounds, which indicates their dynamical stability. The phase diagram and energies of formation of all the sampled structures are given in Fig. 3.
Here we would like to discuss the structural origins and features of the predicted ternary compounds. The crystal structures of the five ternary phases and some related compounds are drawn in Fig.  4. Among the five ternary W-Cr-B compounds, W 2 CrB 2 and W 4 CrB 3 share the same structural features as other earlier reported phases. Namely, the prototypes of W 2 CrB 2 and W 4 CrB 3 are derivatives of V 3 B 2 and Cr 5 B 4 structure types, respectively, while the prototypes of WCrB 2 , WCrB, and WCr 2 B have never been reported before to the best of our knowledge. As shown in Fig. 4c, the structure of W 4 CrB 3 consists of two layers where one layer has the same structure as W 2 CrB 2 (see Fig. 4b) while the other is made Fig. 2 The COPEX algorithm. Flowchart of the co-evolutionary algorithm COPEX.
of W-centered square antiprisms, just like in the structure of W 2 B (see Fig. 4a). Therefore, W 4 CrB 3 is composed of an alternating array of trigonal prism layer and square antiprism layer. In other words, W 4 CrB 3 can be viewed as an intergrowth or a stacking of W 2 CrB 2 and W 2 B along the [001] direction, indicating that W 2 CrB 2 and W 4 CrB 3 are very likely intergrown in practice, which agrees with experimental observations 44,45 .
The structure of WCrB 2 has never been reported before, it shares some similarities with CrB and WB. As CrB and WB adopt the same ground-state structure, here we use WB as an example. The ground-state phase of WB crystallizes into the I4 1 /amd structure (also known as α-WB) 47 . The zigzag chains formed by B atoms in α-WB are perpendicular to each other along the a-axis as shown in Fig. 4d. One B atom is surrounded by 7 W atoms, which form a capped trigonal prism. Therefore, α-WB can also be viewed as being constructed by the perpendicular zigzag chains lined by capped trigonal prisms. What if the zigzag chains are parallel to each other instead of perpendicular? The resulting structure would be the well-known β-WB 48 , which is also constructed by the same capped trigonal prisms.
As shown in Fig. 4e, WCrB 2 is also constructed by the capped trigonal prism, in which one B atom is surrounded by 3 W and 4 Cr atoms. Different from αand β-WB, B atoms in WCrB 2 form parallel armchair chains. As shown in Fig. 4f, one B atom in WCrB is coordinated by 5 W and 4 Cr atoms, resulting in a slightly distorted capped square antiprism. The polyhedra are arranged in a zigzag way, with a 2 1 screw axis. WCr 2 B has Cmcm space group. Interestingly, the Cr atoms form a sublattice with the lonsdaleite topology. Boron atoms are in a tricapped trigonal prismatic coordination (with 3 W and 6 Cr neighbors). Furthermore, Cr and B atoms form a graphene-like flat layer, penetrated by the hexagonal framework of Cr atoms, and result in an interesting graphene-diamond hybrid structure.
Ternary tungsten chromium borides were thought to be nonexistent for a long time. Our work, however, has uncovered five stable tungsten chromium borides at ambient pressure. In these phases, we see strong hybridization between W-d, Cr-d, and B-p states ( Supplementary Fig. 3), indicating strong covalent W-B and Cr-B bonds.
As shown in the W-Cr-B phase diagram (Fig. 3), WCrB 2 is located on the WB-CrB pseudobinary joint. The stability of WCrB 2 can be further investigated by taking end-point compounds WB and CrB into consideration. Here we have constructed artificial oC16-WB and oC16-CrB by taking the same structure of WCrB 2 , and the energy of formation of WCrB 2 from oC16-WB and oC16-CrB is −0.2 eV/atom. When the atomic positions of Cr and W in WCrB 2 are interchanged, the calculated enthalpy of formation is higher than WCrB 2 by 0.194 eV/atom. Although W and Cr belong to the same group, they occupy distinct positions due to their large difference in atomic radius, which makes W-Cr disorder unfavorable. In summary, five ternary borides have been identified with the help of COPEX.
Application to the Mg-Si-O system Thanks to several decades of continuing works by geoscientists, reasonable mineralogical models of the Earth's interior [50][51][52] , especially the mantle, are available to some level of approximation. The structural, vibrational, and thermodynamic properties of the two main mantle-forming compounds, Mg 2 SiO 4 and MgSiO 3 , have been systematically investigated both in experiments and simulations 51,53,54 . According to our previous work 28 , extraordinary ternary compounds MgSiO 6 and Mg 3 SiO 12 have been predicted to be stable at terapascal pressures (relevant to giant planets and exoplanets) in the Mg-Si-O system. And beyond that, one may wonder whether the phase diagram of the Mg-Si-O system is fully understood at the Earth's mantle pressures. Therefore, we would like to use the COPEX algorithm to further explore the system at pressures relevant to the Earth's mantle (1-136 GPa). Such study  would further demonstrate the performance of the COPEX algorithm in an ionic-covalent ternary system.
By exploring the whole range of compositions in the Mg-Si-O at ambient pressure, we constructed the phase diagram of the Mg-Si-O system as shown in Fig. 5a. The vertices and edges of the convex hull refer to stable phases and phase boundaries, respectively. Any compound located on the convex polyhedron is thermodynamically stable. It can be seen that Mg 2 SiO 4 is the only thermodynamically stable ternary compound in the Mg-Si-O system, and the most likely decomposition pathway of Mg 2 SiO 4 is into MgO and SiO 2 .
Unexpectedly, our calculation shows no thermodynamically stable ternary compound in the Mg-Si-O system at 28 GPa. In order to further investigate this phenomenon, we then constructed the phase diagram of the pseudobinary MgO-SiO 2 system, including the zero-point energy. As shown in Fig. 5b, at zero Kelvin akimotoite (MgSiO 3 with ilmenite-type structure) will decompose into periclase (MgO) and stishovite (SiO 2 ) at 27.29 GPa, and then periclase and stishovite will recombine into bridgmanite (perovskite-type MgSiO 3 ) at 28.72 GPa. In other words, both akimotoite and bridgmanite are thermodynamically unstable in the pressure range 27.29-28.72 GPa and therefore a ternary compound instability zone exists, which was also observed in another theoretical work 51 .
By considering the effects of temperature, we have constructed the temperature-pressure phase diagram of Mg 2 SiO 4 , as shown in Fig. 5c. The zone of instability of ternary compounds narrows as temperature increases, and eventually disappears at 250 K and 27.62 GPa, and akimotoite will transform into bridgmanite directly without decomposition above 250 K. We can also conclude that the vibrational properties of akimotoite and bridgmanite play an essential role to stabilize them in the Earth's mantle. Even though the zone of instability of ternary compounds disappears above 250 K, we still need to pay attention to it, since the impurity (Fe, Al, et al.) effect 50 may extend it to higher temperatures. It is possible that this zone crosses the geotherm in some planets, and therefore might have significant influence on the internal structures and dynamics of these planets. Furthermore, ringwoodite decomposes into periclase and akimotoite at 18.63 GPa at zero Kelvin. A triple point between phases ringwoodite-bridgma-nite+periclase-akimotoite+periclase is found at 21.5 GPa and 2150 K, which is in reasonable agreement with the literature 51,55 . In other words, ringwoodite will decompose into periclase and bridgmanite above this point. This phase transition is the main cause of the discontinuity in the mantle at the depth of 670 km 51,55 . The phase diagram shown in Fig. 5c is essential for determining the temperature distribution in the deep interiors of the Earth 52 .
Application to the Hf-Ta-C system High-performance refractory materials are of crucial significance in some extreme applications, such as gas turbines, heat shields for hypersonic vehicles, and more. For instance, transition metal carbides (TMCs) HfC and TaC are the most refractory materials among binary compounds with the highest melting points of nearly 4000°C 56 . Hf and Ta are located next to each other in the periodic table and have similar chemical properties, and both HfC and TaC possess the B1 rocksalt crystal structure. Mixing the two binary carbides provides an approach to tune the electronic and mechanical properties. One of the questions immediately rising is whether ternary Hf-Ta-C carbides will be ordered stoichiometric compounds or solid solutions with Hf-Ta disorder. In order to answer this question and explore the configurational space of the Hf-Ta-C system, we performed a COPEX calculation.
Herein, structure searches using the COPEX algorithm with up to 40 atoms in the primitive cell were carried out at ambient pressure. As usual, the stable structures of the pure elements and binary compounds should be provided. We have found four (HfC, Hf 7 C 6 , Hf 4 C 3 , and Hf 3 C 2 ) and three (TaC, Ta 6 C 5 , and Ta 2 C) stable phases in the Hf-C and Ta-C systems, respectively, at ambient pressure, which is in agreement with the literature 57 . In addition, many low-energy metastable phases like Hf 8 C 7 , Hf 6 C 5 , Hf 2 C, and Ta 5 C 4 were also considered in our calculation 58,59 . Furthermore, no stable compound has been found in the Hf-Ta system. Taking this information into account, we performed a COPEX search and obtained the phase diagram of the Hf-Ta-C system at ambient pressure, as shown in Fig. 6. Eventually, our study identified 14 stable ternary compounds containing HfTa 8 C 9 (P1), HfTa 6 C 7 (R3), HfTa 5 C 6 (C2/m), HfTa 3 C 4 (C2/m), Hf 3 Ta 5 C 8 (R3m), HfTaC 2 (R3m), Hf 3 TaC 4 (C2/m), Hf 5 TaC 6 (C2/m), Hf 6 TaC 7 (R3), Hf 9 TaC 10 (P1), HfTa 2 C 2 (R3m), HfTa 5 C 5 (Cm), Hf 2 Ta 4 C 5 (C2/m), Hf 6 TaC 6 (R3). In all of these  60 . Structural information for these phases is given in Supplementary Table 3. To verify the dynamical stability of the predicted phases, we performed phonon calculations. There are no imaginary frequencies in the whole Brillouin zone, which indicates that the predicted phases are dynamically stable. We also calculated their elastic constants (Supplementary  Table 4), which all satisfy Born criteria of mechanical stability 49 . It is worth noting that 10 out of the 14 stable ternary compounds are located on the pseudobinary HfC-TaC joint, while the other four are not. Therefore, we shall discuss their structural features separately.
From the phase diagram (Fig. 6) we can see that the HfC-TaC pseudobinary phase boundary line is the richer area that contains more low-energy structures in the global Hf-Ta-C energy landscape. The 10 stable ternary phases along this line all adopt the B1 rocksalt structure in which the metal atoms occupy a facecentered cubic lattice and carbon atoms fill the octahedral interstices (see Fig. 7a). Different ratios of Hf to Ta and ordering schemes result in a number of distinct phases. One can easily analyze the arrangement of metal atoms on the (001) plane of the B1 structure. Given this perspective, as shown in Fig. 7b, HfTa 8 C 9 is obtained by substituting one tantalum atom of TaC with one hafnium atom at intervals of 8 tantalum atoms in each horizontal atom array. HfTa 6 C 7 , HfTa 5 C 6 , and Hf 9 TaC 10 can be constructed following the same principle. In contrast, HfTa 3 C 4 and HfTaC 2 follow another rule, in which the whole tantalum atom array is replaced by hafnium atoms. The former is arranged alternately by one row of hafnium atoms and three rows of tantalum atoms, while the latter is arranged alternating one row of hafnium atoms and one row of tantalum atoms. Interestingly, Hf 6 TaC 7 , Hf 5 TaC 6 , and Hf 3 TaC 4 can be constructed by swapping the hafnium and tantalum atoms of HfTa 6 C 7 , HfTa 5 C 6 , and HfTa 3 C 4 , respectively. Among the 10 ternary phases, Hf 3 Ta 5 C 8 is different from others. It is formed by intergrowth of one layer of HfTa 3 C 4 and one layer of HfTaC 2 , as highlighted in Fig. 7l.
The four stable ternary compounds, Hf 2 Ta 4 C 5 , HfTa 5 C 5 , Hf 6 TaC 6 , and HfTa 2 C 2 , which do not belong to the pseudobinary HfC-TaC system, are vacancy-ordered derivatives of the rocksalt structure, as illustrated in Fig. 7m-r. Hf 2 Ta 4 C 5 inherits the structure prototype of the stable binary carbide Ta 6 C 5 . By substituting the tantalum atoms at 4i-sites (Wyckoff position) in Ta 6 C 5 with hafnium atoms one gets the structure of Hf 2 Ta 4 C 5 . In addition, an alternating arrangement of vacancies of one in every three layers of carbon atoms can be observed. Furthermore, by substituting half of hafnium atoms alternately with tantalum atoms, one can further get the HfTa 5 C 5 structure. Similarly, Hf 6 TaC 6 can be obtained by replacing hafnium atoms at 3b-sites with tantalum atoms of the Hf 7 C 6 phase. Interestingly, HfTa 2 C 2 forms a sandwich quasi-2D structure that distinguishes itself from the other predicted structures. Besides, by removing all the carbon atoms in the vacancy layers of Hf 2 Ta 4 C 5 and slightly shifting the remaining atoms, one can get the structure of HfTa 2 C 2 . In total, all the ternary stable phases in the Hf-Ta-C system have one basic prototype, and structures with ordered distribution of metal atoms and vacancies have lower energies than disordered structures. In addition, the simplest stoichiometries of all the stable ternary compounds predicted here can be simplified into either Hf x Ta y C x+y or Hf x Ta y C x+y−z (x, y, and z are positive integers). Based on our findings, the highest concentration of vacancies in Hf x Ta y C x+y−z is 1/3.

Performance of the co-evolutionary algorithm and its verification
In order to verify the effectiveness of our co-evolutionary algorithm, we have compared its performance with the conventional USPEX algorithm. In such comparative runs, the number of sampled structures should be kept at the same level. Considering the complexity of the ternary system, a long evolutionary search is needed in every single run. For instance, about 20 thousand structures were searched in the W-Cr-B and Hf-Ta-C systems. The benchmark results are given in Table 1. In the W-Cr-B system, by sampling around 20k structures the two COPEX runs successfully predict all the five stable ternary compounds, while in contrast, no stable ternary compounds are found by conventional USPEX. We have continued one USPEX run until over 30k structures were sampled, and then one stable ternary compound, WCrB 2 , was found. In the Mg-Si-O and Hf-Ta-C systems, the results (see Table  1) also show that COPEX algorithm works much better than conventional USPEX algorithm.
In addition, in order to evaluate the net effort of co-evolution, we have performed several comparison runs with/without coevolution. The equivalent number of independent USPEX processes ran in parallel is the same as in COPEX, while the coevolution module remains inactive, we may call such runs as M-USPEXs. In the W-Cr-B system, the results show two of the five stable ternary compounds, WCr 2 B and W 2 CrB 2 were found in the M-USPEXs run. We can see that M-USPEXs already surmount the performance of the conventional USPEX, and importantly, the net benefit for co-evolution can be evidenced by the prediction of the remaining three compounds, W 4 CrB 3 , WCrB, and WCrB 2 . Similar comparative calculations are performed in the Mg-Si-O and Hf-Ta-C systems, the results (Table 1) demonstrate the net benefit of the co-evolutionary algorithm.
One may wonder how the co-evolutionary algorithm enhances the structure sampling in the evolutionary searches. Here we have constructed a low-dimensional representation of the complex high-dimensional atomic environments of the W-Cr-B system as an example. Principal component analysis (PCA) 61 of the local structure environment based on the smooth overlap of atomic positions (SOAP) 62,63 descriptor was performed. In such a way, the structures, especially the local atomic environments, sampled by different algorithms can be described and compared in a lowdimensional manner, amenable to visualization. By analyzing all the structures sampled by COPEX and USPEX, we have built the PCA maps shown in Fig. 8. One can see that both COPEX and UPSEX algorithms can generate structures diversely, which is the fundamental requirement for the success of a structure prediction algorithm 64,65 . Furthermore, the stable ternary compounds are all located in the hot sampled region of the PCA map from the COPEX run, indicating the co-evolutionary algorithm spends more effort in the area(s) of the energy landscape where stable structures exist. We have also compared the chemical composition sampling intensity maps obtained by COPEX and USPEX for the Hf-Ta-C system. The results show that COPEX has an enhanced sampling region as highlighted in Fig. 9, in which nine out of the ten predicted stable ternary compounds are located. USPEX shows similar but much lighter sampling intensity than COPEX. In other words, USPEX lays the foundation which ensures the sampling diversity and COPEX further enhances the sampling of the promising regions of the chemical space.

DISCUSSION
Predicting the crystal structure of ternary and more complex compounds is very challenging, as the chemical and configurational space expands dramatically with increasing number of atomic types and the total number of atoms. Here, we propose an algorithm to address this challenge, developing the algorithm called COPEX, coupled with the powerful evolutionary algorithm USPEX and based on the idea of co-evolution. Within COPEX, multiple USPEX processes are carried out in parallel, where two of them target the whole configurational space while the rest are designed to focus only on the most promising area of the compositional space. Co-evolution is achieved by sharing structural information on the fittest individuals among the different USPEX processes during the evolution. Through sharing highquality structures, COPEX can maximize the characteristic of the self-improving evolutionary algorithm and improve the success rate of structure prediction.
In order to demonstrate the performance of the COPEX algorithm, we have chosen three quite different systems, the intermetallic systems W-Cr-B and Hf-Ta-C, and the ionic-covalent system Mg-Si-O, to explore. The atomic radii of Cr and W are significantly different, while the atomic radii of Hf and Ta are almost the same. This explains the different behavior of these two systems. In the W-Cr-B system, besides W 2 CrB 2 and W 4 CrB 3 we have also found three stable ternary compounds, WCrB, WCrB 2 , and WCr 2 B, belonging to new structure types. In the Hf-Ta-C system, we have identified 14 stable ternary compounds, all of which are based on the B1 rocksalt structure type including four compounds with vacancy ordering. For the Mg-Si-O system, besides Mg 2 SiO 4 and MgSiO 3 , no other ternary compound has been found at pressures up to 40 GPa. Importantly, we have found Fig. 7 Crystal structures of the phases in the Hf-Ta-C system. a Crystal structure of TaC, and the two-dimensional projections (along the [001] direction of the B1 structure) of the structures b HfTa 8 C 9 , c HfTa 6 C 7 , d HfTa 5 C 6 , e HfTa 3 C 4 , f HfTaC 2 , g HfC, h Hf 9 TaC 10 , i Hf 6 TaC 7 , j Hf 5 TaC 6 , k Hf 3 TaC 4 . l The crystal structure of Hf 3 Ta 5 C 8 is formed by alternation of one layer of HfTaC 2 (denoted as P 1 ) and one layer of HfTa 3 C 4 (denoted as P 2 ). Crystal structures and their projections along the [001] direction of the B1 structure: m Ta 6 C 5 , n Hf 2 Ta 4 C 5 , o HfTa 5 C 5 , p Hf 7 C 6 , q Hf 6 TaC 6 , r HfTa 2 C 2 .
at low temperature a region of decomposition (27.29-28.72 GPa at zero Kelvin) where no stable ternary compounds exist.
In conclusion, we have applied COPEX to three representative systems W-Cr-B, Mg-Si-O, and Hf-Ta-C, and then evaluated its performance based on benchmark tests, and analyzed its sampling efficiency using a low-dimensional representation of the complex high-dimensional atomic environments and chemical composition space. The results demonstrate the reliability and efficiency of this methodology and illustrate that COPEX is a powerful tool for predicting new complex materials.

DFT calculations
Density functional theory (DFT) calculations within the Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation (GGA 66 ), implemented in the VASP 67 code, are used for structure relaxations and total energy calculations. The allelectron projector-augmented wave method is used, with a plane-wave kinetic energy cutoff of 600 eV and k-point mesh with the reciprocal-space resolution of 2π × 0.02 Å −1 . Phonon calculations were performed using Phonopy 68 code using the force constants calculated by VASP.

COPEX calculations
The COPEX calculations were performed by following the procedure shown in Fig. 2 and the sub-processes were carried out using the evolutionary algorithm USPEX. In the two USPEX runs that cover the whole ternary chemical space, 100 structures are produced in each generation, while in other USPEX runs, 60-80 structures are produced. In each USPEX run, structures of the first generation were produced randomly. For variable-composition searches, the fractions of structures produced by random structure generator, heredity, softmutation, lattice mutation, and transmutation were 35, 35, 10, 10, and 10% in the second generation. For fixed-composition searches, the fraction of structures produced by random structure generator, heredity, softmutation, lattice mutation, and permutation were 35, 35, 10, 10, and 10% in the second generation. In subsequent generations, the weights of variation operators were adjusted dynamically. Stability of a compound is determined using the thermodynamic convex hull construction.  X. Liu et al.
In step 3, 10% fittest structures (i.e., those closest to the convex hull) from each USPEX sub-process were extracted to construct a structure pool, in which identical structures are discarded.
In step 4, structures in the structure pool are manipulated in two ways. First, structures will be distributed to each USPEX sub-process as "seed" structures. Such structures are treated the same as the structures generated by USPEX. Structures which do not exist in the current USPEX sub-process will be selected as the high-quality structural gene carriers to generate child structures. Importantly, the distributed structures must follow the stoichiometry rule of each USPEX sub-process. For instance, the stoichiometry of seed A x B y C z must satisfy m × C 1 = x, n × C 1 + p × C 2 = y, and q × C 2 = z, in which A m B n and B p C q refer to the end-point compounds of the pseudobinary system, and all the variables are positive integers.
Second, we apply the variation operators to structures in the structure pool to generate structures and then remove duplicate structures using fingerprints 69 . The resulting structures will be added to different USPEX sub-processes according to the distribution principle discussed previously. The variation operators used here include heredity, softmutaion, lattice mutation, and MEtransmutation. Among them, heredity, softmutation, lattice mutation were already implemented in USPEX 32,37 . In addition to these operators, an operator named MEtransmutation (Multi-Equivalent positions transmutation) is introduced here. In the MEtransmutation operator, atoms of a certain type populating a randomly selected Wyckoff position are replaced by atoms of another type. The heredity and MEtransmutation operators are very effective in generating high-quality offspring as demonstrated in Supplementary Figs 7-9.
In step 5, information on all the sampled structures is gathered and the composition-enthalpy phase diagram is constructed. Importantly, edge lines of the convex hull on the phase diagram are the potential searching paths. Searching priority of these pseudobinary systems is ranked by the numbers of low-energy structures along the path inspired by the Bell-Evans-Polanyi principle 70,71 and paper 69 , which demonstrated that low-energy minima are clustered in compact regions of the configurational space. In addition, whenever a ternary compound appears on the current phase diagram, its composition will be added to the list of fixedcomposition sub-processes. One can set how many USPEX sub-processes are to be run in parallel, but all the promising pseudobinary and fixedcomposition systems are recommended to be searched eventually. COPEX will launch new USPEX sub-processes based on the above criteria. In our work, the top 10% structures based on the rank of fitness according to their enthalpy will be deemed as low-energy structures. For instance, COPEX can easily find out that MgO-SiO 2 , MgSiO 3 , and Mg 2 SiO 4 are the promising systems that shall be explored. The two ternary variable-composition USPEX sub-processes are kept alive during the whole COPEX search. The pseudobinary or fixed-composition subprocesses are evaluated every 10 generations, and run on subsystems that only produce structures that are far above the updated convex hull, will be terminated. Naturally, COPEX can reactivate such sub-processes as the search continues. One can also launch or terminate sub-processes manually.

DATA AVAILABILITY
The data supporting the findings of this study are available within the paper and Supplementary information. Structures of all stable and some low-energy metastable structures are available via Github at: https://github.com/Dustglaxy/Co-evolutionarycrystal-structure-prediction-algorithm-for-complex-systems. Fig. 9 Sampling intensity maps of the Hf-Ta-C system in chemical composition space. Data are generated from a COPEX and b USPEX. The map is colored according to the sampling intensity, divided into 23 × 20 bins, and the color bar refers to the number of sampled structures in each bin. The stable ternary compounds are marked by black circles.