Introduction

Inverse materials design, prediction of a structure and composition exhibiting targeted properties, is used to accelerate materials discovery for light emission, sensing, lasing, energy harvesting, and energy storage. Recently, deep learning (DL) models have predicted the properties of molecular and inorganic crystals7,8,9. However, even with a deep learning acceleration of 105 in predicting properties of materials as compared to a single DFT calculation (Supplementary Note 1), exploring the compositional and structural space using existing models remains infeasible: there are ~107 inorganic ternary10, ~1010 quaternary compounds10,11, and even more variations for alloyed and multinary compositions12. Therefore, these property prediction models are usually combined with a search algorithm such as a genetic algorithm (GA) for materials space search13,14,15. This enables the prediction of materials with the optimal set of properties16,17. However, neither of these components directly enables interpretability and explainability of the behavior of materials and the properties they exhibit.

The development of methods that are adaptable to different applications is equally important. For example, an ML model trained to predict bandgaps of materials can be used to search for materials that emit across a wide range of wavelengths such as UV (<400 nm) or IR (>700 nm). An effective interpretability method should be able to extract design rules for all these applications. Tools such as GNNExplainer18 explain the origin of candidate material properties, but they are not efficient at extracting chemical rules and theories from the trained property prediction model. Furthermore, such approaches apply to only certain types of ML methods: for instance, GNNExplainer identifies the subgraph of the input graph structure to the GNN that is dominating the prediction by maximizing mutual information between various possible subgraphs and outcome prediction. GNNExplainer is effective at explaining outcomes for a well-trained Graph Neural Network but cannot be applied to neural networks other than GNNs such as generative models19.

To overcome the challenge of efficient search and interpretability, we sought to develop a machine-learned framework, one that we term DARWIN: Deep Adaptive Regressive Weighted Intelligent Network. There are three components to DARWIN: a surrogate model, a search algorithm, and the means to distill knowledge in a way that humans can understand. We combine property prediction models and search algorithms with a supervised learning component to extract scientific insights. As part of the approach, we first generate multiple candidates that meet the desired target properties such as stability and UV bandgap. We then use statistical techniques and supervised ML to generate and identify relevant statistically significant chemical rules (Fig. 1 for a summary of the approach). While the approach itself does not make any assumptions about the property prediction model or the search algorithm, we use GNNs as surrogate models and GA as a search algorithm for demonstration.

Fig. 1: Different steps of DARWIN.
figure 1

Input crystals were generated using substitutions in prototype structures and spanned over 7 crystal systems and 220 space groups. DARWIN uses trained Graph Networks as surrogate models and mutations to find new candidates that meet the target specifications. Both the negative and positive pools of candidates generated are then characterized using different chemical featurization and subjected to supervised learning combined with statistical testing. The statistically significant rules derived in this way enable the discovery of new compounds and uncover new chemical trends which can be intuitively explained to experimentalists.

This paper is organized as follows: we discuss the three components of DARWIN: ML surrogate models that predict material properties using unrelaxed structures, integration of an evolutionary algorithm, and finally, methods for extracting interpretable chemical design rules. We demonstrate the practicality of DARWIN through two use cases: the design of stable UV light-emitting materials and direct bandgap materials.

Results and discussion

ML surrogate models

We focus on optoelectronic applications of materials and therefore, train ML models for three properties: energy above the hull, bandgap, and nature of bandgap. Data for energy above the hull and direct–indirect classification was obtained from the Materials Project database20,21. We train the bandgap regressor using a recently published HSE06 xc-functional based dataset22 (refer to “Methods” subsection on “Data generation for ML” for more details; please see Supplementary Fig. 1 for analysis of the data distribution across training, validation, and testing splits of the relevant properties and Supplementary Fig. 2 for distribution of the crystal structures).

We use GCNs for property prediction. GCNs generate a global representation of the crystal structure from the chemical features representative of each element at a given node and edge feature (Fig. 2a, see the “Methods” subsection on “General crystal graph network structure” for details). Several graph convolutional network (GCN) architectures6,9 have been reported in the literature to predict properties based on DFT relaxed structures but very few have been reported12,23 that predict properties based on unrelaxed structures, which is necessary to perform high-throughput screening without performing computationally expensive DFT geometry optimization.

Fig. 2: Surrogate models.
figure 2

a Mapping crystals to graph representations through encoding, we train graph neural networks to predict the desired property—bandgaps, energies, and direct/indirect nature. b Performance of different GCNs in predicting energies above hull to determine material stability. c Performance of different GCNs in predicting HSE06 exchange-correlation functional calculated bandgaps from unrelaxed initial structures without DFT relaxation. d Performance of the classifier on direct-indirect classification task using different ML methods (TL: transfer learning approach used in this study; ‘r-MEGNet’ refers to the MEGNet model trained using relaxed optimized geometries as a baseline; ‘r-CGCNN’ refers to the CGCNN model trained using relaxed optimized geometries as a baseline). We only report the best-performing model for relaxed structure predictions out of all those considered (CGCNN, MPNN, SchNet, MEGNet) to setup a robust strong baseline.

It is worth noting that, in addition to GCNs, recent progress in predicting material properties has been enabled by the use of generative models24,25 such as the invertible crystallographic representation19 and diffusion-based graph generative model26. Generative models are better at predicting geometry-optimized structures accurately; it remains to be clarified whether they are superior in predicting material properties than the feed-forward ML models. For instance, the most accurate model on Matbench structure-based property prediction challenges is ALIGNN27, which is not a generative model. Herein we explore the use of graph neural networks as property prediction surrogate models. We suggest that these can potentially be replaced with generative models; for these become more accurate without change to the interpretability framework.

To solve this problem, we adapt the MatDeepLearn framework to search, hyperoptimize and benchmark several existing and new GCN architectures (refer to Supplementary Table 1 for all models considered)28. We also compare the performance of various GCNs against fine-tuning of pre-trained models. We found the most success in learning the map from unrelaxed initial structures to energy above hull and bandgaps through fine-tuning the pre-trained models trained on formation energies obtained from the open quantum materials database (OQMD)22.

Observations of the training experiments for each target property (energy above the hull, bandgaps, and direct/indirect nature of bandgap) are summarized in Supplementary Table 2 and Fig. 2. It can be observed that fine-tuned GCN models outperform other GCNs in predicting energy above the hull, bandgaps, and the nature of bandgaps (direct vs. indirect) from initial structures (Fig. 2). The best GCN model predicts HSE06 bandgaps with mean absolute errors (MAE) of 0.35 eV on test data, and energies above hull with MAE of 0.034 eV/atom (Fig. 2b, c) using unrelaxed structures. Our classifier predicts the direct-indirect nature of the material bandgaps with an F1-score of 0.76 and 0.84 using initial and relaxed geometries respectively (Fig. 2d and Supplementary Fig. 3): this value is close to a previously reported (0.89) study on the direct-indirect classification that was limited only to the Kesterite family of compounds29.

Evolutionary algorithm for accelerated search in the chemical space

As the second step of DARWIN, we interface the trained ML models with a search algorithm30,31 (evolutionary algorithm/EA) to search through materials space. The fitness function of EA is set as the weighted sum of the mean squared errors of predicted bandgap, energy above the hull, and direct–indirect nature against their desired values. This fitness score is then used to score the candidates. The bottom half is discarded, and the top half is replicated but with each corresponding structure receiving a mutation, generating a new group of candidates to evaluate. We implement the mutation operation as a random elemental substitution of the same oxidation state. This ensures that the charge neutrality of the structure is maintained.

EA relies on our models to predict the properties of interest and evaluate the set of candidates for their fit. Experiments show that mutations alone are enough to direct the search toward the optimal compositions of the large chemical space allowing us to skip crossovers as shown by the decreasing loss as generations of solutions proceeds in time (Fig. 1 for a pictorial representation, “Methods” subsection “Evolutionary algorithm” for more implementation details and Supplementary Fig. 4 for loss as a function of generations).

Interpretability

Although GA combined with a surrogate model can efficiently search the chemical space and lead us to promising candidates, it does not on its own provide an intuitive understanding of the experimental discovery of such materials. The last component of DARWIN solves this problem by identifying chemical features and rules that provide physical insights into the origin of properties that can be consumed by chemists and material scientists in the lab for the design of new materials. All the candidates generated by the GA during its run are collected and categorized into two groups: those that meet the desired target properties and those that do not. Materials in the two groups are featured using several chemical features and operations on them (Supplementary Note 2 for an exhaustive list of properties and operations) such as the electronegativity difference between B and X site of ternaries (AxByXz), (range, standard deviation, mean, sorted-difference) of electronegativity and elemental chemical properties, HOMO–LUMO corresponding to all the atoms and band centers.

We train a simpler ML model such as Random Forest to learn the classification between the two groups using the generated features. Each feature then acts as a chemical rule and is characterized through two parameters: its relative importance and its statistical significance. We demonstrate two paradigms for acquiring importance: (1) Spearman’s coefficient; (2) permutation importance obtained using Random Forests. Post assignment of importance, we identify the statistical significance of each of these chemical rules using the Kruskal–Wallis H-test (Fig. 1 for the process summary).

Both of these methods are used in tandem to derive scientific insights. In the following two subsections, we use DARWIN to solve two problems in material science: (a) design of direct–indirect bandgap materials and (b) design of stable UV light emitting direct bandgap materials.

Design of direct–indirect bandgap materials

Origin of the direct–indirect nature of bandgap is of fundamental importance for a material’s usage in optoelectronics32,33 While a recent study34 tried to explore its origin, it was focused just on binary III–V semiconducting materials.

Here, we extend the criterion and derive chemical rules that explain the origin of the direct–indirect nature of bandgap across all stable p-block semiconductors using DARWIN (Fig. 3a). Our approach identifies that semiconductors composed of higher atomic mass p-block elements are more likely to exhibit direct bandgap (elements that have smaller melting temperature \({T}_{{\rm {m}}}\) with larger covalent radius \({R}_{{{\rm {conv}}}}\) are favorable). Similarly, the more negative the energy of LUMO among individual atomic orbitals constituting the chemical compound and the more the number of p-valence electrons on average, the more likely the compound is to exhibit direct bandgap and is stable. The former of these two rules is what has been reported in literature34. We also observe that as the average electronegativities of the elements increase, the material tends to be a stable direct bandgap material. Thus, the chemical insights discovered by DARWIN not only reaffirm one of the previously reported rules but also provide us with yet another statistically significant chemical rule.

Fig. 3: Chemical interpretability.
figure 3

a Some relevant and statistically significant (p-value < 0.05) chemical rules generated by DARWIN for the design of stable direct bandgap p-block semiconductors (Ehull < 0.07 eV/atom). b Some of the relevant and statistically significant (p-value < 0.05) chemical rules generated by DARWIN for the design of direct bandgap UV halide-based semiconductors (bandgap range: 3.1 ± 0.3 eV, Ehull < 0.07 eV/atom). Here, μ() represents mean, \({\rm{min }}\) \((\cdot )\) represents the minimum, \({\rm{max }}\) \((\cdot )\) represents the maximum and σ() represents the standard deviation of the quantity enclosed. \({N}_{{\rm {v}}}^{{\rm {p}}}\) is the number of p-valence electrons, \({T}_{{\rm {m}}}\) is the melting point temperature, \(C\) is the column number in the periodic table, \({Z}_{{\rm {m}}}\) is the Mendeleev number, \(A\) is the atomic mass, \({R}_{{{\rm {conv}}}}\) is the covalent radius, \({N}_{{\rm {v}}}\) is the number of valence electrons, \(R\) is the row number in the periodic table and OEDW is the optimal electronegativity difference window.

Using these design rules, we modify some of the indirect bandgap materials that are widely used in semiconducting and catalytic applications. To test whether DARWIN-derived rules have wider application, we show both cation modification and show mixed anion compounds. We provide a reference that suggests that the synthesis of such compounds may now be feasible35. The results are shown in Table 1.

Table 1 Tuning of indirect bandgap materials to make them direct.

Design of stable UV light emitting direct bandgap materials

Next, we use DARWIN to solve a slightly more complicated multi-target materials discovery problem: the discovery of stable direct bandgap UV-light emitting materials (3–4 eV), a vast and relatively unexplored36 chemical space10,37. Findings from the interpretability analysis, when the search is limited to perovskites-like structures, reconfirm known predictive descriptors such as the role of A-site and B-site cations in typical perovskite-based crystals for stabilities (Supplementary Figs. 57).

When the search is extended to all ternary halide-based compounds, we find several interesting relationships. The features from Fig. 3b are statistically significant (p-value < 0.05) and allow us to predict stable UV light-emitting candidates. It is observed that \(\Delta\)X–B, the difference between the electronegativity (EN) of the B-site (the second most metallic element in the composition) and X-site (most electronegative anion), ranks high and exhibits a small coefficient of variation (\(\sigma\) /\(\mu\) < 0.3). Further analysis revealed that \(\Delta {\rm{EN}}\)X–B is within a narrow range (0.84, 1.5) whenever the material is a stable UV direct bandgap semiconductor. We denote this specific range as the optimal electronegativity difference window (OEDW).

The knowledge of OEDW was then conveyed to the experimental collaborator who combined it with in-lab constraints and factors such as precursor availability, synthesis conditions, and equipment availability. These factors complemented by limited research in K/Cu-based systems at that time made us choose K2CuX3-based systems as ideal and optimal candidates to try experimentally. K2CuCl3 and K2CuBr3 were experimentally synthesized via spin-coating with an intermediate anti-solvent dripping step38,39. We found that K2CuCl3 meets the target specifications with emission below 400 nm (Fig. 5 and Supplementary Fig. 8 for experimental measurements). K2CuCl3 has also recently been synthesized independently40. Rb2CuCl3 satisfies the OEDW criterion and a recent independent report on Rb2CuCl3 saw interesting and encouraging results41. It is worth emphasizing here that instead of relying purely on search results, DARWIN aims to express the predictions in a chemical language that speaks to experimentalists: this enabled us to choose a chemical system that satisfied chemical constraints, leading to UV emission; and enabling experimentalists to incorporate chemical knowledge such as solubility of precursors, temperature parameters that are otherwise difficult to parameterize, and model using ab-initio methods.

We also performed DFT simulations to verify the predicted optical properties of K2CuCl3. The initial structure was obtained by substituting the prototype structure Eu2CuS3. The initial positions are then relaxed using GGA xc-functional with an energy convergence criterion of 0.0001 eV and a maximum force convergence criterion of 0.01 eV/\({\text{\AA}}\). Simulated and experimental XRD peaks match indicating that the structures obtained after structure optimization is close to the one obtained through experiments (Fig. 4d). Band structures calculations using HSE06 exchange-correlation functional were performed on the relaxed geometry (Fig. 4a). The results from the Ek plot (Fig. 4b) indicate a direct bandgap at the \(\Gamma\)-point. Further analysis of the elemental contributions in the orbitally resolved projected density of states (PDOS) reveals that the halide species significantly contributes to the valence band maxima (VBM) of such materials and the B-cation dominates the conduction band minima (CBM) (Fig. 4b), thus rationalizing the observation that \(\Delta\)X–B is a good predictor of the bandgap. Specifically, it is observed that in K2CuCl3, K+ does not contribute to the electronic structure and that the strong orbital interaction of the Cu and Cl species leads to the observed optical properties40,42.

Fig. 4: Experimental realization of K2CuX3 and computational studies.
figure 4

a Simulated orthorhombic crystal structure (Pnma space group) of K2CuCl3 illustrating the 1D chains of [CuCl3]2− separated by K+. b Simulated band structure and Density of States of K2CuCl3 (Refer to Supplementary Information for similar analysis of K2CuBr3). c The absorption spectrum and PL profiles of K2CuBr3 and K2CuCl3. d Simulated21 and experimental (powder) X-ray diffraction measurements of K2CuBr3 and K2CuCl3.

We also used these rules to propose materials that have not been synthesized before i.e., not reported in academic papers nor in materials databases (OQMD, Materials Project, AFLOW). These materials have been compiled in Table 2 with a more comprehensive list added as Supplementary Table 1.

Table 2 List of promising materials with emissions close to 3.1 eV and not reported in the literature.

Design of stable IR light emitting direct bandgap perovskites

To test the broader application of the approach, we further apply DARWIN to search for stable direct bandgap IR halide perovskite materials. We focus on a target direct bandgap of 1.2 eV, for this is of interest in tandem solar cells43,44. We initialize the search with halide-based compounds of general formula ABX3 (X = Cl, Br and I; refer to Methods subsection on Parameters for genetic algorithm search and optimization of candidates). Results of the interpretability analysis are shown in Fig. 5.

Fig. 5: Interpretation analysis for IR emitting perovskite materials.
figure 5

The analysis shows that p-block elements (in addition to halides themselves) are crucial. This is reflected through both the number of p-valence electrons (\({N}_{{\rm {v}}}^{{\rm {p}}}\)) as well as the preference for lower melting points (\({T}_{{\rm {m}}}\)) of the constituent elements. Furthermore, higher electronegativities (\(\chi\)) also favor direct bandgap IR emission of halide perovskites indicating a shift away from d-block transition elements. Furthermore, orbital character (\(c\)~ s, p, d, f) also indicates that homo–lumo that have contributions from s and p-orbitals are more likely to exhibit IR emission.

Some of the chemical rules obtained via this analysis simply reconfirm prior literature. For instance, one of the prominent features is \(\frac{\sum {{T}}_{{\rm{m}}}}{{{\max }}({R})}\) \(\left(\right.{{T}}_{{\rm{m}}}\!:\) melting temperature; \({R}\!:\) row number (1–7) in the periodic table) which shows a negative Spearman correlation and is statistically significant under the Kruskal–Wallis H-test45. This indicates that to achieve bandgap with IR emission, the elements must be heavy, and this is supported by existing literature such as iodine-based perovskites including MAPbI3 and CsPbI3 having small bandgaps. The melting point of the metals (\({{T}}_{{\rm{m}}}\left)\right.\) and the number of p-valence electrons (\({{N}}_{{\rm{v}}}^{{\rm{p}}}\)) appear frequently and are statistically negatively and positively correlated with the ability to emit in the IR, respectively. Thus the transition series metals from the periodic table (rows 4–6 and groups 3–10) are not suitable for IR perovskites. Existing Sn-, Ge-, and Pb-based small bandgap materials agree with this picture46,47. The range of the p-valence electrons also is linked to the IR emission behavior in a statistically significant way. It is also worth noting that IR emission is also linked to HOMO–LUMO orbital characteristics of the atomic orbitals constituting the material such that if both originate from s or p orbitals, it is likely to observe a direct bandgap in the IR regime. Few known compounds with IR bandgaps such as CsSnI3 fit the above criterion. These interpretability guidelines can also be used to modify compounds such that they go closer to emitting in the IR zone. Some of the stoichiometric and alloyed compounds designed following DARWIN interpretation rules are listed in Table 3.

Table 3 List of promising materials with emissions close to 1.2 eV and not reported in the literature.

Ablation experiment

Since there are different components within DARWIN, it is important to ask if all the components are essential for DARWIN’s success. We set up an ablation study where we remove the surrogate model and search algorithm with candidates selected from Materials Project. We then compare the performance of this modified setup with DARWIN in developing interpretable chemical rules for stable UV direct bandgap perovskite materials (\({{E}}_{{\rm{g}}}\in [2.8\,{\rm{eV}},3.4\,{\rm{eV}}]\)) using the correctness and completeness dimensions of interpretability as proposed by Oviedo et al.48.

Since MP-bandgaps are calculated with PBE xc-functional, they severely underestimate experimental bandgaps. We fit a linear scale between MP-bandgaps and experimental bandgaps49 (refer to Supplementary Note 4 for the equation; Supplementary Fig. 9 for comparison). We then search MP for materials with UV bandgaps using this linear transformation. We end up with 5 compounds that fit the criterion of having energy-above-hull less than 0.07 eV/atom and direct UV bandgap. However, with a 10% margin of error and 95% confidence interval, the recommended sample size for performing significance analysis (calculate p-values) is 97 which is 8 times larger than the number of materials that could be collected from Materials Project (required sample sizes calculated using power test50). Thus, removing the search component of DARWIN leads to an inability to calculate statistical significance due to a lack of promising candidates. Even if the significance values obtained were correct, the critical feature we obtained for UV design specifically, OEDW was found to be statistically insignificant (p-value > 0.05) with a very small Spearman correlation of 0.07. On the other hand, OEDW was found to be statistically significant using DARWIN for UV light-emitting perovskite design with a Spearman correlation of 0.54 (Fig. 6a). Both observations indicate that the baseline interpretability approach fails on the completeness front.

Fig. 6: Comparison of baseline method against DARWIN.
figure 6

a When we compare the OEDW interpretability insight on the smaller data that we could extract from MP, we find that not only it turns up a very small Spearman correlation but also statistically insignificant. b When we compare the top-ranked chemical rules of the baseline method (either from Spearman analysis or random forest permutation importance), we find that the majority of them turn out to be statistically insignificant (p-value > 0.05) on the larger pool of promising candidates generated by the search algorithm of DARWIN.

Finally, we train a random forest classifier on the collected data from MP. The top non-trivial features obtained are neither statistically nor highly predictive in the larger pool of candidates that were predicted by the search algorithm and surrogate models of DARWIN. Within the top 30 chemical rules that were predicted by this small dataset, 63% of them turned out to be statistically insignificant (p-value > 0.05) on the larger material candidates generated by the evolutionary algorithm (Fig. 6b). This means that relying just on screening MP, in this case, would limit exploration of candidates and wrong chemical insights. This baseline approach, therefore, fails both on the correctness and completeness front of interpretability. This shows that it is the combination of an accurate surrogate model, large candidate pools generated using a search algorithm, and feature-based interpretability analysis that makes DARWIN effective for interpretable ML for materials discovery. For the application cases shown, the approach not only recovered the known rules for the design of materials but also discovered new chemical rules. These rules enabled the discovery of materials that met the design specifications with human-in-the-loop. The approach, therefore, enables the interpretability of ML-powered material discovery pipelines in addition to just predictions.

Methods

Data generation for ML

For predicting the stability and optoelectronic properties of the materials, we use DFT calculations to get energy above the hull and bandgaps coupled with the direct/indirect nature of the band structure. We trained GNNs on energy-above-hull and direct–indirect data obtained from the Materials Project on about 117,000 and 45,000 compounds, respectively. The total energy values obtained from the Materials Project are based on the Perdew–Burke–Ernzerhof exchange-correlation functional which has been shown to perform satisfactorily for predicting the stability of the compounds6,51. The list of mp-ids of all the materials data used for this study is attached as part of the SI. We remove all the entries from the dataset that have energy above hull >2 eV/atom since those represent highly unstable compounds indicating either a very unreasonable geometry or problems with DFT results. To train the bandgap regressor, we use the open-source dataset22 on HSE bandgaps. Furthermore, the direct–indirect classification dataset is unbalanced; therefore, we perform under-sampling and use the balanced subset to train the models. The initial structures, as referred to here, are procured using the MPRester API by specifying the property ‘initial_structure’.

General crystal graph network structure

We used the MatDeepLearn28 package combined with PyTorch framework and PyTorch-Geometric module to build and test the crystal graphs and implement the GCN models. The method to encode the crystal structures as graphs has previously been reported in the literature8,9,52. Crystal structures are represented as G: = {V, E} where V represents atoms represented as nodes and E represents the set of edges connecting two atoms with spatial information. This enables one to represent the 3D geometrical and stoichiometric information of the crystals as graphs. Please refer to Supplementary Note 3 for the exhaustive list of hyperparameters used for the purpose of hyperparameter optimization.

In general, the process can be summarized as follows:

Crystal graphs are fed to the network in batches. We first apply graph convolution operations to them. Convolution operations on a node \(i\) can be represented as \({\rm{Conv}}({u}_{i},{u}_{j}^{j\in N\left(i\right)},{e}_{{ij}}^{j\in N(i)})\). Several convolution operations have been proposed in literature9,52,53. This quantity is then used to update the node representation for node \(i\) as

$${u}_{i}\to f\left({u}_{i},{\rm{Conv}}\left({u}_{i},{u}_{j}^{j\in N\left(i\right)},{e}_{{ij}}^{j\in N\left(i\right)}\right)\right)$$
(1)

This convolution operation is repeated depending on the chosen hyperparameter. Post the convolutions, we perform global pooling of all node features per graph to obtain a fixed-length vector representation of the crystal geometries under inspection (max pooling, min pooling and mean pooling are a few examples). This is represented as

$$U={\rm{Poo}}{{\rm{l}}}_{{i}\in {V}}\left({u}_{i}\right)$$
(2)

In a generalized framework, this vectorial representation is further operated upon using one or more dense layers

$$U\to {\rm{act}}\left(A* U+B\right)$$
(3)

where A represents the weights, B represents the biases and \({\rm {{act}}}\) represents the activation function of a dense layer. Finally, the output layer has a single node to enable the prediction of desired properties such as the bandgap or energy above the hull in our case.

In an untrained GCN, the predicted values differ from the ground truth significantly. The error is then backpropagated using a gradient optimizer and all the weights and biases are updated with every epoch till the prediction error reduces to an acceptable level. Changing the hyperparameters such as the number of dense layers, graph convolution layers, pooling type, and optimizer parameters are some of the ways this is usually done.

Transfer learning

Both the architectures (CGCNN and MEGNet) were first fully trained on a dataset of 500k formation energies obtained from OQMD. Post-hyper-parameters optimization, all the convolution and pooling layers were frozen. The number of frozen dense layers, learning rate, and batch sizes were treated as a hyperparameter for transfer learning.

Evolutionary algorithm

The EA operates on a surrogate model composed of the three predictive ML models built for the various prediction and classification tasks. A selection criterion is designed for target material properties such as the bandgap value and stability. In general, the multi-step iterative process by which the evolutionary search is implemented is as follows: (1) initialization of primary candidates denoted as the initial generation; (2) prediction of material properties using the ML models; (3) evaluation of the current generation; (4) selection of the fittest candidates; and (5) mutations in the selected individuals, and developing a new generation of candidates.

Over successive iterations, the evolutionary algorithm converges and outputs a set of candidates that are optimal given the current set of selection parameters.

Initialization: In the initialization step, we select a set of elements and generate an initial set of candidates based on the 200 crystal structure types and 7 families. We select the bandgap and energy above the hull which we would like to optimize for and set these search criteria.

Prediction: Crystal graphs are generated via the aforementioned process and fed as inputs into the three pre-trained ML models to obtain prediction values for the bandgap, energy above the hull, and direct–indirect classification.

Evaluation: We evaluate each individual in the current generation given the loss metric as shown in the equation below which is a weighted sum of the squared loss for each individually predicted property and the target selection values, where \({\lambda }_{i}\) are normalizing factors for each loss component. For the selection procedures, we set all the weights to be equal. We initialize with a population of 20 randomly chosen and substituted prototype structures, originally obtained from ALOWlib54,55, set the generation limit threshold at 200.

Selection: Upon evaluating the loss, we rank all individuals by their loss in the current generation and discard the bottom half and retain the remaining population.

Mutation: We then proceed to make a mutation on each top-ranked individual in the population which we define as a single elemental substitution in the crystal structure with the equivalent oxidation state to retain structural charge neutrality. The new set of candidates is then added to the current top-ranked generating a new population and the process is repeated but now starting at evaluation. After multiple iterations, the loss has plateaued, and the EA proposes a set of candidate solutions that ideally match the initial selection criteria. The proposed crystal structures are then aggregated and collected to comprise the candidate solutions for the given target conditions. This process is repeated (100 times in our experiments) for various selection criteria to span the varied bandgap range and design a set of candidate solutions for further analysis and experimental realization.

$${\mathcal{L}}{\,=\,}{{\mathscr{\lambda} }_{1}\left({\hat{E}}_{{\rm{gap}}}-{E}_{{\rm{gap}}}^{{\rm{target}}}\right)}^{2}+{{\lambda }_{2}\left({\hat{E}}_{{\rm{hull}}} \,<\, {E}_{{\rm{hull}}}^{{\rm{target}}}\right)}^{2}+{{\lambda }_{3}\left({\hat{E}}_{{\rm{direct}}}-1\right)}^{2}$$
(4)

Parameters for genetic algorithm search and optimization of candidates

All the case studies shown here were performed with the following set of parameters. The evolutionary algorithm search was conducted for 50 generations with a population size of 50. These evolutionary searches were performed 100 times. The best candidates that meet the threshold requirements were taken as class 1 whereas the worst-performing candidates were labeled as class 0. Performance was measured using the loss function as defined in the “Methods” subsection “Evolutionary algorithm”.

Experimental synthesis—film fabrication

Potassium halide (KX, X = I, Br, Cl), copper halide (CuX, X = I, Br, Cl), dimethylsulfoxide (DMSO), and dimethylformamide (DMF) were purchased from Sigma-Aldrich. Chloroform was purchased from DriSolv. All chemicals were used as received. The precursor solution was prepared by dissolving stoichiometric quantities of KX and CuX in a DMSO/DMF (25/75% v/v) solution (0.5 M) under continuous stirring for 1 h at room temperature. The concentration of the chloride-based precursor solution (in DMSO/DMF 75/25% v/v) was limited to 0.2 M due to the low solubility of the precursors. Glass substrates were O2 plasma-treated to improve adhesion. The precursor solution was spin-coated onto the substrates via a two-step process: 1000 rpm for 10 s and 3000 rpm. for 60 s. During the second spin step, 0.5 mL of chloroform was poured onto the substrate. The films were then annealed at 110 °C for 10 min. All the samples were prepared in a glove box with an N2 atmosphere to control the atmospheric conditions.

Material characterization

X-ray diffractograms were recorded using a Rigaku MiniFlex 600 powder X-ray diffractometer equipped with a NaI scintillation counter and using monochromatized Cu Kα radiation (l = 1.5406 Å). UV−Vis absorption was measured using a Perkin Elmer LAMBDA 950 UV/Vis/NIR spectrometer. PL measurements were collected using a UV–Vis USB 2000+ spectrometer (Ocean Optics). The samples were optically excited using a 355 nm frequency-tripled Nd:YAG laser with a pulse width of 2 ns and a repetition rate of 100 Hz.

There are some missing peaks in the comparison between simulated and powder XRD patterns. We attribute the additional peaks observed in XRD to potentially other remaining lattice planes which exist in the perfect crystal. As with all experimental synthesis, it is possible that certain planes were far favorable in the thin-film fabrication process given the current set of precursor ratios and reaction conditions. This helps explains the mismatch in certain small low-intensity planes. At the same time, it is important to note that neither structures resemble that of the original precursors used to fabricate them. The simulated data assumes that the orientation of the crystals is random; however, powder or thin-film samples would always have preferred orientations. A case in point is the PXRD pattern of K2CuCl3 in Figs. 1a and S6 of the study published by Creason et al.56. The latter only showed a few peaks compared to the former.

The bandgap and PL of K2CuX3 in this paper are different from other published works (Chem. Mater. 32, 6197−6205 (2020); Org. Electron. 86, 105903 (2020)). The crystal structures reported in those works are almost identical to ours—the K2CuX3 is composed of 1D [CuX3]2− chains separated by K+. However, their optical characterization is based on single crystals, which may have significant differences in optical properties compared to our solution-processed thin films. Overall, we attribute this difference as a function of the material preparation method which would also cause a discrepancy regarding the bandgap prediction, and so the current deviation is acceptable.