Lithium-ion batteries (LIBs) have shown great success as the major power source for transportation and as an energy storage solution for grid applications1,2,3. Nowadays, the relatively low-energy density and the scarcity of Li raw materials are the main issues of LIBs for large-scale applications4,5. These issues call for high density, cheaper and sustainable alternatives to present LIB technologies. Multivalent metal-ion batteries (MIBs), including Mg2+, Ca2+, Zn2+, Al3+, have the potential to meet this purpose, due to the relatively high abundance of these elements in the Earth’s crust and high-energy density6.

Within the arena of big data, deep learning (DL) has emerged as a game-changing technique very recently, enabling numerous scientific applications in chemistry7, mathematics8, physics9, and biology10. In materials science, several DL models11,12,13 have also been developed, such as AtomSets14, SchNet15, Material Optimal Descriptor Network (MODNet)16, Compositionally Restricted Attention-Based Network (CrabNet)17,18, Crystal Graph Convolutional Neural Network (CGCNN)19,20,21,22,23,24. They have achieved great success in the applications, for instance, identifying degradation patterns of LIBs25, designing solid-state LIBs26, learning properties from multi-fidelity data27, discovering stable lead-free hybrid organic–inorganic perovskites28, screening 2D ferromagnets29, discovering and designing electrocatalysts30,31, mapping the crystal-structure phase32, and designing material microstructures33. It has been widely demonstrated that DL has much lower model errors than conventional machine-learning (ML) models, such as support vector regression (SVR) and kernel ridge regression (KRR), when dealing with big data34. For example, the reported mean absolute error (MAE) of the deep-neural network (DNN) model is lower than conventional ML in predicting the volume change and voltage of LIB electrodes35,36. However, DL still cannot solve many problems in the field of batteries, especially the MIBs beyond lithium, due to insufficient data available. For example, Joshi et al. predicted the voltages of Na-ion battery electrodes with a DNN model, but MAE is much higher than that of LIBs35. Moreover, due to being highly complex, DNN takes a long training time and is not explainable, which may not outperform shallow learning in solving some battery problems, especially for the design of batteries, in which explanations are in great demand. Therefore, the high-performance and interpretable DL model is of high need for predicting a variety of properties of multivalent MIBs from small data and then designing high-performing multivalent MIBs.

In this work, we take the voltage of multivalent MIBs as an example to demonstrate how an interpretable deep transfer-learning (TL) model, (Fig. 1) can be used for exploration and design of electrode materials for battery applications, addressing the conventional ML issues (low prediction accuracy and heavy feature-engineering dependence) and DL limitations (low interpretability and high big-data demand). It is worth noting that high-voltage electrode materials can enhance the voltage platform of batteries, which is the key component for high-energy density MIBs and is generally used in the performance prediction of materials of battery electrodes35,36,37. We firstly train our DL models with relatively large data of the electrode voltage of LIBs (2000+ data) from Materials Project (MP)10,38. It is found that MAE of LIBs is only 0.32 V. Nevertheless, MAE of multivalent MIBs is significantly high (up to 2.14 V) using the same method because of small datasets of multivalent MIBs (as low as 149). We, thus, integrate the TL technique39, widely used to address less data restriction40,41, into our DL models. It greatly reduces the MAEs for Zn-, Ca-, Mg-, and Al-ion batteries, for example, from 2.14 V down to only 0.47 V for the Zn-ion battery. To interpret our DL models, we perform the visualization of the similarity between the elements and local environments in different layers of the deep-neural network. The DL models can automatically hierarchically extract key features and explain the different contributions of element groups in the periodic table to the corresponding electrode voltages (Supplementary Fig. 1). Our results show that the highly accurate and interpretable deep model could accelerate the discovery and design of electrode materials for multivalent MIBs and the development of the large-scale battery industry.

Fig. 1: Illustration of the interpretable crystal graph convolutional neural network.
figure 1

A crystal is converted and characterized by atomic and bonding vectors. The elemental and local environment representations for visualization of the model are constructed from the embedding and the convolutional layers, respectively. The weights in the hidden layer and output layer are trained during transfer learning.

Results and discussion

Deep learning of Li-ion batteries

In all, 2190 samples of LIB electrode materials are used for model training and assessment after excluding the outliers and inconsistencies. The data (target data) firstly is randomly split into training, validation, and test parts with the ratio of 0.8:0.1:0.1, respectively. The three parts show similar distributions (Fig. 2a, top), indicating a reasonable split. Then, we train and supervise our DL model (Fig. 1) with training and validation data, respectively. The mean squared errors (MSEs) are chosen to be the loss function and MAEs are the evaluation metric. After fully training, MAE of our DL model for predicting the voltages of LIBs is only 0.32 V, which is lower than the conventional ML results (0.40 V). Our results are quite similar to the previous reports for LIBs35,36. The predicted voltages have similar distributions (Fig. 2a, right) with the target voltage (Fig. 2a, top) for all the training, validation, and test sets. The data points of predicted voltages plotted against the target voltages are around the dashed straight line of y = x (Fig. 2a, dash line), which also indicates the high accuracy of our model. However, our DL model developed from LIBs (named Li-model) cannot be directly used for multivalent MIBs (quite large MAEs Fig. 2b–e) because of the different properties between LIBs and multivalent MIBs. It’s also inadvisable to train DL models from small data of multivalent MIBs. Furthermore, it is found that the MAEs of conventional ML are similar to the Li-model (Fig. 2b–e). Thus, neither DL nor conventional ML is suitable for multivalent MIBs. This critical problem will be addressed in the late part of this paper.

Fig. 2: Plots of predicted voltage and target voltage for the metal-ion batteries.
figure 2

a The data points are predicted with the deep-learning model and those in be are predicted with the transfer-learning model. The model errors of the Li-, Ca-, Mg-, Zn-, and Al-ion battery are shown in the right-down corner. The histograms at the top and right show the distributions of predicted and target voltages, respectively. The black dashed line is the identity line (y = x) for reference. The TL, CML, and Li-model mean the transfer learning, conventional machine learning, and the model trained on LIB database only.

Interpretability and visualization of the DL model

Although the state-of-the-art DL models outperform conventional ML models in the large dataset regime, they are generally viewed as black-box models due to the high complexity and are often achieved at the cost of interpretability. This is a major drawback of DL models for applications in which the interpretability of decisions is a critical prerequisite, such as new materials discovery and design20,42,43. Here, we visualize the embedding and convolutional layers of the deep-neural network, separately20 (see in Fig. 1) to interpret the contribution of underlying features to the electrode-voltage prediction. Taking the LIB as a proof of concept, we first project the 64-dimensional elemental representation vectors on a two-dimensional (2D) plane constituted by the first two principal components, dimension 1 and 2, using the principal component analysis (PCA)44,45. Figure 3a shows the visualized element features automatically extracted from the embedded layers (Fig. 1). As can be seen, the elements can be clearly clustered into three groups according to the location of the corresponding elements in the periodic table. For instance, the first two group elements, alkali, and alkaline-earth elements, locate at the upper right corner of the plot; the early transition metal (TM) elements locate at the left side; and the rest elements mainly distribute at the lower right side. Such a distribution indicates that the voltage of a crystal has a strong correlation with the group number of the constituent elements. Interestingly, we can observe a linear relationship between the covalent radius of the elements and the second principal component (dimension 2) (Fig. 3a inset). Thus, the covalent radius of the constituent elements of electrodes should be one of the important features for crystal voltage prediction. A conventional ML-based random forest regression (RFR) model with 273 materials features46,47 is also trained for analyzing the Gini importance factor by the weight of features. The covalent radius is also found to be the most important feature with the importance ratio as high as 20% (Supplementary Fig. 2). Such a result indicates the validation of the visualization analysis of our interpretable DL model.

Fig. 3: Visualization of the DL model.
figure 3

a, b The visualization of the two principal dimensions with PCA for the element representations and the local oxygen-coordination-environment representations, respectively. Dimension 1 and 2 in the plot constitute a plane in the corresponding vector space that can approximate the representations best. The points in a are colored according to their elemental groups. c the median value of the learned local voltages for each element in the oxygen-coordination environments depicted in the elemental parodic table. The elements in b and c are labeled according to the type of center atoms in oxygen-coordination environments and the colors are coded with learned local voltages. The gray arrow in b indicates the change of local voltages.

Besides the visualization of the element representations in the neural network, the interpretable model can get a clear picture of the local environment from the electrode-voltage prediction. We next visualize the local environment (see more details in the Methods section), and take the oxygen coordination, at least two oxygen atoms around the central element, as the local environment representation because of the large number of oxides for battery electrodes. There are 23,022 local oxygen-coordination environments in total in our dataset. We first project all the oxygen local environments into a 2D plane using the PCA (Fig. 3b), and the elements are colored according to their local voltage, which is the voltage for each atom in the corresponding crystal and is calculated using Equation (2) in Methods. It shows that the local voltage color changes along the diagonal direction (gray arrow) in the dimension 1- dimension 2 plot. Such a phenomenon indicates that both dimension 1 and dimension 2 have linear relationships with the local voltage, implying the local voltage information has been learned by our model. To surface such a relationship, we visualize the oxygen-coordination-environments with the t-distributed stochastic neighbor embedding (t-SNE) algorithm48 (Supplementary Fig. 3). We find that oxygen environments can be approximately grouped into three large groups according to the t-SNE distribution: (i) the IA group with low voltages, (ii) the main group with high voltages, and (iii) the TM group with both low and high voltages. In order to validate such voltage performance extracted from the neural network, we calculate and project the median voltage (Vi) for each element (under the oxygen-coordination environment) on the elemental periodic table as shown in Fig. 3c. As can be seen, it is quite similar to the t-SNE analysis, but with more details. The elements in the main group have high-median local voltage except for the alkali metal elements, which have low voltages. For the TMs, the early TMs mainly have relatively low-median local voltages, while the late TMs perform reversely. Such a relationship of local voltages with elements probably provides a promising strategy to design electrodes of batteries with a high voltage through the elemental replacement, such as to replace an early transition metal, Ti, by a late transition metal, Cu, which can significantly increase the cathode voltage from 1.32 V of NaTiPO4F to 5.52 V of NaCuPO4F for Na-ion batteries49. Although, a high-voltage cathode often experiences a poor reversibility in the experiment, which should be taken into consideration in actual selection of cathode materials in the experiment.

Interestingly, the PCA of the elemental features (Fig. 3a) also performs similar element-voltage relations with that of the local voltage of oxygen-coordination environment analysis. For example, the elements with the low local voltages occupy the left and right parts (two regions shaded ion blue in Fig. 3a), while the middle area shaded in orange is mostly occupied by the elements with the high local voltages. This means that instead of extracting from the late convolutional layers (Fig. 1), the voltage information has already been learned in the early layer in our neural work. The oxygen-coordination environment and the covalent radius shows a similar trend with respect to the element (Supplementary Fig. 4), in line with the previous element representation analysis. Moreover, the local voltage of each element (Fig. 3c) is not only connected with the elemental properties, such as the electronegativity, but also affected by its oxygen-coordination environment, which are not available from conventional ML.

The p-orbital elements and late transition metal elements prefer to accept electrons, resulting in the reduction reaction. These characters make crystals with these elements have large chemical potentials and thus high voltages. The alkali metals and the early transition metals are easy to lose their electrons and get oxidation. Thus, electrode materials with these elements under the oxygen environment would have small chemical potentials and low voltages. The alkaline-earth metal ions prefer to attract electrons because of the electrostatic force between the cation and the electrons, and thus, the existence of these bivalent cations would make the cathodes have large chemical potentials and high voltages.

Transfer learning of multivalent metal-ion batteries

In this section, we will address the aforementioned problem that it is hard to get high-performance DL models for multivalent MIBs because of data scarcity, for example, only 149 Al-ion samples. Multivalent MIBs have the similar mechanism with LIBs, which allows us to reuse the pre-trained DL model on LIBs to multivalent MIBs. Here, we integrate the layer-freezing TL method45, which would only fine tune the last two fully connected layers, into our deep-neural network (see in Fig. 1) to predict the voltages of Mg-, Ca-, Zn-, and Al-ion battery electrodes. For comparison, two conventional ML models and one DL model trained from LIBs (named Li-model) are also used on multivalent MIBs. Thus, the DL model, CGCNN, and the two conventional ML models are not only used for the Li-ion battery voltage prediction, but also the multivalent MIBs, while the TL model is only used for the multivalent MIBs. The overview of the architecture for these models are shown in Supplementary Fig. 1. Figure 2b–e show the model error, MAE, for voltage prediction using the conventional ML model, Li-model, and TL model. As can be seen, the Li-model and conventional ML models have very similar MAEs, which are much larger than those predicted by the TL model. It demonstrates that the prediction accuracy from small data has really been largely improved by the TL technique, especially for the Zn and Al-ion battery electrodes, whose voltage predictions have been significantly improved by 1.67 V and 1.03 V, respectively. Moreover, Fig. 2b–e show that the predicted and the corresponding target voltage datasets have similar normal distributions, indicating that the datasets are reasonable, and the predictions are accurate. It is worth noting that the Li-model can provide a relatively good prediction for Na- and K-ion batteries (Supplementary Table 2) because Li-, Na-, and K-ions are all monovalent and having similar performances in electrodes35. However, our TL model still can further improve the performance in predicting monovalent MIBs (Supplementary Table 2). The MODNet16 and CrabNet17,18 are another two neural network models commonly used for the small dataset. We train these two models on the electrode voltages of Al- and Mg-ion battery datasets for comparation. Our results show that the MODNet has higher MAEs than the CGCNN TL with the MAE of 0.63 and 0.47 V for Al- and Mg-ion batteries, respectively. It is found that the CrabNet is highly overfitting for predicting voltages of electrode materials. It is because the MAE of the training set for Mg-ion batteries is only 0.13 V, but 0.36 V for the validation set. Besides, the CrabNet model is time consuming compared with TL because it is an attention-based model.

We validate our model by comparing it with the machine learning and available experimental results as shown in Fig. 4. As can be seen, our model outperforms the conventional ML model in predicting the electrode voltage for multivalent Mg-, Ca-, Zn-, and Al-ion batteries. Therefore, the voltages predicted by the TL model are more reliable to guide experiments in finding appropriate electrode materials for improving the performance of future multivalent MIBs. To help the battery community, we thus implemented our model into a publicly available online tool kit that can be used for quickly pre-checking and screening the voltage of any electrode with the crystal structure only.

Fig. 4: Comparing the predicted voltages with the TL and conventional ML models with the available experimental values.
figure 4

aRef. 58. bRef. 59. cRef. 60. dRef. 61. eRef. 62. fRef. 63. gRef. 64. hRef. 50. iRef. 65. jRef. 66.

Finally, we predict the voltages of 1637 materials for Al-ion (613), Ca-ion (278), Mg-ion (369), and Zn-ion (377) battery electrodes in Supplementary Tables 36, which are not available in Materials Project database. Within the 1637 materials, 113/613 of Al-ion, 128/278 of Ca-ion, 107/369 of Mg-ion, and 4/377 of Zn-ion battery electrode materials have an average voltage higher than 3.0 V as the potential candidates. It is worth noting that only around 1% cathode materials for Zn-ion batteries have the average voltage of 3.0 V or above. It means that finding a high-voltage electrode material for Zn-ion batteries would be a challenge. As a general design rule, our results of local voltages of each element (Fig. 3a) indicate the relationship between crystal elements and corresponding electrode voltages. Such element-voltage relationship may provide a promising strategy to design high electrode voltages by the elemental replacement for Zn-ion batteries.

Online artificial intelligence tool kit for voltage prediction

A web tool, provided for the voltage prediction for both multivalent and monovalent MIB electrodes, is publicly available online ( The voltage of a crystal can be obtained within a few seconds, with the crystal structure or the material ID in the MP and type of metal ion as inputs. The voltages for LIBs are predicted by the interpretable DL model and others are predicted by the deep TL model.

In summary, we develop an interpretable TL model to accurately predict the electrode voltages for MIBs (especially the multivalent batteries with very small data) and explain the underlying physical pictures as the important features for the voltage prediction by visualizing the vectors in layers in the neural network. The prediction with our model is much accurate in comparison with the conventional ML models and validated by the reported experimental results. The high-performing and interpretable DL models with the booming growth of battery materials data would greatly benefit the battery community that can be used for AI-enabled high-speed materials screening and rational design of electrodes of MIBs, such as substituting high local voltage elements (Fig. 3c) in a crystal to improve its voltage and energy density.


Training datasets

The data are extracted from the MP database38,50 accessed through the application programming interface (API) implemented in pymatgen51. The database contains a total number of 4401 intercalation-based battery electrode materials, in which, 2291 entries are electrode materials for LIBs, and 393, 484, 385, 149, 328, and 125 instances for the Mg-, Ca-, Zn-, Al-, Na-, and K-ion battery, respectively (Supplementary Table 1). Before training, outliers and inconsistent data are removed, and 2190, 387, 471, 378, 149, 287 and 125 instances are kept to train the model for Li-, Mg-, Ca-, Zn-, Al-, Na-, and K-ion battery electrode materials, respectively.

Machine-learning model

The conventional support vector regression (SVR) and kernel ridge regression (KRR) models are used in this work to predict the electrode voltages. They produce quite similar voltages for both multivalent and monovalent MIBs (Supplementary Fig. 5).

Deep-learning model

Our DL models are mainly based on the CGCNN4. The CGCNN model presents a periodic crystal structure into a multigraph G. Each atom in a structure is represented by a node i in G, which is represented by the atomic feature vector vi. The vector is then transformed into a 64-dimensional vector in the embedding layer. The nearest 12 neighbors for each atom are also considered in the CGCNN model and the chemical bond of the neighboring atom i and j is expressed as an edge (i, j)k in G, which is represented by vector u(i, j)k. The subscript k indicates the kth edge between note i and j because of the periodicity of the crystal19. Then, the atom and bond features are input of convolutional layers using a convolution function designed in the following equation.

$${{{\boldsymbol{v}}}}_i^{(t)} = {{{\boldsymbol{v}}}}_i^{(t - 1)} + \mathop {\sum }\limits_{j,k} \left[\sigma \left({{{\boldsymbol{z}}}}_{(i,j)_k}^{(t - 1)}{{{\boldsymbol{W}}}}_f^{(t - 1)} + {{{\boldsymbol{b}}}}_f^{(t - 1)}\right) \odot g\left({{{\boldsymbol{z}}}}_{(i,j)_k}^{(t - 1)}{{{\boldsymbol{W}}}}_s^{(t - 1)} + {{{\boldsymbol{b}}}}_s^{(t - 1)}\right)\right]$$

where \({{{\boldsymbol{z}}}}_{(i,j)_k}^{(t)} = {{{\boldsymbol{v}}}}_i^{(t)} \oplus {{{\boldsymbol{v}}}}_j^{(t)} \oplus {{{\boldsymbol{u}}}}_{\left( {i,j} \right)_k}\) is a neighbor feature vector, and \(\oplus\) denotes the concatenation of vectors of the atomic and bonding feature of neighboring atoms of the ith atom. The \(\odot\) denotes element-wise multiplication, σ is a sigmoid function, and g is a non-linear activation function. W and b denote weights and biases of the neural networks, respectively. Behind three convolutional layers, the output vectors are then fitted into a pooling layer to create an overall feature vector. We, then, fully connect the resulting vector through a hidden layer, which is followed by a linear transformation to scalar values.

In the CGCNN model, only elemental and structural features of the electrode crystals are used, because our training datasets retrieved from Materials Project database are on the basis of DFT calculations. The average voltage in Materials Project is calculated by Supplementary Equation (3). Only the internal energy difference between the crystals before and after discharge is used to obtain the average voltage as an approximation, the details of which can be found in SI. It has been reported that the calculated voltages with such approximation are comparable to the experimental results. For example, the calculated average voltage of olivine LiFePO4 and LiNiPO4 is about 3.2 and 4.9 V, respectively, which are consistent with the experimental results of 3.4 and 5.1 V, respectively52,53,54,55. The calculated voltage curves for RuO2, SnO2, and SnS2 have the similar voltage curve change trend as those measured in experiments56.

Interpretability and visualization of deep-neural network

Ante-hoc and post-hoc are two commonly used interpretation models. Ante-hoc model would open the black-box and to gauge how a certain neural network is about to contribute to its predictions. Post-hoc only pries about the model from the outside of the model57. In the CGCNN network4, the output vectors of each layer are available for public, the ante-hoc could be used to probe the whole model. In this work, the element and the local environment representations, who are endured from embedding and the convolutional layers in the model respectively (Fig. 1), are mapped by two dimensions for visualizing the feature vectors in the layers. The local voltages are also explored to evaluate the contribution of each atom to the crystal voltage.

Element representations completely depend on the type of elements in the crystal. In the CGCNN model, the ith atom is a node i in the multigraph representation G. The node stores a 92-dimensional vector vi, which include all the elemental information, for instance atom covalent radius, valence electrons in each atom orbital. Then, the vector vi is put into the embedding layer and converted into a 64-dimensional vector \({{{\boldsymbol{v}}}}_i^{(0)}\). In order to visualize the features in the embedding layer, the elemental representation vector \({{{\boldsymbol{v}}}}_i^{(0)}\) are dragged out from the model. However, it cannot be directly visualized due to the large dimension of the vector. Thus, we adopt the PCA analysis to reduce the elemental representation into a two-dimensional plane.

Local environment representations depend on both the element type and their neighbors in crystal. After the three convolutional layers in the CGCNN model, a 64-dimensional vector \({{{\boldsymbol{v}}}}_i^{(3)}\), who represents the local environments of atom i and contains the information of both the atom and their neighbors, is obtained. In our work, only the vectors with the local oxygen-coordination environments, in which the working atom must have at least two oxygen atoms as neighbors, are dragged out for visualization. It is because cooperation among the same environment is more reasonable, and the oxygen-coordination environment is the commonest in the electrode materials dataset of MIBs. The local oxygen environment vector \({{{\boldsymbol{v}}}}_i^{(3)}\) is then reduced into two dimensions with the PCA and t-SNE method in order to have a clear visualization about what information has been learned in the convolutional layer.

Local voltage representation is derived from the local oxygen-coordination environment vectors and represents the contributions of each atom to the voltages of the crystal. A linear transformation is performed to map \({{{\boldsymbol{v}}}}_i^{(3)}\) to the local voltage Vi,

$$V_i = {{{\boldsymbol{v}}}}_i^{(3)}{{{\boldsymbol{W}}}}_l + b_l$$

where Wl and bl denote weights and biases. These two parameter vectors are trained solely during the model visualization process. The voltage of the crystal is predicted using the average of the local voltage \(V = \frac{1}{n}\mathop {\sum }\limits_i V_i\) where n is the number of atoms in the crystal. The large local voltage of an atom means that electrode materials with such atom may have a larger voltage. The local voltage of elements in the oxygen-coordination environment is a specific local voltage for the atom, having at least two nearest neighboring oxygen atoms in the crystal. The oxygen-coordination environment is considered specially in this work because there are a large number of oxides in the electrode materials, which may make our models more explainable from such local environment.

Transfer learning

TL is an effective DL model for the prediction with insufficient datasets and is expected to speed up and improve the performance of the convolution neural network. There are two major TL scenarios for loading parameters from previous neural networks. One is to fine tune the convolution network. Herein, the parameters of the target network are initially loaded with a pre-trained network. Thereafter, all the parameters are optimized just as usual. The other one is to set the convolutional network as fixed feature extractor. This method freezes the weights of the earlier layers and only the last few fully connected layers are trained. In this study, the second TL method is used for the voltage predictions of multivalent MIBs and monovalent Na- and K-ion batteries. Only parameters in the last two fully connected layers in the CNN structure are optimized in the TL model, which are the hidden layer and the output layer (Fig. 1 orange box).