Introduction

Materials informatics is the study of the data-oriented understanding of materials data, represented by structures, properties, mechanisms, and protocols1. Artificial intelligence has been used in the field for automated material design, massive data analyses, and accelerated experiments with robots to advance the discovery of materials for energy- and environment-related applications1,2,3,4,5.

A long-term challenge in materials informatics and materials science is lossless data sharing by the scientific community6. Although materials and devices are sensitive to their preparation processes, material databases and scientific documents generally do not provide sufficient information1,7,8. Most databases focus on structure–property relations, and ignore or shorten the preparation protocols1,4,6,8. Experimental methods are available in scientific journals, but only specialists can appropriately extract the structure–property–process relationships from the text, and automated text parsing by artificial intelligence is not yet practical7,9. Furthermore, detailed information, including non-representative experimental protocols, lot numbers of reagents, and raw measurement data, is often omitted from articles, which leaves major uncertainties about material data. Researchers may need to improve their communication style to achieve lossless material data sharing.

We propose a data platform that can explicitly describe the relations among the structures, properties, and processes of materials (Fig. 1). Based on the concepts of knowledge graphs or flowcharts7,10, all experimental events are connected as nodes in graphs. Most experimental information can be described losslessly as graphs, the format of which is also compatible with data science7. We demonstrated the system by using it in our superionic organic conductor research that revealed the factors for achieving a remarkable room temperature conductivity of 10−4–10−3 S/cm and a Li+ transference number of 0.8, practically the highest values of organic solid-state conductors without plasticizers11,12,13,14,15. All experimental data, including everyday experimental operations and measurements (over 500 records), were recorded in the database, and are available from a public repository. This work is the demonstration in experimental materials science of the everything-open research style, which should become the standard for scientific communication to accelerate the integration of materials knowledge.

Fig. 1: Graph-shaped material data storage system.
figure 1

All experimental results were recorded as graph-shaped data, and automatically converted into a table database for analysis (see Supplementary Fig. 1 for a representative case). Missing values were imputed by machine learning.

Results

Recording daily experiments as graph-shaped data

As the essential components of next-generation secondary batteries12,13,14,16,17,18, solid-state organic lithium-ion conductors were prepared by mixing aromatic polymers, electron-accepting molecules, and lithium salts (Fig. 2a). Several candidates were virtually extracted in our previous machine learning study, using the model trained with literature data (>10,000 experimental records)4. The model indicated a high room temperature conductivity over 0.1 mS cm−1, and we experimentally confirmed some predictions4. However, the model could not input process information, even though the properties and hierarchical structures of composite materials are changed drastically by different preparation protocols1,7,8. The literature does not provide comprehensive experimental information for each electrolyte, mainly because of the limited space for methodology sections. This is not a problem specific to ionic conductors but has been a general limitation in materials informatics.

Fig. 2: Electrolyte structures and conductivity.
figure 2

a Search space of chemical structures and major operations to prepare electrolytes. b Nyquist plot for a representative electrolyte, PPO/chloranil = 6/4 (mol/mol) with 30 wt % LiFTFSI. Inset: Photograph of the electrolyte layer. c Experimental ionic room temperature conductivities of the electrolytes. The samples were named using the format ‘XYZMM-NNαβ’, which indicates an electrolyte containing MM mol % donor (X = S: PMPS; O: PPO) versus acceptor (Y = L: chloranil; Q: benzoquinone; D: 2,3-dichloro-5,6-dicyano-p-benzoquinone) with NN wt % salt (Z = D: LiTFSI; M: LiFTFSI; N: LiFSI; B: LiBF4). Symbols α and β indicate operational conditions (α = H: thermal annealing before measurement; L: room temperature, and β = G: cells were kept in a glove box until measurement; O: kept outside). Box-plot elements are defined as follows. Center line: median, box limits: upper and lower quartiles, whiskers: 1.5x interquartile range, and points: outliers. Supplementary Discussion g details the effects of the factors for conductivity.

During electrolyte exploration, we used a graph database as an electronic laboratory notebook in which we recorded the daily experiments (Figs. 1, 2b, c). Electronic laboratory notebooks are commercially available, but they are not specially designed for data science, and are only available in a closed system (e.g., require payware)19. In contrast, our management system uses open-format graphs (XML data) and an open-source processing system (Supplementary Fig. 1). One graph was designed to contain almost all the information for one experiment, including experiment date, environment, experimenter, protocols, chemical formula, and a link to analytical data.

Although the electrolytes were prepared by simply mixing the components, over 40 small steps and at least 100 variable parameters could be recorded for the conductivity measurements (e.g., heating temperature, duration, and timing; Supplementary Fig. 1). For each experiment, experimental protocols were changed slightly to optimize the conditions. These large numbers of steps are usual in materials science, but recording them using conventional frameworks is unmanageable. The protocols are too complex for standard process informatics tools such as experimental design and Bayesian optimization, which typically focus on less than ten variables1,2,6. Only a representative protocol is usually described in the methodology section of scientific articles. In contrast, no data loss would occur in this system because every experimental result is available as graph data on the public repository.

Bridging electronic laboratory notebooks and data science

All experimental results in the project, exceeding 500 records, were recorded in the database. Unsuccessful conductors, synthesized properly but displaying poorer performances because of the unoptimized experimental procedures or compositions, were also recorded to improve machine learning models. We emphasize that they are often omitted from conventional scientific articles and lost from the community permanently.

For data analysis, the raw experimental (graph) data were automatically converted into table data, which was learned by a conventional tree-based ensemble model (Supplementary Fig. 2). First, the graphs were processed to a numerical array by our open-source Python module (Fig. 3a). We used a fingerprint algorithm to describe the characteristics of graphs. Fingerprint algorithms were developed to characterize the features of molecules by representing the presence of specific chemical moieties20. The availability of specific steps in a protocol was checked in the current algorithm (Fig. 3b, see Methods section for details). Similar operations were automatically grouped by natural language processing (BERT)21 and unsupervised learning (k-nearest neighbour, kNN). The grouping improved the generality of the fingerprint by addressing orthographical variants (Supplementary Fig. 3, Supplementary Table 1). Individual algorithms were designed to parse chemical and measurement data to extract their characteristic features, such as molecular weight, conductivity, crystallinity, and peak position (Supplementary Fig. 4).

Fig. 3: Automated feature analysis.
figure 3

a Conversion of process and measurement data into a numerical array. A full machine-learning scheme is shown in Supplementary Fig. 2. b Fingerprint generation from flowcharts. A binary array expresses specific experimental steps (e.g., in the figure, ‘Protocol A’ has steps a, b, c, and not c’ or d: this yields a fingerprint of 11100). ‘Cool’ steps in Protocol A and B are distinguished by c or c’ because only the latter is connected to ‘Stir’. Then, BERT and kNN automatically group similar steps (e.g., ‘Heat’ and ‘Hot plate’ can be categorized in the same group). c Prediction of ionic conductivity (σion) using LightGBM regressor with statistically essential electrolyte parameters extracted by Boruta. d SHAP values during the prediction (explanations of parameters are shown in Supplementary Table 2). e Causal relations estimated by unsupervised machine learning.

Over 50 descriptors characterizing the features of processes, structures, and analytical data, were automatically generated as a numerical array by parsing the database (Supplementary Fig. 5 and Supplementary Fig. 6, Supplementary Table 2, Supplementary Data). Conventional materials informatics usually requires the manual preparation of table databases from experimental results, which is time-consuming and has been a practical bottleneck in material informatics1,7. In contrast, our system automatically converts laboratory notebooks into machine-learnable databases.

Generally, limited research resources do not allow experiments to be conducted with all-inclusive conditions, thereby leading to sparse experimental databases1,6,22. Missing values in the current database were filled by data imputation (Supplementary Fig. 6)7,22. In other words, the unmeasured data were generated from existing results using a LightGBM regressor, which is a standard decision tree-based ensemble model22,23.

During electrolyte preparation, we milled the electrolytes into microparticles. The diameter measurements were conducted only on a few samples, and the values for the other conductors were estimated by imputation (Supplementary Fig. 7). The predicted diameters decreased as the milling time increased, in the same way as for the measured data, indicating successful data imputation. Although the technique is not always accurate22, it can help researchers with objective data analysis and causal exploration.

Data-oriented analysis of electrolytes

Experimentally, various conductors were examined using the polymers poly(p-phenylene oxide) (PPO) or poly(2,5-dimethyl-1,4-phenylenesulfide) (PMPS)24; the electron acceptors chloranil or benzoquinone; the lithium salts lithium bis(trifluoro methanesulfonyl)imide (LiTFSI), lithium (fluorosulfonyl)(trifluoromethanesulfonyl)imide (LiFTFSI), lithium bis(fluorosulfonyl)imide (LiFSI), or lithium tetrafluoroborate (LiBF4); and different experimental protocols such as mixing and heating conditions (Fig. 2a). The conditions were selected based on our previous virtual screening4 and on-time data analysis by the current system.

We emphasize that the introduced aromatic molecules, the scope of the database and the informatics system, differed from regular aliphatic polymer electrolytes (e.g., poly(ethylene oxide) and poly(ionic liquids))12,15,25. The introduced aromatic polymers and electron acceptors can form charge-transfer complexes4,26. Their polarized structures could induce electrostatic interactions with lithium salts, generating potentially superionic phases for an unclear reason4,26. On the other hand, the glassy electrolytes often suffer from insufficient grain contacts: room temperature conductivities varied from almost insulating to superionic with the current electrolytes (10−11–10−3 S cm−1, Fig. 2b, c). We try to clarify the experimental factors affecting the conductivity and its large variance.

Critical parameters for ionic conductivity (σion) were extracted by supervised machine learning., Important descriptors were selected from over 50 descriptors by using the LightGBM regressor and Boruta package27, which can choose statistically valid parameters based on hypothesis testing (Fig. 3c, d). High R2 scores of σion with the randomly split training (>0.9) and testing datasets (>0.6) indicated that the essential factors for conduction were selected adequately (Fig. 3c). About 20 descriptors remained after the filtration, the contributions of which were then quantified by SHapley Additive exPlanations (SHAP) values (Fig. 3b, Supplementary Fig. 8; the scientific significance of parameters are discussed in Supplementary Discussion a)28.

We recognized the relations among the composition, conductivity, crystallinity, and NMR peak width (full width at half maximum; FWHM) of the electrolytes from the SHAP analysis (Fig. 3d). The detailed causal relationships were analyzed by unsupervised machine learning (Fig. 3e)29. The automated causal exploration indicated that adding polymer and acceptor molecules to salts simultaneously reduced the crystallinity, sharpened NMR peaks, and increased σion (Supplementary Discussion b). Just by recording the daily exploratory experiments, essential and objective material insights could be extracted by the system.

Revealing superionic conduction

We rationalized the electrolytes’ high conductivity based on materials science. Traditionally, solid-state organic ionic conductors have been designed to contain polar and flexible aliphatic chains, represented by poly(ethylene oxide), for solvation and transportation of ions (Fig. 4a)12,25. The solidification and overly strong cation-solvent interaction reduced the mobility of molecules, leading to a typical conductivity of 10−6 S cm−1 and lithium-ion transference number of 0.3, much smaller than liquids or recent inorganic solid conductors, which have conductivities of >10−3 S/cm12,17,25.

Fig. 4: Characterization of superionic conductors.
figure 4

a Comparison of molecular structures for ionic conductors. b Estimated scheme of ion conduction through the amorphous electrolyte. c DSC curves for the PPO/chloranil = 60/40 (mol/mol)+LiFTFSI electrolytes. d 19F MAS NMR spectra of the samples. e Temperature dependence of σion for the 20 wt % LiFTFSI sample. Inset: DSC for the pristine LiFTFSI salt. f Charge/discharge curves for a solid-state lithium-ion battery containing PPO/benzoquinone = 8/2 with ca. 52 wt % LiFTFSI electrolyte, operated at 50 °C.

An essential feature of our electrolytes is that they consist of only rigid components, namely, aromatic polymers, acceptors, and lithium salts. The highly conductive yet chemically, mechanically, and thermally robust properties of our electrolytes4 may not be achieved by other organic or inorganic conductors, which typically suffer from thermal melting, hydrolysis, or large interface resistances12,17. The superionic conduction in the organic glassy media must be understood from both scientific and technological perspectives to advance the practical applications of these materials (e.g., electric vehicles).

The automated causal analysis highlighted the emergence of the amorphous phase after the electrolyte preparation (Fig. 4b). The polymers (electron donors) and acceptor molecules form charge-transfer complexes4, inducing electrostatic interactions with lithium salts (Fig. 4b, molecular calculations shown in Supplementary Fig. 9 and Supplementary Fig. 10, and Supplementary Discussion c). As a representative case, electrolytes consisting of PPO/chloranil with different amounts of LiFTFSI were examined in detail (maximum conductivity of around 10−3 S cm−1; Fig. 2a). Scanning electron microscopy (SEM) and element mapping showed the uniform distribution of the polymer, acceptor, and salt as microparticles (Supplementary Fig. 11). Peak shifts of the aromatic rings of the donor polymer in 13C NMR indicated molecular-level interactions between the complexes and salt (e.g., cation-π interactions30; Supplementary Fig. 12).

X-ray diffraction (XRD) and differential scanning calorimetry (DSC) also indicated the amorphous phase formation (Fig. 4c and Supplementary Figs. 1316). After mixing with charge-transfer complexes, crystalline XRD peaks of lithium salts became smaller due to the amorphous phase transformation (Supplementary Fig. 14). The transition was consistent with the fade of melting peaks after the interaction with charge-transfer complexes.

The addition of the charge-transfer complexes to salts induced unexpected sharpening of solid-state 7Li and 19F NMR peaks (Fig. 4d and Supplementary Figs. 170). Pristine LiFTFSI displayed a broad 19F magic angle spinning (MAS) NMR peak at −80 ppm and a smaller sharp peak at around −79.5 ppm, attributed to CF3 groups. After the electrolyte formation, only smaller peaks were visible, and σion increased (Fig. 4d). Because the peak sharpening indicates the greater atom motion, the high σion was explained by the mobile ions in the amorphous phases (Figs. 3e, 4b).

The kinetics of atoms in electrolytes can be quantified by different NMR techniques, such as pulse-field gradient (PFG) NMR and relaxation time estimation12,31,32. PFG-NMR was unsuccessful for the electrolytes, certainly due to the interference by electron spins in charge-transfer complexes33. On the other hand, the fast kinetics of 7Li and 19F was successfully confirmed by relaxation time T131 during static NMR measurements (Supplementary Fig. 21, Supplementary Discussion d). In contrast, carbon peaks showed no sharpening; the polymers had high glass transition temperatures (Tg) over 100 °C 4,24 (Supplementary Fig. 12). The NMR results suggested that the mobile media was the salts themselves, not the polymers or acceptors (Fig. 4b).

Although pristine lithium salts have been believed to be almost insulating12, our experimental results supported the salt-based conduction model. The temperature dependence of σion was fitted by the Vogel-Fulcher-Tammann (VFT) equation, which describes the kinetics of amorphous structures (Fig. 4e, Supplementary Fig. 22)34,35. The fitting yielded a Kauzmann temperature (TK) of −48 °C (Supplementary Fig. 13b, Supplementary Discussion e). The value satisfied the empirical relation of amorphous phases, TgTK + 5035, where Tg 0 °C–10 °C was the experimental glass transition temperature of the salt (Fig. 4e)36. The agreement demonstrated that the dominant carriers were amorphous salts. Molecular dynamics simulations also showed the advantages of the amorphous phase. Ions were firmly trapped in the crystalline phase, whereas structural vacancies were available in the amorphous phase (Supplementary Fig. 23, Supplementary Discussion f).

The electrolytes look dissimilar to conventional ones, but they could be regarded as a type of polymer-in-salt system, in which excess salt is added to polymers (typically over 50 wt % to an aliphatic polymer)13,37. In the system, amorphous salt clusters, which emerge after polymer-salt interactions, function as carriers, enabling a conductivity of up to 10−4 S/cm13. Rubbery polymers are often used, but even high-Tg polymers (e.g., polyacrylonitrile, Tg 100 °C) can also become conductors by unexplained mechanisms13,38.

In the current electrolytes, electrostatic interactions of polar charge-transfer complexes and salts should have generated the amorphous clusters, similarly to conventional polymer-in-salt systems. The hypothesis is supported by the high dielectric constants of charge-transfer complexes (εr = 100–104), achieved by the partially mobile (or sometimes fully delocalized) electrons in charge-transfer complexes39. The species could be potentially functional as polarized groups for ion dissociation and yield higher bulk permittivity than normal solid polymer electrolytes4.

Kinetically, thermal (mainly rotational) motions of anions could transport cations even in the absence of flexible polymer matrices, which is unusual in normal polymeric conductors, but widely observed in ionic liquids, soft crystals, and some polymer-in-salt systems13,40. In our ongoing research, we intend to reveal more details of the structures and kinetics, including the possible contributions of charge-transfer complexes to salt structures, anion-acceptor exchange reactions41, and other chemical factors (Supplementary Fig. 24 and Supplementary Fig. 25, Supplementary Discussion g).

Next, we analyzed the best electrolyte conditions, focusing on the reproducibility of conductivity. A modified evaluation score, which was the median minus standard deviation of σion for electrolytes with the same composition, was predicted instead of each conductivity (Supplementary Fig. 26). The score was low for poorly reproducible conductors with large conductivity variance. The analysis showed the importance of the following factors: a thermally stable PPO polymer; a highly dissociative and asymmetrical LiFTFSI salt; an appropriate donor/acceptor/salt ratio (e.g., donor/acceptor = 6/4 (mol/mol)+30 wt% salt); sufficient heating during electrolyte formation and premeasurement; and particle milling (Supplementary Discussion h). After the optimization, high room temperature conductivities were obtained with slight variance (e.g., 10−4 to 10−3 S cm−1 for electrolyte OLM60-41HG in Fig. 2). Thus, the materials informatics tool was suitable for technological optimization tasks.

A solid-state lithium-ion battery was fabricated using a lithium iron phosphatase cathode and a lithium titanate anode. The electrolyte consisted of PPO/benzoquinone = 8/2 with 52 wt % LiFTFSI (maximum σion 10−3 S cm−1, OQM78-52 in Fig. 2). Benzoquinone was selected because of its potential chemical stability over chloranil (e.g., containing no halogens). The cell operated reversibly at 1.9 V, corresponding to the standard voltage (Fig. 4f). Although charge-transfer complexes could become electrically conducting, the composition of electrically insulating salts did not cause short circuits or current leakage (around 10−8 S cm−1)4,26. Even at a large current density of 1.5 mA cm−2 (corresponding to 1 C), the charge/discharge reactions proceeded. The impressive performance can be explained by the electrolyte’s high Li+ transference number (ca. 0.8, estimated from the actual battery responses; Supplementary Fig. 27, Supplementary Discussion i). This high value was also an essential characteristic of the polymer-in-salt system13. The origin of the exceptionally high transference number must be revealed in future research.

Discussion

We developed a materials science platform for easy data analysis and sharing. Everyday experiments were recorded as graph data in an electronic laboratory notebook, clarifying the relations among structures, properties, and processes. Automated parsing of the graphs and data analysis helped researchers to understand data with broad experimental conditions objectively. The research style of publishing all raw experimental results with appropriate metadata will pave the way for lossless material data sharing, which is also compatible with big data mining and deep learning7. The platform is principally applicable to the versatile research of organic materials, and is sharable among different laboratories. The accelerated data and model sharing help researchers more efficient material exploration.

Scientific and technological aspects of glassy organic superionic conductors were examined using the platform. Semi-automated data analysis helped us reveal the causal relationships in the electrolyte system statistically. The amorphous phases of salts achieved superionic conduction and a high transference number (ca. 0.8) after interacting with polymer charge-transfer complexes. We are now addressing the cyclability problem of the prototype batteries (currently about 30 cycles; Supplementary Fig. 27) with our informatics tool, which is highly compatible with composite materials and device development.

Methods

Materials

PPO and chloranil were purchased from Sigma-Aldrich Co. The molecular weight of PPO was estimated by gel permeation chromatography using chloroform as eluent and polystyrene standards (Mn = 1.4 × 104, Mw/Mn = 1.9). LiFTFSI was purchased from Fujifilm Wako Chemicals Co. LiTFSI was obtained from Kanto Chemical Co. LiFSI, LiBF4, and benzoquinone were purchased from Tokyo Chemical Industry Co. Other chemicals were obtained from the above companies. All compounds were used as received, and their regent containers were kept and opened only in a glove box. Detailed information, such as production numbers, is available on the public repository. PMPS was synthesized according to our previous report mentioned in the main manuscript (Mw: 18,000, Mn: 7000).

Electrolyte preparation

General information

A typical electrolyte preparation procedure for PPO/chloranil = 60/40 (mol/mol) with 40 wt% LiFTFSI electrolyte (also shown as a flowchart in Supplementary Fig. 1) is described here. Methods for the other electrolytes are given in the repository as graph data. All procedures were conducted in an argon-filled glove box (oxygen concentration of less than 10 ppm and dew point of −80 °C). Scrupulously dry conditions were maintained because lithium salts are highly deliquescent.

Electrolyte preparation

PPO (211 mg) and chloranil (282 mg) were mixed roughly in a petri dish. The powder was ball-milled (Mini-Mill PULVERISETTE 23, FRITSCH; stainless-steel balls, 10 mm diameter) for 30 min at 40 Hz. The mixture was put in a petri dish and heated at 200 °C for 65 min to form the charge-transfer complex. The plate was covered with aluminium foil to catch sublimated chloranil, which was collected and returned to the mixture. After mixing the powder with a spatula, the complex was heated at 280 °C for 30 min. The product was obtained as a black powder (325 mg). The yield was less than the weight of raw materials (211 + 282 = 493 mg) because the unreacted acceptor had sublimated from the mixture.

LiFTFSI (208 mg) was added to the charge-transfer complex (304 mg) and mixed roughly. The powder was ball-milled at 40 Hz for 60 min, and heated at 150 °C for 10 min to melt the salt. After cooling, the mixture was ball-milled at 40 Hz for 75 min, yielding the final product.

Impedance measurements

The ionic conductivities of electrolytes were evaluated using solid-state cell equipment (KP-SolidCell, Hohsen; equipped with Macor insulator and PTFE O-ring). Stainless disks (1 cm diameter) were used to sandwich electrolytes. The equipment was dried at 100 °C for more than 1 h before it was placed in the glove box. The cell was filled with the electrolyte and sealed with a torque of over 20 Nm. The cell was annealed on a hot plate at 80 °C for 4 h and at room temperature for at least 16 h. Impedance measurements were conducted just after removing the cells from the glove box. Several cells were prepared under the same conditions to check the variation in conductivity. Electrolyte thicknesses were measured with a conventional gauge after cell disassembly. Instead of leaving some cells in the glove box, some were sealed with a conventional plastic vacuum bag, kept outside the glove box, and annealed in an incubator (AMF-20N, Asahi Rika Seisakusho Co. Ltd.; the bag was opened just before the measurements).

Fabrication of lithium-ion battery

A solid-state lithium-ion cell was prepared in a similar way to the normal cells for impedance measurements. A precoated LiFePO4 cathode and Li4Ti5O12 anode (Piotrek Co.; areal capacity of about 1.5 mAh cm−2) were used instead of stainless disks.

Electrolyte abbreviation

The samples were named using the format ‘XYZMM-NNαβ’, and the variables had the following meanings.

X: Donor polymers (S: PMPS, O: PPO)

Y: Acceptor molecules (L: chloranil, Q: benzoquinone, D: DDQ)

Z: Salt molecules (D: LiTFSI, M: LiFTFSI, N: LiFSI, B: LiBF4)

MM: Formal mole percent of donor with respect to the acceptor

NN: Weight percent of salt in an electrolyte

α: Whether an electrolyte was thermally annealed before electrochemical impedance spectroscopy (EIS) (H: high-temperature thermal annealing (e.g., 80 °C), L: only room temperature leaving)

β: Whether annealing was conducted in a glove box or outside (G: glove box, O: outside).

Measurements and data analyses

EIS

A photoelectrochemical system (ZAHNER CIMPS, Zahner) was used as a spectrometer for EIS (Supplementary Fig. 4a). Samples were measured under open-circuit conditions at frequencies of 106 to 1 Hz. The spectra were fitted manually with an RC parallel equivalent circuit. A constant phase element was introduced for a depressed semicircle, and the value of the intercept on the x-axis in the Nyquist plot was extracted for highly conducting samples. Ionic conductivity σion was estimated from the measured resistance R (= σion−1 × electrolyte thickness per area). Capacitance C was around 10−10 F, corresponding to the relative permittivity of 1–100 of the electrolytes. If a capacitance that was too large or small was obtained, the corresponding semicircle was not attributed to ionic conduction.

DSC

Samples were sealed in standard DSC pans in a glove box to avoid air exposure. DSC curves were measured with a differential scanning calorimeter (Q200, TA Instruments) with a typical scan rate of 20 °C min−1 (Supplementary Fig. 4b). The melting temperature and enthalpy of salt components were manually detected and recorded in the database.

XRD

XRD measurements were performed with an X-ray diffractometer (SmartLab, Rigaku). In a glove box, samples were sealed in airtight sample holders for the intensive beam method or glass capillaries for the parallel beam method to avoid air exposure (Supplementary Fig. 4c). After automatically separating signal peaks from the background, the peak areas were calculated (Supplementary Fig. 16).

Solid-state NMR

NMR spectrometers were used to obtain solid-state NMR measurements (JNM-ECA400, JEOL). The NMR samples were sealed in airtight sample holders in a glove box to avoid air exposure. During MAS NMR, samples were rotated at 8 kHz. Rotation was stopped during static NMR. Peak position and FWHM of 7Li and 19F peaks were detected automatically and recorded in the database. The CF3 fluoride peaks were analyzed for LiFTFSI and LiTFSI. In Supplementary Fig. 21, an NMR spectrometer with a different configuration and lot number was employed. Because of the shorter dead time, background peaks (i.e., broad components) were more successfully detected in comparison to Supplementary Fig. 18 and Supplementary Fig. 20. For a clearer comparison, the peak positions and FWHM in Supplementary Fig. 21 were not used as descriptors for machine learning.

Temperature dependence of conductivity

The conductivities of the solid-state cells were measured in a thermostat chamber while the temperature (T) was increased (Supplementary Fig. 4e). The temperature dependence was automatically fitted by the VFT equation, σion = σ0 exp(−B(T – TK)−1), where σ0 is the prefactor, B is the activation energy, and TK is the Kauzmann temperature.

Particle size analysis

Electrolytes or charge-transfer complex particles were observed using a conventional optical microscope under atmospheric conditions (Supplementary Fig. 4f and Supplementary Fig. 7). The particle diameters were automatically calculated from the images, yielding d50 and standard deviation. The salts were washed from the electrolytes with water before the measurements.

Computational simulations

Workstation

A commercial computer (CPU: AMD Ryzen Threadripper3 3990X with 64 cores; memory: 256 GB; storage: 2 TB SSD and 8 TB HDD) with the Ubuntu 18.04 operating system was used for computational simulations and data processing for materials informatics.

Density functional theory calculations

Density functional theory (DFT) calculations were conducted using Gaussian09 (Supplementary Fig. 9). Initial structures were preliminarily optimized by the PM6 semi-empirical method. The geometry optimization and main calculations were performed at the B3LYP/6-31 + g(d,p) level in vacuo. Molecular interactions were quantified by calculating stabilization (solvation) energy. The value was defined as the energy difference between the normal structures and the control cases, where salt and surrounding molecules were far apart.

Molecular dynamics simulation

The molecular dynamics of pristine LiTFSI molecules in the crystal or amorphous phases were calculated by LAMMPS (29 Oct 2020 version) (Supplementary Fig. 23). For the crystal phase, public crystallography data were used to place 480 ordered molecules. For the amorphous phase, 128 molecules were placed randomly. The calculation settings were obtained by using Winmostar V8. The atom charges and force fields were generated by the Dreiding and Gasteiger methods. Simulations were conducted with NPT ensembles at 300 K under 1 atm pressure (‘pair_style’ option of ‘lj/cut/coul/long 10. 10’ and ‘pair_modify’ option of ‘mix arithmetic’). Periodic boundary conditions were set for all dimensions, and the time step was 2 fs.

Open database and materials informatics

General information

All raw experimental data in this project are available on GitHub (https://github.com/KanHatakeyama/GlassyIonicConductorDatabase). The measurement data were linked by graph structures as meta information. The data management system is developed as an open-source Python project, FlowMater2 (https://github.com/KanHatakeyama/FlowMater2).

Preparation of graph-shaped databases

One experiment typically consisted of experimental protocols, material information, and measurement data. Those relationships were expressed in a directed graph explicitly. A diagram was recorded as a GraphML file, which is an XML-based format for graphs that can be edited with free software, such as yEd graph editor (https://www.yworks.com/products/yed).

Experimental procedures were recorded in a graph in a similar way to flowcharts (Supplementary Fig. 1a). For one experimental step, one node was introduced, which consisted of a node name and its characteristic features.

For instance, the ‘Set environment’ node in Supplementary Fig. 1a had the following features.

type: Indicates that the node was related to an experimental operation.

person: Experimenter.

O2: Oxygen concentration in the glove box.

Dew: Dewpoint in the glove box.

Day: Date of the experiment.

Env: Indicates whether the experiment was conducted in a glove box.

ID: Indication for the data processing system that this node is related to the glove box environment for electrolyte preparation.

In another example, the ‘Add donor’ node, had the following features.

type: Indicates that the node was related to chemical samples.

keyword: A macro command to load further information about PPO, such as molecular weight and production data, from a JSON file where additional data were recorded.

weight: Amount of the sample.

note: User note to improve understanding.

ID: Indication for the data processing system that this node is related to the addition of donor molecules.

In another example, the ‘Ball mill’ node had the following features.

type: Indicates that this node is related to an experimental operation.

frequency: Frequency of ball milling.

time: Milling duration.

The electrolyte information was named ‘U-1130’ for the processing system at the end of the graph. Supplementary Fig. 1b shows another graph for impedance measurement. The data processing system automatically loaded and integrated the electrolyte preparation (Supplementary Fig. 1a) and cell fabrication (Supplementary Fig. 1c) procedures, making a full experimental graph. Conductivity data measured with different cells were recorded in a JSON file and loaded by the system automatically. Pathways to other measurement data, such as NMR and DSC data, were noted on graph nodes, clarifying the relationships between experimental methods and measurements.

Parsing graph data

The graph characteristics were described as fingerprints, which indicated the presence of specific nodes (Fig. 3b). For instance, ‘fingerprint a’ in the figure shows the appearance of the ‘Add’ operation in a graph. The corresponding bit is 0 or 1 for the presence or absence of the operation, respectively. The fingerprint algorithm checks the states of the closest neighbouring nodes. The ‘Cool’ operation neighbouring ‘Heat’ (corresponding to fingerprint c) in Protocol A was distinguished from those in Protocol B because it neighboured an additional node of ‘Hot plate’. Over 250 checksums were generated from the experimental graph data in the project.

The fingerprint was compressed into a 30-bit array by automatically grouping the similar nodes (e.g., ‘Heat’ and ‘Hot plate’ in Fig. 3b). Initially, texts in a node were converted into a 768-dimensional numeric vector by a pretrained model of BERT (bert-base-uncased model available at https://github.com/google-research/bert, Supplementary Fig. 3). Then, the vectors were automatically categorized into 30 groups by a kNN model. The frequently used words in the original nodes were extracted to name the new 30-fingerprint checksums.

Characteristic features of measurement data were manually or automatically extracted using different algorithms, as explained in the Measurements and data analyses section. The molecular weight of a compound (or unit structure in a polymer) was selected as a descriptor to express molecular information. This feature should be enough for this project with several available chemical structures. Numeric values in some nodes with the ‘ID’ keyword (examples shown in Supplementary Fig. 1) were also extracted as experimental descriptors. Several new parameters, such as donor and acceptor ratio, were calculated from the information. Major features are summarised in Supplementary Table 2.

Conductivity prediction

Graph data were converted into a numerical table after the parsing, allowing conventional data analytics. The LightGBM model was selected for regression. Before regression, missing experimental values were filled using a machine learning-based imputer (https://github.com/KanHatakeyama/gen_model). Conductivity-related variables, such as σion and VFT parameters, were removed during the imputation, which was necessary to avoid problem leaking.

The logarithmic experimental ionic conductivity was predicted from the experimental descriptors generated by the system (Fig. 3b). Statistical essential descriptors were automatically selected using the Boruta package (https://github.com/scikit-learn-contrib/boruta_py). Then, prediction accuracy was examined with randomly split training (80%) and testing (20%) datasets, using the coefficient of determination, R2. Strictly speaking, descriptor selection should be conducted only with the training dataset. However, the whole database was used because the aim of the study was fundamental parameter analysis, not a prediction. SHAP values were analyzed for a regression model trained with the entire database. Causal relationships were explored using the CausalNex library (https://github.com/quantumblacklabs/causalnex).

In addition to log σion, a newly defined score, (median of σion) – (standard deviation of σion), was predicted for electrolytes prepared under the same experimental conditions (Supplementary Fig. 26). The value is an indicator of conductivity and reproducibility.