Data science assisted investigation of catalytically active copper hydrate in zeolites for direct oxidation of methane to methanol using H2O2

Dozens of Cu zeolites with MOR, FAU, BEA, FER, CHA and MFI frameworks are tested for direct oxidation of CH4 to CH3OH using H2O2 as oxidant. To investigate the active structures of the Cu zeolites, 15 structural variables, which describe the features of the zeolite framework and reflect the composition, the surface area and the local structure of the Cu zeolite active site, are collected from the Database of Zeolite Structures of the International Zeolite Association (IZA). Also analytical studies based on inductively coupled plasma-optical emission spectrometry (ICP-OES), X-ray fluorescence (XRF), N2 adsorption specific surface area measurement and X-ray absorption fine structure (XAFS) spectral measurement are performed. The relationships between catalytic activity and the structural variables are subsequently revealed by data science techniques, specifically, classification using unsupervised and supervised machine learning and data visualization using pairwise correlation. Based on the unveiled relationships and a detailed analysis of the XAFS spectra, the local structures of the Cu zeolites with high activity are proposed.

www.nature.com/scientificreports/ although the reaction using H 2 O 2 give higher productivity and selectivity of CH 3 OH than the reactions using the other oxidants. In the case of CH 4 oxidation using O 2 , Cu zeolites hardly act as catalysts, but activated Cu zeolites, prepared by heat treatment under O 2 offer methoxy species on active sites by stoichiometric reaction with CH 4 . Then, the methoxy species are treated by water vapor to extract CH 3 OH. Thus, the chemical looping process involving Cu zeolite activation, CH 4 oxidation and CH 3 OH extraction has been proposed for direct conversion of CH 4 to CH 3 OH 25,26 . Recent analysis from an economic standpoint has suggested that the problem of the chemical looping process lies in the production efficiency of CH 3 OH and the durability of Cu zeolites 15 . On the other hand, when NO x is used as an oxidant, CH 3 OH is formed over Cu zeolites continuously in gas flow reactors; however, the CH 3 OH selectivity is much lower than those of the former two processes 18,27 . Thus, methods for direct oxidation of CH 4 to CH 3 OH need to be improved further for practical application. The local structure around the Cu sites in zeolites may be the key to catalytic activity for CH 4 oxidation, because the local structure strongly influences the adsorption of reactants, intermediates and products, and consequently reaction results such as reactant conversion and product selectivity are considerably affected 28,29 . In addition, zeolite-framework-derived diffusion and adsorption of molecules can influence the reaction results 30,31 . Thus, it is necessary to gain an improved understanding of catalytic reactions by investigating structure-activity relationships based on various structural data which describe each catalyst accurately. Furthermore, if the key structural descriptors and their effects are revealed, new catalysts can be developed based on fundamental structural design considerations. Here, the data analysis techniques developed in data science such as machine learning and data visualization, are considered useful for revealing the key descriptors in "hidden" relationships in complex multidimensional data 6,7 . Recently, such data analysis techniques have been applied in the field of catalysis chemistry 7,32-41 . Meanwhile, the construction and publication of databases related to catalyst materials are also progressing. As for zeolites, the Structure Commission of the International Zeolite Association (IZA-SC) has provided and upgraded structural data for all zeolite framework types since 1996 42 . Consequently, data for various materials can be explored as catalyst descriptors.
Measured structural data often become more important for describing actually used catalysts than the common data obtained from the published databases. In the case of Cu zeolites, UV-Vis spectra, X-ray diffraction and X-ray absorption fine structure (XAFS) spectra are analyzed as they reflect local structures of Cu active sites 14,29,43 . XAFS spectra provide sensitive and accurate information on valence, symmetry and coordination structure of Cu active sites. Thus, Cu K-edge XAFS has been used to reveal the structure of Cu active sites for CH 4 oxidation as well as that for NO x purification 14,29,43 . It should also be noted that advances in synchrotron radiation and optical techniques in recent decades have permitted collection of XAFS spectra in a relatively short period of time 44 . Therefore, the actual structural data of active sites can be effectively obtained by XAFS measurement.
Here, dozens of Cu zeolites are prepared and CH 4 oxidation is performed in a batch type reactor using H 2 O 2 as an oxidant. Information on the zeolite framework structure is collected from the database. In addition, Cu K-edge XAFS spectra and specific surface areas are obtained for the prepared catalysts. Based on the dataset consisting of reaction and structural data, the active site structures in the Cu zeolite catalysts for CH 4 oxidation are investigated with the aid of data science techniques.

Methods
Catalyst preparation. CHA type zeolites and JRC-Z90 are obtained from JGC Catalysts and Chemicals Ltd. and the Catalysis Society of Japan, respectively. The other zeolites are provided by Tosoh Corporation. Copperexchanged zeolites are prepared by adding 1-2 g of zeolite powder to aqueous solutions of Cu(CH 3 COO) 2 ·H 2 O. After stirring at 80 °C for 3 h, the suspensions are filtered, washed with water, and dried at 110 °C overnight. The samples are calcined at 700 °C for 1 h. The zeolites are designated by M(X)-TYP-Y where M is the exchanged metal species, (X) is the loading amount of M, TYP is the 3-letter code which indicates the type of framework for the zeolite and Y is the Si/Al 2 ratio. Fe-MFI(37) and Mn-MFI (39) are also prepared by the same ion exchange method using Fe(CH 3 COO) 2  The Cu loadings of Cu zeolites are determined using inductively coupled plasma-optical emission spectroscopy (ICP-OES, Thermo iCAP7400) and X-ray fluorescence (XRF, Rigaku EDXL 300). The aqueous solutions for ICP analysis are prepared by a fusion method as reported elsewhere 45 . The mixture of 20 mg of Cu zeolite and 0.5 g of sodium peroxide is heated in a Zr crucible at 500 °C. The resulting molten samples are dissolved by adding 20 mL of 2 M HCl. The Cu loading is calculated from the Cu/Al evaluated by the ICP-OES measurement and the Si/Al specified on the manufacturer's catalogue. The Cu loadings are also evaluated from XRF analysis of the Cu zeolite powders under He flow or under evacuation. Figure S1 shows the relationship between the relative XRF intensities for Cu/(Cu + Al + Si) and the Cu loadings determined by ICP-OES, where the data for 26 samples are plotted. Using the linear relationship and the relative XRF intensities, the Cu loadings for 35  Data analysis using data science techniques. Scikit-learn (version 0.17) and pandas are implemented for supervised and unsupervised machine learning as well as for calculation of pairwise correlation of the variables representing the structure and reactivity of the Cu zeolites 48 . A Gaussian mixture model within unsupervised machine learning is used for classifying the data where the covariance type is set to full. Random forest classification, supervised machine learning, is used to evaluate the importance of descriptors 49 . The number of trees in the random forest is set to 100 where the random state with the highest score is chosen. Cross validation is used to evaluate the accuracy of each machine learning algorithm where the data are split into test data (20% of the data) and trained data (80% of the data). The average score of ten random tests is evaluated. The pairwise correlations are calculated to evaluate the correlations between the variables. H NMR spectra are obtained using a water suppression pulse program. The reaction products, i.e., CH 3 OH, CH 3 OOH and formic acid, are quantified from the peak areas at δ = 3.36, 3.87 and 8.26 ppm, respectively, by the external standard method using maleic acid as standard. The gas phase in the autoclave is analysed by a gas chromatograph (GC-2014, Shimadzu) equipped with a thermal conductivity detector.

Results and discussion
Thirty-five Cu zeolites and twenty H zeolites are tested for the CH 4 -H 2 O 2 reaction in the batch reaction system and the results are presented in Fig. S2a,b and Table S1. CH 3 OH, CH 3 OOH, or HCOOH are observed as the products, and the main product is varied with the catalysts (Figs. S2a, S3, Table S2). The Cu zeolites offer CH 3 OH and CH 3 OOH as the oxygenated products (Figs. S2a, S3c,d). When H-MFI zeolites are used, the overoxidized product, HCOOH, is formed in addition to CH 3 OH and CH 3 OOH (Figs. S2b, S3a). Thus, the Cu 2+ in Cu-MFI suppresses overoxidation of CH 4 , which is consistent with previous studies 16,23,24 . The catalytic activity of H-MFI is attributed to Fe contamination which has activity for non-selective oxidation of CH 4 via the Fenton reaction (Table S2) 24 . Interestingly, the other H-zeolites of MOR, FER, FAU and CHA show much less oxidized products than H-MFI, and do not produce HCOOH (Figs. S2b, S3b). This might be due to the difference in the structure of the Fe species in the H-zeolites (Fig. S4). In the viewpoint of H 2 O 2 utilization, the different Fe and Cu species do not cause significant change in the H 2 O 2 utilization based on the H 2 O 2 concentration measurement after the CH 4 -H 2 O 2 reaction using several zeolites (Fig. S5). The CH 4 -H 2 O 2 reaction is also performed using other metal exchanged zeolites, i.e., Fe, Co, Ni, Rh and Ag-MFI. As presented in Table S3, the M-MFI other than Cu-MFI produce HCOOH. According to the literature, CH 3 OH is formed by the decomposition of CH 3 OOH, while HCOOH is formed by overoxidation of CH 3 OH or non-selective oxidation 24 . Therefore, Cu species are considered to be effective for selective oxidation of CH 4 to CH 3 OH and CH 3 OOH. The result is in good agreement with previous studies 16,23,24 .
To investigate the catalytic performance of Cu species in the zeolites, the increments of the products due to Cu exchange are evaluated from the differences in total yields of all products before and after Cu exchange. The product increments for all Cu zeolites are shown in Fig. S2c, where all Cu-MFI show negative values. It is reasonable to consider that Cu in the MFI zeolites traps oxygen radical species as a result of Fe contamination in the MFI zeolites. A more important fact is that the other Cu-zeolites show positive values (Fig. S2c), which are attributable to catalysis by Cu species. On the other hand, the catalysis of Cu species in MFI cannot be evaluated because of the too strong influence of Fe species in MFI on the catalytic performance. Therefore, active structures of Cu zeolites can be investigated due to the catalytic activity of Cu-zeolites other than Cu-MFI. It should be also noted that neither CO 2 nor CO (< 6 ppm) is detected in the gas phase after the CH 4 -H 2 O 2 reaction using several catalysts including H-MOR(18.8), H-MOR(29.4), and Cu(2.02)-MOR(18.8) by a gas chromatograph with a thermal conductivity detector. Thus, the catalyst activity is evaluated from the total products of CH 3 OH and CH 3 OOH. Figure 1 shows the specific activity determined by dividing the product increments by the amount of Cu in the Cu zeolites. The catalyst activity varies with the Cu zeolites, suggesting that the catalyst activity varies depending on the Cu zeolite structure.
The catalyst structural data are collected in order to explore the highly active structures. Table 1 lists the catalyst structural data collected in this study. The structural data due to the zeolite framework type are taken from the Database of Zeolite Structures of the IZA 42 . More specifically, the following eight variables are collected: framework density; topological density (TD10); channel dimensionality (CD); maximum diameter of a sphere www.nature.com/scientificreports/ that can be included; those that can diffuse along three unit vectors (Da, Db, Dc); and accessible volumes. The surface area, which also belongs to the zeolite structural data, is evaluated by N 2 adsorption measurements. Further, the zeolite compositional data includes the Si/Al 2 ratio of zeolite, the Cu loadings and the ion exchange rates (Cu/Al 2 ratio, denoted as IE) which are determined by the ICP/XRF measurements. The data describing the Cu active site features are obtained by Cu K-edge XAFS spectral analysis, where the electronic state and local structure of the Cu species are evaluated. The Cu K-edge XANES spectra of all Cu zeolites listed in Fig. 1 are presented in Figs. 2 and S6 together with those for Cu 2 O, Cu(OH) 2 and CuO powders as references of Cu + and Cu 2+ . The XANES spectrum of Cu(NO 3 ) 2 aq is also taken as a reference of hydrated Cu 2+ . All Cu zeolites exhibit an X-ray absorption edge at similar energy to Cu(OH) 2 , CuO and Cu(NO 3 ) 2 aq, but at higher energy than Cu 2 O, indicating that all the Cu species in the zeolites are Cu 2+ . Furthermore, the Cu 2+ species in the zeolites are attributable to hydrated Cu 2+ because the spectra are similar to Cu(NO 3 ) 2 aq. However, the Cu zeolites exhibit slightly different X-ray absorption edge profiles from each other, suggesting different electronic state or coordination number/symmetry for the Cu species 29,51 . Thus, the edge energies at 0.5 of the normalized absorption are evaluated as a means to describe the structure of the Cu species. It should also be noted that the XAFS spectrum of a Cu zeolite is not changed by immersion in H 2 O (Fig. S7). The result suggests that the XAFS spectra of Cu zeolites (Figs. 2, 3) reflect the sample state in the liquid phase reaction conditions.
The local structure of the Cu species is evaluated from the Fourier transform (FT) of the EXAFS spectra. The Cu K-edge FT EXAFS spectra for all the Cu zeolites in Fig. 1 are presented in Figs. 3a and S8. The peak at ca. 1.5 Å is assignable to Cu-O scattering, which shows differences in the peak intensity between the Cu zeolites. In addition, the peak intensity at ca. 2.1 Å is also significantly different between the Cu zeolites. The spectral differences suggest a difference of local structure around the Cu species. In fact, previous studies on an aqueous solution of Cu 2+ revealed that hydrated Cu 2+ can have various local structures in dynamic equilibrium including: a distorted octahedron with six H 2 O coordinated Cu 2+ (dOh), a distorted square pyramid (dSPy), a square pyramid (SPy), a regular trigonal bipyramid (TBPy) with five H 2 O coordinated ones and a square planner (SPl) structure with four H 2 O coordinated one 51 . Accordingly, the FT EXAFS spectra of the various hydrated Cu 2+  Table S1.  47,51 . The structural parameters for each model structure are shown in Table S4. The simulated FT EXAFS spectra are presented in Fig. 3b and show that the peak intensities at ca. 1.5 and 2.1 Å vary with the local structure. Therefore, the peak intensities at ca. 1.5 and 2.1 Å are extracted as descriptors of the local structure of Cu.   Table S5 together with the specific activity. To explore the important descriptors for the specific activity, the random forest classification method is deployed for the data of the twenty-eight Cu zeolites. As a data pretreatment, the specific activity is classified into three groups, i.e., low, medium and high using Gaussian Mixture model within unsupervised machine learning in order to perform random forest classification. The classified specific activity is listed in Table S5. Here, the explanatory variables are set to fifteen descriptors of the Cu zeolite catalysts while the objective variable is set to the classified specific activity. Then, the trained random forest classification with the 15 descriptors is evaluated by cross-validation, which returns an average score of 68%. The importance of each descriptor is evaluated and the results are presented in Fig. 4. Relatively high importance is assigned to seven variables including Si/Al 2 , Cu wt, IE, SA, E at abs 0.5, Int at 1.5 Å and Int at 2.1 Å, which are the structural parameters or compositions of the Cu zeolites. In contrast, the descriptors of zeolite types and pores including FD, TD10, DI, Da-c, AV and CD have less impact on the specific activity. It is suggested that the Cu zeolite structure and/or composition are the key descriptors of catalytic activity.
Pairwise correlations of the 16 variables including both explanatory variables and objective one are evaluated in terms of the Pearson correlation coefficient and the results are presented in Fig. 5. The dark color in red or blue means high positive or high negative correlation coefficients, respectively, which suggest a strong correlation between the pair variables. Accordingly, the correlations between the specific activity and the seven variables from FD to CD representing zeolite framework structure are weak (See the green square in Fig. 5). However, there are relatively strong correlations between the specific activity and the other seven variables from Si/Al 2 to the Int at 2.1 Å (the purple square in Fig. 5). Note that the strength of the pairwise correlation is consistent  www.nature.com/scientificreports/ with the importance of the random forest classification. Thus, both data analyses suggest that the seven variables relating to Cu zeolite structure and composition are important descriptors of the specific activity. Interestingly, relatively strong correlations are observed for the pairs between the variables of the zeolite framework structure (from FD to CD) and those of the Cu structure (E at abs 0.5, Int at 1.5 and 2.1 Å) (the yellow square in Fig. 5), suggesting that the zeolite framework structure affects the structure of the Cu active site. In addition, significant correlations are also found among the pairs between the variables from Si/Al 2 to SA and those from E at abs 0.5 to Int at 2.1 Å (the pink square in Fig. 5). Thus, one can consider that the zeolite framework structure, composition and surface area determine the structure of the Cu active site, which strongly affects the catalytic activity.
In order to specify a highly active Cu structure, the specific activity is plotted against the intensities of FT-EXAFS at 1.5 and 2.1 Å as shown in Fig. 6. In both cases, the specific activity increases with the decrease of the intensities of the FT EXAFS. Note that the SPy and TBPy structures have five Cu-O bonds of length 1.96 Å, which show the highest intensity at 1.5 and 2.1 Å among the simulated spectra (Fig. 6). Thus, the SPy and TBPy structures are not highly active species. In other words, any of the other structures dOh, dSPy and SPl have high specific activity. Given that the Cu species in zeolites are considered mixtures of various structures, the main structures are difficult to be determined only from the intensities of FT EXAFS. In addition, the Int at 2.1 Å might be affected significantly by noise in the EXAFS spectra (Fig. S9). Thus, the Cu K-edge XANES spectral features of the active Cu zeolites are also examined for further specification of the active structure, because the XANES feature is sensitive to the local structure and is less affected by the noise than the EXAFS. Figure 7 displays the XANES spectra of Cu(0.40)MOR(220), Cu(0.64)FAU(110) and Cu(1.11)FAU(14.9) with a high specific activity. The spectra of the two CuFAU catalysts have a shoulder at 8986 eV and have a relatively low white line intensity at 8995 eV, features which are similar to the SPl structure reported in the literature 29 . In particular, the shoulder at 8986 eV is assignable to the electronic transition 1s to 4p of the Cu 2+ species with the SPl structure. Therefore,  www.nature.com/scientificreports/ the SPl structure is considered to be the highly active structure in FAU. In the case of Cu(0.40)MOR(220), the XANES spectrum does not show such a shoulder at 8986 eV but shows a relatively high white line intensity. Such a spectral feature is seen in dOh and dSPy. In addition, the FT-EXAFS of Cu(0.40)MOR(220) does not exhibit a peak shift at 1.5 Å as is the case for the two CuFAU with SPl structures, suggesting that the dSPy structure is formed in Cu(0.40)MOR(220), because dOh should show the peak shift as simulated in Fig. 4. Therefore, the dSPy structure is proposed as the highly active structure in MOR.

Conclusions
Various metal zeolites are prepared and tested for direct oxidation of CH 4 to CH 3 OH using H 2 O 2 as an oxidant. Given that Cu is effective for the selective oxidation to CH 3 OH and CH 3 OOH without producing HCOOH, the catalytic performance of 35 Cu zeolites and 20 H zeolites having MOR, FAU, BEA, FER, CHA, and MFI frameworks are evaluated, where the Cu zeolites except for Cu-MFI are confirmed to show catalytic activity for the CH 4 -H 2 O 2 reaction. In addition, the CuMOR and CuFAU zeolites contain highly active Cu species among the Cu zeolites tested. The catalytically active twenty-eight Cu zeolites are described in terms of the structural variables of the zeolite framework obtained from the database of the IZA and the experimentally evaluated zeolite features based on composition, surface area and local structure of the Cu active site. The relationships between the specific activity of the Cu zeolites and the structural variables are analyzed by classification methods using unsupervised and supervised machine learning and by pairwise correlation, suggesting that the local structure of the Cu active species, represented by the intensities of FT-EXAFS at 1.5 and 2.1 Å are the important descriptors for the specific activity. By comparing the experimental XAFS spectra with the simulated or reported ones, highly active Cu species in FAU and MOR are considered to have SPl and dSPy structures, respectively. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.