A meta-analysis of catalytic literature data reveals property-performance correlations for the OCM reaction

Decades of catalysis research have created vast amounts of experimental data. Within these data, new insights into property-performance correlations are hidden. However, the incomplete nature and undefined structure of the data has so far prevented comprehensive knowledge extraction. We propose a meta-analysis method that identifies correlations between a catalyst’s physico-chemical properties and its performance in a particular reaction. The method unites literature data with textbook knowledge and statistical tools. Starting from a researcher’s chemical intuition, a hypothesis is formulated and tested against the data for statistical significance. Iterative hypothesis refinement yields simple, robust and interpretable chemical models. The derived insights can guide new fundamental research and the discovery of improved catalysts. We demonstrate and validate the method for the oxidative coupling of methane (OCM). The final model indicates that only well-performing catalysts provide under reaction conditions two independent functionalities, i.e. a thermodynamically stable carbonate and a thermally stable oxide support.


Supplementary Note 1 OCM data and statistical analysis by Zavyalova et al.
In a landmark publication, Zavyalova et al. [1] collected a large set of historical OCM data from literature published between 1982 and 2009. A total number of 1866 observations from 421 referenced publications (papers, patents, PhD theses) were collected. The dataset was made publically available under http://www.fhi-berlin.mpg.de/acnew/department/pages/ocmdata.html.
In the dataset, the catalyst composition is described in terms of the presence and the molar fractions of elements assigned to the categories cation, anion and support, as well as the presence of "promotors" (without molar fraction). The reaction conditions are provided in terms of the partial pressure of methane and oxygen in the reaction feed, total pressure, temperature and contact time. Catalytic performance is described via conversion of methane and oxygen, C 2 selectivity and C 2 yield. Only catalysts that provided all the listed information were included in the data set. Moreover, only catalysts studied in a fixed bed reactor with continuous methane/oxygen co-feed were considered. Supplementary Figure 1 illustrates the dataset as well as the statistical evaluation by Zavyalova et al.
Supplementary Figure 1 Structure of the OCM dataset collected by Zavyalova et al. and implemented data categories. [1] Variables considered in the statistical analysis of Zavyalova et al. are marked by white background color. The arrow illustrates the tested correlations between elemental catalyst composition and indicators of OCM performance.
Zavyalova et al. analysed the dataset by statistical methods in order to identify correlations between elemental catalyst composition and OCM performance. Arithmetic mean values of C 2 selectivity and yield were used to compare the OCM performance of single unsupported oxides that appear at least three times within the dataset. Alkaline earth elements Ca, Sr and Ba as well as lanthanides La, Sm, Nd, Eu, Gd and Yb were found to exhibit relatively high values for C 2 selectivity and yield. However the number of employed observations per element varies, and the variance of performance values for the same element is often times high.
Multiway analysis of variance (ANOVA) [2] was employed to evaluate the effect of the presence or absence of specific elements on catalytic performance. In this case, this method analyses whether a considered element can be left out without significantly changing the variance of the dependent variables C 2 selectivity or yield. To achieve this, a null hypothesis is formulated stating that the variance of the dependent variable does not change if one element is left out. The ANOVA output is a significance level for each element, giving the probability of erroneously rejecting the null hypothesis. This means, a low significance level indicates a strong effect of the considered element on catalytic performance. Zavyalova et al. applied the analysis to either the complete dataset or a subset containing only wellperforming catalysts with C 2 yield ≥ 15% and C 2 selectivity ≥ 50%. For example, for alkaline elements Li and Na, as well as for alkaline earth elements Sr and Ba, high significance was found with respect to C 2 selectivity and yield. However, the representative quality of ANOVA results for the group of wellperforming is questioned by the authors due to a low number of observations for some of the elements. Furthermore, ANOVA does not indicate whether the effect of an element is positive or negative, which is needed in the context of catalysis. To establish correlations between the considered variables molar fraction of elements and C 2 selectivity or yield, Zavyalova et al. applied Bravais-Pearson's and Spearman's rank correlation [3] . The Bravais-Pearson correlation evaluates whether a positive or negative linear dependency exists between two variables, and assigns a correlation coefficient. It is defined as the standardized covariance of the two variables and allows values between -1 and 1. The Spearman rank correlation assigns ranks to all appearing values of the two variables. The order of ranks (i.e. lowest to highest appearing value) is then compared for both variables. A correlation coefficient is assigned that evaluates whether a systematic order exists between the ranks of both variables. The coefficient allows values between -1 and 1. In contrast to the Bravais-Pearson coefficient, a linear dependency between both variables is not assumed. Zavyalova et al. show positive correlation coefficients for both C 2 selectivity and yield for alkaline metals (Li, Na, Cs), for alkaline earth metals (Mg, Ca, Sr, Ba), and various other metals (Bi, Mo, Ga, Nd, Re, Yb). However, the authors point out that the obtained correlation coefficient have values below 0.2, indicating that the covariance is small with respect to the variances of the variables. This indicates weak correlations. Additionally, correlation coefficients are generally not suitable for inferring causality [4] . Zavyalova et al. studied the effect of presence or absence of specific elements also using regression trees. In this method, an algorithm was used to split the data subset of well-performing catalysts into two groups maximizing the homogeneity of C 2 selectivity or yield within each of the two groups. To achieve this, the variance of the performance variable within each group is minimized. The criterion to split the dataset was the presence of absence of a considered element. The obtained groups were again subdivided using the same algorithm, until one of the formed groups would contain less than 5% of the observations. However, the regression trees are unstable, as their stepwise development depends strongly on the first split. Starting a regression tree with a different split as starting point would potentially also change the relevant elements for further splitting in the next steps [5] and their effect on the performance variables. In the regression trees, the situation occurs that the same element has opposite effects on performance depending on preceding splits, which indicates the instability of the obtained statements. Furthermore, the difference in homogeneity of the performance variable between two groups formed when splitting the dataset is not documented. Therefore, it cannot be evaluated whether the selected element for splitting is significantly more relevant than any other element. Also, only the mean value of the performance variable is shown for each subgroup, with no information on variance within the subgroup.
As a result of the applied statistical analyses, Zavyalova et al. identified a group of 18 "key elements": Sr, Ba, Mg, Ca; La, Nd, Sm; Ga, Bi, Mo, W, Mn, Re; Li, Na, Cs; F, Cl. For binary combinations of these elements, ANOVA was applied to find those combinations contributing significantly to C 2 selectivity or yield. The extent of their contribution was evaluated by comparing mean values of C 2 selectivity or yield in presence or absence of the respective element combinations, within the group of well performing catalysts. A list of element combinations is given that contribute positively towards C 2 selectivity or yield. Additionally, Zavyalova et al. performed a regression tree analysis using molar fractions of the "key elements" as criteria for group splitting. However the differences in mean C 2 selectivity or yield for the subgroups formed in this regression tree analysis is rather small, and again no information on the statistical significance of the differentiation is given. Some "key elements" were selected multiple times with different molar fractions used as split points. However, no consistent trend between molar fraction and subgroup performance is obtained.
Throughout the statistical analysis performed by Zavyalova et al., the reaction conditions applied to obtain catalytic data are disregarded. This is a severe simplification, as reaction conditions (temperature, methane / oxygen ratio) have a strong influence on OCM performance. Furthermore, the assignment of elements as either support or active components is inconsistent. For example, in the dataset elements such as Mg, Ca, La appear as both cation components and catalyst support. This is because no objective definition for support components was applied, rather the subjective assignment of the various authors referenced in the dataset was adopted.
The results of the statistical evaluation by Zavyalova et al. are correlations between elemental catalyst composition and OCM performance. From this, a group of "key elements" that positively influence the catalyst performance is derived. However, no direct correlations between catalyst properties and performance are obtained. The catalyst properties discussed for high-performance OCM catalysts are not based on the results of the statistical analysis. It is merely pointed out that some of the "key elements" exhibit strong basicity as a property. One of the main conclusions resulting from the statistical evaluation by Zavyalova et al. is that in order to obtain advanced knowledge, it is required to consider catalyst properties rather than only elemental compositions.

Supplementary Note 2 OCM performance prediction and experimental validation based on statistical analysis by Kondratenko et al.
A work by Kondratenko et al. [6] attempted to validate the identified correlations between elemental catalyst composition and OCM performance reported by Zavyalova et al. and to evaluate the potential of applying statistical models based on the OCM dataset to predict OCM performance of multicomponent materials. To achieve this, the authors compared predicted C 2 yields derived from neural networks with experimentally measured C 2 yields.
The employed clustered radial basis function networks were trained by minimizing various error models. The data used for training and prediction was a subset of the OCM dataset that consisted of catalysts containing only components from a pool of nine elements (Li, Na, Cs, Mg, Sr, Ba, La, Mn, W). The independent variables were the molar fractions of the elements and the dependent variable was the C 2 yield. Reaction conditions (temperature, methane / oxygen ratio) were not regarded in the model. To predict C 2 yields, weighted averages calculated by neural networks distinguished in model details were employed. The C 2 yields of catalysts with 42 different ternary element combinations were predicted. For each ternary combination, the element molar fractions were chosen in a way that the predicted C 2 yield was maximized. For the experimental study, 42 catalysts with different ternary element combinations were synthesized using incipient wetness impregnation. OCM performance of the catalysts were tested in a parallel reactor.
The predicted and experimental C 2 yields show noticeable discrepancies. This is also indicated by a significance test comparing the distribution of predicted and experimental C 2 yields. For some catalyst systems, the null hypothesis of equal yield distribution can be rejected with high significance. For other catalyst systems, the null hypothesis cannot be rejected on a typical significance level of 5%. However, this does not necessarily indicate equal yield distribution of predicted and experimental C 2 yields, as the applied significance test generally can only reject, but not confirm the null hypothesis. The authors discuss that reasons for the discrepancies can either be inappropriate models or strong deviation in the data used for model training. The authors show by comparing the extent of variation with model error that the employed models are appropriate for most of the catalyst systems. This means the main reason for discrepancy of predicted an experimental C 2 yields are deviations within the dataset. This is plausible, as the performance data in the OCM dataset were obtained with strongly varied reaction conditions. The reaction conditions have a strong influence on the catalyst structure and properties (phases present, catalyst surface, adsorbed gas species, reaction kinetics), as well as on overall catalytic performance. These effects were ignored, as only elemental catalyst composition was used in the modelling. The following tables lists all the corrections and changes that we applied to the original OCM dataset. The columns "Publication number" and "ID number" identify the respective catalyst entry in the naming convention of the original dataset. The "publication number" is a number that identifies the respective publication; ID number n addresses the nth included observation from the respective publication.  Figure 4 illustrates the corrected dataset. Variables with white background colour were employed our current statistical evaluation. The complete corrected dataset (Microsoft Excel) is provided in the file in "20160720_corrected-dataset.xls".

Supplementary
The original work of Zavyalova et al. assigned each element contained in a catalyst to a fixed category of either "Cation", "Anion", "Promotor" or "Support" based on the authors intuition and experience. We kept these categories in the corrected dataset structure in order to remain consistent with the original dataset. However, these pre-assigned categories were not used in our statistical evaluation. Instead, respective catalyst functions were assigned via objective and quantitative descriptor and sorting rules (see Figure 1 in the main paper).    table. The complete table of element properties (Microsoft Excel) is provided as "20160811_element-properties.xls". The element properties are employed in the descriptor rules to assign descriptors for more complex properties to catalyst compounds.

Supplementary
Supplementary Figure 5 Structure and content of the compiled element property table. The contained element properties can be used via descriptor rules in order to calculate and assign descriptors to a catalyst that reflect its chemical or physical properties based on its elemental composition.
Columns 1 to 10 of the table are self-explaining and reflect the elements name, atomic number and position in the periodic table of elements.
The ability to form a carbonate was assigned based on information available from literature. For example, Si does not form a carbonate under normal conditions [7] .
For each cation, the oxidation number and mass per cation is provided for the most thermally stable oxide known for the respective element.
The thermal stability of the most stable oxide with respect to sintering and associated loss of surface area was evaluated via Tammann temperature [8] . The Tamman temperature can be calculated from the oxides melting point (in K) multiplied by a prefeactor. The property table therefore lists the melting temperature of the respective stable oxide and provides the corresponding literature reference. Tamman temperatures were calculated in our analysis via descriptor rules employing different prefactors (see e.g. Supplementary Note 14).
One out of several possible indicators for the basicity of an oxide is the thermal stability of the corresponding carbonate. The thermal stability of the carbonate can be derived from its decomposition temperature, which can be measured experimentally via the mass loss in a thermogravimetric analysis. We measured the respective temperature at which the carbonates of alkali elements (Li, Na, K, Rb, Cs) decompose into the corresponding oxide via TGA (10 K/min, Ar flow). Where possible, the intersection of tangent lines through the inflection points of the mass loss curve was used to derive the decomposition temperature. In the case of multiple decomposition steps (i.e. lanthanide carbonates), the last decomposition step was evaluated. All other carbonate decompositions temperatures were taken from literature as indicated for each element directly inside the element-property table. The descriptor rules are employed to assign more complex physicochemical properties to possible catalyst compounds based on catalyst composition and element properties. This results in a table with property descriptors for each catalyst compound. Supplementary Figure 6 illustrates for a small subsection the structure of the descriptor table. The complete descriptor table (Microsoft Excel) is provided in "20160729b_corrected_preprocessed_dataset_descriptors.xlsx".
Supplementary Figure 6 Illustration of the structure of the descriptor table computed via descriptor rules. The shown columns indicate for each catalyst (columns 1…3) contained in the dataset and for eacht contained element e.g. if a present cation could be considered to act as a support ("is_support"), is able to form a carbonate ("is_carbonate") and if the carbonate is thermally stable at the respective studied OCM temperature ("is_stabcarb"). The role of a catalyst support is to provide sufficient surface area for interaction between the catalyst and the gas phase. To reflect this functionality two criteria were defined that a component that could be considered as a support had to fulfil: 1) sufficiently weight fraction in the catalyst and 2) sufficient thermal stability under reaction conditions.
For criterion 1), the support component must be present in sufficiently high quantity within the catalyst composition in order to provide enough material to disperse the other catalyst components. The weight fraction of an oxide component proved to be a suitable representation of the material amount. Equation (1) defines the weight fraction criterion for a component to fulfil the support function.
Setting the required weight fraction to 50% in a typical analysis ensured that the support function could be assigned to only one component of the catalyst. However, the robustness tests reported in section Supplementary Note 14 demonstrate also the results for a variation of this parameter.
For criterion 2), a support should be thermally stable to keep its surface area under reaction conditions. The Tammann temperature [8] describes the onset of sintering and loss of surface area. The Tammann temperature is often calculated via multiplication of a compounds melting point with a prefactor. However, there is no general agreement of the exact general value of this Tamman factor. typical values provided in literature range from 0.37 [9][10][11] to 0.75 [12][13] . We used a value of 0.60 as criterion for all oxides in the final model, which is close to the value for typical support oxides such as Al 2 O 3 and SiO 2 [14] . Moreover, the Tammann factor was also subject to a parameter variation (see robustness tests in Supplementary Note 14 for details).
The respective descriptor rule compared the Tamman temperature assigned to the oxide of a catalyst component to the temperature of the corresponding OCM measurement as reported in the dataset. A component was therefore considered to provide a thermally stable support if the respective OCM temperature was lower than a threshold value derived from the oxides melting point. However, the same component may not fulfil the stability criterion if a catalyst was measured at higher OCM temperatures. Formula (2) defines the thermal stability criterion for a component to fulfil the support function.
A catalyst component was credited with a possible support function if at least one component of the catalyst fulfilled both the weight fraction criterion (1) and the thermal stability criterion (2).
The ability of a catalyst component to form a thermodynamically stable carbonate also at the typically rather high temperatures of the OCM reaction is one possible indicator for strong basicity of a catalyst component. Moreover, if a carbonate is potentially present under OCM reaction conditions, it has the possibility to directly contribute to the overall reactivity of the catalyst.
To estimate whether a carbonate can be stable under the respective OCM conditions, the decomposition temperature of the carbonates that can be formed from the elements contained in a catalyst were individually compared to the respective temperature of OCM measurement for a given catalyst. That means that the same carbonate may be considered thermodynamically stable or not, depending on the temperature employed for the reported OCM experiment.
However, the OCM reaction can form a significant amount of CO 2 as a reaction byproduct. The CO 2 enters a chemical equilibrium between oxide and carbonate, thus stabilizing the carbonate against decomposition. Moreover, the carbonates decomposition temperature in TGA analysis describes only the onset of mass loss, and complete decomposition may only be obtained at higher temperatures. To account for these effects, an offset was introduced when comparing carbonate decomposition and measurement temperature, as shown in formula (3). The value was set to 100 K for the final model.
However, other values were tested systematically (see robustness tests in Supplementary Note 14).
If the component of a catalyst with the highest carbonate decomposition temperature fulfils the criterion given by formula (3), that component is assigned the property of the thermodynamically stable carbonate.

Supplementary Note 7 Sorting rules for catalyst property groups
Sorting rules were applied in order to assign catalysts of similar properties into so-called property groups based on the specific formulation of a chemical hypothesis. The sorting results in a table with assigned property groups for each catalyst. Supplementary Figure 7 illustrates the resulting table. The  complete table of property groups (Microsoft Excel) is provided for the final model in "20160729b_corrected_preprocessed_dataset_property_groups.xlsx". For hypothesis 1, if a catalyst contains at least one element that can form a carbonate, it is located in the group "Can form a carbonate". Otherwise it is assigned to the group "Cannot form a carbonate".
In hypothesis 2, only catalysts from the "Can form carbonate" group from the first hypothesis were further evaluated. If assigned to the "Unsupported" group, none of the catalyst components fulfilled the support function criteria. In the "Supported" group, one component fulfilled the support function and additionally, another component besides the support was able to form a carbonate. The "Self-Supported" group was defined as catalysts that contain only one component, with this component fulfilling also the support function criteria. Catalysts in the "Not assigned" group did not fit in any of the three previously described groups. This is e.g. the case whenever there is one support component, but none of the other components has the ability to form a carbonate.
In hypothesis 3, catalysts from the "Supported" group were further analyzed. If the support component had the ability to form a carbonate, the catalyst was placed in the group "Carbonate supported". Otherwise the support component was assumed not to be able to form a carbonate, and the catalyst was assigned to the group "Oxide supported".
Hypothesis 4 was applied to all catalysts contained in the property groups deduced by hypothesis 2 and 3. In hypothesis 4, catalysts were further assigned to subgroups based on the stability of the carbonate with the highest decomposition temperature among all components of the catalyst. If the formation of a carbonate is thermodynamically possible for this component, the catalyst is located in the respective subgroup "At least one thermodynamically stable carbonate". Otherwise, the catalyst was assigned to the subgroup "No thermodynamically stable carbonate".

Supplementary Note 8 Applied regression model and testing for statistical significance
The presented approach divides the catalysts contained in the dataset into groups based on sorting rules that use catalyst properties a criterion for the group assignment. Then, two resulting groups are selected and compared to each other with respect to differences in OCM performance between the two groups. The comparison employs a multiple regression analysis that can account also to some extent for the scatter in individual experimental conditions (temperature, p CH4 /p O2 ) reported for each catalyst.
The choice of a target variable that adequately represents "OCM performance" for a given catalyst is not trivial. The final model therefore employs the C 2 yield as performance indicator. The dataset provided for each catalyst the conversion of methane and oxygen, selectivity to CO X , ethane, ethylene, and the sum of ethane and ethylene as well as the yield of C 2 products. The available parameters do not allow the calculation of a (kinetic) rate of C 2 formation for several reasons (data not measured under isothermal conditions / kinetic regime / differential reactor, etc …). Using methane conversion would not allow to distinguish between the desired formation of C 2 products and undesired methane combustion. Selectivity is also a poor measure due to the fact that methane conversion and the ratio of CH 4 / O 2 partial pressure in the feed vary widely between different catalysts. Using Y C2 as performance indicator represent a necessary compromise in light of the availability of data. However, also literature typically employs Y C2 for a comparison, e.g. as criterion for the economic viability of the OCM process [15,16] .
The analysis was always performed by comparing two catalyst groups to each other. The comparison employs a simple equation (eq. 1) that describes Y C2 as a function of the group assignment, reaction temperature and the feed pressure ratio p CH4 /p O2 . A dummy variable D group,i reflects if a catalyst belongs to one ("0") or the other group ("1"). The comparison is complicated by the fact that the catalytic tests reported in the dataset were not conducted under the same reaction conditions. However, the employed reaction conditions influence the obtained C 2 yield. To compensate for this effect, the individual reaction conditions temperature and CH 4 / O 2 partial pressure ratio also enter the applied regression as variables. Due to a lack of better correlations that describe the complex reaction kinetics and supported by the distribution of experimental results ( In equation (1), each catalyst in the dataset has the following variables: -$ is the C 2 yield of the catalyst as reported in the dataset and described by three independent variables ) ( !, , ∆ and ∆0 ,-/ /, . -) ( !, is a dummy variable with value either 0 or 1 describing the position of catalyst in one of the two catalyst groups that are compared.
-∆ is the difference between the measurement temperature applied to the catalyst and the mean measurement temperature of all catalysts in the dataset.
-0 ,-/ / is the CH 4 / O 2 partial pressure ratio. Accordingly, ∆0 ,-/ /, is the difference of measurement ratio applied to catalyst and the mean measurement ratio of all catalysts in the dataset.
The obtained & regression coefficient provide valuable quantitative information on the effect of each independent variable on the C 2 yield. & ( ! quantifies the effect of the tested catalyst properties (i.e. group assignment) on OCM performance in terms of the difference in C 2 yield between the two groups. However, it should be kept in mind that & ( ! is not strictly identical to the difference of reported mean C 2 yield for the two groups due to the fact that the regression equation includes a compensation for different reaction parameters for each catalyst (see e.g. Supplementary Note 14 for a robustness test that does compares the regression results with and without correction for differences in T and p CH4 /p O2 ).
Moreover, the regression coefficients & * and & ,-/ / provide a rough estimate to which extent the reaction parameters temperature and CH 4 / O 2 ratio contribute to the observed differences in C 2 yield. The typically obtained positive values in & * and negative values in & ,-/ / reflect the general observation that high temperatures and high oxygen partial pressures often lead to increased C 2 yields (see e.g. Supplementary Note 9).
The statistical significance of information derived from each & regression coefficient was evaluated by a T-test. The T-test relates the derived value of the respective & regression coefficient to its standard error estimated from the observations contained in the evaluated subsection of the dataset (i.e. the compared groups). From this, the p-value for each & regression coefficient is calculated. The p value describes the significance of the observed effect (e.g. a difference in Y C2 between the two compared groups) by the probability that an estimated & value is concluded to be different from zero, whereas the true value is zero. A low p-value indicates a low probability of erroneous assignment of an effect, and therefore corresponds to a high statistical significance of the observed effect. Usually a confidence of at least 95% (0 ≤ 0.050) is chosen as requirement for sufficient statistical significance.

Supplementary Note 9 Typical regression parameters and results
Equation (1)  The CPR plots are discussed here exemplarily for hypothesis 1 in order to illustrate the general regression procedure and outcome. In hypothesis 1 all catalysts were divided into two property groups, depending on whether they contain at least one element that can form a carbonate (group 1b) or not (group 1a). Supplementary Figure 9a/b/c shows CPR plots for all three independent variables, i.e. property group, temperature and CH 4 / O 2 ratio corresponding to hypothesis 1.
The value of the regression coefficient & ( ! = +7.62 indicates the catalysts in the dataset that are able to form a carbonate show on average a C 2 yield that is 7.62% higher than for catalysts that do not contain an element that can form a carbonate. The value 0 = 0.000 indicates a high significance of the correlation. An interpretation of this value in context of OCM is given in the discussion of the first hypothesis.
The obtained regression coefficient for the measurement temperature amounts to & * = +0.012. The respective p value amounts to p=0.000, which indicated a high statistical significance. Hence, C 2 yields increase on average by 0.012% per Kelvin, or 1.2% when the measurement temperature is increased by 100 K. This positive correlation relates to the fact that methane conversion typically increases with temperature in OCM and that a high temperature and complete methane conversion also often contribute to a higher C 2 selectivity [18][19][20][21][22] . However, the positive trend indicates by no means that an increased temperature induces for each individual catalyst an increased C 2 yield: It represents only a generally trend observed over many different catalysts contained in the studied dataset.
The regression coefficient obtained for the influence of the pressure ratio CH 4 / O 2 in the feed amounts to & ,-/ / = 60.431. The respective p value amounts to p=0.000, which indicated a high statistical significance. Hence, a ratio that is increased by 1 via e.g. changing p CH4 :p O2 from 2:2 to 2:1 results in a C 2 yield that is on average 0.431 percentage points lower when averaged over the studied dataset. This negative correlation can be explained by a low oxygen concentration limiting the conversion of methane. While such a change can increase the C 2 selectivity, the lower methane conversion results in low overall the C 2 yield. Studies that varied the CH 4 / O 2 ratio for various catalysts report consistent observations for individual catalysts [20][21][22]. The same data as in Supplementary Figure 9 a/b/c can also be presented as a residual analysis without the components, i.e. values of 1 plotted against the independent variables as shown in Supplementary Figure 9 d/e/f. The slopes of the linear fits become zero and the residuals follow a reasonably random distribution. For catalysts exhibiting noticeable high positive residuals, the compositions are shown. Judging by the compositions, these appear as typical OCM catalysts. Therefore, it can be assumed that the deviation is largely caused by variance in performance results and is not caused by special catalyst properties.

Supplementary Note 10 Number of observations of cation combinations within property groups
The number of literature reports and therefore also the number of corresponding dataset entries varies significantly between very popular catalyst systems (e.g. Li/MgO, 72 observations) and less frequently studied systems (Li/CaO, 27 observations). The number of observations can be deduced by counting the unique cation combinations contained in the dataset. The applied procedure produces cation combinations labelled by e.g. "Li_Mg" for catalyst containing Li and Mg, or "Ca_Pb" for a catalyst containing Ca and Pb, irrespective of the chemical state, phase composition, element concentration etc of the catalyst. Supplementary Figure 10 illustrates the most frequent cation combinations in the different groups produced by the final model. Note that the cation combinations are unique and do not overlap with each other, i.e. all catalyst containing only Na and Ca in combination are assigned to "Ca_Na", but are not counted for e.g. catalysts that containing only Ca ("Ca"). The complete table of cation combination within all property groups (Microsoft Excel) is provided in the supplemented file "20160713_list_cat-combination_property-groups.xlsx".
The highest significant differences are observed when the group "Cannot form a carbonate" (column 1a) is compared to group 1b ("Can form a carbonate", & ( ! = 7.51), group 2b ("Supported", & ( ! = 8.43), group 3a ("Carbonate supported", & ( ! = 9.31) and group 4d ("At least one carbonate thermodynamically stable, & ( ! = 9.68). It is clear from the results that & ( ! values increase from hypothesis level 1 to 4, i.e with increasing complexity and specificity of the hypothesis expression. The table can serve as a guideline for experiments that aim to validate or investigate the propertyperformance relationships indicated by the statistical analysis. One could e.g. select catalysts from two property groups that show a large difference in Y C2 and a low p-value, and study these selected catalyst e.g. spectroscopically with respect to the physico-chemical property that distinguishes the two corresponding groups. Such a study (performed preferably in-situ under OCM conditions) could confirm the presence or absence of the respective property.
If, for example, one would chose to study the role of the thermodynamic stability of a carbonate spectroscopically under OCM conditions, selecting oxide supported catalysts from group 4e and 4f (& ( ! = +5.11) would be recommended over carbonate supported (4c/4d, +2.36), unsupported (4a/4b, +2.04) as well as self-supported (4g/4h, -0.05) from a statistical point of view (see & ( ! values in Fig.  4 and Supplementary Table 11a). An experimental evaluation of the influence of the supports ability to form a carbonate could be studied by comparing groups 4c and 4e (+4.89) instead of 3a and 3b (+3.30).

Supplementary Note 11b Final model: Matrix of β T regression coefficients and their p-values
Each multivariate regression analysis with temperature compensation produces also a matrix of & * values and the corresponding p values. The matrix resulting for the final model is provided below.
The large majority of regression tests indicate a positive and statistically significant influence of temperature on C 2 yield.  Each multivariate regression analysis with p CH4 /p O2 compensation produces also a matrix of & ,-/ / values and the corresponding p values. The matrix resulting for the final model is provided below. The large majority of regression tests indicate a statistically significant influence of decreasing C 2 yield for increased p CH4 .

Supplementary Table 11c
Matrix of & ,-/ / regression coefficients. Each cell gives the & ,-/ / regression coefficient and p-value between a pair of property groups.

Supplementary Note 12 Generation of density plots of Y C2 via Epanechnikov kernel density function
The proposed method applies decision rules in order to sort all reported catalysts into groups of similar physico-chemical properties. The dataset reports for each catalyst i also an indicator of OCM performance, i.e. Y C2, i . The OCM performance of the resulting property groups can be evaluated based on the distribution of Y C2,i values within each group as well as the deduced group-averaged Ȁ C2 . So-called density plots of C 2 yield were generated via Epanechnikov kernel density function in order to visualize the Y C2 distribution for each group (see e.g. Y C2 density plots in Fig. 4 of the main article).
A kernel density function describes a nonparametric method, which is used to smooth random data variability and to illustrate systematic variability at the same time [23] . The basis of kernel densities is the principle of local averaging. To calculate a local average value on a certain value of estimation ;, not only observations featuring exactly the value ; are considered in the calculation, but also the surrounding observations with values between ; 6 ℎ and ; + ℎ, where ℎ is called the bandwidth. Observations (= , ; ) close to the value of estimation ; typically get larger weights for calculation of a weighted average [23] . Variable > typifies the distance between a considered observation ; and the value of estimation ; divided by the bandwidth ℎ.
Equation (2) shows how the kernel weight B(>) is calculated using the Epanechnikov kernel density function [24] . B(>) indicates whether and to what extent an observation ; is considered in the calculation of the density at the value of estimation ;.
Observations featuring exactly the point of estimation ; get the largest weight of 0.75. The Supplementary Figure 12 illustrates the shape of the Epanechnikov kernel density. The Epanechnikov kernel density considers both a weight and a cutoff limit.  Group 4e ("Oxide supported → No thermodynamically stable carbonate") typically contains cation compositions that are binary combinations of Pb or transition metals (i.e. Mn, Co) supported on Al or Si. In contrast, the group 4f ("Oxide supported → At least one carbonate thermodynamically stable") is dominated by variations of the Na-Mn-W-Si system, in which Na is able to form a carbonate under OCM conditions. Additionally, binary combinations of alkaline metals or alkaline earth metals and Al, Si, or Ti are located in this group.
The subgroups 4g and 4h of the "Self-supported" group represent catalysts that contain only one cation. The regression results indicate that the catalyst property "Carbonate thermodynamically stable" has a clearly beneficial influence on OCM performance. For the support property group 2a ("Unsupported", subgroups 4a vs. 4b), the regression coefficient amounts to & ( ! = +2.04 with sufficient statistical significance. Also for the groups 3a ("Carbonate supported", 4c vs. 4d) and 3b ("Oxide supported", 4e vs. 4f) the respective values are & ( ! = +2.36 and & ( ! = +5.11 are high and statistically significant. Hence, the positive effect is confirmed for three different types of catalyst groups. Only for group 2c ("Self-supported", 4g vs. 4h) no statistically significant result was obtained. This might be related to the fact that all catalyst in group 2c contained only one cation, and can therefore not provide the "support function" and the "carbonate function" via two different cations.

Supplementary Note 14 Robustness test: Variation of descriptor rule parameters for definition of property groups
An essential criterion in the evaluation of model quality is the robustness of its findings against the modifications of model parameters and regression variables. The robustness of the final model ( Figure  3, Figure 4) was therefore assessed via systematic variation of the parameter values employed by the sorting rules (4) (Figure 2). The varied parameters include (i) the minimum weight fraction required for an oxide to be considered as support (level 2), (ii) the factor used to calculate the Tammann temperature from the melting point of the most stable oxide (level 2), and (iii) the offset between the reported OCM temperature and the decomposition temperature of the most stable carbonate formed by a catalyst (level 4). Moreover, the regression was performed (iv) without correction for temperature and Values that correspond to the final model (main article, Fig. 4) are marked in bold. The changes in property group compositions resulting from the parameter variation are discussed qualitatively below. The subsequently presented Supplementary Figures 14a-k provide the corresponding detailed regression results.
The vast majority of the main effects on Y C2 (approximate effect size, sign) identified by the final model ( Fig. 4) was preserved throughout the parameter variations ( Supplementary Figure 14a-k). Only in few cases, some of the identified effects became statistically insignificant (e.g. 4a/4b in Supplementary  Figure 14a, b, h, i). Moreover, most catalysts remained in their respective property group throughout the parameter variations. Hence, the final model proves to be very robust towards variation of the tested descriptor parameters.
A variation of the weight fraction parameter in the support descriptor rule (equation (1) in Supplementary Note 6) mainly shift the position of catalysts between the groups "Unsupported" and "Supported" in hypothesis 2. Increasing the threshold corresponds to less catalyst components that fulfil the criterion, so that the respective catalysts move into the group "Unsupported" (2a) and vice versa. This effect occurs more pronounced when the component that is assumed to provide the support function possesses a low molar mass (i.e. Al, Mg) compared to components with higher molar mass (i.e. Pb, Bi). Values below 50% weight fraction allow more than one component of a catalyst to fulfil the support function, which can cause ambiguity in the third hypothesis.
A variation of the Tammann temperature factor (equation (2) in Supplementary Note 6) in the support descriptor rule mostly affected catalysts containing SiO 2 as a support. SiO 2 -based catalysts measured at 800 °C fulfil the support criterion only when the Tammann factor exceed 0.54. However, increasing the factor further above 0.6 had very little influence on the assignment of the support function.
A variation of the temperature offset that evaluates the thermodynamic carbonate stability in hypothesis 4 (equation (3) in Supplementary Note 6) influences the group assignment of catalysts that containing specific elements as their most stable carbonates. For example, decreasing the offset below 100 K causes Li 2 CO 3 to be considered as thermodynamically unstable for many catalysts. However, experimental studies indicate that Li 2 CO 3 is often present on such catalysts after OCM testing [25][26][27] . Increasing the offset value to 150 K causes mainly CaCO 3 and La 2 O 2 CO 3 to be considered as thermodynamically stable. However, at some point the subgroups "No thermodynamically stable carbonate" contain only few remaining catalysts, which reduces the statistical value of a respective group comparison.

Supplementary Figure 14a
Variation of wt% descriptor parameter to 30% and resulting catalyst property tree with hypotheses results. Reference numbers assigned to each property group are shown in blue. The number of catalysts (N) assigned to each property group is provided in the bottom right corner of each box and indicated for hypothesis 1, 2 and 3 also by the corresponding box width. Red arrows indicate the direction of each hypotheses that compares two property groups via regression analysis and are labeled with the obtained & ( ! regression coefficient (~ ∆Ȁ C2 ) and p-value. The path towards the group with the highest OCM performance is marked by boxes with bold fringe lines. However, a robust model should provide similar results also when each cation combination has a similar impact on the regression analysis. Hence, the regression was also performed in a way that attached weights (the inverse number of catalysts of similar cation combination in the respective property group) to the squared residual U i of each Y C2 observation. If e.g. the cation combination "Li-Mg" appeared 72 times in the group "Can form a carbonate", the a factor of L M/ was attached to each observation of the cation combination "Li-Mg". The obtained regression results are presented in Supplementary Figure 16. The results indicate only minor differences when compared to the final model (Fig. 4) that did not include the respective weight factors. All main effects are preserved and remain statistically significant. The strongest change in the effect value ∆Ȁ C2 is observed for group 1a/1b (1.2 percentage points). Moreover, Ȁ C2 computed for each terminal group changes also by less than 1.2 percentage point when compared to the final model ( Fig. 4), which confirms the excellent robustness of the developed model.

Supplementary Note 17 Robustness test: Regression towards ln(Y C2 ) as independent variable
Throughout the statistical analysis, significance of the obtained regression coefficients & ( ! is expressed by the corresponding p-value. The p-value is derived from a t-test, for which normal distribution is assumed. However, the yield distribution of the catalyst property groups deviates from a normal distribution (see yield density plots in Fig. 4 of the main article). The normal distribution can be better fulfilled when the data are transformed using a logarithm.
Therefore, the regression analysis was performed using ln(Y C2 ) instead of Y C2 as the independent variable, according to equation (1 In case that for a catalyst $ = 0.0%, the yield was arbitrarily set to $ = 0.00001%, so that the observation can be included when applying the logarithm. The & ( ! regression coefficients relate to differences in ln(Y C2 ) between property groups and therefore have lower absolute values.
The obtained regression results are presented in Supplementary Figure 17. The results indicate that the main trends of the final model ( Fig. 4) are retained. All hypotheses that are statistically significant (p ≤ 0.050) remain significant.

Supplementary Note 18 Robustness test: Robust regression
A linear regression that minimizes squared residual is typically affected strongly by outliers. This can be avoided by (partially) excluding outliers from the regression. A so-called robust regression uses realizes the partial exclusion of outliers through M-estimators [28] . The general equations [29] are given by (1) and (2). V represents a loss function, which handles the input of residuals 5 (P) divided by the residuals' standard deviation _. Large positive or negative residuals compared to their standard deviation cause large inputs u. If u exceeds k, the largest possible loss functions' output of one is generated. These points are rejected by equation (1) and therefore not considered in the regression.
Regression results using robust regression with compensation for reaction conditions are shown in Supplementary Figure 18. & ( ! coefficients obtained by robust regression show only small deviations compared to the final model ( Fig. 4) using ordinary least squared regression. The largest change (0.54 percent points) is obtained for groups 3a/3b. The regression 4a/4b is statistically insignificant using robust regression. However, the main trends retained, showing high robustness of the final model towards implementation of the linear regression.

Supplementary Note 19 Preparation and testing of Al 2 O 3 supported catalysts
Alumina supported catalysts were prepared by wet impregnation of an Al 2 O 3 support with aqueous solutions (alkali carbonates, alkaline earth acetates), followed by drying. The procedure was adapted from Kusche et al. [30] The employed γ-Al 2 O 3 support was supplied by Südchemie/Clariant as "Al 2 O 3 -100" pellets, which were ground and sieved to a size fraction of 200 -500 µm. Thereafter, the Al 2 O 3 was calcined for 12 h at 800 °C in air. The obtained Al 2 O 3 had a surface area of 90 m²/g, as evaluated from N 2 physisorption by BET method.
Catalyst loading was chosen as 23.1 wt% of the employed salt in anhydrous state. Only carbonates of alkali elements are sufficiently soluble for aqueous impregnation. Hence, catalysts containing alkaline earth were prepared using acetates. For wet impregnation, typically 1.0 g of Al 2 O 3 was stirred in a round flask with 20 ml aqueous salt solution for 5 min. The concentration of the solutions was c = 15 g/l with respect to the respective anhydrous salt. After impregnation, catalysts were dried for 30 min at 60 °C and 60 mbar using a rotary evaporator, followed by a further drying step for 4 h at 150 °C and 37 mbar using a vacuum oven.
The catalysts were tested in a parallel fixed bed reactor setup at Leibniz Institute for Catalysis (LIKAT) Rostock. The setup employs 48 quartz tube reactors with 4 mm inner diameter. The fixed bed is held by a glass wool plug at equal height for each reactor. On the plug the catalyst powder (50 mg) is placed. Above the catalyst powder, pre-calcined SiC granules (300 mg) were placed to ensure the temperature of the feed being equal to heating temperature.
For catalytic testing, the reactors were heated (5 K / min ramp) in OCM feed. Catalytic measurements were performed at holding temperatures in 25 K intervals (450 °C 5 K / min ramp 475 °C, …) up to 850 °C. At each holding temperature, catalysts were measured sequentially by analysing the gas phase using a gas chromatograph (Agilent 7890) equipped with PLOT/Q (for CO 2 ), AL/S (for hydrocarbons) and Molsieve 5 (for H 2 , O 2 , N 2 , and CO) columns. [6] Conversions and selectivities were calculated on feed basis using peak areas of a flame ionisation detector FID (CH 4 , C 2 H 6 , C 2 H 4 ) and a thermal conductivity detector TCD (O 2 , N 2 , CO 2 , CO).
OCM testing was performed in two different feed compositions with similar CH 4 / O 2 ratio. In the "N 2 OCM feed", N 2 was used as diluent. In the "CO 2 OCM feed", N 2 was almost completely replaced by CO 2 . A low concentration of N 2 was necessary as a reference for calculation. Each feed was tested in independent test runs, starting each time from unused catalysts. The feed compositions are given in Supplementary Table 19. The flow in each reactor channel amounted to 14.7 Nml/min, resulting in a contact time of 0.26 s. The experimental values contained in the studied dataset feature a broad distribution of reported values even for simple catalyst composition. Supplementary Figure 20 displays the C 2 yield density distributions for two of the most popular catalysts systems, i.e. Li/MgO and variations of the Mn-Na-W/SiO 2 system. The values are shown without any modeling or correction. C 2 yields reported e.g. for Li/MgO range up to ca. 30%, with a mean value of about 12.6%. The broad yield distributions illustrates that a substantial variety in reported yields exists, independent of the definition of the property groups. Thus, this variation is always a major contribution to the yield distribution within any property group.