High-throughput predictions of metal–organic framework electronic properties: theoretical challenges, graph neural networks, and data exploration

With the goal of accelerating the design and discovery of metal–organic frameworks (MOFs) for electronic, optoelectronic, and energy storage applications, we present a dataset of predicted electronic structure properties for thousands of MOFs carried out using multiple density functional approximations. Compared to more accurate hybrid functionals, we find that the widely used PBE generalized gradient approximation (GGA) functional severely underpredicts MOF band gaps in a largely systematic manner for semi-conductors and insulators without magnetic character. However, an even larger and less predictable disparity in the band gap prediction is present for MOFs with open-shell 3d transition metal cations. With regards to partial atomic charges, we find that different density functional approximations predict similar charges overall, although hybrid functionals tend to shift electron density away from the metal centers and onto the ligand environments compared to the GGA point of reference. Much more significant differences in partial atomic charges are observed when comparing different charge partitioning schemes. We conclude by using the dataset of computed MOF properties to train machine-learning models that can rapidly predict MOF band gaps for all four density functional approximations considered in this work, paving the way for future high-throughput screening studies. To encourage exploration and reuse of the theoretical calculations presented in this work, the curated data is made publicly available via an interactive and user-friendly web application on the Materials Project.


INTRODUCTION
Metal-organic frameworks (MOFs) have been extensively studied over the last two decades due to their high degree of synthetic tunability, which makes it possible to tailor their physical and chemical properties for a given application 1,2 . While much attention has been focused on the use of MOFs for industrial gas storage and separations 3,4 , the design of MOFs with targeted electronic properties has become a topic of recent interest as well [5][6][7][8] . Through a judicious selection of inorganic nodes and organic linkers, MOFs have been proposed for novel electronic and optoelectronic devices, electrocatalysts, photocatalysts, sensors, and energy storage devices, among many other applications 6,[9][10][11] . However, with tens of thousands of MOFs that have been experimentally synthesized 12 and virtually unlimited more that can be proposed 13 , it is often difficult to identify promising MOF candidates with the optimal set of electronic properties.
The advent of machine learning (ML) and related big data approaches has made it possible to more efficiently search through MOF chemical space, and high-throughput computational screening can often provide insight into previously unknown structure-function relationships [14][15][16][17][18][19][20][21][22] . With this goal in mind, a high-throughput density functional theory (DFT) workflow 23 was recently used to construct a publicly accessible dataset of quantum-chemical properties for thousands of MOFs (and coordination polymers), known as the Quantum MOF (QMOF) Database 24 . Like many databases of material properties generated from high-throughput periodic DFT calculations 25,26 , the electronic structure properties within the QMOF Database were computed with the relatively inexpensive Perdew-Burke-Ernzerhof (PBE) 27 exchange-correlation functional. While PBE is useful for generating large quantities of material property data that are often needed for ML, the electron selfinteraction error 28 of generalized gradient approximation (GGA) functionals like PBE can greatly influence the predicted electronic properties 28,29 . Perhaps most notably, PBE is known to severely underpredict band gaps [30][31][32] , but the degree to which there may be qualitative (as opposed to merely quantitative) errors is not wellestablished. This inherently limits the practical utility of data-driven, computational screening approaches based on such a functional.
For inorganic solids, several approaches have been taken to increase the accuracy of ML-predicted band gaps trained on highthroughput DFT calculations in a computationally tractable manner. The most straightforward option is to train ML models on experimental band gap data 33 or an ensemble of both theoretical and experimental band gap data 34 . Unfortunately, this approach is challenging to apply to MOFs because there are relatively few reports of experimentally measured MOF band gaps 8 . Furthermore, the reported band gaps of MOFs can vary by several tenths of an eV depending on the synthesis conditions and crystallinity of the material 6 . Another approach is to carry out higher-accuracy DFT calculations on a subset of materials and use them to train an ML model that can make more reliable predictions. Recently, large datasets of band gaps computed with meta-GGA and hybrid functionals have been published for inorganic solids [35][36][37] , although no such resource currently exists for MOFs.
In the present work, we complement the existing dataset of PBE electronic structure properties in the QMOF Database with analogous data computed using three other functionals: HLE17 38 (a high local-exchange meta-GGA), HSE06 39,40 (a screened-exchange hybrid GGA), and a functional we refer to here as HSE06 * in which the amount of screened Hartree-Fock (HF) exchange of HSE06 has been changed from 25% at short interelectronic distances to 10%. By analyzing the electronic structure properties calculated at these levels of theory, we uncover severe theoretical limitations associated with the more computationally efficient (meta-)GGA density functionals that prevent them from achieving quantitatively-and sometimes qualitatively-accurate band gap predictions for MOFs and coordination polymers with respect to hybrid functionals. Since it is known that different density functional approximations (DFAs) can alter the underlying charge density, we also investigated trends related to the computed partial atomic charges. In general, we find that the different levels of theory predict similar partial atomic charges; however, as compared to PBE, the meta-GGA and screened hybrids tend to shift electron density away from the metal centers and onto the ligand environments.
We conclude by using the electronic structure data to train multi-task and multi-fidelity convolutional neural network models that can predict PBE, HLE17, HSE06, and HSE06 * band gaps given a graph-based representation of a MOF crystal structure. We anticipate that the computational data, trends, and subsequent deep learning models presented in this work will make it possible to achieve both rapid and accurate predictions of MOF band gaps that can greatly accelerate the materials design and discovery process. To help realize this vision, all the data underlying the QMOF Database is now also made available as a dedicated, interactive application on the widely used Materials Project 41 .

Band gap comparison
To develop ML models that can directly guide future experimental efforts, it is essential to first understand the behavior and potential limitations of various levels of theory when predicting MOF electronic structure properties. As such, we begin by comparing the DFT-predicted band gaps for 10,720 structures in the QMOF Database with the PBE (GGA: 0% HF exchange), HLE17 (meta-GGA: 0% HF exchange), HSE06 * (screened hybrid: 10% HF exchange at small interelectronic distances decreasing to zero at large distance), and HSE06 (screened hybrid: 25% HF exchange at small interelectronic distances decreasing to zero at large distance) functionals.
As shown in Fig. 1, we observe pronounced differences amongst the predictions of the various DFAs. Starting with the box plots, we find that of the four functionals tested in this work, PBE generally predicts the lowest band gaps. Including HF exchange-as with HSE06 * and HSE06-tends to increase the predicted band gap values (as expected 42 ), with the relative increase depending on the fraction of HF exchange in the selected functional. Qualitatively, the HSE06 * and HSE06 results are more reflective of prior experimental studies 6 , which suggest that the majority of MOFs are electronically insulating and that comparatively few exhibit semi-conducting or metallic character. Switching focus to the HLE17 meta-GGA, we find that the median band gap value is within 0.09 eV of the HSE06 * calculations, suggesting that the parameterization of this functional can partially improve upon the band gap underprediction problem of PBE despite not incorporating HF exchange.
When comparing the violin plots in Fig. 1, it is immediately clear that the shape of the band gap distribution can vary significantly depending on the DFA. The PBE-computed band gap data exhibits two distinct distributions with peaks around 0.90 eV and 2.93 eV (Fig. 1), which is observed for the full QMOF Database of 20,000 structures as well ( Supplementary Fig. 6). A qualitatively similar distribution of band gaps is obtained when using the HLE17 functional, which has peaks around 0.86 eV and 3.21 eV. However, the two distributions in the band gap data exhibit much more significant overlap for the HSE06 * functional, and for the HSE06 functional there is almost complete overlap such that the overall distribution is virtually unimodal.
The two underlying distributions in the band gap data can be better understood by separating the computed values based on whether the material has closed-shell or open-shell character, the latter of which is associated with lower band gaps on average (Fig.  2a). When including 10% HF exchange with HSE06 * , the degree of overlap between the closed-shell and open-shell band gap distributions is partway between that of PBE and HSE06 (Fig.  2a), which illustrates the strong dependence of the trends on the fraction of HF exchange. Taking the hybrid-quality calculations as the more accurate reference point 43 , these findings suggest that the PBE functional exhibits severe quantitative and qualitative shortcomings when applied to a wide range of MOF structures and that these shortcomings go beyond a simple underprediction of the band gap. Although HLE17 increases the median band gap of the dataset compared to PBE and decreases the number of structures with a predicted band gap in the low-energy subset, it retains the bimodal nature of the band gap distribution. Nonetheless, HLE17 does significantly increase the band gaps of the closed-shell frameworks, and the distribution of band gaps for the closed-shell MOFs is similar to that of HSE06 * .
By directly comparing the predicted band gaps for the PBE, HSE06 * , and HSE06 calculations, we find that there is a correlation between the median band gap and the fraction of HF exchange (Fig. 2b), at least within the range of 0-25% HF exchange considered in this work. Assuming linear behavior in this region, it can be concluded that the median band gap across the dataset changes by~0.05 eV per percent of HF exchange for the closedshell frameworks and~0.10 eV per percent of HF exchange for the open-shell frameworks, although we emphasize that these statistics are specific to the QMOF Database and may differ for other datasets of MOFs. Collectively, these results have significant Raincloud plots (i.e., combined violin plot, box plot, and strip plot) for the DFT-computed band gaps, E g , of 10,720 structures in the QMOF Database at the PBE, HLE17, HSE06 * , and HSE06 levels of theory. The strip plots show all the data at that level of theory (jittered horizontally for ease-of-visualization). The box plots show the extrema (whisker tails), interquartile range (box boundaries), and median (horizontal line). The violin plots show the probability density of the data.
implications for computational screening studies of MOFs and coordination polymers, as the use of GGA functionals like PBE may lead to incorrect qualitative comparisons between the band gaps of different materials if some have closed-shell character and others have open-shell character.
While Figs. 1 and 2 show how the entire dataset changes with different density functionals, it is also important to investigate the degree of correlation between the various functionals. As shown in Fig. 3, nearly every MOF has a larger predicted band gap with the HSE06 * (Fig. 3b) and HSE06 (Fig. 3c) functionals than with PBE. This is also the case for most of the closed-shell MOFs with the HLE17 functional, especially when E g,PBE is above~1.5 eV (Fig. 3a). For the closed-shell frameworks ( Supplementary Fig. 7), there is a linear correlation between the computationally inexpensive PBEquality band gaps and those calculated with the more accurate HSE06 * and HSE06 functionals as well as the HLE17 functional. As  shown in Supplementary Fig. 7c, a simple linear equation of the form 1.09E g,PBE + 1.04 eV can predict HSE06 band gaps with an R 2 value of 0.92, provided the frameworks are closed-shell systems and have HSE06 band gaps above~1.0 eV. Similar linear equations can be obtained for HLE17 and HSE06 * for the closed-shell structures (Supplementary Fig. 7a and Supplementary Fig. 7b). The correlation between PBE and the hybrid functionals is weaker for MOFs with open-shell character, hence the larger degree of scatter in the low E g,PBE range of Fig. 3b and c.
As might be anticipated based on trends in crystal-field splitting parameters and spin-pairing energies 44 , most openshell materials in the QMOF Database contain 3d transition metal cations (particularly Cu, Co, Mn, Ni, Fe, V, and Cr in decreasing frequency of occurrence) ( Supplementary Fig. 8). Previous theoretical work on transition metal complexes and gas-phase molecules containing transition metal cations has implicated large self-interaction errors (a consequence of each electron interacting with the total electron density, including its own 28 ) as a major source of errors in systems with 3d transition metal cations that have open-shell character 45,46 . More generally, selfinteraction error is usually considered to be responsible for many of the deficiencies of DFT across virtually all properties and material classes, often due to the associated delocalization error 47,48 . Since self-interaction error is partially decreased by the inclusion of HF exchange, this is a major reason that the hybrid functionals give different results than the local functionals for the band gap predictions in this work.

Partial charge comparison
Beyond band gaps, it is well-established that different DFAs can change how the charge density is distributed in a given material [49][50][51][52][53] . Furthermore, partial atomic charges (which can be computed directly from the underlying charge density) are commonly used in molecular simulations of MOFs and can be used to interpret trends when modeling redox processes and chemical reactions 54,55 . One such method to compute partial atomic charges, the sixth-generation Density Derived Electrostatic and Chemical (DDEC6) partitioning scheme [56][57][58] , has found widespread use in molecular simulations of MOFs 54 (e.g., for gas storage and separations) and has performed well in tests of reproducing the electrostatic potential 59 . To explore the sensitivity of partial atomic charges to different DFAs, we compared over 900,000 partial charges calculated from the DDEC6 method using charge densities at the PBE, HLE17, HSE06 * , and HSE06 levels of theory.
As shown in Fig. 4a, the DDEC6 partial atomic charges calculated by PBE and HLE17 are highly correlated across the entire dataset, with most points falling within 0.04 charge units from the y = x line. When investigating the computed partial charges by HSE06 * , we find that the HSE06 * partial charges are even closer to the PBE reference than the HLE17 partial charges are (Fig. 4b), indicating that 10% HF exchange at small interelectronic distances does not substantially change the first moment of the charge density. However, when increasing the HF exchange at small interelectronic distances to 25% with HSE06, a slightly larger difference can be observed (Fig. 4c). By focusing solely on the metal elements and the ligand atoms within their first coordination spheres (as determined using the CrystalNN near-neighbor finding algorithm 60,61 ), we find thatcompared to the PBE reference-there is often a loss of electron density (i.e., increased partial atomic charge) at the metal and corresponding gain of electron density (i.e., decreased partial atomic charge) on the surrounding ligands when using the HSE06 functional (Fig. 4d). These trends are consistent with previous partial charge analyses carried out on transition metal complexes and open-framework solids 46,52,62 . Given the large partial charge dataset in the present work, we can conclude that this shifting of electron density occurs for an enormously diverse range of metal-ligand environments and can be taken as a rule-of-thumb in most cases. While there are differences in the partial atomic charges between the various levels of theory, they are generally relatively minor. The overall strong agreement suggests that the less expensive PBE-quality charges, which are available for thousands of MOFs 24,54 , are likely suitable when carrying out high-throughput computational screening studies.
Since no single charge partitioning scheme is expected to be ideal for all applications, we also compared the effect of different charge partitioning schemes for a given DFA. As shown in Fig. 5, the differences between Bader 63,64 , DDEC6 56,57,65 , and Charge Model 5 (CM5) 66 partial atomic charges (as computed with the PBE functional) tend to be far larger than any differences observed when changing the DFA, similar to what has been observed for several inorganic solids 67 . This is especially the case when directly comparing the Bader and DDEC6 methods. As one example of many, large deviations are often observed for the S and P atoms of SO 4 2− and PO 4 2− groups, which have partial atomic charges upwards of~2.4 charge units higher with the Bader method than the DDEC6 method. In addition, there can be qualitative differences between Bader and DDEC6 charges, such as atoms that have a partial positive charge with the Bader method but a partial negative charge with the DDEC6 method. While there are also clear differences between the DDEC6 and CM5 methods (Fig.  5b), the agreement between these two charge partitioning approaches is generally greater than that between DDEC6 and Bader. For applications involving systems quite different from those in available benchmarks 55,56,66 , it might be advisable to compare multiple partial charge schemes and further investigate any substantial differences 68 .

Machine learning
With the goal of reducing the number of DFT calculations needed in future high-throughput computational screening studies, we have evaluated the performance of several ML models that can predict MOF band gaps from graph representations of their threedimensional structures (for the prediction of partial atomic charges, we refer the reader to several ML models 69-71 that have been shown to accurately predict PBE-quality DDEC6 and CM5 charges for MOFs). Using MatDeepLearn 72 , we first trained individual graph neural networks for each DFA and found that they performed well at predicting DFT-computed band gaps compared to a baseline model that simply predicts the mean of the dataset for each entry (Table 1). Prior work 24,72 on the QMOF Database showed that a crystal graph convolutional neural network model 73 could predict PBE band gaps with a comparable accuracy, and it is reassuring that relatively low testing-set MAEs on the order of 0.24-0.29 eV can be obtained for the more accurate DFAs (i.e., HLE17, HSE06 * , HSE06). Overall, the graph neural network trained on PBE band gap data performs better than the graph neural networks trained on the HLE17, HSE06 * , or HSE06 datasets, which can likely be attributed to the greater number of data points available for training with PBE. Despite similar training set sizes for the HLE17, HSE06 * , and HSE06 levels of theory, the model based on HSE06 data has the largest testing set MAE of 0.29 eV, which may be attributed in part to a wider range of possible band gap values and a greater overlap in the band gap distributions for the closed-and open-shell frameworks.
Next, we considered various approaches that could make more efficient use of the available band gap data obtained with different functionals. Starting with a multi-task learning approach that predicts band gaps for all four DFAs simultaneously using a single model architecture, perceptible but minor improvements to the model performance are obtained (Table 1). While more convenient to use than multiple individual models if multiple band gap estimates are desired, an inherent drawback of the multi-task learning method is that the training process requires structures that have band gaps computed for all DFAs of interest, which limits the amount of data that can be used.
An alternate way to efficiently leverage data at multiple levels of theory is to construct a multi-fidelity model, which treats each level of theory as a unique sample 74,75 . With a substantially expanded dataset size of up to 52,806 samples, we find that the multi-fidelity MEGNet model architecture of Chen et al 75 . achieves significantly lower MAEs than the individual and multi-task models for the 3-fi (i.e., PBE, HLE17, and HSE06 * ) and 4-fi (i.e., PBE, HLE17, HSE06 * , and HSE06) models (Table 1). These results demonstrate that data at multiple levels of theory can be used to improve the overall model performance, which is especially important for the prediction of band gaps from hybrid functionals that are more computationally demanding to calculate. However, we note that the 2-fi model (i.e., PBE + HSE06) does not outperform the multitask model. In future studies, it may be worthwhile to consider additional approaches (e.g., Δ-learning) 76 if only two fidelities are available, especially given the correlation between the PBE and HSE06 functionals (Fig. 4c). The testing set parity plots for each Fig. 5 Correlation between partial atomic charges with different charge partitioning schemes. a Comparison of the partial atomic charges, q, for 1,429,082 atoms computed using the Bader and DDEC6 charge partitioning schemes at the PBE level of theory. b Comparison of the partial atomic charges, q, for 2,321,435 atoms computed using the CM5 and DDEC6 charge partitioning schemes at the PBE level of theory. Given the large dataset size, the data is shown as 2D histograms with the logarithmic color bar reflecting the frequency of points in each bin. The y = x line is shown as for reference.
A.S. Rosen et al. model are presented in Supplementary Figs. S12-S16, which show that the predictive accuracy generally holds over the range of band gaps, albeit with an increase in scatter toward the low band gap region (e.g., E g,DFT < 0.5 eV). The increased error in the low band gap region can likely be traced back to several factors, such as a smaller number of MOFs to train on in this range and a higher fraction of open-shell MOFs whose properties are likely more difficult to predict with ML models. Collectively, we anticipate that the multi-task and multi-fidelity ML models will be a valuable resource for future high-throughput screening studies by minimizing the need to carry out computationally demanding hybrid DFT calculations, particularly if low-fidelity PBE band gap data is readily available (as is the case with the QMOF Database). Given the promising nature of the multi-fidelity ML models, incorporating experimentally determined band gaps 6,8 during the training process would likely be worth pursuing in future work.

QMOF database on the materials project
With DFT-computed properties at multiple levels of theory, we aimed to make the QMOF Database align with the findable, accessible, interoperable, and reusable (FAIR) guiding principles 77,78 . Therefore, we conclude by showcasing an interactive web application hosted on the Materials Project 41,79 , which can be accessed at the following webpage: https://materialsproject.org/ mofs. Known as the Materials Project MOF Explorer, the web application makes it possible to investigate the computed properties in the QMOF Database through a user-friendly, search-based interface. The data driving the MOF Explorer is made available to the public through the Material Project's contribution platform MPContribs 80,81 . The MPContribs application programming interface and its accompanying Python client 82 provide a unified mechanism for contributors to submit a dataset and for the community at large to programmatically retrieve, download, and query the contributed materials data. Here, contributions containing materials data are linked to a given MOF via a dedicated, unique identifier (QMOF ID) and are organized in components of queryable dictionary data, Pymatgen 83 structure objects, and binary data files.
As shown in Fig. 6, the Materials Project-hosted MOF Explorer allows users to sort and filter materials in the QMOF Database by numerous geometric, compositional, textural, topological, magnetic, and electronic properties. Selecting a single material on the MOF Explorer leads to a detailed calculation summary page, which lists various tabulated properties for that material and an interactive visualization of the DFT-optimized crystal structure. In addition to DFT-computed properties, each material has an associated MOFid/MOFkey 84 (where computable) to support substructure searches as well as cross-referencing with other MOF databases. As the QMOF Database continues to evolve, we plan to incorporate additional computed properties and visualizations on the Materials Project to enable further data exploration.

DISCUSSION
With a generated dataset of electronic structure properties for a subset of~10,700 MOFs (and coordination polymers) in the QMOF Database 24 , we compare the performance of different DFAs for the prediction of band gaps and partial atomic charges. When comparing DFT-computed band gaps with the commonly used PBE functional against those that incorporate some fraction of HF exchange, we observe that PBE almost universally results in a lower band gap prediction, as might be expected from prior work. Notably, this difference is largely systematic for MOFs with closed-shell electronic configurations and can be empirically corrected through a simple linear relationship for structures that are semi-conductors or insulators. For MOFs with open-shell electronic configurations (in particular, those containing 3d transition metals), an even larger-and less predictable-disparity between band gap predictions is observed as a function of the fraction of HF exchange. As compared to the PBE results, the meta-GGA HLE17 is found to increase the computed band gaps for the closed-shell MOFs such that they are similar to values predicted using the HSE06 screened hybrid functional with 10% HF exchange at small interelectronic distances (denoted here as HSE06 * ). Individual, multi-task, and multi-fidelity model performance. The individual models represent four separate models that are each trained on band gaps at a single level of theory. The multi-task model is a single model that is trained on and predicts band gaps at all four levels of theory simultaneously. The multifidelity models combine data from different levels of theory without all samples needing to have band gaps at each level of theory. The 2-fi, 3-fi, and 4-fi models are trained/tested on PBE + HSE06, PBE + HLE17 + HSE06, and PBE + HLE17 + HSE06 a + HSE06 data, respectively. A baseline model that simply predicts the mean value of the dataset is shown for reference. The dataset sizes refer to the entire available dataset, which is split 80:5:15 train:validation:test. The mean absolute errors (MAEs) are shown for the testing set. However, compared to the hybrid functionals, HLE17 does not as significantly increase the band gaps of the open-shell MOFs. When investigating partial atomic charges, which are reflective of the underlying charge density for a given density functional approximation, we find that there are slight systematic differences amongst the predictions of the different functionals. For both the HLE17 meta-GGA and the screened hybrid functionals, electron density localized on the metals is lower than with PBE, and the opposite is true for the coordinating ligand atoms. Nonetheless, these changes in the partial atomic charges are relatively minor compared to the differences that arise from using different charge partitioning schemes.
Finally, we used the electronic structure data generated in this work to train multiple ML models that can predict MOF band gaps at various levels of theory from graphs of the underlying crystal structures. We find that individual graph neural network models can predict PBE, HLE17, HSE06 * or HSE06 band gaps from the QMOF Database with a testing-set MAE of 0.23-0.29 eV. A multitask graph neural network model capable of simultaneously predicting MOF band gaps for all four functionals performs slightly better than the individual models, but with three or more functionals to train on, a multi-fidelity model achieves the best performance of the models tested in this work.
High-throughput computational screening approaches have historically been devoted to the discovery of MOFs tailored for gas storage and separations. With the dataset and ML models presented in this work-coupled with an increased understanding of the behavior of common DFAs for predicting electronic properties-we anticipate that a computational materials design perspective can be brought to countless application areas for MOFs. Now hosted on the widely used Materials Project platform (https://materialsproject.org/mofs), theorists and experimentalists alike can leverage the data from tens of thousands of quantummechanical calculations to accelerate the discovery of promising MOFs for electronic and optoelectronic applications.

Density functional theory calculations
Plane-wave, periodic DFT calculations were carried out using the Vienna ab initio Simulation Package (VASP) 85,86 version 5.4.4 and the Atomic Simulation Environment (ASE) 87 version 3.20.0b1. All structures were adopted from the QMOF Database 24 . We consider properties calculated with four exchange-correlation functionals: PBE-D3(BJ) 27,88,89 , HLE17 38 , HSE06 39,40 , and HSE06 * (i.e., HSE06 with reduced HF exchange). The PBE-D3 (BJ) calculations were obtained from the QMOF Database, as previously reported 24 . The HLE17, HSE06, and HSE06 * calculations are carried out in this work using structures from the QMOF Database 24 that were previously optimized with the PBE-D3(BJ) exchange-correlation functional. In commonly accepted notation, these levels of theory would generally be referred to as PBE-D3(BJ), HLE17//PBE-D3(BJ), HSE06//PBE-D3(BJ), and HSE06 * //PBE-D3(BJ), indicating that the functional to the left of the doubleslash is a single-point (i.e., static) calculation carried out on the geometry obtained using the functional to the right of the double-slash. For brevity, we will simply refer to these levels of theory as PBE, HLE17, HSE06, and HSE06 * , respectively. Of the 20,000+ structures in the QMOF Database with properties computed using PBE,~10,700 have computed properties at the HLE17, HSE06, and HSE06 * levels of theory based on the calculations in this work.
The HSE06 functional is a screened-exchange functional built upon PBE and replaces a portion of PBE's local exchange with 25% HF exchange at small interelectronic distances, decreasing continuously to zero at large interelectronic distances 39,40 . HSE06 was selected in this work because it is currently the most widely used functional for predicting the band gaps of solid-state materials when high accuracy is required, including for MOFs 43,90 . Other functionals may have comparable or slightly better performance for certain systems 37,91-93 but are less widely used and tested. In addition to HSE06, we considered the hybrid functional defined here as HSE06 * , which has 10% HF exchange at small interelectronic distances and decreases to zero at large interelectronic distances. HSE06 * was considered because the standard HSE06 functional can overcorrect the band gap underprediction problem of PBE for some materials 94 , as is the case with MOF-5 95,96 . Considering a functional with an intermediate fraction of HF exchange between that of PBE and HSE06 also makes it easier to discern the impact of HF exchange. The HSE06 and HSE06 * calculations are considerably more expensive than the PBE calculations because of the nonzero fraction of HF exchange. With this in mind, we included the HLE17 meta-GGA functional as well because prior benchmarking studies 38,43 suggest that it can greatly improve the prediction of semiconductor band gaps without the need for computationally expensive HF exchange. While one could also consider the GGA+U approach 97 , relatively little is currently known about selecting empirically ideal U values for MOFs 90,98,99 despite its widespread use in correcting the predicted energetic and electronic properties of inorganic solids in high-throughput DFT databases [100][101][102] .
For materials that are closed-shell (i.e., without magnetic character), the band gap is defined as the energy difference between the conduction band minimum (CBM) and valence band maximum (VBM). For materials with open-shell character, there can be more than one way to characterize the band gap 103 . Except where otherwise stated, we define the band gap for spin-polarized systems as min CBM " ; CBM # À Á À max VBM " ; VBM # À Á , where ↑ and ↓ refer to the spin-up and spin-down spin-orbital manifolds, respectively. Nonetheless, we note that this definition can occasionally result in a band gap that is associated with a formally spin-forbidden electronic excitation, as depicted in Supplementary Fig. 4. Using the band gap instead defined as min CBM " À VBM " ; CBM # À VBM # À Á does not involve a spin-flip. Regardless of which band gap definition is employed, the trends and conclusions reported throughout this work remain unchanged ( Supplementary Fig. 5). We also note that the computed band gaps refer to electronic band gaps and are not directly comparable to experimentally measured optical gaps (e.g., via UV-Vis spectroscopy) 104,105 , particularly when the exciton binding energies are non-negligible, as has been observed for some MOFs 106 .

Machine learning
Graph neural network architectures, which take graphs representing the crystal structures as inputs, were used for the ML models. The graph representations contain atoms as nodes and interatomic distances as edges. Here, the atoms are represented with a one-hot encoding of the element with a vector length of 100 within the node attributes. The edge attributes contain interatomic distances within a cutoff of 8 Å and up to 12 neighbors per node, where the distances were then expanded by a Gaussian basis 114 to a length of 50. In this work, an additional state attribute is included, representing the level of theory used (i.e., fidelity) as an integer. The graph neural network itself adopts the MatErials Graph Network (MEGNet) architecture 115 where the node, edge, and state attributes are propagated sequentially in the stated order during the graph convolutional steps. The overall model contains one pre-processing layer, four graph convolutional layers, one pooling layer using the Set2Set function, and finally two post-processing layers. The pre-processing, post-processing, and graph convolutional update functions are all fully-connected layers with Rectified Linear Unit activation functions and with dimensions of 128, 128 and (128, 128), respectively. The models were trained with the AdamW optimizer 116,117 using an initial learning rate of 0.0005 and a batch size of 128 for a total of 250 epochs. The model state with the lowest validation MAE is saved and used for testing. The training:validation:testing ratio used is 80:5:15, and the samples were randomly split across the training, validation, and testing sets. For all cases in this work, the same hyperparameters were used in the models. For the individual models, the models were trained separately. In multi-task learning, the output dimension was expanded to four, and the predictions were performed simultaneously with a single model for all fidelities (i.e., levels of theory). For multi-fidelity learning, we adopt the approach used by Chen et al 75 . where each fidelity is considered a unique data sample and structures with different fidelities can appear in both training and testing data splits. The model training and testing was set up and performed using the MatDeepLearn framework 72 , which is implemented using the PyTorch 118 A.S. Rosen et al. and PyTorch geometric 119 libraries. The training and evaluation were conducted on four NVIDIA Tesla V100 ('Volta') graphics processing units.

DATA AVAILABILITY
With the release of the Materials Project-hosted MOF Explorer interface to the QMOF Database, all data in this work can be accessed at the following webpage: https://materialsproject.org/mofs. Each version of the QMOF Database made available on the Materials Project is permanently archived on Figshare at the following DOI: 10.6084/m9.figshare.13147324. The VASP input and output files are made available via the Novel Materials Discovery (NOMAD) platform 120