Retrofitting metal-organic frameworks

The post-synthetic installation of linker molecules between open-metal sites (OMSs) and undercoordinated metal-nodes in a metal-organic framework (MOF) — retrofitting — has recently been discovered as a powerful tool to manipulate macroscopic properties such as the mechanical robustness and the thermal expansion behavior. So far, the choice of cross linkers (CLs) that are used in retrofitting experiments is based on qualitative considerations. Here, we present a low-cost computational framework that provides experimentalists with a tool for evaluating various CLs for retrofitting a given MOF system with OMSs. After applying our approach to the prototypical system CL@Cu3BTC2 (BTC = 1,3,5-benzentricarboxylate) the methodology was expanded to NOTT-100 and NOTT-101 MOFs, identifying several promising CLs for future CL@NOTT-100 and CL@NOTT-101 retrofitting experiments. The developed model is easily adaptable to other MOFs with OMSs and is set-up to be used by experimentalists, providing a guideline for the synthesis of new retrofitted MOFs with modified physicochemical properties.


Supplementary Note 1: Definition of the host-guest system
The user of RetroFit has to identify the metal-OMS vector of the MOF as well as the vector of the donor groups of the cross linker (CL) to determine the angles γ and a as described in the manuscript. In the following, this is described for the example of a Cu paddlewheel MOF and a nitrile CL. However, this approach can universally be applied to other MOFs and CLs.

Structure of the MOF system
The relative position of two open metal sites in a Cu paddlewheel MOF can be described by the distance between the two Cu atoms (Cu2 and Cu4) RCuCu that are available for coordination and the angle of the CuCu vectors of the two paddle wheels γ or by the Cu1-Cu2-Cu4 angle (see Supplementary Figure 1

Structure of the cross linker
The CL can be described by the distance of the N atoms of the two coordinating nitrile groups RNN and the angle of the two CN vectors α (see Supplementary Figure 2).

Supplementary Figure 2:
Geometric description of the cross linker.

Geometric relation between MOF and CL
The orientation of a CL symmetrically bridging two paddlewheels can be described by the distance between a coordinating nitrile group and the Cu site RCuN and the two angles δ and θ (see Supplementary Figure 3).

Supplementary Figure 3:
Geometric description of the relative orientation of the MOF and the cross linker.
The parameters defining the three subsystems are summarized in Supplementary Table 1.  In the following, it is shown that α, θ and γ are interdependent. Therefore, the auxiliary angle k, which is the angle in the Cu2-Cu4-γ triangle, is defined:

Supplementary
The angle between N-Cu2-Cu4, which is the auxiliary angle e, is therefore: Now, a right-angled triangle, spanned by the distances RCuN and ½(RCuCu -RNN) (including angle e), can be defined (see Supplementary Figure 5). In addition, a second triangle can be defined (see Supplementary Figure 6). In this triangle,

Supplementary
have an angle sum of 180°: This equation can be rearranged to: Consequently, the entire host-guest system can be described by six variables, i.e. RCuCu, RNN, RCuN, a, g, and d, or RCuCu, RNN, RCuN, a, q, and d, or RCuCu, RNN, RCuN, a, g, and q. In the retrofitting tool, the input parameters of a molecule (α and RNN) and of the MOF-system (RCuCu and γ) are used. As the description of the whole system requires 6 variables, a screening of the 2 missing variables (RCuN and d) is required in order to find the optimal position on a MIP (see below).

Structural parameters of the MOF system
The geometric information of the MOF system (RCuCu and γ) is extracted from the crystal structure (cif-file) of the MOF and is a manual input. Please refer to the How-To for further information.

Structural parameters of the cross linker
All molecule geometries were optimized with the Gaussian09 program package. 1 Optimization was performed with DFT with a B3LYP hybrid functional and a 6-31G basis set. 2,3 For the optimization the tight convergence criterion was used and the Hessian was recalculated after each optimization step (keyword 'calcall'). All molecules were symmetry restricted during the optimization process (see Supplementary Figure 7). The resulting Gaussian output file was converted to the xyz-format using Open Babel (version 2.3.2) and then imported into the retrofit tool using the Atomic Simulation Environment (ASE) (version 3.16.0) to compute RNN and α. Figure 7: Library of dicyano-CLs used in this work. CLs 2,3,4,12,13,16,17,19, and 29 were restricted to D2h symmetry while CLs 1, 5-11, 14, 15, and 18 were restricted to C2v symmetry during the optimization with Gaussian.

Single point DFT calculations
To access energies in the host-guest system, a simple model was chosen and transferred to the real system. Precisely, we optimized a Cu(II) formate paddlewheel and an acetonitrile molecule, respectively, and then arranged the two entities that the nitrile group points towards the OMS of the paddlewheel. By varying RCuN, δ and θ according to the parameters given in Supplementary Table 2, we obtained an energy of the system for every combination of the three parameters, which allows the generation of a model interaction potential (MIP). Two screenings were performed: one with the acetonitrile in the Cu-O plane and another one with the acetonitrile in the plane of the O-Cu-O angle bisector. Both screenings yielded similar energy values and therefore we proceeded with the values for the in-plane scan. The configuration with the lowest energy is defined as 0 kcal/mol and all energies are given as energy differences DE.
The single point calculations were done on a DFT level of theory using the TURBOMOLE (V7.1) software package. 4 The hybrid functional B3LYP 3, 5 was used with a TZVPP basis set 6 and a fine 'm5' grid 7 for all elements. The multipole-accelerated 8 resolution of the identity approximation 9, 10 was used for performance reasons. Grimmes D3 11 was employed to properly account for dispersive interactions. The SCF convergence criterion was set to 10 -6 Hartree.
The spin-state of the Cu 2+ dimer with a total of 18 d-electrons was assigned to be the ferromagnetically coupled triplett state (eight occupied d orbitals and two singly occupied d orbitals with alpha spin). Even though the spin state in a Copper paddlewheel MOF is the openshell singled state(eight occupied d orbitals and two singly occupied d-orbitals, one with alpha and one with beta spin), the excited state we chose is experimentally only 292.2 cm -1 , 12 and in a comparable hybrid (B3LYP/cc-pVDZ-PP/UDFT) calculation 372 cm -1 above the ground state energy, 13 which at room temperature makes it notably populated. Computing the antiferromagnetically coupled ground state or applying e.g. a multireference treatment however, requres a more involved theoretical description, which is why we chose here to focus on the low lying excited state. This was justified in the aforementioned computational study, where the two states were compared with respect to their energies, geometries and vibrational frequencies/modes and were found to be "nearly indistinguishable". 13 Due to the electronic configuration, the calculations were carried out spin-unrestricted. (R JKL , δ, θ) = (R JKL , 360 − δ, 360 − θ) (eq. 7)

Supplementary Note 3: Algorithm of the RetroFit program
The program code is written in the open source programming language Python TM (available at http://www.python.org). First, the data of the paddlewheel MIP (RMD, δ, θ and energy DE) is imported and interpolated. As interpolation method, a regular grid interpolator from the Scipy package was used (the class scipy.interpolate.RegularGridInterpolator). This interpolation method uses trilinear interpolation. As method-tag, 'linear' was chosen. The interpolation results in an object, which has as input RMD, δ and θ and returns the respective energy value E (i.e. DE) and works within the boundaries of the MIP as defined in Supplementary Table 2, including the mirroring of δ and θ for angles >180°. Second, the geometric information of the MOF system (RMM and ∢( . − / − 0 ), although γ is used as describing angle in rest of the program) and guest molecules (a, RDD) are imported. Note, for the ease of use the ∢( . − / − 0 ) is used as input value as it can be easily obtained from the MOF crystal structure. For non-paddlewheel MOFs this angle is defined by the vector M-M and a virtual vector between one metal center and the direction of its OMS. This data is used to calculate via (eq. 4) and (eq. 6) an energetically optimal triple of RMD, δ and θ. Hereby, θ can be easily calculated from a and γ with (eq. 6). In order to obtain RMD and δ, (eq. 6) is used. As this equation has two unknown variables, it is not straight forward to calculate both quantities simultaneously. Therefore, a set of δ values (δ -testlist) is used to calculate a respective set of RMD values. Subsequently, the δ -testlist, the set of RMD values and θ is used to calculate with the interpolation object a set of energies. The lowest energy value of the set is identified. By doing this, the energetically most favorable coordinates RMD, δ and θ of a given guest molecule with the geometry defined by a and RDD in a MOF system defined by RMM and γ are obtained for the respective input MIP data. A flowchart of the described algorithm is shown in Supplementary Figure 9. Figure 9: General workflow of the RetroFit program with emphasis on the various steps that are performed within the Python-based code. The result of one cycle is a ΔEmin value (purple) for a given CL@MOF system. When performed for various CLs for one parent MOF with OMS, CLs can be ranked with respect to their applicability in a retrofit experiment based on their ΔEmin values. A how-to guideline is provided as a separate part of the supplementary information.

Supplementary
The execution time of the code depends on the selected step size of the δ-testlist and the step size of the parameters a and RNN used to generate the energy map. Using a step size of 100 for the three parameters, which we found reasonable, the execution time is in the order of a few minutes sing a typical personal computer (Supplementary Figure 10). Figure 10: Screenshot of the reported runtime of RetroFit. The code was executed on a personal computer (Windows 10) using a step size of 100 for δ, a and RNN.

Supplementary Note 4: Restrictions and limitations of RetroFit
Open metal-site of MOF In this model, all four Cu atoms are located in one plane and consequently the Cu-OMS vector is also located in that plane. Therefore, this model works for 2-dimensional systems that have a mirror plane, meaning at least a C2v symmetry. For systems in which the Cu-OMS vectors are not in one plane additional assumptions are necessary (see results for the NOTT systems below).

Guest Molecule
All guest molecules are defined by their functional groups, which are in this case nitrile groups. The two nitrile groups (i.e. the C-N vectors) have to be in one plane for the system to be mirror symmetric to each other. Hence, a C2v symmetry of this four-atom system is required, or further assumptions are necessary.

Interaction potential of model system
For every new metal -functional group pair a new MIP has to be calculated, which requires some DFT calculations.

Interpolation
In the retrofit program an interpolation of the MIP data is performed. Hence, interpolated data are only available in the limits of the calculated MIP data (see Supplementary Table 2).

General
As no geometrical optimization is performed by the RetroFit program, all components are considered rigid. Therefore, the framework and the molecule do not change their geometry and bond lengths and angles are always constant in each subsystem. In addition, no dispersion interactions or interactions between two guest molecules are taken into consideration. For a more accurate simulation, the entire system had to be calculated by DFT at significantly higher computational costs.

Supplementary Note 6: Interpolation error
In order to verify the interpolation of the the MIP, we have checked for error values originating from the interpolation itself. The set of Energies ({ , , }) was linearly interpolated to obtain a hypothesis T ({ , , }) for any point inside the set of linearly spaced points U , U , U .
The Error of interpolation at any of these points is given by We divide the space into cubelets with dimensions 0.2 Å×5°×5°. For each cubelet, a maximum error is estimated.
In order to find an upper limit for the error in each cubelet, the diagonal of the cubelet is considered (from U , U , U to UV. , UV. , UV. ).
According to the mean value theorem, we can estimate the interpolation error for a linear interpolation of a point between two adjacent linearly spaced points U and UV. (e.g [ U , UV. ]) in the following way: (eq. 9) (/) ( i ) is the second derivative for [ U , UV. ]. Since the maximum value of the derivative is not known, we estimate it from the numerically calculated second derivatives at the nearest evaluation points U and UV. and choose the largest value of these two.
As we have a multi-dimensional problem, we estimate the error in each direction along the sides of the cubelet (from U , U , U to UV. , U , U , from U , U , U to UV. , U , U and from U , UV. , U to U , U , UV. ) and then estimate the total error in a cubelet as:

Supplementary Note 7: Results of the RetroFit algorithm
RetroFit calculates the optimal position of a CL (RCuN, δ, θ) within a given MOF and provides energy deviation DE from the ideal configuration. This allows to rank the tested CLs according to their fit, i.e. lowest DE values, and represents a guideline for experimentalists. The energies for all tested CLs and MOFs studied in this work are provided in Supplementary Table 3.